How to Recognize Rotated Text with OCR and Barcode SDK

Xiao Ling - Mar 26 '21 - - Dev Community

The accuracy of optical character recognition (OCR) is sensitive to text orientation. If a text image file is rotated by 90°, 180° or 270°, and there is no EXIF information, it would be challenging to doing text recognition. In case of text around barcode, the combination of Dynamsoft Label Recognition(DLR) and Dynamsoft Barcode Reader(DBR) is proposed as an ideal OCR solution. This article will demonstrate how to build a C++ command-line application to recognize a rotated barcode label (e.g. price tag) which is hardly recognized by general open source or commercial OCR SDK.

OCR: rotated text recognition

Requirements

DLR API Documentation

https://www.dynamsoft.com/label-recognition/programming/c-cplusplus/user-guide.html?ver=latest

A New OCR Approach: Text Recognition by Barcode Orientation Detection

Both Dynamsoft Barcode SDK and Dynamsoft OCR SDK support JSON-formatted template import. To parse JSON in C++, we use json.hpp, which is a single header file for easy integration. No dependencies required.

A basic template file for DLR runtime settings taking advantage of barcode results is as follows:

{
   "CharacterModelArray" : [
      {
         "Name" : "NumberLetter"
      }
   ],
   "LabelRecognitionParameterArray" : [
      {
         "Name" : "BarcodeLabel",
         "ReferenceRegionNameArray" : [ "defaultReferenceRegion" ]
      }
   ],
   "ReferenceRegionArray" : [
      {
         "Localization" : {
            "BarcodeFormatIds" : [ "DLR_BF_ALL" ]
         },
         "Name" : "defaultReferenceRegion"
      }
   ]

}

Enter fullscreen mode Exit fullscreen mode

The barcode zone is used as the reference region for text recognition. By default, the text searching area is under the barcode.

The BarcodeFormatIds is mandatory. If the input barcode symbologies are not fixed, we set DLR_BF_ALL for covering primary 1D and 2D barcode symbologies.

Let's create DLR and DBR objects and construct a JSON object by loading the template file:

#include <string>
#include <fstream>
#include <iostream>

#include "include/DynamsoftLabelRecognition.h"
#include "include/DynamsoftCommon.h"
#include "include/DynamsoftBarcodeReader.h"
#include "include/json.hpp"

using json = nlohmann::json;
using namespace dynamsoft::dlr;
using namespace dynamsoft::dbr;

CLabelRecognition dlr;
dlr.InitLicense("LICENSE-KEY");

CBarcodeReader dbr;
dbr.InitLicense("LICENSE-KEY");

std::ifstream templateFile("<template file>");
json templateObj;
templateFile >> templateObj;
Enter fullscreen mode Exit fullscreen mode

Before invoking the OCR APIs, we determine the reference regions based on barcode recognition results:

TextResultArray *resultArray = NULL;
dbr.DecodeBuffer(imageCopy.data, imgWidth, imgHeight, imageCopy.step.p[0], IPF_RGB_888,"");
dbr.GetAllTextResults(&resultArray);

dlr.ResetRuntimeSettings();
DLRRuntimeSettings settings = {};
dlr.GetRuntimeSettings(&settings);
string templateName = templateObj["LabelRecognitionParameterArray"][0]["Name"];
dlr.AppendSettingsFromString(templateObj.dump().c_str());
dlr.UpdateReferenceRegionFromBarcodeResults(resultArray, templateName.c_str());
dlr.UpdateRuntimeSettings(&settings);
Enter fullscreen mode Exit fullscreen mode

Here are the decoding APIs used for different data source types:

Afterward, we can either use RecognizeByBuffer or RecognizeByFile to recognize text:

// buffer
DLRImageData data = {imageCopy.step.p[0] * imgHeight, imageCopy.data, imgWidth, imgHeight, imageCopy.step.p[0], DLR_IPF_RGB_888};
dlr.RecognizeByBuffer(&data, templateName.c_str());

// file
dlr.RecognizeByFile("<image file>", templateName.c_str());
Enter fullscreen mode Exit fullscreen mode

The returned result is an array, from which we can get text and corresponding coordinate values:

DLRResultArray* pDLRResults = NULL;
dlr.GetAllDLRResults(&pDLRResults);
if (pDLRResults != NULL)
{
  int rCount = pDLRResults->resultsCount;
  for (int ri = 0; ri < rCount; ++ri)
  {
    DLRResult* result = pDLRResults->results[ri];
    int lCount = result->lineResultsCount;

    for (int li = 0; li < lCount; ++li)
    {
      printf("Line result %d: %s\r\n", li, result->lineResults[li]->text);
      DLRPoint *points = result->lineResults[li]->location.points;
      printf("x1: %d, y1: %d, x2: %d, y2: %d, x3: %d, y3: %d, x4: %d, y4: %d\r\n", points[0].x, 
      points[0].y, points[1].x, points[1].y, points[2].x, points[2].y, points[3].x, points[3].y);
    }
  }
}
dlr.FreeDLRResults(&pDLRResults);
if (resultArray) CBarcodeReader::FreeTextResults(&resultArray);
Enter fullscreen mode Exit fullscreen mode

Label OCR Test by Orientation

To prove the efficiency of combining Dynamsoft barcode and OCR SDKs, we can make a simple experiment by rotating a barcode label image.

Sample Image
codabar

0 degree

text recognition by 0 degree

90 degree

text recognition by 90 degree

180 degree

text recognition by 180 degree

270 degree

text recognition by 270 degree

As long as there is a barcode exists, the text aside can be successfully recognized.

Source Code

https://github.com/yushulx/cmake-cpp-barcode-qrcode/tree/main/examples/10.x/opencv_camera

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player