OpenCV and tesseract
Do you need to OCR OpenCV image? No problem with tesseract.
Here is simple example how OCR OpenCV images.
Check if you have installed OpenCV (e.g. rpm -qa | grep -i opencv in OpenSuse ) or download from official OpenCV site (e.g. for windows ).
I expect to have installed with tesseract as explain in tesserocr blog
For example we can use images available from Electronic Text Center.
Lets create file opencv_tesseract.cpp with following code:
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc.hpp>
int main(int argc,char* argv[]) {
    if(argc==1) {
        printf("Program usage:\n\t %s image_filename\n", argv[0]);
        return 0;
    }
    std::string imPath = argv[1];
    cv::Mat cv_image = cv::imread(imPath, cv::IMREAD_GRAYSCALE);
    setMsgSeverity(9);  // turn off leptonica messages
    tesseract::TessBaseAPI* ocr = new tesseract::TessBaseAPI();
    // suppress tesseract debug messages
    ocr->SetVariable("debug_file", "/dev/null");
    if (ocr->Init(NULL, "eng")) {
        std::cout << "Failed to initialize Tesseract." << std::endl;
    } else {
        ocr->SetVariable("user_defined_dpi", "300");
        ocr->SetImage(cv_image.data, cv_image.cols, cv_image.rows, 1, cv_image.cols);
        char* str = ocr->GetUTF8Text();
        std::cout << str << std::endl;
        ocr->Clear();
        ocr->End();
        delete ocr;
        if (str)
            delete[] str;
    }
}
Now setup environment:
"c:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat" x64
SET PATH=%PATH%;f:\win64\bin
SET TESSDATA_PREFIX=f:\Project\tessdata
Build your code:
  cl /EHsc opencv_tesseract.cpp /If:\win64\include ^ /If:\opencv2\opencv\build\include ^ /link /LIBPATH:f:/win64/lib ^ /LIBPATH:f:\opencv2\opencv\build\x64\vc15\lib\ ^ tesseract41.lib leptonica-1.81.0.lib opencv_world451.lib ^ /machine:x64 /out:opencv_tesseract.exe
And run it:
opencv_tesseract.exe robertson.jpg
 
Comments
Post a Comment