OpenCV and tesseract
Do you need to OCR OpenCV image? No problem with tesseract.
Here is simple example how OCR OpenCV images.
Check if you have installed OpenCV (e.g. rpm -qa | grep -i opencv in OpenSuse ) or download from official OpenCV site (e.g. for windows ).
I expect to have installed with tesseract as explain in tesserocr blog
For example we can use images available from Electronic Text Center.
Lets create file opencv_tesseract.cpp with following code:
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <opencv2/opencv.hpp>
#include <opencv2/imgproc.hpp>
int main(int argc,char* argv[]) {
if(argc==1) {
printf("Program usage:\n\t %s image_filename\n", argv[0]);
return 0;
}
std::string imPath = argv[1];
cv::Mat cv_image = cv::imread(imPath, cv::IMREAD_GRAYSCALE);
setMsgSeverity(9); // turn off leptonica messages
tesseract::TessBaseAPI* ocr = new tesseract::TessBaseAPI();
// suppress tesseract debug messages
ocr->SetVariable("debug_file", "/dev/null");
if (ocr->Init(NULL, "eng")) {
std::cout << "Failed to initialize Tesseract." << std::endl;
} else {
ocr->SetVariable("user_defined_dpi", "300");
ocr->SetImage(cv_image.data, cv_image.cols, cv_image.rows, 1, cv_image.cols);
char* str = ocr->GetUTF8Text();
std::cout << str << std::endl;
ocr->Clear();
ocr->End();
delete ocr;
if (str)
delete[] str;
}
}
Now setup environment:
"c:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Auxiliary\Build\vcvars64.bat" x64
SET PATH=%PATH%;f:\win64\bin
SET TESSDATA_PREFIX=f:\Project\tessdata
Build your code:
cl /EHsc opencv_tesseract.cpp /If:\win64\include ^ /If:\opencv2\opencv\build\include ^ /link /LIBPATH:f:/win64/lib ^ /LIBPATH:f:\opencv2\opencv\build\x64\vc15\lib\ ^ tesseract41.lib leptonica-1.81.0.lib opencv_world451.lib ^ /machine:x64 /out:opencv_tesseract.exe
And run it:
opencv_tesseract.exe robertson.jpg
Comments
Post a Comment