Posts

Showing posts with the label ocr

OCR pdf file in python on the fly

With   PyMuPDF and tesserocr you can OCR image pdf easily

Create searchable pdf with c++ and tesseract

Image
Many office machines creates pdf as result of scan instead of  image. Unfortunately not always they includes also text layer for copy&paste or they include text layer based on default language of scanner and not document language. In such cases you can use  tesseract to crete "searchable pdf".

OpenCV and tesseract

Do you need to OCR OpenCV image? No problem with tesseract.

Building tesserocr on MS Windows 64bit

If you search for efficient solution of using tesseract OCR in python you will need to use tessocer. But there are no recent version of project for current version of python on windows. So you have build it by yourself.