Cross Compile Tesseract For Android On Windows 10

Do you want to test tesseract on Android? Here is a short description of how to build it on Windows (without Visual Studio ;-) ).

Install  requirements

Windows 10 includes command line utility winget for installing software. This makes installing dependencies easier:

  winget install cmake
  winget install Git.Git
Git comes with many useful GNU utils, so we include it Path:
  SET PATH=%PATH%;c:\Program Files\Git\usr\bin;
Also you can consider to use modern terminal on Windows, so try to install, configure and use Windows terminal:
  winget install Microsoft.WindowsTerminal

Finally you need to (manually) download android-ndk (e.g. android-ndk-r21e-windows-x86_64.zip) from https://developer.android.com/ndk/downloads and SDK Platform-Tools for Windows from https://developer.android.com/studio/releases/platform-tools and unzip them to desired place. In my case:

  unzip -q -o android-ndk-r21e-windows-x86_64.zip -d f:/0Android
  unzip -q -o platform-tools_r31.0.2-windows.zip -d f:/0Android

BTW: Windows 10 (build is 17063, or later) included utility cURL for downloading ;-)

Build

Setting up environment

You can adjust these variables based on your needs:
  SET INSTALL_DIR=F:/0Android/custom
  SET NDK=f:/0Android/android-ndk-r21e
  SET TOOLCHAIN=%NDK%/toolchains/llvm/prebuilt/windows-x86_64
  SET PATH=%PATH%;%TOOLCHAIN%\bin;f:\0Android\platform-tools;
  SET TARGET=aarch64-linux-android
  SET API=21
  SET ABI=arm64-v8a
  SET MINSDKVERSION=16
  SET CXX=%TOOLCHAIN%/bin/%TARGET%%API%-clang++
  SET CC=%TOOLCHAIN%/bin/%TARGET%%API%-clang
Android ndk comes with precompiled zlib and clang compiler, so build is quiet easy:

PNG

 curl -L -O https://vorboss.dl.sourceforge.net/project/libpng/libpng16/1.6.37/lpng1637.zip
  unzip lpng1637.zip
  cd lpng1637
  cmake -Bbuild -G"Unix Makefiles" -DHAVE_LD_VERSION_SCRIPT=OFF ^
    -DCMAKE_TOOLCHAIN_FILE=%NDK%/build/cmake/android.toolchain.cmake ^
    -DANDROID_PLATFORM=android-21 ^
    -DCMAKE_MAKE_PROGRAM=%NDK%\prebuilt\windows-x86_64\bin\make.exe ^
    -DANDROID_TOOLCHAIN=clang -DANDROID_ABI="arm64-v8a" ^
    -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=%INSTALL_DIR% ^
    -DCMAKE_INSTALL_PREFIX=%INSTALL_DIR% 
  cmake --build build --config Release --target install

Leptonica

 git clone --depth 1 https://github.com/DanBloomberg/leptonica.git
 cd leptonica
 cmake -Bbuild -G"Unix Makefiles" -DBUILD_PROG=OFF -DSW_BUILD=OFF ^
  -DBUILD_SHARED_LIBS=ON ^
  -DPNG_LIBRARY=%INSTALL_DIR%/lib/libpng.so -DPNG_PNG_INCLUDE_DIR=%INSTALL_DIR%\include ^
  -DCMAKE_TOOLCHAIN_FILE=%NDK%/build/cmake/android.toolchain.cmake ^
  -DANDROID_PLATFORM=android-21 ^
  -DCMAKE_MAKE_PROGRAM=%NDK%\prebuilt\windows-x86_64\bin\make.exe ^
  -DANDROID_TOOLCHAIN=clang -DANDROID_ABI=arm64-v8a -DCMAKE_BUILD_TYPE=Release ^
  -DCMAKE_INSTALL_PREFIX=%INSTALL_DIR% ^  
  -DCMAKE_PREFIX_PATH=%INSTALL_DIR%;%INSTALL_DIR%/lib;%INSTALL_DIR%/include;%INSTALL_DIR%/lib/cmake
 cmake --build build --config Release --target install

tesseract

 git clone https://github.com/tesseract-ocr/tesseract.git
 cd tesseract
 cmake -Bbuild -G"Unix Makefiles" ^
  -DBUILD_TRAINING_TOOLS=OFF -DGRAPHICS_DISABLED=ON ^
  -DSW_BUILD=OFF -DOPENMP_BUILD=OFF ^
  -DBUILD_SHARED_LIBS=ON ^
  -DLeptonica_DIR=%INSTALL_DIR%\lib\cmake ^
  -DCMAKE_TOOLCHAIN_FILE=%NDK%/build/cmake/android.toolchain.cmake ^
  -DANDROID_PLATFORM=android-21 ^
  -DCMAKE_MAKE_PROGRAM=%NDK%\prebuilt\windows-x86_64\bin\make.exe ^
  -DANDROID_TOOLCHAIN=clang -DANDROID_ABI=arm64-v8a ^
  -DCMAKE_BUILD_TYPE=Release ^
  -DCMAKE_INSTALL_PREFIX=%INSTALL_DIR% ^
  -DCMAKE_PREFIX_PATH=%INSTALL_DIR%;%INSTALL_DIR%/lib;%INSTALL_DIR%/include;%INSTALL_DIR%/lib/cmake
 cmake --build build --config Release --target install

Testing on android device

Installation via adb

Connect your android device via USB cable and enable USB debugging. Check if your device is online with command "adb devices". Make sure your device is unlocked as you will need to enable access to adb. First we will transfer tesseract data to location when your can run executable: /data/local/tmp/ (other location does not permit executables)

  adb push %INSTALL_DIR%/bin/tesseract /data/local/tmp/
  adb push %INSTALL_DIR%/lib/libtesseract.so /data/local/tmp/
  adb push %INSTALL_DIR%/lib/libleptonica.so /data/local/tmp/
  adb push %INSTALL_DIR%/lib/libpng16.so /data/local/tmp/
  curl -L -O "https://raw.githubusercontent.com/tesseract-ocr/tessdata/master/eng.traineddata"
  adb push eng.traineddata /data/local/tmp/
  curl -L -O https://raw.githubusercontent.com/tesseract-ocr/test/ebaee164bb39fe55b601b95b92db686d3c7da265/testing/phototest-rotated-R.png
  adb push phototest-rotated-R.png /data/local/tmp/

Testing tesseract in adb shell

  adb shell
$ export LD_LIBRARY_PATH=/data/local/tmp:$LD_LIBRARY_PATH:. $ export PATH=/data/local/tmp:$PATH $ cd /data/local/tmp/ $ chmod 755 tesseract $ ./tesseract -v tesseract 5.0.0-alpha-20210401 leptonica-1.82.0 libpng 1.6.37 : zlib 1.2.11 Found NEON $ ./tesseract phototest-rotated-R.png - Error in pixReadMemTiff: function not present Error in pixReadMem: tiff: no pix returned Error in pixaGenerateFontFromString: pix not made Error in bmfCreate: font pixa not made This is a lot of 12 point text to test the ocr code and see if it works on all types of file format. The quick brown dog jumped over the lazy fox. The quick brown dog jumped over the lazy fox. The quick brown dog jumped over the lazy fox. The quick brown dog jumped over the lazy fox.

Error messages are ok, as we did not build in all leptonica dependencies because they did not affect the OCR results. If you want to turn them off, set environment variable:

  $ export LEPT_MSG_SEVERITY=7

TIPS

Uninstall

If you need to uninstall leptonica or tesseract it just run (in build directory):

  cat install_manifest.txt | dos2unix | xargs rm

All necessary commands are part of git ;-)

Reconfigure build

If you want to reconfigure your cmake build, the best way is to remove all files in the build directory. You can do it easily from command line:

  rm -R build/*

Comments

Popular posts from this blog

Tesseract LSTM training (aka Makefile training)