

To install additional languages into Islandora, you will need to know the path to your Tesseract installation's 'tessdata' folder.
#GOOGLE TESSERACT OCR DOWNLOAD#
These additional languages can be found on Tesseract's download page. Tesseract requires little configuration out of the box that being said, Islandora supports the installation of multiple languages for OCR processing, and may even require English language support. Configuration Additional Language Support For Linux users, or any others compiling it from source, you will need to make sure that you also have the Leptonica library installed, and that you have appropriate source building tools. A binary installer exists for Windows, and specific instructions for installing on a Mac through MacPorts can be found in the Tesseract readme here. Tesseract an OCR engine that was developed at HP Labs between 19 - it is currently managed by a team at Google the latest stable release can be found on the downloads page of their website.

THIS MEANS THAT IT IS LIKELY THAT YOU WILL HAVE TO COMPILE IT FROM SOURCE. At the time of writing, this is the latest stable version. For the Islandora OCR module to create OCR derivatives, Tesseract 3.02.02 or higher is required. For Linux installations: While it is likely that your distribution's package manager may contain Tesseract in one of its repositories, it is EXTREMELY unlikely that it will be the correct version.
