You are here: Chapter 5: Verifying the Tested Application > Tesseract OCR

Tesseract OCR

T-Plan Robot Enterprise v2.2 introduced support of Tesseract OCR v2 which is considered to be the most accurate open source engine for optical character recognition. T-Plan Robot Enterprise integrates with this engine to allow recognition of text displayed on the connected remote desktop or device. The engine is exposed in the scripting language as a standard image comparison method called "tocr" and can be employed through the standard CompareTo, Screenshot and WaitFor commands.

The engine is not part of the T-Plan Robot Enterprise product and must be downloaded and installed separately. T-Plan Robot Enterprise then only needs to know its location which can be configured in the Tesseract OCR panel of the Preferences window. See the method documentation for installation and configuration instructions.

To achieve the best results it is highly recommended to restrict the recognition to a smaller area where the text may appear. For example, to get text of a button, it is recommended to find its corners using the image search method and then limit the OCR to the button rectangle. When the engine is employed to recognize text on a larger area, with multiple GUI controls, or even against the whole desktop, it can get distracted by the GUI, and produces inaccurate results.

The following example shows how to employ the OCR engine to recognize the date and time on the Windows task bar and take advantage of a regular expression to test whether the current month is August.

Tesseract OCR demoTesseract OCR demo
Tesseract OCR demo
on Windows 7

 

12 December 2014

Copyright © T-Plan Ltd.

Version 1.0