Image Based Text Recognition with T-Plan Robot Enterprise
2. Plugin installation and configuration
3.1 Character Image Collections
3.2 Character Capture Wizard
3.3 The "text"
Image Based Text Recognition (IBTR) is
feature introduced in T-Plan Robot Enterprise 3.0. It is also
available for T-Plan Robot 2.3 and higher in form of a standalone
The goal of this feature is to allow extraction
from a computer screen or an image. This is fairly
similar to functionality of an OCR engine save that characters are
recognized through a collection of presaved character images. rather
than character shapes stored in form of a vector graphics. Compared to
OCR this principle has one major disadvantage. As OCR recognizes
character shapes, it works out of the box for most fonts types and
sizes. Robot's IBTR on the other hand recognizes just the characters of
the particular type, size, color and background which have been
extracted from the screen and stored to an image collection.
2. Plugin installation and configuration
There's no need to install anything if you run T-Plan Robot 3.0.
To install IBTR on T-Plan Robot 2.3 and higher:
To verify that the plugin has been installed make sure that:
- Download the plugin from http://www.t-plan.ltd.uk/releases/robot/plugins/TextRecognizer.zip
- Unzip the file and copy the
plugins/ directory under the Robot installation
- Start or restart Robot.
To uninstall the plugin simply delete the
- The Tools
menu contains a new item called Character Capture
- The Script->Compareto
Command window contains a new item called "text" in the Comparison method
file from the
3. Using Image Based Text Recognition in Robot
The IBTR functionality consists of three components:
- Character image collections,
- The "text" comparison method,
- Character Capture wizard.
3.1 Character Image Collections
A Character Image Collection
is a directory with character images which meets these conventions:
An example of a character collection containing two variants of the 'a'
character and one 'o' character may look like:
- The collection directory contains just
folders named 'uNNNN' (a "character
folder") where 'NNNN' is the four digit upper case hexadecimal
UTF-8 character code. For example, the 'M' character must be
represented by a folder called 'u004D' because its UTF-8 code is
- Each character folder contains one or
more character images in a Java supported lossless format (PNG is
preferred, BMP is also supported but tends to be large). The image file
names are not relevant but the file extensions must be correct with
regard to the image format. For example, a PNG image must be stored in
a file with the ".png" extension.
Though Robot supports creation and maintenance of character image
collections through the Character Capture Wizard,
a collection may be created and edited by hand or in third
party image editors provided that the conventions are met. This
may be useful to exploit some features of the image
algorithm used internally by the IBTR. For example, to
create a character collection which would work on any background one
may edit the character images in an image editor and make the
background transparent or translucent.
- As the internal algorithm derives the
height of the text line and the space size from the collection images,
do not mix images of characters of different
and/or size into a single collection. Mixing of
character images of different font and background colors is OK as
long as the characters are of the same font type and size.
3.2 Character Capture Wizard
The Character Capture Wizard
is a front end GUI window allowing to create, view and maintain
character image collections. The wizard can be started through the Tools->Character
Capture menu item.
The very first wizard screen titled Select Image Collection
deals with selection ot the character image directory to work with:
Collection button is enabled only if the Collection field
contains an existing directory. It opens a new window with the
collection details consisting of a tree of character images and a UTF-8
char set table showing which characters are covered by the collection.
The viewer also allows basic maintenance tasks such as editing or
deletion of the character images.
To add new character images to the selected collection click Next to proceed
to the Select Text
screen. Note that this button is enabled only when Robot is connected
to a desktop. If you need to extract characters from a static image
stored to a file, use the Login
Window to load the image though the Static Image Client.
To extract new characters from the screen perform teh following steps:
Once you have selected the image and provided the text hit Next to proceed
to the last screen called Confirm Character Images:
- Click the Select Text Area
button. It will open a new window with a copy of the remote desktop.
Then drag your mouse in the image to mark the area containing text. See
the recommendations chapter below for
text selection tips. Once you are satisfied click either the green tick
button next to the selection or the Save & Close button on
the tool bar.
- Type the text the image contains into
the editor situated above the button. Make sure to provide it exactly
as it is displayed with all spaces and lines.
This screen shows the list of characters contained in the provided
text. If Robot recognizes the individual characters on the pixel level
in the image the list gets populated with the suggested character
images. Characters may be removed or modified the the Delete and Edit buttons. Be
aware that you don't have to deal with duplicate images. If a character
image already exists and it is the same as the newly extracted one, it
will be skipped automatically.
In some scenarios Robot may fail to parse the text area image for
individual characters. This may happen for example when the selected
text area does not meet the recommendations
criteria. Unrecognized characters then appear in the list with the
red N/A icon
as this one:
This behavior does not mean that the character can not be searched
using the character collections. It merely indicates that the wizard
was not able to suggest the character image. To fix this simply select
the character in the list, click Edit and define
the its image in the text area manually.
These tips suggest how to select the text area on the screen in order
to minimize the number of unrecognized characters (N/A) which
have to be fixed manually:
- The very first pixel of the selected
area (the top left corner) must be of the background color.
- The area should contain text of a single
font type, size and color displayed on a solid color background such as
for example a block of black text on a white background. If the text
does not meet these requirements process it in smaller parts which meet
this criteria (by words and/or lines).
- The area should contain one continuous
block of text displayed on a single graphical component, such as a text
message, a button label, a menu item etc.. Do not select for example an
area which contains two or more buttons because their texts may be
aligned differently along the horizontal axis.
- As Robot separates the characters by
looking for a space of background color between them, fonts where
characters overlap along the vertical or horizontal axis fail to parse
and must be extracted manually. If such a character image contains
parts of another character, these must be removed in a third party
image editor such as Gimp and the affected must be made transparent to
make the character search algorithm skip them.
3.3 The "text" Comparison Method
At the level of test scripts the Image Based Text Recognition is
supported in form of a new comparison method called "text" which can
be employed through calls of the CompareTo,
match/mismatch commands or its Java test script counterpart
methods. The comparison method supports the following parameters:
The method returns the same return values as the Tesseract OCR method.
When none of the "text",
parameters is used the calling command returns 0 (success) to indicate
a successful execution even when no text gets actually recognized. If
the text matching parameters are used the calling command will return
either success (0) or fail (non-zero value) depending on whether the
recognized text matches or not.
- The standard comparison parameters of "cmparea" and "passrate" are
supported. The "cmparea"
one can be used to limit the text recognition to a particular
rectangular area of the screen. The "passrate" one
is currently not used. If it is defined by the calling command it is
- The "chars"
parameter is mandatory and defines the path of the character
collection. If the path is relative it is resolved against the
script's template directory.
- The "tolerance"
parameter is similar to the one supported by the Enterprise
Search method ("search"). It must be an integer number
between 0 and 256. It indicates how much the Red, Green and Blue
components of a desktop pixel may differ at a maximum to consider the
color be equivalent to the corresponding character image pixel. This
allows to search for characters which are rendered with minor
differences in a dynamic way. If the parameter is not specified, it
defaults to zero and the algorithm compares pixels using exact color
- The "text", "distance" and "pattern"
parameters perform the same functionality as their counterparts in the Tesseract
The method also populates a set of variables similar to the Tesseract
OCR method. These can be used by the calling script to process the
|The recognized text (full
multiline format). The variable is created always
when the method gets executed successfully (meaning that it doesn't
for a missing desktop connection or invalid character image collection).
||Parsed text line where <n> is the
line number. Numbering starts from one.
If the method recognizes just a single line of text the content of the
variable will be equal to _TEXT_LINE1.
||Number of lines in the recognized text.
It gets populated only if the method fails to perform text
recognition, for example when there is no desktop connection or when
the specified character image collection doesn't exist or can not be
|The part of
recognized text that produced a fuzzy match meeting the criteria
specified in the "text"
|Position (index) of the matching
text described in _TEXT_MATCH. Indexing
starts from 0 which corresponds to the beginning of the recognized text.
The following examples show how to call the text recognition in a
script. More advanced examples on text matching can be found in the Tesseract
OCR method documentation whose behavior this method mostly mimicks.