Optical Character Recognition

Overview Which Engine Should I Use?Tesseract OCR Windows OCR Using OCR With Rapise

Overview

Optical character recognition (OCR) is the conversion of images containing text into actual editable text. Often during testing you need to be able to dynamically parse the text from a graphic on the screen so that you can determine if an action passed correctly.

For example, many applications are using the CAPTCHA image system for preventing automated spam posts to applications. If you need to test such an application, the ability to respond correctly to the CAPTCHA is critical.

Rapise can be integrated with a number of OCR engines, including Tesseract and Windows, and provides a special OCR Global Object to make the task of extracting text straightforward.

Which Engine Should I Use?

There are benefits to each of the two supported engines:

Tesseract OCR

Open Source: Tesseract is a free, open-source OCR engine, making it a cost-effective choice for users.

Flexibility: It can be customized for different languages and image types.

Community Support: Being open-source, Tesseract has a large community of developers who continuously improve the tool, ensuring it stays updated and adaptable for a variety of tasks.

Ease of Integration: Tesseract integrates easily with Rapise, especially for basic OCR needs in automation scripts.

Benefits:

Low cost (free).

Good for straightforward OCR tasks.

Highly customizable for advanced users.

Windows OCR

Native Integration: Windows OCR leverages the built-in OCR capabilities of the Windows operating system, ensuring seamless integration with Rapise.