Recognition settings

Contents
[ ]

Aspose.OCR for .NET provides good recognition accuracy and performance by default. However, there will inevitably be cases where the default settings cannot provide reliable recognition results.

To configure Aspose.OCR recognition settings, pass an optional RecognitionSettings structure to the recognition method.

Setting Type Default value Description
all_image bool false Force recognition of the entire image. Not used for receipt recognition.
allowed_characters characters_allowed_type characters_allowed_type::ALL The predefined whitelist of characters Aspose.OCR engine will look for.
alphabet string All symbols A custom list of characters to be recognized, provided as a case-sensitive string. Characters that do not match the provided list are ignored.
auto_contrast bool false Automatically increase the contrast of images before proceeding to recognition.
auto_denoising bool false Automatically remove noise from images before proceeding to recognition.
correct_skew bool true Automatically correct image tilt (deskew) before proceeding to recognition.
detect_areas_mode detect_areas_mode_enum detect_areas_mode_enum::DOCUMENT Manually override the default document areas detection method. Not used for receipt recognition.
filters custom_preprocessing_filters none Preprocessing filters to be applied to the image.
format export_format export_format::text The format of recognition results that are stored in memory (plain text, JSON or XML).
ignoredCharacters string none A blacklist of characters that are ignored during recognition.
language_alphabet language language::none Specify a language for recognition.
lines_filtration bool false Set to true to recognize text in tables.
Set to false to improve performance by ignoring table structures and treating tables as plain text.
preprocess_area rect* NULL Specify the image region to be affected by preprocessing filters.
rectangles rect* NULL Areas of the image from which to extract text.
rectangles_size size_t 0 The number of recognition areas.
save_format file_format file_format::txt The file format in which the recognition result is saved.
skew double 0 Manually rotate the image by the specified degree. Does not work if recognition areas are specified.
threshold_value int 0 Override the automatic binarization settings.
upscale_small_font bool false Improve small font recognition and detection of dense lines.

Example

The following code example shows how to fine-tune recognition:

std::string image_path = "source.png";
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.language_alphabet = language::ukr;
settings.auto_contrast = true;
settings.upscale_small_font = true;
size_t res_len = aspose::ocr::page_settings(image_path.c_str(), buffer, len, settings);
std::wcout << buffer;