Saving recognition results as a file
To save recognition results to a file or write them to the memory stream, use page_save()
method.
The file format is defined by save_format
property of recognition settings. The following formats are supported:
Format | Description |
---|---|
file_format::txt |
Plain text with line breaks. |
file_format::pdf |
Portable Document Format. |
file_format::docx |
Microsoft Word document (version 2007 or later). |
file_format::format_rtf |
RTF, a universal format for exchanging rich text documents between different word processing programs. |
file_format::xlsx |
Microsoft Excel spreadsheet (version 2007 or later). |
file_format::format_json |
JSON is a popular format widely used in software development and data exchange. The de facto standard for websites and REST APIs. |
file_format::format_xml |
Extensible Markup Language (XML), a universal data exchange and storage format for most systems. |
directory dir(".");
const string current_dir = dir.full_name();
const string image = current_dir + "p.png";
const size_t len = 6000;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.save_format = file_format::docx;
aspose::ocr::page_save(image.c_str(), "result.docx", settings);
Saving recognition results as a multi-page document
You can merge several recognition results into one document. Simply provide several images as a semicolon-separated string to page_save()
method. This can be useful for recognizing books, contracts, articles, and other printouts consisting of multiple pages, as well as for batch recognition.
const size_t len = 4096;
wchar_t buffer[len] = { 0 };
RecognitionSettings settings;
settings.save_format = file_format::pdf;
asposeocr_page_save("1.png;2.jpg;3.jpg", "result.pdf", settings);
When saving recognition results in PDF format, the original image is placed as the background of the resulting document. The recognized text is added as an invisible overlay on top of the background image, and can be searched and copied to the clipboard.