Aspose.OCR for .NET 23.3.1 - Release Notes
This article contains a summary of recent changes, enhancements and bug fixes in Aspose.OCR for .NET 23.3.1 (March 2023) release.
GPU version: 21.6.0
Deprecation warning
The release 23.3.1 introduces a slimmer, faster and more straightforward API that can significantly simplify your code. Unfortunately, the major reorganization of classes, methods and properties result in “breaking changes”.
To make it easier to upgrade your code, we have kept all existing classes and methods fully functional, but marked them as deprecated. All of your existing code will continue to work and you can even make minor updates to it, but be aware that all deprecated elements are scheduled to be removed in release 23.11.0 (November 2023) in favor of the new API.
Time to deprecation: 6 months left.
What was changed
Key | Summary | Category |
---|---|---|
OCRNET‑636 OCRNET‑646 |
A slimmer, faster and more straightforward API has been introduced. See Added public APIs for details. | New feature |
OCRNET‑648 | Most of the existing API classes and methods have been marked as deprecated to remind you to update your existing code. They remain functional but will be removed in release 23.11.0 (November 2023) in favor of the new API introduced in this release. See Deprecated APIs for details. | Enhancement |
Public API changes and backwards compatibility
This section lists all public API changes introduced in Aspose.OCR for .NET 23.3.1 that may affect the code of existing applications.
Added public APIs:
The following public APIs have been introduced in this release:
Aspose.OCR.OcrInput
class
The universal class for providing any type of data (images, PDF documents, archives, folders, streams, arrays, and so on) to the new image processing and recognition methods.
Aspose.OCR.ImageProcessing
class
Specially adjust one or more files to improve the accuracy and reliability of the OCR. This class provides extended replacements for Aspose.OCR.AsposeOcr.PreprocessImage
method:
Method | Action |
---|---|
Save(OcrInput images, string folderPath) |
Saves processed images to a folder. Replaces PreprocessImage method. |
Render(OcrInput images) |
Processes files and returns an OcrInput object with adjusted images that can be passed to recognition methods. |
Aspose.OCR.AsposeOcr.Recognize
method
Recognize one or more files provided as an OcrInput
object. It is a universal replacement for the following recognition methods:
Method | Action |
---|---|
RecognizeImage |
Extract text from a raster image, provided as file, memory stream, or a pixel array. |
RecognizePdf |
Extract text from a PDF document. |
RecognizeTiff |
Extract text from a multi-page TIFF image. |
RecognizeDjvu |
Extract text from a DjVu file. |
RecognizeImageFromUri |
Recognize an image hosted on website without downloading it to the computer. |
RecognizeMultipleImages |
Batch recognition. |
RecognizeImageFromBase64 |
Extract text from Base64 encoded images. |
Aspose.OCR.AsposeOcr.RecognizeLines
method
Recognize files containing a single line of text in the fastest possible mode. It is an extended replacement for RecognizeLine
method.
Aspose.OCR.AsposeOcr.DetectRectangles
method
Find areas of images containing text. It is an extended replacement for GetRectangles
method.
Aspose.OCR.AsposeOcr.CalculateSkew
method
Find out skew angles of provided images. It is an universal replacement for the following methods:
Method | Action |
---|---|
CalculateSkew |
Detect the skew angle of an image. |
CalculateSkewFromUri |
Detect the skew angle of an image hosted on website without downloading it to the computer. |
AllowedSymbols
recognition setting
Limit a subset of recognized characters instead of using all symbols from the selected language. It is a replacement of the alphabet
argument of AsposeOcr
constructor and AllowedCharacters
recognition setting.
IgnoredSymbols
recognition setting
A blacklist of characters that are ignored during recognition. It is a replacement of the IgnoredCharacters
recognition setting to ensure the consistent naming.
Updated public APIs:
No changes.
Removed public APIs:
The following public APIs have been removed in this release:
detectAreas
argument of RecognitionSettings
constructor
BACKWARD INCOMPATIBILITY!
The deprecated argument of RecognitionSettings
constructor has been removed in this release.
RecognitionSettings
constructor no longer accepts detectAreas
argument. Specify the area detection mode in DetectAreasMode
recognition setting instead.
Deprecated APIs
The following public APIs have been marked as deprecated and will be removed in 23.11.0 (November 2023) release:
CalculateSkew
method
Replaced with Aspose.OCR.AsposeOcr.CalculateSkew
method.
CalculateSkewFromUri
method
Replaced with Aspose.OCR.AsposeOcr.CalculateSkew
method.
Aspose.OCR.AsposeOcr.PreprocessImage
method
Replaced with Save
method of Aspose.OCR.AsposeOcr.CalculateSkew
class.
RecognizeImage
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizePdf
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeTiff
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeDjvu
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeImageFromUri
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeMultipleImages
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeImageFromBase64
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeLine
Replaced with Aspose.OCR.AsposeOcr.RecognizeLines
method.
GetRectangles
Replaced with Aspose.OCR.AsposeOcr.DetectRectangles
method.
CalculateSkew
Replaced with Aspose.OCR.AsposeOcr.CalculateSkew(OcrInput images)
override.
CalculateSkewFromUri
Replaced with Aspose.OCR.AsposeOcr.CalculateSkew(OcrInput images)
method.
DocumentRecognitionSettings
No longer required. Use RecognitionSettings
class as a universal replacement.
AutoSkew
recognition setting
No longer required. Use AutoSkew
image processing filter instead.
SkewAngle
recognition setting
No longer required. Use Rotate
image processing filter instead.
ThresholdValue
recognition setting
No longer required. Use the binarization threshold
setting in Binarize
image processing filter instead.
AutoContrast
recognition setting
No longer required. Use ContrastCorrectionFilter
image processing filter instead.
AutoDenoising
recognition setting
No longer required. Use AutoDenoising
image processing filter instead.
PreprocessingFilters
recognition setting
No longer used in recognition methods. Process images before proceeding with recognition or provide processing filters in OcrInput
object.
AllowedCharacters
recognition setting
No longer required. Use the new AllowedSymbols
recognition setting instead.
IgnoredCharacters
recognition setting
No longer required. Use the new IgnoredSymbols
recognition setting instead.
AsposeOcr(string alphabet)
constructor
No longer required. Define the list of allowed characters through the new AllowedSymbols
recognition setting instead.
Examples
The examples below illustrates the changes introduced in this release:
Migrating to the new API
Original code (Aspose.OCR for .NET 23.2.1 and below):
Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Correct geometric distortions
Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter filters = new Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter();
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.AutoDewarping());
// Convert the first 3 pages of PDF to searchable PDF
Aspose.OCR.DocumentRecognitionSettings recognitionSettings1 = new Aspose.OCR.DocumentRecognitionSettings();
recognitionSettings1.Language = Aspose.OCR.Language.Ukr;
recognitionSettings1.AutoContrast = true;
recognitionSettings1.StartPage = 0;
recognitionSettings1.PagesNumber = 3;
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.RecognizePdf("source1.pdf", recognitionSettings1);
Aspose.OCR.AsposeOcr.SaveMultipageDocument("result1.pdf", Aspose.OCR.SaveFormat.Pdf, results);
// Convert the second PDF to searchable PDF
Aspose.OCR.DocumentRecognitionSettings recognitionSettings2 = new Aspose.OCR.DocumentRecognitionSettings();
recognitionSettings2.Language = Aspose.OCR.Language.Ukr;
recognitionSettings2.AutoContrast = true;
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.RecognizePdf("source2.pdf", recognitionSettings2);
Aspose.OCR.AsposeOcr.SaveMultipageDocument("result2.pdf", Aspose.OCR.SaveFormat.Pdf, results);
New code (Aspose.OCR for .NET 23.3.1 and above):
Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter filters = new Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter();
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.AutoDewarping());
// Activate automatic contrast adjustment in processing filters instead of recognition settings
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.ContrastCorrectionFilter());
// Add all PDF documents to OcrInput object and apply processing filters
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(InputType.PDF, filters);
// Specify page numbers when adding a file to OcrInput
input.Add("source1.pdf", 0, 3);
input.Add("source2.pdf");
// Use RecognitionSettings instead of DocumentRecognitionSettings
Aspose.OCR.RecognitionSettings recognitionSettings = new Aspose.OCR.RecognitionSettings();
// Remove automatic contrast adjustment from recognition settings - it is already applied through image processing filters
recognitionSettings.Language = Aspose.OCR.Language.Ukr;
// Recognize all files in one universal call
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.Recognize(input, recognitionSettings);
// Save recognition results as searchable PDFs
for(int i=0;i<results.count;i++) Aspose.OCR.AsposeOcr.SaveMultipageDocument($"result{i+1}.pdf", Aspose.OCR.SaveFormat.Pdf, results[i]);
Process and save all images from PDF documents
// Set processing filters
Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter filters = new Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter();
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.AutoDewarping());
filters.Add(Aspose.OCR.Models.PreprocessingFilters.PreprocessingFilter.ContrastCorrectionFilter());
// Add all PDF documents to OcrInput object and apply processing filters
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(InputType.PDF, filters);
input.Add("source1.pdf", 0, 3);
input.Add("source2.pdf");
// Save all images from provided PDFs to the folder
Aspose.OCR.ImageProcessing.Save(input, @"C:\images");
Detect skew angles
Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add all PDF documents to OcrInput object
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(InputType.PDF);
input.Add("source1.pdf", 0, 3);
input.Add("source2.pdf");
// Detect skew angles
List<Aspose.OCR.SkewOutput> angles = recognitionEngine.CalculateSkew(input);
foreach(Aspose.OCR.SkewOutput angle in angles) Console.WriteLine($"{angle.Source} {angle.Page} {angle.ImageIndex} {angle.Angle}");