Latest release (March 2023)
This article contains a summary of recent changes, enhancements and bug fixes in Aspose.OCR for Java 23.3.0 (March 2023) release.
GPU version: 21.6.0
Deprecation warning
The release 23.3.1 introduces a slimmer, faster and more straightforward API that can significantly simplify your code. Unfortunately, the major reorganization of classes, methods and properties result in “breaking changes”.
To make it easier to upgrade your code, we have kept all existing classes and methods fully functional, but marked them as deprecated. All of your existing code will continue to work and you can even make minor updates to it, but be aware that all deprecated elements are scheduled to be removed in release 23.11.0 (November 2023) in favor of the new API.
Time to deprecation: 6 months left.
What was changed
Key | Summary | Category |
---|---|---|
OCRJAVA‑314 OCRJAVA‑315 |
A slimmer, faster and more straightforward API has been introduced. See Added public APIs for details. | New feature |
OCRJAVA‑314 | Most of the existing API classes and methods have been marked as deprecated to remind you to update your existing code. They remain functional but will be removed in release 23.11.0 (November 2023) in favor of the new API introduced in this release. See Deprecated APIs for details. | Enhancement |
Public API changes and backwards compatibility
This section lists all public API changes introduced in Aspose.OCR for Java 23.3.0 that may affect the code of existing applications.
Added public APIs:
The following public APIs have been introduced in this release:
Aspose.OCR.OcrInput
class
The universal class for providing any type of data (images, PDF documents, archives, folders, streams, arrays, and so on) to the new image processing and recognition methods.
Aspose.OCR.AsposeOcr.Recognize
method
Recognize one or more files provided as an OcrInput
object. It is a universal replacement for the following recognition methods:
Method | Action |
---|---|
RecognizePage |
Extract text from a raster image, provided as file, memory stream, or a pixel array. |
RecognizePdf |
Extract text from a PDF document. |
RecognizeTiff |
Extract text from a multi-page TIFF image. |
RecognizePageFromUri |
Recognize a file hosted on website without downloading it to the computer. |
RecognizeMultiplePages |
Batch recognition. |
Aspose.OCR.AsposeOcr.DetectRectangles
method
Find areas of images containing text. It is an extended replacement for getTextAreas
method.
Aspose.OCR.ImageProcessing
class
Specially adjust one or more files to improve the accuracy and reliability of the OCR. This class provides extended replacements for Aspose.OCR.AsposeOcr.PreprocessImage
method:
Method | Action |
---|---|
Save(OcrInput images, string folderPath) |
Saves processed images to a folder. Replaces PreprocessImage method. |
Render(OcrInput images) |
Processes files and returns an OcrInput object with adjusted images that can be passed to recognition methods. |
Aspose.OCR.AsposeOcr.CalculateSkew
method
Find out skew angles of provided images. It is an universal replacement for the following methods:
Method | Action |
---|---|
CalcSkewImage |
Detect the skew angle of an image. |
CalcSkewImageFromUri |
Detect the skew angle of an image hosted on website without downloading it to the computer. |
AsposeOCRPdf.GetImages
method
Extract individual images from a PDF document.
Updated public APIs:
No changes.
Removed public APIs:
No changes.
Deprecated APIs
The following public APIs have been marked as deprecated and will be removed in 23.11.0 (November 2023) release:
CalcSkewImage
method
Replaced with Aspose.OCR.AsposeOcr.CalculateSkew
method.
CalcSkewImageFromUri
method
Replaced with Aspose.OCR.AsposeOcr.CalculateSkew
method.
PreprocessImage
method
Replaced with Render
and Save
methods of Aspose.OCR.ImageProcessing
class.
RecognizePage
method
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizePageFromUri
method
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeMultiplePages
method
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizePdf
method
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeTiff
method
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
RecognizeLine
method
Replaced with Aspose.OCR.AsposeOcr.Recognize
method.
getTextAreas
method
Replaced with Aspose.OCR.AsposeOcr.DetectRectangles
method.
AsposeOcr(string alphabet)
constructor
No longer required. Define the list of allowed characters through the setAllowedCharacters
method of recognition setting instead.
RecognitionSettings.setDetectAreas
method
No longer required. Disable the document areas detection or override the default detection mode with setDetectAreasMode
(#allowedsymbols-recognition-setting) method of recognition setting instead.
RecognitionSettings.setAutoSkew
method
No longer required. Enable automatic skew correction image processing filter instead.
RecognitionSettings.setSkew
method
No longer required. Specify image rotation angle in image processing filters instead.
RecognitionSettings.setThresholdValue
method
No longer required. Specify binarization threshold in image processing filters instead.
RecognitionSettings.setAutoContrast
method
No longer required. Enable or disable automatic contrast adjustments in image processing filters instead.
RecognitionSettings.setAutoDenoising
method
No longer required. Enable or disable automatic noise removal in image processing filters instead.
Usage examples
The examples below illustrates the changes introduced in this release:
Migrating to the new API
Original code (Aspose.OCR for Java 23.2.0 and below):
AsposeOcr api = new AsposeOcr();
// Correct geometric distortions
PreprocessingFilter filters = new PreprocessingFilter();
filters.add(PreprocessingFilter.AutoDewarping());
// Specify recognition settings
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setAutoDenoising(true);
recognitionSettings.setSkew(90);
recognitionSettings.setPreprocessingFilters(filters);
recognitionSettings.setLanguage(Language.Ukr);
// Extract text from image
RecognitionResult result = api.RecognizePage("source.png", recognitionSettings);
System.out.println("Recognition result:\n" + result.recognitionText + "\n\n");
New code (Aspose.OCR for Java 23.3.1 and above):
AsposeOCR api = new AsposeOCR();
// Configure image processing
PreprocessingFilter filters = new PreprocessingFilter();
filters.add(PreprocessingFilter.AutoDewarping());
filters.add(PreprocessingFilter.AutoDenoising());
filters.add(PreprocessingFilter.Rotate(90));
// Specify recognition settings
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setLanguage(Language.Ukr);
// Add an image to OcrInput object and apply processing filters
OcrInput input = new OcrInput(InputType.SingleImage, filters);
input.add("source.png");
// Extract text from any source data using a universal call
ArrayList<RecognitionResult> results = api.Recognize(input, recognitionSettings);
System.out.println("Recognition result:\n" + results[0].recognitionText + "\n\n");
Process and save all images from PDF documents
// Set processing filters
PreprocessingFilter filters = new PreprocessingFilter();
filters.Add(PreprocessingFilter.AutoDewarping());
filters.add(PreprocessingFilter.ContrastCorrection());
// Add all PDF documents to OcrInput object and apply processing filters
OcrInput input = new OcrInput(InputType.PDF, filters);
input.Add("source1.pdf", 0, 3);
input.Add("source2.pdf");
// Save all images from provided PDFs to the folder
ImageProcessing.Save(input, "C://images");
Detect skew angles
AsposeOCR api = new AsposeOCR();
// Add all PDF documents to OcrInput object
OcrInput input = new OcrInput(InputType.PDF);
input.Add("source1.pdf", 0, 3);
input.Add("source2.pdf");
// Detect skew angles
ArrayList<SkewOutput> angles = api.CalculateSkew(input);
for(SkewOutput x : angles) {
System.out.println(x.Angle);
}