Defining the whitelist of characters
Contents
[
Hide
]
Limiting a subset of characters instead of using the full set can greatly improve recognition accuracy, increase speed, and reduce resource consumption. A list of characters can be automatically identified from an image using the built-in Aspose.OCR mechanisms.
You can specify a list of characters Aspose.OCR engine will look for in AllowedSymbols
property of recognition settings. The characters are provided as a case-sensitive string.
Characters that do not match the provided list are ignored.
Aspose.OCR.AsposeOcr recognitionEngine = new Aspose.OCR.AsposeOcr();
// Add an image to OcrInput object
Aspose.OCR.OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.SingleImage);
input.Add("source.png");
// Limit a subset of recognized characters
Aspose.OCR.RecognitionSettings recognitionSettings = new Aspose.OCR.RecognitionSettings();
recognitionSettings.AllowedSymbols = "AÁBCDEÉFG12345";
// Recognize image
List<Aspose.OCR.RecognitionResult> results = recognitionEngine.Recognize(input, recognitionSettings);
foreach(Aspose.OCR.RecognitionResult result in results)
{
Console.WriteLine(result.RecognitionText);
}
Upgrading from previous versions
Starting with Aspose.OCR for .NET 23.3.1, this recognition setting replaces the alphabet
parameter of Aspose.OCR.AsposeOcr
class constructor as well as AllowedCharacters
recognition setting.