Aspose.OCR’s .NET OCR plug-in extracts text from scanned PDFs or converts them into searchable documents, preserving original images. Advanced algorithms accurately identify text and table structures, making it your go-to solution for PDF text extraction.
OcrInput
object.Get the respective assembly files from the downloads or fetch the package from NuGet to add Aspose.OCR directly to your workspace.
By default, Aspose.OCR can automatically recognize a wide range of languages based on the Extended Latin alphabet. However, providing a specific language can significantly enhance recognition accuracy. Explicitly specify the language when recognizing Cyrillic, Chinese, and Hindi texts.
Aspose.OCR supports popular formats from scanners or cameras, including PDF, JPEG, PNG, and TIFF. Recognition results are returned in plain text, HTML, Microsoft Word, PDF, JSON, and XML.
Good image quality is crucial for accurate OCR. Use a scanner or high-resolution camera. The library includes advanced filters to automatically improve image quality before recognition.
Explore our online documentation or visit the Aspose.OCR for .NET repository for code samples and showcase projects.