Skip to main content

Amazon Textract is a machine learning service from AWS that automatically extracts text, handwriting and data from scanned documents. Instead of just offering basic optical character recognition (OCR), Textract goes a step further by identifying, understanding and extracting information from complex structures such as forms and tables. This makes it easier for users to transform traditional paper documents into usable digital data, streamlining processes and reducing manual data entry.

In this Onify Blueprint we show how to 1) upload files to AWS S3, 2) process the PDF using AWS Textract and 3) send a link to a form to verify the data from the PDF. The next step, deciding where to send the data from the form, is handled in another Blueprint 🙂.

For more info on this Blueprint, visit GitHub.

Onify

Author Onify

More posts by Onify