AWS makes Textract generally available for extracting text from documents

Amazon says no machine learning expertise is needed to use the to use the service, which automatically extracts text and data from tables or forms.

AWS WorkLink: Employees can log in safely into the corporate system through mobile The solution is designed to eradicate the need for custom browsers or VPNs.

Amazon Web Services on Wednesday announced the general availability of Textract,  a fully managed service that uses machine learning to automatically extract text and data, including from tables and forms. Textract was one of multiple AI-powered tools and services unveiled at last year's AWS re:Invent conference that requires no machine learning expertise to use.

Special Feature

Special Feature: Managing AI and ML in the Enterprise

This ebook, based on the latest ZDNet / TechRepublic special feature, advises CXOs on how to approach AI and ML initiatives, figure out where the data science team fits in, and what algorithms to buy versus build.

Read More

Typically, companies use optical character recognition (OCR) software to extract text and data from files like contracts, tax documents, expense reports or patient forms. However, traditional OCR technologies can't recognize common layouts like forms and tables. They consequently generate a lengthy and often inaccurate text dump. 

Also: Top cloud providers 2019: AWS, Microsoft Azure, Google Cloud; IBM makes hybrid move; Salesforce dominates SaaS

Must read

Top cloud providers 2019: AWS, Microsoft Azure, Google Cloud

The cloud computing race in 2019 will have a definite multi-cloud spin.

Read More

By comparison, AWS has called Textract an OCR ++ service. It can, for instance, see a document with a table and recognize that the data belongs in rows and columns. "It's able to identify there's a table and able to lay out for you what that table should look like so you can use and read that data," AWS CEO Andy Jassy said at re:Invent.

Textract's API supports multiple image formats including scans, PDFs and photos, and customers can use it with database and analytics services like Amazon Elasticsearch Service, Amazon DynamoDB and Amazon Athena. They can also use it with other machine learning services like Amazon Comprehend, Comprehend Medical, Amazon Translate or Amazon SageMaker.

Customers using the service already include The Globe and Mail, PwC, Healthfirst, UiPath, Teradact, Ripcord, BluePrism and Alfresco.

Textract is now available in the US East (Ohio) region, US East (N. Virginia), US West (Oregon) and EU (Ireland). AWS will bring it to additional regions in the coming year.