Robot
RPA.DocumentAI: extract structured data from documents
RPA.DocumentAI: Intelligent Document Processing with various engines
Currently supported engines:
- Google:
google
(requiresrpaframework-google
) - Base64:
base64ai
- Nanonets:
nanonets
Tasks
Process a real world PDF invoice (or its PNG counterpart) with the following tasks:
Document AI Google
: using Google engineDocument AI Base64
: using Base64 engineDocument AI Nanonets
: using Nanonets engineDocument AI All
: using all the available enginesDocument AI Work Items
: using custom configured engines for multiple files
Secrets
We recommend using Control Room's Vault for storing the API keys you'd normally need to configure before being able to operate these libraries which need to authenticate in their external services before being able to predict a document.
The expected structure can be observed within our vault.yaml file and looks like so:
out of which we can understand the following:
- Google uses a JSON (serialized dictionary) describing your
service account private
key under the
serviceaccount
field. - Base64 needs two comma-separated values under
base64ai
, which unfolds into the account e-mail address and an API key generated under it. - Nanonets only requires a valid
API key under
nanonets
.
Use
Init Engine
with any of the above for a unified & simplified experience.
⚠️ At all times keep these values private (securely stored in our Vault) and treat them like passwords.
Technical information
Last updated
August 15, 2023License
Apache License 2.0