RPA.DocumentAI

Wrapper library offering generic keywords for initializing, scanning and retrieving results as fields from documents (PDF, PNG etc.).

Library requires at the minimum rpaframework version 19.0.0.

This is a helper facade for the following libraries:

  • RPA.Cloud.Google (requires rpaframework-google)
  • RPA.DocumentAI.Base64AI
  • RPA.DocumentAI.Nanonets

Where the following steps are required:

  1. Engine initialization: Init Engine
  2. Document scan: Predict
  3. Result retrieval: Get Result

So no matter the engine you're using, the very same keywords can be used, as only the passed parameters will differ (please check the docs on each library for particularities). Once initialized, you can jump between the engines with Switch Engine. Before scanning documents, you must configure the service first, with a model to scan the files with and an API key for authorizing the access.

See Portal example: https://robocorp.com/portal/robot/robocorp/example-document-ai

Example: Robot Framework

*** Settings *** Library RPA.DocumentAI *** Tasks *** Scan Documents Init Engine base64ai vault=document_ai:base64ai Init Engine nanonets vault=document_ai:nanonets Switch Engine base64ai Predict invoice.png ${data} = Get Result Log List ${data} Switch Engine nanonets Predict invoice.png model=858e4b37-6679-4552-9481-d5497dfc0b4a ${data} = Get Result Log List ${data}

Example: Python

from RPA.DocumentAI import DocumentAI, EngineName lib_docai = DocumentAI() lib_docai.init_engine( EngineName.GOOGLE, vault="document_ai:serviceaccount", region="eu" ) lib_docai.predict( "invoice.pdf", model="df1d166771005ff4", project_id="complete-agency-347912", region="eu" ) print(lib_docai.get_result())