Cloud machine learning (ML) APIs

This robot is included in our downloadable example robots. You can also find the code at the example robots repository.

Cloud APIs

Supporting advanced data analytics or decision logic beyond simple deterministic rules often requires machine learning (ML) solutions. One way to provide the needed intelligence is to use commercial cloud services that can be applied for a number of typical use cases. Such services include, for example, image analysis, natural language processing, document analysis, and optical character recognition (OCR).

These examples demonstrate how to use cloud-based ML APIs with RPA Framework. Currently, a selected set of services from AWS (RPA.Cloud.AWS), Microsoft Azure (RPA.Cloud.Azure) and Google Cloud (RPA.Cloud.Google) are available.

You can access cloud ML services that are not supported by RPA Framework by writing custom Robot Framework libraries to connect them via REST APIs or service-specific SDKs. If your RPA team has in-house machine learning expertise, the possibilities for self-developed ML services from the cloud are almost limitless. A separate article on this topic will be published later.


To use cloud ML APIs, you need:

  • An account for the cloud provider of your choice.
  • The required ML service(s) activated and accessible from the public internet.
  • Credentials to the service(s) - the format depends on the cloud provider.

Setup the access keys

The authentication method varies depending on the cloud service:

  • AWS: AWS key and AWS key ID (AWS_KEY and AWS_KEY_ID)
  • Azure: Azure subscription key (AZURE_SUBSCRIPTION_KEY)
  • Google Cloud: Service credentials file

There are three different ways to set the access credentials - See the documentation of the RPA.Cloud libraries. Here we present an approach that works both in a local and Robocorp Cloud setting.

For local robot runs, we use a vault.json file placed outside the repository, e.g., in /Users/<username>/vault.json. For AWS and Azure, we can save the key values directly in the file. For Google Cloud, the content of the service credentials file must be mapped to a string, with special characters escaped. This can be done, e.g., with the following Python script.

import json

vault = {}

vault["aws"] = {"AWS_KEY_ID": "aws-access-key-id",
                "AWS_KEY": "aws-secret-access-key"}

vault["azure"] = {"AZURE_SUBSCRIPTION_KEY": "azure-subscription-key"}

with open("path-to-gcp-credentials-file", 'r') as f:
    gcp_json_str =

vault["gcp"] = {"json_content": gcp_json_str}

with open('/Users/<username>/vault.json', 'w') as f:
    json.dump(vault, f)

This will result in the following JSON file:

  "aws": {
    "AWS_KEY_ID": "aws-access-key-id",
    "AWS_KEY": "aws-secret-access-key"
  "azure": { "AZURE_SUBSCRIPTION_KEY": "azure-subscription-key" },
  "gcp": { "json_content": "GCP service credentials file as string" }

If the robot is run in Robocorp Cloud, store the above credential information in your workspace vault with the names aws, azure and gcp.

Configure file vault support

To run in your local or development environment, edit the path to the vault file in devdata/env.json.

Sample data

In this example, we will use images of an invoice and a sphygmomanometer to showcase the image processing functionalities. For natural language processing, we will be working on a piece of text given as a string.

The sample invoice and picture are shown below.

invoice picture

Using ML services with RPA Framework

The robot code resides in the tasks.robot file. In the *** Settings *** section, we define which libraries are used and configure the cloud libraries.

Match the region argument of RPA.Cloud.AWS and RPA.Cloud.Azure to your region in AWS or Azure.

*** Settings ***
Documentation     Machine Learning API examples.
Library           RPA.FileSystem
Library           RPA.HTTP
Library           RPA.Tables
Library           RPA.Cloud.AWS
...               region=us-east-1
...               robocloud_vault_name=aws
Library           RPA.Cloud.Azure
...               region=eastus
...               robocloud_vault_name=azure
Library           RPA.Cloud.Google
...               robocloud_vault_name=gcp
...               robocloud_vault_secret_key=json_content

*** Variables ***
${INVOICE_FILE}=    ${CURDIR}${/}output${/}invoice.png
${PICTURE_FILE}=    ${CURDIR}${/}output${/}picture.jpg
${TEXT_SAMPLE}=    A software robot developer creates digital agents for robotic process
...               automation (RPA), test automation, application monitoring, or some
...               other use. Tens of thousands of new jobs are predicted to be created
...               in the RPA industry. Most of these will be for developers.
...               The demand for software robot developers is growing. Many companies
...               will employ teams of software robot developers to build and operate
...               their automated workforce. Other organizations hire external
...               developers to offer them automation with a 'robotics-as-a-service' model.

The first task is to download the sample images:

*** Tasks ***
Download sample files
    Create Directory    ${CURDIR}${/}output    parents=True
    Download    ${INVOICE_URL}    target_file=${INVOICE_FILE}    overwrite=True
    Download    ${PICTURE_URL}    target_file=${PICTURE_FILE}    overwrite=True

The basic functionality of RPA.Cloud libraries is to send a request to the API and return the response as JSON without any modification to the content. The JSON data can then be processed further based on the needs of the automated process. Such operations are not currently fully supported by RPA Framework since each service has a different response syntax.


To analyze a scanned invoice, we use the Textract service from AWS.

*** Tasks ***
Analyze invoice with AWS Textract and find tables from the response
    Init Textract Client    use_robocloud_vault=True
    ${response}=    Analyze Document    ${INVOICE_FILE}    ${CURDIR}${/}output${/}textract.json
    ${tables}=    Get Tables
    FOR    ${key}    IN    @{tables.keys()}
        Write Table To Csv    ${tables["${key}"]}    ${CURDIR}${/}output${/}table_${key}.csv

After configuring the client and getting the answer from AWS, we use the Get Tables keyword to collect tables from the response. In this example, table data is correctly found.

1Front and rear brake cables100.00100.00
2New set of pedal arms15.0030.00
3Labor 3hrs5.0015.00
Sales Tax 6.25%9.06

The full response from AWS is stored in a JSON file (here output/textract.json). Parsing the other data, like key-value pairs, from the JSON response needs to be done by separate functions. The syntax of the Textract response and code examples are available on the Textract developer guide.


The usage of (RPA.Cloud.Azure) is demonstrated by Text analytics service. In the following task, the robot recognizes the language, key phrases, and sentiment of a sample text.

*** Tasks ***
Analyze text sample with Azure
    Init Text Analytics Service    use_robocloud_vault=True
    Detect Language    Vilken språk talar man in Åbo?    ${CURDIR}${/}output${/}text_lang.json
    Key Phrases    ${TEXT_SAMPLE}    ${CURDIR}${/}output${/}text_phrases.json
    Sentiment analyze    ${TEXT_SAMPLE}    ${CURDIR}${/}output${/}text_sentiment.json

The outcome of the analysis is stored in JSON files (here output/text_lang.json, output/text_phrases.json, and output/text_sentiment.json). As a summary, we find the following information:

'teams of software robot developers' 'external developers', 'test automation', 'robotic process automation', 'RPA industry', 'digital agents', 'application monitoring', 'Tens of thousands', 'new jobs', 'companies', 'demand', 'service', 'use', 'robotics', 'organizations', 'automated workforce', 'model'
neutral{'positive': 0.03, 'neutral': 0.96, 'negative': 0.01}

Google Cloud

From RPA.Cloud.Google we try Google Vision AI which can be used, for example, to label an image or detect the text.

*** Tasks ***
Analyze image with Google Vision AI
    Init Vision Client    use_robocloud_vault=True
    ${labels}=    Detect Labels    ${PICTURE_FILE}    ${CURDIR}${/}output${/}vision_labels.json
    ${text}=    Detect Text    ${PICTURE_FILE}    ${CURDIR}${/}output${/}vision_text.json
    Log    ${CURDIR}

The full responses are stored in JSON files (here output/vision_labels.json and output/vision_text.json). The summary of the outcome is shown below. Note that in order to get "sphygmomanometer" among the labels, one needs to train a model since such a term is not included in Google Vision AI. The recognized text blocks cover almost all the text appearing in the image. The full response data also contains the bounding box information.

Electronic device0.8969995
Measuring instrument0.827221
"beurer", "102", "63", "LINKER ARM", "LEFT ARM", "WHO", "SYS", "mmHg", "mmHg", "DIA", "mmHg", "2-3 cm", "Arterie", "artery", "LATEX", "12 20M" "9.05.", "FREE", "PULSE", "/min", "22-35 cm", "Item No.: 162.972", "urer GmbH, Söflinger Strasse 218, 89077 Ulm, Germany", "M+C"; "Set"