Web scraper Python robot example

Get the code and run this example in your favorite editor on our Portal!

It is possible to implement robots using pure Python, without using Robot Framework. These robots are first-class citizens in Robocorp: they can be developed iteratively with our developer tools, and run and orchestrated in Control Room, like any other robot. RPA Framework, our set of open-source libraries, provides APIs for both Robot Framework and Python.

This simple robot opens a web page, searches for a term, and takes a screenshot of the web page using the RPA.Browser.Selenium library.

Robot script (Python)

from RPA.Browser.Selenium import Selenium

browser = Selenium()
url = "https://robocorp.com/docs/"
term = "python"
screenshot_filename = "output/screenshot.png"


def open_the_website(url: str):
    browser.open_available_browser(url)


def search_for(term: str):
    input_field = "css:input"
    browser.input_text(input_field, term)
    browser.press_keys(input_field, "ENTER")


def store_screenshot(filename: str):
    browser.screenshot(filename=filename)

# Define a main() function that calls the other functions in order:
def main():
    try:
        open_the_website(url)
        search_for(term)
        store_screenshot(screenshot_filename)
    finally:
        browser.close_all_browsers()

# Call the main() function, checking that we are running as a stand-alone script:
if __name__ == "__main__":
    main()
  • Import the Selenium library from the RPA.Browser.Selenium package.
  • Initialize the browser and some variables.
  • Define the functions that implement the operations the robot is supposed to do.
  • Define the main function.
  • Call the main function.

Important: Always use try and finally to ensure that resources such as open browsers are closed even if some of the functions fail. This is similar to using [Teardown] in Robot Framework scripts. Read the Errors and Exceptions document on python.org for more information.

robot.yaml

The robot.yaml file for the Python robot looks like this:

tasks:
  entrypoint:
    command:
      - python
      - tasks.py

condaConfigFile: conda.yaml
artifactsDir: output
ignoreFiles:
  - .gitignore
PATH:
  - .
PYTHONPATH:
  - .

Note the command section:

command:
  - python
  - tasks.py

In this case, we tell Python to run the tasks.py file. You can customize the command for your needs.

For comparison, the typical Robot Framework command looks like this:

command:
  - python
  - -m
  - robot
  - --report
  - NONE
  - -d
  - output
  - --logtitle
  - Task log
  - tasks.robot

In this case, we tell Python to run the robot module (python -m robot) with arguments.

Learn more about the libraries mentioned on this page:

June 28, 2021