Browser and web automation

You are already using one of the most powerful automation tools: your browser! As more and more tools and companies become web-based, learning how to automate the browser is one of the most useful skills for any Software Robot Developer.

Is browser automation what you need?

Before delving into browser automation, consider the task you are automating. Does the web application you are working with allow API access? APIs do not change as often as the graphical user interfaces. Automating using APIs means your automation scripts don't break as easily. Check the HTTP section for information on working with HTTP APIs.

GET https://api.spacexdata.com/v3/launches
GET https://api.spacexdata.com/v3/rockets/{{rocket_id}}

Sometimes there is no API provided, or the API is missing some functionality to complete the process. In these cases, you need to interact with the application's user interface; filling and submitting forms, pressing buttons, clicking on elements, scraping content, and many other interactions.

How do you "automate" a browser?

At a high level, browser automation is made possible by a "driver", a piece of software that can manage and control the browser just like a human would do, but executing instructions written in code. Historically, obtaining and setting up drivers for browser automation has required many steps. The RPA Framework set of libraries for Python and Robot Framework automatically takes care of this setup for you so that you can get started immediately.

Which browser should you use in your automation?

You can choose the browser you prefer. Our suggestion is to use Google Chrome unless the application you are trying to automate works only with a specific browser. Different browsers might behave in different ways. A robot could work with Chrome, but not Safari or Interner Explorer.

Which automation library should you use?

The RPA.Browser library uses Selenium under the hood, currently the most established tool for browser automation. Technically, it is based on the SeleniumLibrary to which it adds many convenient features to make your life easier as a developer.

The Robot Framework Browser library, based on the newer Playwright open-source project backed by Microsoft, provides an exciting alternative approach that promises to modernize the whole browser automation scene. You can learn more about how to work with this library and its pros and cons in our dedicated section. Give a try!

Opening the browser

The first step is to open a browser. The easiest way is to use the Open Available Browser keyword, which will set up everything for you automatically. Internally, it will detect which browsers are installed on your machine and start the first browser it finds (it prefers Chrome!).

Other ways to use the browser

When you open a browser using the Open Available Browser keyword, you get a blank state: the browser is not logged into any specific account and will use default settings. This setup is sufficient for many use cases, but there are cases in which you might want to use alternative approaches:

  • Attaching to a running browser (Chrome only). You might have a browser profile where you have already performed a complicated login step or have plugins or configurations you want to use in your automation.

  • You can open the current user's default browser using the Open User Browser keyword.

  • You can get full control of all configuration options by using the Open Browser keyword. Then you are responsible for setting up the driver for your desired browser. Check the keyword documentation for more info.

Headless browser?

While developing your web-based automation, it is useful to see the browser GUI at work. Once you have verified the process works, you might want to tell the browser to run in "headless" mode, which means no window or interaction will be shown to the user. You can always reverse this change to debug the process later. The headless mode is used when running the robot in a cloud container.

Locators: a fundamental concept

Humans can easily see and interact with elements using a mouse, keyboard, touch controls, or other input devices. Your software robot will need you to point it to the elements you want to interact with using locators. Using locators, you can tell the robot-browser which form inputs to fill, buttons to press, elements to scroll into view, etc. We have instructions covering using locators in web applications.

Logging into web applications

Very often, logging into a web application is the first step you need to do.

For applications that require a simple user name and password, you will simply instruct the browser to interact with a login form using locators and the relevant keywords.

You can see an example in this video, which uses Robocorp Lab:

In more complex cases, you might need to setup two-step authentication, or attach to a browser where you have manually logged into the application first.

Downloading files via the browser

If a file is publicly available on the internet and you know the URL, the easiest way to download it in the context of your software robot is probably to use the RPA Framework HTTP library.

Suppose the file is only available after you log into the application or resulting from an export operation that happens after a button is clicked, for example. In that case, you can download it using the browser. In this case, you probably want to be able to change the browser default download directory.

Learn more about browser and web automation

Now you should have a clearer idea of what browser automation means. However, there is much more to learn! One way is to follow our Beginners' course, which will guide you through solving a fun use case with browser automation. Also, check out the robot examples in this section and on our Portal.

All the articles and examples in this section

Are you stuck? Get help in the forums or on our Slack!

If you have questions or need help with your automation project, register a free account and get help in our Community forums or Slack!