Browser and web automation
Learn the ropes of web browser automation to log into applications, fill forms, download files and more. Explore automation for Chrome, Safari and Firefox.
You are already using one of the most powerful automation tools: your browser! As more and more tools and companies become web-based, learning how to automate the browser is one of the most useful skills for any Software Robot Developer.
Is browser automation what you need?
Before delving into browser automation, consider the task you are automating. Does the web application you are working with allow API access? APIs do not change as often as the graphical user interfaces. Automating using APIs means your automation scripts don't break as easily. Check the
HTTP section for information on working with HTTP APIs.
Sometimes there is no API provided, or the API is missing some functionality to complete the process. In these cases, you need to interact with the application's user interface; filling and submitting forms, pressing buttons, clicking on elements, scraping content, and many other interactions.
How do you "automate" a browser?
At a high level, browser automation is made possible by a "driver", a piece of software that can manage and control the browser just like a human would do, but executing instructions written in code. Historically, obtaining and setting up drivers for browser automation has required many steps. The RPA Framework set of libraries for Python and Robot Framework automatically takes care of this setup for you so that you can get started immediately.
Which browser should you use in your automation?
You can choose the browser you prefer. Our suggestion is to use Google Chrome unless the application you are trying to automate works only with a specific browser. Different browsers might behave in different ways. A robot could work with Chrome, but not Safari or Internet Explorer.
Which automation library should you use?
Robocorp provides two main options for browser automation.
The RPA.Browser.Selenium library uses Selenium under the hood, currently the most established tool for browser automation. Technically, it is based on the SeleniumLibrary to which it adds many convenient features to make your life easier as a developer.
The RPA.Browser.Playwright library (Robot Framework Browser), based on the newer Playwright open-source project backed by Microsoft, provides an exciting alternative approach that promises to modernize the whole browser automation scene. You can learn more about how to work with this library and its pros and cons in our dedicated Playwright section. Give a try!
Opening the browser
The first step is to open a browser. When using the RPA.Browser.Selenium library, the easiest way is to use the
Open Available Browser keyword, which will set up everything for you automatically. Internally, it will detect which browsers are installed on your machine and start the first browser it finds (it prefers Chrome!).
Other ways to use the browser
When you open a browser using the
Open Available Browser keyword, you get a blank state: the browser is not logged into any specific account and will use default settings. This setup is sufficient for many use cases, but there are cases in which you might want to use alternative approaches:
Attaching to a running browser (Chrome only). You might have a browser profile where you have already performed a complicated login step or have plugins or configurations you want to use in your automation.
You can open the current user's default browser using the
Open User Browserkeyword.
You can get full control of all configuration options by using the
Open Browserkeyword. Then you are responsible for setting up the driver for your desired browser. Check the keyword documentation for more info.
While developing your web-based automation, it is useful to see the browser GUI at work. Once you have verified the process works, you might want to tell the browser to run in "headless" mode, which means no window or interaction will be shown to the user. You can always reverse this change to debug the process later. The headless mode is used when running the robot in a cloud container.
Locators: a fundamental concept
Humans can easily see and interact with elements using a mouse, keyboard, touch controls, or other input devices. Your software robot will need you to point it to the elements you want to interact with using locators. Using locators, you can tell the robot-browser which form inputs to fill, buttons to press, elements to scroll into view, etc. We have instructions covering using locators in web applications.
Logging into web applications
Very often, logging into a web application is the first step you need to do.
For applications that require a simple user name and password, you will simply instruct the browser to interact with a login form using locators and the relevant keywords.
In more complex cases, you might need to setup two-step authentication, or attach to a browser where you have manually logged into the application first.
Downloading files via the browser
If a file is publicly available on the internet and you know the URL, the easiest way to download it in the context of your software robot is probably to use the RPA Framework
Suppose the file is only available after you log into the application or resulting from an export operation that happens after a button is clicked, for example. In that case, you can download it using the browser. In this case, you probably want to be able to change the browser default download directory.
Learn more about browser and web automation
Now you should have a clearer idea of what browser automation means. However, there is much more to learn! One way is to follow our Beginners' course, which will guide you through solving a fun use case with browser automation. Completing the course will grant you the Robocorp Level I certificate!
Also, check out the robot examples on our Portal.