Webinar

November 3rd, 2021 12:00 PM EDT
Automation for Field Services & DistributionNovember 3rd, 2021 12:00 PM EDT
Learn how creating a digital workforce can improve your supply chain processes!

A new option for web automation: Using the Robot Framework Browser library, based on Playwright

A bit of history πŸ¦–

Browsers are very useful when working with the web. Millions (billions!) of people use them to complete tasks on the Internet. In addition to human users, there has always been the need to automate the browser. The use cases include testing and process automation. To automate a browser, you need a tool for doing so! πŸ”¨

All hail king Selenium! πŸ‘‘

The Selenium project has been the cornerstone of browser automation for years now. Originally developed by Jason Huggins in 2004 as an internal tool at ThoughtWorks, Selenium is now the de facto standard browser automation tool. The Selenium project has been ported to many languages, including Java, Python, C#, Ruby, JavaScript, and Kotlin. It is old and venerable!

Selenium automates browsers. That's it! - www.selenium.dev

Here's a small example of Python code using Selenium:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located

with webdriver.Firefox() as driver:
    wait = WebDriverWait(driver, 10)
    driver.get("https://google.com/ncr")
    driver.find_element(By.NAME, "q").send_keys("cheese" + Keys.RETURN)
    first_result = wait.until(presence_of_element_located((By.CSS_SELECTOR, "h3>div")))
    print(first_result.get_attribute("textContent"))

Puppeteer 🎎

Selenium is not the only option for browser automation. There's Puppeteer by Google, for example:

Puppeteer is a Node library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default but can be configured to run full (non-headless) Chrome or Chromium. - https://github.com/puppeteer/puppeteer

Here's a small example of Javascript code using Puppeteer:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({ path: 'example.png' });
  await browser.close();
})();

Playwright ✍️

And then there is the Playwright project by Microsoft:

Playwright enables reliable end-to-end testing for modern web apps. - https://playwright.dev/

And here's more Javascript using Playwright:

const { webkit } = require('playwright');

(async () => {
  const browser = await webkit.launch();
  const page = await browser.newPage();
  await page.goto('http://whatsmyuseragent.org/');
  await page.screenshot({ path: `example.png` });
  await browser.close();
})();

β€Ό Wait a minute! Did you just copy & paste Puppeteer code there? πŸ€”

The code looks very similar to Puppeteer because the Playwright project started as a fork of the Puppeteer project (open-source FTW)! The fork aims at improving the browser automation APIs and providing support for more browsers, for example.

Robot Framework Browser πŸ€–

The Robot Framework Browser library wraps and builds on top of the Playwright project, bringing all the goodness to the Robot Framework ecosystem!

Why Robot Framework Browser and Playwright? πŸ€– + ✍️

Robot Framework and Selenium are good and old friends. The venerable SeleniumLibrary is actively maintained and integrates with all the good things that Selenium brings to the browser automation table. But, as always, some things can be improved even further! Let's take a look at some of the selling points of the Playwright "ecosystem".

  • Always the correct version of the browser and the driver. With Selenium, you need to manage the browser and the driver executables separately. If there is a version mismatch, things might break. Playwright bundles browser executables as part of its package to avoid browser version conflicts. No breaking automation scripts after your browser decides to auto-update and the driver does not know how to talk to the newcomer! βœ…

  • Auto-wait APIs. Playwright interactions auto-wait for elements to be ready. This improves reliability and simplifies automation authoring. With Selenium, you might need to wait for the elements before interacting with them explicitly. Also, this is not as simple as waiting for an element to be visible. The element needs to be actionable. This means you might need to run multiple checks on the elements before interacting with them. Playwright takes care of those checks automatically, so you don't have to! πŸ”

  • Timeout-free automation. Playwright receives browser signals, like network requests, page navigations, and page load events to eliminate the need for sleep timeouts that can cause flakiness. Playwright knows when something is "done". With Selenium, you might need to explicitly wait for elements and then add timeout settings on top of that before proceeding with the execution. Lots of boilerplate code, and still a chance that you will miss some conditions that might not happen on every automation run. 🀯

  • Lean parallelization with browser contexts. Reuse a single browser instance for multiple parallelized, isolated execution environments with browser contexts. Instantiating a browser is expensive (slow). Playwright aims to minimize the need for new browser instances. This results in faster automation execution. And time is money! 🏎

  • Powerful element selectors. Playwright can rely on user-facing strings like text content and accessibility labels to select elements. You can also combine different types of selectors (CSS, XPath). With Selenium, when creating a single locator, you need to choose either the CSS or XPath strategy. With Playwright, you can use both at the same time, mixing them! This makes things like "select the parent element of this element" finally easy to achieve! πŸ€Έβ€β™€οΈ

In a nutshell: Playwright-based browser automation promises to be more reliable, faster, and more convenient than Selenium-based solutions.

Enough of the marketing stuff. Let's cut to the beef and see how using this new thing looks like in practice, and compare it to how things are done with the good old Selenium-based library!

Installation with Robocorp

Playwright requires Node.js LTS version 12 or 14. The browser binaries for Chromium, Firefox, and WebKit work across the three platforms (Windows, macOS, Linux). The good news is that you can add all of this with a few lines in your conda.yaml file:

  • Add nodejs and robotframework-browser as dependencies.
  • Add the rfbrowser init command in the rccPostInstall section.
channels:
  - conda-forge
dependencies:
  - python=3.7.5
  - pip=20.1
  - nodejs=14.17.4
  - pip:
      - robotframework-browser==9.0.1
      - rpaframework==11.4.0
rccPostInstall:
  - rfbrowser init

Note that the rpaframework package is not required. We include it here so that we can make comparisons between Playwright and Selenium. If you use only Robot Framework Browser keywords, importing robotframework-browser is enough.

Robot Framework examples

To use the Robot Framework Browser library in Robot Framework scripts, you import the Browser library. We provide RPA.Browser.Selenium library examples for comparison and discuss the differences.

Import the library

Robocorp supports two options for browser automation. One is based on Selenium (RPA.Browser.Selenium) and the other on Robot Framework Browser (Playwright-based) (Browser). You can choose which one to use by importing the corresponding library.

Playwright:

*** Settings ***
Library    Browser

Selenium:

*** Settings ***
Library    RPA.Browser.Selenium

Open a new browser in headless mode

The default run mode with the Browser library is headless (no browser GUI). The New Page keyword opens a browser to the given URL. If a browser is not already open, it will be opened first.

*** Tasks ***
Playwright: Open a browser in headless mode
    New Page    https://robotsparebinindustries.com

The default run mode with the RPA.Browser.Selenium library is GUI-mode. The Open Available Browser keyword sets up the browser driver and opens the given URL. You can use the headless=True argument to force headless mode.

*** Tasks ***
Selenium: Open a browser in headless mode
    Open Available Browser    https://robotsparebinindustries.com    headless=True

Open a new browser in GUI mode

To see the browser GUI when using the Browser library, use the Open Browser keyword. We use the Browser. prefix here since the RPA.Browser.Selenium contains a keyword with the same name.

*** Tasks ***
Playwright: Open a browser in GUI mode
    Browser.Open Browser
    New Page    https://robotsparebinindustries.com

With the RPA.Browser.Selenium library, the Open Available Browser keyword opens the browser in GUI mode by default.

*** Tasks ***
Selenium: Open a browser in GUI mode
    Open Available Browser    https://robotsparebinindustries.com

Type text into a text field

With the Browser library, the Type Text keyword takes a selector and the input text as arguments. The keyword waits for the text field to be in an actionable state automatically; no need to wait for it separately.

In this case, the text field is an input element that has an id attribute with the value of username:

<input type="text" id="username" name="username" required="" class="form-control" />

Typing input#username is enough; no need to use the css: prefix that is sometimes needed with the RPA.Browser.Selenium library to indicate a CSS selector. On the other hand, you cannot use the shorthand locators (username) supported in SeleniumLibrary.

*** Tasks ***
Playwright: Type into a text field
    New Page    https://robotsparebinindustries.com
    Type Text    input#username    maria

You could also use the ID selector here: id=username. The default selector strategy is CSS.

The RPA.Browser.Selenium library supports SeleniumLibrary shorthand CSS locators. That's why we can just write username and it works. SeleniumLibrary checks for elements having the provided id.

*** Tasks ***
Selenium: Type into a text field
    Open Available Browser    https://robotsparebinindustries.com
    Input Text    username    maria

Type secrets and wait for elements

The Browser library keyword Type Secret types the given text, but does not log it.

In this case, the input#firstname field takes some time to load. The Browser library waits for the field automatically; no need to use explicit waits here:

*** Tasks ***
Playwright: Typing secrets and automatic waiting
    New Page    https://robotsparebinindustries.com
    Type Text    input#username    maria
    Type Secret    input#password    thoushallnotpass
    Click    button.btn-primary
    Type Text    input#firstname    First!

The equivalent keyword in RPA.Browser.Selenium is Input Password.

Here we can use shorthand CSS locators (username, password, firstname), and instead of clicking a button to submit, we can choose to use the Submit Form keyword. We also have the Input Text When Element Is Visible keyword for explicit waiting before typing:

*** Tasks ***
Selenium: Typing secrets and explicit waiting
    Open Available Browser    https://robotsparebinindustries.com
    Input Text    username    maria
    Input Password    password    thoushallnotpass
    Submit Form
    Input Text When Element Is Visible    firstname    First!

Use complex selectors

Let's consider a more complicated case for element selection. The Swag Labs website displays a listing of products. This is one of the products from the list:

Swag Labs product card

And this is the HTML markup for the product card:

<div class="inventory_item">
  <div class="inventory_item_img">
    <a href="./inventory-item.html?id=4" id="item_4_img_link"
      ><img class="inventory_item_img" src="./img/sauce-backpack-1200x1500.jpg"
    /></a>
  </div>
  <div class="inventory_item_label">
    <a href="./inventory-item.html?id=4" id="item_4_title_link"
      ><div class="inventory_item_name">Sauce Labs Backpack</div></a
    >
    <div class="inventory_item_desc">
      carry.allTheThings() with the sleek, streamlined Sly Pack that melds uncompromising style with unequaled laptop
      and tablet protection.
    </div>
  </div>
  <div class="pricebar">
    <div class="inventory_item_price">$29.99</div>
    <button class="btn_primary btn_inventory">ADD TO CART</button>
  </div>
</div>

We want the robot to add the Sauce Labs Backpack product to the cart by clicking the ADD TO CART button. The issue is that the product name (the product we want to add to the cart) and the add button are not hierarchically connected. They have the same parent element, though (div.inventory_item).

To click the correct add to cart button, we:

  • Find the element containing the product name (div.inventory_item_name = Sauce Labs Backpack).
  • Navigate to the parent container that contains both the name and the button (div.inventory_item).
  • Find the button element under the parent, and click it.

CSS does not support selecting elements by their textual content or selecting their parent elements. We will need to use XPath and sometimes the web element APIs to navigate the DOM. This example is derived from the Web store order processor robot example, which uses the Selenium-based RPA.Browser.Selenium library:

*** Keywords ***
Add product to cart
    ${locator}=
    ...    Set Variable
    ...    xpath://div[@class="inventory_item" and descendant::div[contains(text(), "Sauce Labs Backpack")]]
    ${product}=    Get WebElement    ${locator}
    ${add_to_cart_button}=    Set Variable    ${product.find_element_by_class_name("btn_primary")}
    Click Button    ${add_to_cart_button}

Using the Browser library, we can use chainable selectors. Combining both CSS and XPath selectors, the same logic might look like this:

*** Keywords ***
Add product to cart
    ${add_to_cart_button}=
    ...    Get Element
    ...    .inventory_item >> text="Sauce Labs Backpack" >> ../.. >> .btn_primary
    Click    ${add_to_cart_button}

Much shorter, very powerful, such wow! 🐢

Read more about element selectors in the Playwright documentation, and in the Playwright GitHub repository.

Build keywords with JavaScript

Love your daily dose of JavaScript? Great! With Robot Framework Browser, you can implement keywords in your favorite language. Create a new JavaScript file. Let's call it module.js in this case:

async function myGoToKeyword(page, args) {
  await page.goto(args[0]);
  return await page.title();
}

exports.__esModule = true;
exports.myGoToKeyword = myGoToKeyword;

In your robot script, import the JavaScript module, and call the keyword:

*** Settings ***
Library    Browser    jsextension=${CURDIR}/module.js

*** Tasks ***
Use a JavaScript-based keyword
    New Page
    ${title}=    myGoToKeyword    https://playwright.dev
    Should Be Equal    ${title}    Playwright

Use assertions

Sometimes you want to verify (assert) expectations before proceeding with the process. Here's how you assert things with the Robot Framework Browser library:

*** Tasks ***
Playwright: Use asserts
    New Page    https://robotsparebinindustries.com
    Get Title    ==    RobotSpareBin Industries Inc. - Intranet
    Get Title    validate    value == "RobotSpareBin Industries Inc. - Intranet"
    Get Attribute    meta    charset    ==    utf-8
    Get Url    ==    https://robotsparebinindustries.com/#/

Currently supported assertion operators are:

OperatorAlternative OperatorsDescriptionValidate Equivalent
==equal, should beChecks if returned value is equal to expected value.value == expected
!=inequal, should not beChecks if returned value is not equal to expected value.value != expected
>greater thanChecks if returned value is greater than expected value.value > expected
>=Checks if returned value is greater than or equal to expected value.value >= expected
<less thanChecks if returned value is less than expected value.value < expected
<=Checks if returned value is less than or equal to expected value.value <= expected
*=containsChecks if returned value contains expected value as substring.expected in value
^=should start with, startsChecks if returned value starts with expected value.re.search(f"^{expected}", value)
\$=should end with, endsChecks if returned value ends with expected value.re.search(f"{expected}\$", value)
matchesChecks if given RegEx matches minimum once in returned value.re.search(expected, value)
validateChecks if given Python expression evaluates to True.
evaluatethenWhen using this operator, the keyword does return the evaluated Python expression.

Limitations

  • Playwright does not support legacy Microsoft Edge or IE11 (deprecation notice). The new Microsoft Edge (on Chromium) is supported.

  • You can not attach to an already running browser.

And so much more!

This article only scratched the surface. View the Robot Framework Browser documentation to learn about all the features, keywords, etc. Happy automation!

Learn more about the libraries mentioned on this page:

October 21, 2021