Cross-platform desktop automation with Robocorp and image templates (Windows, Linux, macOS)
Cross-platform image-based automation (Windows, Linux, and macOS). Sounds too good to be true? Let us introduce Robocorp image template matching!

When working with applications, human beings are decent at clicking on user interface elements. We have eyes for scraping visual information from the screen (buttons, input fields, calendar widgets...) and a brain to process that into a model we can work with.

Robots can interact with applications when we point them to the things we would like them to click on. A common way to refer to the application's elements is to use textual locators. These can be, for example, IDs (username), names (first-name-field), or structural hierarchy (the first input element on the first form).

The instructions for the robot, using Robot Framework syntax, might look like this:

Sometimes the applications do not expose any textual locators for the robot. Here’s an example from the Maps application on macOS; the 3D button on the map view:

There is a keyboard shortcut for 3D view, but let’s use this as an example, still! 😅

Fret not! Robocorp Lab and Image Templates to the rescue. Failing to describe the element locator as text, we can capture the element's image, give it a textual alias (Maps.3D), and tell the robot to find that image on the screen and click it.

The instructions for the robot, using Robot Framework syntax, might look like this:

Wow! That is powerful. You can click on anything you can describe as an image. And all this is cross-platform. Yes! Works on Windows, Linux, and macOS! 🤯

The secret sauce behind the cross-platform image template support is the pynput, tkinter, mss, opencv-python, and numpy Python libraries. Open-source FTW! 🤓

The template images you save with Robocorp Lab are stored in the .images directory in your robot directory. The locators.json file in the robot directory works as a “database” for the locators. Here is an example of the locators.json file:

Having the images and the locator file stored in the robot directory makes it possible for other developers to use the same locators when they work on the robot! Using JSON makes it easy to merge changes using Git.

There you have it. Cross-platform image-based automation powered by Robocorp! Learn more about image template matching at Locating and targeting user interface elements in Robocorp Lab.