RPA.Desktop library | Robocorp documentation

RPA.Desktop

module `RPA.Desktop`

class `RPA.Desktop.Desktop`

Desktop(locators_path: Optional[str] = None)

Desktop is a cross-platform library for navigating and interacting with desktop environments. It can be used to automate applications through the same interfaces that are available to human users.

The library includes the following features:

Mouse and keyboard input emulation
Starting and stopping applications
Finding elements through image template matching
Scraping text from given regions
Taking screenshots
Clipboard management

WARNING

Windows element selectors are not currently supported, and require the use of RPA.Desktop.Windows

Installation

The basic features such as mouse and keyboard input and application control work with a default rpaframework install.

Advanced computer-vision features such as image template matching and OCR require an additional library called rpaframework-recognition.

The dependency should be added separately by specifing it in your conda.yaml as rpaframework-recognition==5.0.1 for example. If installing recognition through pip instead of conda, the OCR feature also requires tesseract.

Locating elements

To automate actions on the desktop, a robot needs to interact with various graphical elements such as buttons or input fields. The locations of these elements can be found using a feature called locators.

A locator describes the properties or features of an element. This information can be later used to locate similar elements even when window positions or states change.

The currently supported locator types are:

Name	Arguments	Description
alias	name (str)	A custom named locator from the locator database, the default.
image	path (str)	Image of an element that is matched to current screen content.
point	x (int), y (int)	Pixel coordinates as absolute position.
offset	x (int), y (int)	Pixel coordinates relative to current mouse position.
size	width (int), height (int)	Region of fixed size, around point or screen top-left
region	left (int), top (int), right (int), bottom (int)	Bounding coordinates for a rectangular region.
ocr	text (str), confidence (float, optional)	Text to find from the current screen.

A locator is defined by its type and arguments, divided by a colon. Some example usages are shown below. Note that the prefix for alias can be omitted as its the default type.

You can also pass internal region objects as locators:

Locator chaining

Often it is not enough to have one locator, but instead an element is defined through a relationship of various locators. For this use case the library supports a special syntax, which we will call locator chaining.

An example of chaining:

The supported operators are:

Operator	Description
then, +	Base locator relative to the previous one
and, &&, &	Both locators should be found
or,
not, !	The locator should not be found

Further examples:

Named locators

The library supports storing locators in a database, which contains all of the required fields and various bits of metadata. This enables having one source of truth, which can be updated if a website’s or applications’s UI changes. Robot Framework scripts can then only contain a reference to a stored locator by name.

The main way to create named locators is with VSCode.

Read more on identifying elements and crafting locators:

Keyboard and mouse

Keyboard keywords can emulate typing text, but also pressing various function keys. The name of a key is case-insensitive and spaces will be converted to underscores, i.e. the key Page Down and page_down are equivalent.

The following function keys are supported:

Key	Description
shift	A generic Shift key. This is a modifier.
shift_l	The left Shift key. This is a modifier.
shift_r	The right Shift key. This is a modifier.
ctrl	A generic Ctrl key. This is a modifier.
ctrl_l	he left Ctrl key. This is a modifier.
ctrl_r	The right Ctrl key. This is a modifier.
alt	A generic Alt key. This is a modifier.
alt_l	The left Alt key. This is a modifier.
alt_r	The right Alt key. This is a modifier.
alt_gr	The AltGr key. This is a modifier.
cmd	A generic command button (Windows / Command / Super key). This may be a modifier.
cmd_l	The left command button (Windows / Command / Super key). This may be a modifier.
cmd_r	The right command button (Windows / Command / Super key). This may be a modifier.
up	An up arrow key.
down	A down arrow key.
left	A left arrow key.
right	A right arrow key.
enter	The Enter or Return key.
space	The Space key.
tab	The Tab key.
backspace	The Backspace key.
delete	The Delete key.
esc	The Esc key.
home	The Home key.
end	The End key.
page_down	The Page Down key.
page_up	The Page Up key.
caps_lock	The Caps Lock key.
f1 to f20	The function keys.
insert	The Insert key. This may be undefined for some platforms.
menu	The Menu key. This may be undefined for some platforms.
num_lock	The Num Lock key. This may be undefined for some platforms.
pause	The Pause / Break key. This may be undefined for some platforms.
print_screen	The Print Screen key. This may be undefined for some platforms.
scroll_lock	The Scroll Lock key. This may be undefined for some platforms.

When controlling the mouse, there are different types of actions that can be done. Same formatting rules as function keys apply. They are as follows:

Action	Description
click	Click with left mouse button
left_click	Click with left mouse button
double_click	Double click with left mouse button
triple_click	Triple click with left mouse button
right_click	Click with right mouse button

The supported mouse button types are left, right, and middle.

Examples

Both Robot Framework and Python examples follow.

The library must be imported first.

from RPA.Desktop import Desktop desktop = Desktop()

The library can open applications and interact with them through keyboard and mouse events.

def write_entry_in_accounting(entry): desktop.open_application("erp_client.exe") desktop.click(f"image:{ROBOT_ROOT}/images/create.png") desktop.type_text(entry) desktop.press_keys("ctrl", "s") desktop.press_keys("enter")

Targeting can be currently done using coordinates (absolute or relative), but using template matching is preferred.

def write_to_field(text): desktop.move_mouse("image:input_label.png") desktop.move_mouse("offset:200,0") desktop.click() desktop.type_text(text) desktop.press_keys("enter")

Elements can be found by text too.

def click_new(): desktop.click('ocr:"New"')

It is recommended to wait for the elements to be visible before trying any interaction. You can also pass region objects as locators.

def click_new(): region = desktop.wait_for_element("ocr:New") desktop.click(region)

Another way to find elements by offsetting from an anchor:

def type_notes(text): desktop.click_with_offset("ocr:Notes", 500, 0) desktop.type_text(text)

variable `ROBOT_LIBRARY_DOC_FORMAT`

ROBOT_LIBRARY_DOC_FORMAT = 'REST'

variable `ROBOT_LIBRARY_SCOPE`

ROBOT_LIBRARY_SCOPE = 'GLOBAL'

method `add_library_components`

add_library_components(library_components: List, translation: Optional[dict] = None, translated_kw_names: Optional[list] = None)

method `clear_clipboard`

clear_clipboard()

Clear the system clipboard.

method `click`

click(locator: Optional[Union[str, Locator]] = None, action: Action = Action.click)

Click at the element indicated by locator.

Parameters

locator – Locator for click position
action – Click action, e.g. right click

method `click_with_offset`

click_with_offset(locator: Optional[Union[str, Locator]] = None, x: int = 0, y: int = 0, action: Action = Action.click)

Click at a given pixel offset from the given locator.

Parameters

locator – Locator for click start position
x – Click horizontal offset in pixels
y – Click vertical offset in pixels
action – Click action, e.g. right click

method `close_all_applications`

close_all_applications()

Close all opened applications.

method `close_application`

close_application(app: Application)

Close given application. Needs to be started with this library.

Parameters: app – App instance

method `copy_to_clipboard`

copy_to_clipboard(locator: Union[str, Locator])

Read value to system clipboard from given input element.

Parameters: locator – Locator for element
Returns: Current clipboard value

method `define_region`

define_region(left: int, top: int, right: int, bottom: int)

Return a new Region with the given dimensions.

Parameters

left – Left edge coordinate.
top – Top edge coordinate.
right – Right edge coordinate.
bottom – Bottom edge coordinate.

Usage examples:

region = desktop.define_region(10, 10, 50, 30)

method `drag_and_drop`

drag_and_drop(source: Union[str, Locator], destination: Union[str, Locator], start_delay: float = 2.0, end_delay: float = 0.5)

Drag mouse from source to destination while holding the left mouse button.

Parameters

source – Locator for start position
destination – Locator for destination position
start_delay – Delay in seconds after pressing down mouse button
end_delay – Delay in seconds before releasing mouse button

method `find_element`

find_element(locator: Union[str, Locator])

Find an element defined by locator, and return its position. Raises ElementNotFound if` no matches were found, or MultipleElementsFound if there were multiple matches.

Parameters: locator – Locator string

method `find_elements`

find_elements(locator: Union[str, Locator])

Find all elements defined by locator, and return their positions.

Parameters: locator – Locator string

method `get_clipboard_value`

get_clipboard_value()

Read current value from system clipboard.

method `get_display_dimensions`

get_display_dimensions()

Returns the dimensions of the current virtual display, which is the combined size of all physical monitors.

method `get_keyword_arguments`

get_keyword_arguments(name)

method `get_keyword_documentation`

get_keyword_documentation(name)

method `get_keyword_names`

get_keyword_names()

method `get_keyword_source`

get_keyword_source(keyword_name)

method `get_keyword_tags`

get_keyword_tags(name)

method `get_keyword_types`

get_keyword_types(name)

method `get_mouse_position`

get_mouse_position()

Get current mouse position in pixel coordinates.

method `highlight_elements`

highlight_elements(locator: Union[str, Locator])

Draw an outline around all matching elements.

method `move_mouse`

move_mouse(locator: Union[str, Locator])

Move mouse to given coordinates.

Parameters: locator – Locator for mouse position

method `move_region`

move_region(region: Region, left: int, top: int)

Return a new Region with an offset from the given region.

Parameters

region – The region to move.
left – Amount of pixels to move left/right.
top – Amount of pixels to move up/down.

Usage examples:

region = desktop.find_element('ocr:"Net Assets"') moved_region = desktop.move_region(region, 500, 0)

method `open_application`

open_application(name_or_path: str, *args)

Start a given application by name (if in PATH), or by path to executable.

Parameters

name_or_path – Name or path of application
args – Command line arguments for application

Returns: Application instance

method `open_file`

open_file(path: str)

Open a file with the default application.

Parameters: path – Path to file

method `paste_from_clipboard`

paste_from_clipboard(locator: Union[str, Locator])

Paste value from system clipboard into given element.

Parameters: locator – Locator for element

method `press_keys`

press_keys(*keys: str)

Press multiple keys down simultaneously.

Parameters: keys – Keys to press

method `press_mouse_button`

press_mouse_button(button: Any = 'left')

Press down mouse button and keep it pressed.

method `read_text`

read_text(locator: Optional[str] = None, invert: bool = False)

Read text using OCR from the screen, or an area of the screen defined by the given locator.

Parameters

locator – Location of element to read text from
invert – Invert image colors, useful for reading white text on dark background

Usage examples:

label_region = desktop.find_element("image:label.png") value_region = desktop.move_region(label_region, 100, 0) text = desktop.read_text(value_region)

method `release_mouse_button`

release_mouse_button(button: Any = 'left')

Release mouse button that was previously pressed.

method `resize_region`

resize_region(region: Region, left: int = 0, top: int = 0, right: int = 0, bottom: int = 0)

Return a resized new Region from a given region.

Extends edges the given amount outward from the center, i.e. positive left values move the left edge to the left.

Parameters

region – The region to resize.
left – Amount of pixels to resize left edge.
top – Amount of pixels to resize top edge.
right – Amount of pixels to resize right edge.
bottom – Amount of pixels to resize bottom edge.

Usage examples:

region = desktop.find_element('ocr:"Net Assets"') resized_region = desktop.resize_region(region, bottom=10)

method `run_keyword`

run_keyword(name, args, kwargs=None)

method `set_clipboard_value`

set_clipboard_value(text: str)

Write given value to system clipboard.

method `set_default_confidence`

set_default_confidence(confidence: Optional[float] = None)

Set the default template matching confidence.

Parameters: confidence – Value from 1 to 100

method `set_default_timeout`

set_default_timeout(timeout: float = 3.0)

Set the default time to wait for elements.

Parameters: timeout – Time in seconds

method `take_screenshot`

take_screenshot(path: Optional[str] = None, locator: Optional[Union[str, Locator]] = None, embed: bool = True)

Take a screenshot of the whole screen, or an element identified by the given locator.

Parameters

path – Path to screenshot. The string {index} will be replaced with an index number to avoid overwriting previous screenshots.
locator – Element to crop screenshot to
embed – Embed screenshot into Robot Framework log

method `type_text`

type_text(text: str, *modifiers: str, enter: bool = False)

Type text one letter at a time.

Parameters

text – Text to write
modifiers – Modifier or functions keys held during typing
enter – Press Enter / Return key after typing text

method `type_text_into`

type_text_into(locator: Union[str, Locator], text: str, clear: bool = False, enter: bool = False)

Type text at the position indicated by given locator.

Parameters

locator – Locator of input element
text – Text to write
clear – Clear element before writing
enter – Press Enter / Return key after typing text

method `wait_for_element`

wait_for_element(locator: Union[str, Locator], timeout: Optional[float] = None, interval: float = 0.5)

Wait for an element defined by locator to exist, or raise a TimeoutException if none were found within timeout.

Parameters: locator – Locator string