RPA.Desktop
module RPA.Desktop
class RPA.Desktop.Desktop
Desktop is a cross-platform library for navigating and interacting with desktop environments. It can be used to automate applications through the same interfaces that are available to human users.
The library includes the following features:
- Mouse and keyboard input emulation
- Starting and stopping applications
- Finding elements through image template matching
- Scraping text from given regions
- Taking screenshots
- Clipboard management
WARNING
Windows element selectors are not currently supported, and require the use of RPA.Desktop.Windows
Installation
The basic features such as mouse and keyboard input and application
control work with a default rpaframework
install.
Advanced computer-vision features such as image template matching and
OCR require an additional library called rpaframework-recognition
.
The dependency should be added separately by specifing it in your conda.yaml
as rpaframework-recognition==5.0.1
for example. If installing recognition
through pip
instead of conda
, the OCR feature also requires tesseract
.
Locating elements
To automate actions on the desktop, a robot needs to interact with various graphical elements such as buttons or input fields. The locations of these elements can be found using a feature called locators.
A locator describes the properties or features of an element. This information can be later used to locate similar elements even when window positions or states change.
The currently supported locator types are:
Name | Arguments | Description |
---|---|---|
alias | name (str) | A custom named locator from the locator database, the default. |
image | path (str) | Image of an element that is matched to current screen content. |
point | x (int), y (int) | Pixel coordinates as absolute position. |
offset | x (int), y (int) | Pixel coordinates relative to current mouse position. |
size | width (int), height (int) | Region of fixed size, around point or screen top-left |
region | left (int), top (int), right (int), bottom (int) | Bounding coordinates for a rectangular region. |
ocr | text (str), confidence (float, optional) | Text to find from the current screen. |
A locator is defined by its type and arguments, divided by a colon.
Some example usages are shown below. Note that the prefix for alias
can
be omitted as its the default type.
You can also pass internal region
objects as locators:
Locator chaining
Often it is not enough to have one locator, but instead an element is defined through a relationship of various locators. For this use case the library supports a special syntax, which we will call locator chaining.
An example of chaining:
The supported operators are:
Operator | Description |
---|---|
then, + | Base locator relative to the previous one |
and, &&, & | Both locators should be found |
or, | |
not, ! | The locator should not be found |
Further examples:
Named locators
The library supports storing locators in a database, which contains all of the required fields and various bits of metadata. This enables having one source of truth, which can be updated if a websiteβs or applicationsβs UI changes. Robot Framework scripts can then only contain a reference to a stored locator by name.
The main way to create named locators is with VSCode.
Read more on identifying elements and crafting locators:
- Desktop automation and RPA
- How to find user interface elements using locators and keyboard shortcuts in Windows applications
Keyboard and mouse
Keyboard keywords can emulate typing text, but also pressing various function keys.
The name of a key is case-insensitive and spaces will be converted to underscores,
i.e. the key Page Down
and page_down
are equivalent.
The following function keys are supported:
Key | Description |
---|---|
shift | A generic Shift key. This is a modifier. |
shift_l | The left Shift key. This is a modifier. |
shift_r | The right Shift key. This is a modifier. |
ctrl | A generic Ctrl key. This is a modifier. |
ctrl_l | he left Ctrl key. This is a modifier. |
ctrl_r | The right Ctrl key. This is a modifier. |
alt | A generic Alt key. This is a modifier. |
alt_l | The left Alt key. This is a modifier. |
alt_r | The right Alt key. This is a modifier. |
alt_gr | The AltGr key. This is a modifier. |
cmd | A generic command button (Windows / Command / Super key). This may be a modifier. |
cmd_l | The left command button (Windows / Command / Super key). This may be a modifier. |
cmd_r | The right command button (Windows / Command / Super key). This may be a modifier. |
up | An up arrow key. |
down | A down arrow key. |
left | A left arrow key. |
right | A right arrow key. |
enter | The Enter or Return key. |
space | The Space key. |
tab | The Tab key. |
backspace | The Backspace key. |
delete | The Delete key. |
esc | The Esc key. |
home | The Home key. |
end | The End key. |
page_down | The Page Down key. |
page_up | The Page Up key. |
caps_lock | The Caps Lock key. |
f1 to f20 | The function keys. |
insert | The Insert key. This may be undefined for some platforms. |
menu | The Menu key. This may be undefined for some platforms. |
num_lock | The Num Lock key. This may be undefined for some platforms. |
pause | The Pause / Break key. This may be undefined for some platforms. |
print_screen | The Print Screen key. This may be undefined for some platforms. |
scroll_lock | The Scroll Lock key. This may be undefined for some platforms. |
When controlling the mouse, there are different types of actions that can be done. Same formatting rules as function keys apply. They are as follows:
Action | Description |
---|---|
click | Click with left mouse button |
left_click | Click with left mouse button |
double_click | Double click with left mouse button |
triple_click | Triple click with left mouse button |
right_click | Click with right mouse button |
The supported mouse button types are left
, right
, and middle
.
Examples
Both Robot Framework and Python examples follow.
The library must be imported first.
The library can open applications and interact with them through keyboard and mouse events.
Targeting can be currently done using coordinates (absolute or relative), but using template matching is preferred.
Elements can be found by text too.
It is recommended to wait for the elements to be visible before
trying any interaction. You can also pass region
objects as locators.
Another way to find elements by offsetting from an anchor:
variable ROBOT_LIBRARY_DOC_FORMAT
variable ROBOT_LIBRARY_SCOPE
method add_library_components
method clear_clipboard
Clear the system clipboard.
method click
Click at the element indicated by locator.
Parameters
- locator β Locator for click position
- action β Click action, e.g. right click
method click_with_offset
Click at a given pixel offset from the given locator.
Parameters
- locator β Locator for click start position
- x β Click horizontal offset in pixels
- y β Click vertical offset in pixels
- action β Click action, e.g. right click
method close_all_applications
Close all opened applications.
method close_application
Close given application. Needs to be started with this library.
- Parameters: app β App instance
method copy_to_clipboard
Read value to system clipboard from given input element.
- Parameters: locator β Locator for element
- Returns: Current clipboard value
method define_region
Return a new Region
with the given dimensions.
Parameters
- left β Left edge coordinate.
- top β Top edge coordinate.
- right β Right edge coordinate.
- bottom β Bottom edge coordinate.
Usage examples:
method drag_and_drop
Drag mouse from source to destination while holding the left mouse button.
Parameters
- source β Locator for start position
- destination β Locator for destination position
- start_delay β Delay in seconds after pressing down mouse button
- end_delay β Delay in seconds before releasing mouse button
method find_element
Find an element defined by locator, and return its position.
Raises ElementNotFound
if` no matches were found, or
MultipleElementsFound
if there were multiple matches.
- Parameters: locator β Locator string
method find_elements
Find all elements defined by locator, and return their positions.
- Parameters: locator β Locator string
method get_clipboard_value
Read current value from system clipboard.
method get_display_dimensions
Returns the dimensions of the current virtual display, which is the combined size of all physical monitors.
method get_keyword_arguments
method get_keyword_documentation
method get_keyword_names
method get_keyword_source
method get_keyword_tags
method get_keyword_types
method get_mouse_position
Get current mouse position in pixel coordinates.
method highlight_elements
Draw an outline around all matching elements.
method move_mouse
Move mouse to given coordinates.
- Parameters: locator β Locator for mouse position
method move_region
Return a new Region
with an offset from the given region.
Parameters
- region β The region to move.
- left β Amount of pixels to move left/right.
- top β Amount of pixels to move up/down.
Usage examples:
method open_application
Start a given application by name (if in PATH), or by path to executable.
Parameters
- name_or_path β Name or path of application
- args β Command line arguments for application
- Returns: Application instance
method open_file
Open a file with the default application.
- Parameters: path β Path to file
method paste_from_clipboard
Paste value from system clipboard into given element.
- Parameters: locator β Locator for element
method press_keys
Press multiple keys down simultaneously.
- Parameters: keys β Keys to press
method press_mouse_button
Press down mouse button and keep it pressed.
method read_text
Read text using OCR from the screen, or an area of the screen defined by the given locator.
Parameters
- locator β Location of element to read text from
- invert β Invert image colors, useful for reading white text on dark background
Usage examples:
method release_mouse_button
Release mouse button that was previously pressed.
method resize_region
Return a resized new Region
from a given region.
Extends edges the given amount outward from the center, i.e. positive left values move the left edge to the left.
Parameters
- region β The region to resize.
- left β Amount of pixels to resize left edge.
- top β Amount of pixels to resize top edge.
- right β Amount of pixels to resize right edge.
- bottom β Amount of pixels to resize bottom edge.
Usage examples:
method run_keyword
method set_clipboard_value
Write given value to system clipboard.
method set_default_confidence
Set the default template matching confidence.
- Parameters: confidence β Value from 1 to 100
method set_default_timeout
Set the default time to wait for elements.
- Parameters: timeout β Time in seconds
method take_screenshot
Take a screenshot of the whole screen, or an element identified by the given locator.
Parameters
- path β Path to screenshot. The string
{index}
will be replaced with an index number to avoid overwriting previous screenshots. - locator β Element to crop screenshot to
- embed β Embed screenshot into Robot Framework log
method type_text
Type text one letter at a time.
Parameters
- text β Text to write
- modifiers β Modifier or functions keys held during typing
- enter β Press Enter / Return key after typing text
method type_text_into
Type text at the position indicated by given locator.
Parameters
- locator β Locator of input element
- text β Text to write
- clear β Clear element before writing
- enter β Press Enter / Return key after typing text
method wait_for_element
Wait for an element defined by locator to exist, or raise a TimeoutException if none were found within timeout.
- Parameters: locator β Locator string