Working with HTML Tables
This robot demonstrates how to work with HTML tables using Beautiful Soup and RPA Framework.
This robot demonstrates how to work with HTML tables.
The example HTML table
We use the table at https://www.w3schools.com/html/html_tables.asp as an example:
The HTML parser library: Beautiful Soup
The robot uses the beautifulsoup4
and the rpaframework
dependencies in the conda.yaml
configuration file:
Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.
RPA.Tables is great for manipulating, sorting, and filtering tabular data. Common use-cases are reading and writing CSV files, inspecting files in directories, or running tasks using existing Excel data.
The HTML table custom parser library
The robot includes a custom HTML parser that uses Beautiful Soup internally. Beautiful Soup is mighty and flexible. Building your customized parsers does not take too long!
HTML tables come in many shapes and forms. This example uses a well-formatted and straightforward table. More complex tables might require more effort to parse. Still, the idea is the same: Read and parse the HTML. Return a generic data structure that is easy to work with.
html_tables.py
:
The robot
The Get HTML table
keyword returns the example HTML table markup from https://www.w3schools.com/html/html_tables.asp.
The Read Table From Html
is provided by the html_tables.py
library. It parses and returns the given HTML table as a Table
structure.
The returned data structure can be worked with all the keywords in the RPA.Tables library.
Technical information
Last updated
August 24, 2023License
Apache License 2.0