Creating a PDF

To make sure that her colleagues actually look at the sales data, Maria usually creates a PDF out of the sales data table in the intranet and sends it out as a company newsletter.

After all, it would be a shame if no one saw the result of all that copy-pasting!

Maria's procedure involves copying the table into Microsoft Word and then exporting it to PDF from there with some additional software, our robot instead will do it all by itself automatically.

What we want is to turn the table on the left into the PDF on the right:

Html table and pdf file side by side

So, as always, let's start by adding a new step in our *** Tasks *** cell:

*** Tasks ***
Insert the sales data for the week and export it as a PDF
    Open The Intranet Website
    Log In
    Download The Excel File
    Fill The Form Using The Data From The Excel File
    Collect The Results
    Export The Table As A PDF

Then we add a new *** Keywords *** cell just before it:

*** Keywords ***
Export The Table As A PDF

As always, we plan to do this in steps. (Remember about the poor elephant we are eating? 🐘)

Our plan for this keyword is:

  • we will isolate the part of the page that contains the sales table
  • we will assign the content of that part of the page to a variable
  • we will create a PDF with the HTML content of the table.

Getting the HTML table element out of the page

The first thing we want to do is to make sure that the table element is actually on the page when we try to "grab" it. We'll use the Wait Until Element Is Visible keyword. We just need a locator.

This is how the HTML source of the table area on the page looks like:

...
<div id="sales-results">
  <table class="table table-dark table-striped">
    ...
  </table>
</div>
...

We can see that the table is wrapped in a <div> element with an id attribute of sales-results. Our locator will then be id:sales-results.

Look at that beautiful code. It's almost like that page was created for this course! 😍

So we will modify our keyword like this:

*** Keywords ***
Export The Table As A PDF
    Wait Until Element Is Visible    id:sales-results

Next, we want to put the actual HTML markup of that element into a variable. We can do this with the Get Element Attribute keyword (RPA.Browser library) like this:

*** Keywords ***
Export The Table As A PDF
    Wait Until Element Is Visible    id:sales-results
    ${sales_results_html}=    Get Element Attribute    id:sales-results    outerHTML

Ok, we admit this was not too easy to guess. 😅 But no panic! Let's see what's going on in this new line. We are creating a variable (${sales_results_html}=), and we are storing into it what we get out of the Get Element Attribute keyword. We pass two arguments to that keyword: the first one is the locator for the element (id:sales-results); the second is the name of the attribute of that element we want to get. In our case, we want all the HTML markup it contains, so we choose the outerHTML attribute.

You can read more about the Element API if you are interested. It gets quite technical, though. Don't say we did not warn you! 🙂

Alright! Now let's run our robot again:

Seeing the HTML markup of the table in the log

We can see in the log that the robot has now grabbed the HTML markup for the table:

Log containing the code for the HTML table

Creating the PDF file out of the HTML contents variable

Only one more step to go!

Now that we have the HTML contents of the table in a variable, we need to create a PDF file out of it. To do it, we will add the RPA.PDF library!

Add a new library, get new keywords... Wax on, wax off... 🥋 Practice will make us perfect! 💪

*** Settings ***
Documentation     Robot to enter weekly sales data into the RobotSpareBin Industries Intranet.
Library           RPA.Browser
Library           RPA.Excel.Files
Library           RPA.HTTP
Library           RPA.PDF

Now we can add the final line to our keyword.

*** Keywords ***
Export The Table As A PDF
    Wait Until Element Is Visible    id:sales-results
    ${sales_results_html}=    Get Element Attribute    id:sales-results    outerHTML
    Html To Pdf    ${sales_results_html}    ${CURDIR}${/}output${/}sales_results.pdf

We use the Html To Pdf keyword provided by the RPA.PDF library to create a sales_results.pdf file out of our ${sales_results_html} variable's contents, and place it again into the output folder (${CURDIR}${/}output).

And that's it!

Here's what our robot code looks like now:

*** Settings ***
Documentation     Robot to enter weekly sales data into the RobotSpareBin Industries Intranet.
Library           RPA.Browser
Library           RPA.Excel.Files
Library           RPA.HTTP
Library           RPA.PDF

*** Keywords ***
Open The Intranet Website
    Open Available Browser    https://robotsparebinindustries.com/

*** Keywords ***
Log In
    Input Text    id:username    maria
    Input Password    id:password    thoushallnotpass
    Submit Form
    Wait Until Page Contains Element    id:sales-form

*** Keywords ***
Download The Excel file
    Download    https://robotsparebinindustries.com/SalesData.xlsx    overwrite=True

*** Keywords ***
Fill And Submit The Form For One Person
    [Arguments]    ${salesRep}
    Input Text    firstname    ${salesRep}[First Name]
    Input Text    lastname    ${salesRep}[Last Name]
    Input Text    salesresult    ${salesRep}[Sales]
    ${target_as_string}=    Convert To String    ${salesRep}[Sales Target]
    Select From List By Value    salestarget    ${target_as_string}
    Click Button    Submit

*** Keywords ***
Fill The Form Using The Data From The Excel File
    Open Workbook    SalesData.xlsx
    ${salesReps}=    Read Worksheet As Table    header=True
    Close Workbook
    FOR    ${salesRep}    IN    @{salesReps}
        Fill And Submit The Form For One Person    ${salesRep}
    END

*** Keywords ***
Collect The Results
    Screenshot    css:div.sales-summary    ${CURDIR}${/}output${/}sales_summary.png

*** Keywords ***
Export The Table As A PDF
    Wait Until Element Is Visible    id:sales-results
    ${sales_results_html}=    Get Element Attribute    id:sales-results    outerHTML
    Html To Pdf    ${sales_results_html}    ${CURDIR}${/}output${/}sales_results.pdf

*** Tasks ***
Insert the sales data for the week and export it as a PDF
    Open The Intranet Website
    Log In
    Download The Excel File
    Fill The Form Using The Data From The Excel File
    Collect The Results
    Export The Table As A PDF

Let's run the robot one final time:

Opening the PDF file

We can see a new sales_results.pdf file appear in the output directory, containing the sales data! 🎉🎉🎉

What we learned

  • You can use the Get Element Attribute keyword from the RPA.Browser library to get the actual HTML markup of any element using the outerHTML attribute.
  • The names of the libraries are case-sensitive.
  • You can create a PDF file easily starting from HTML content using the RPA.PDF library and the Html To Pdf keyword.