Plan the business data transformation

The sales data system expects to receive the traffic data in a specific format and conform to specific rules. This is the example specification given to you by the sales system API team:

{ "country": "three-letter-country-code-here", "year": 2022, "rate": 1.23456 }

The specification from the sales system API team reads:

"The data should contain the following:"

  • The three-letter country code (string).
  • The year (integer).
  • The average rate (float).

"We are only interested in the latest available data for each country, where:"

  • The data concerns both females and males.
  • The rate is the average rate.
  • The average rate is below 5.0.

You take a look at the first item in the raw JSON data you downloaded from the traffic data API:

{ "Id": 25257658, "IndicatorCode": "RS_198", "SpatialDimType": "COUNTRY", "SpatialDim": "AFG", "TimeDimType": "YEAR", "TimeDim": 2017, "Dim1Type": "SEX", "Dim1": "FMLE", "Dim2Type": null, "Dim2": null, "Dim3Type": null, "Dim3": null, "DataSourceDimType": null, "DataSourceDim": null, "Value": "6.0 [5.2-6.8]", "NumericValue": 5.99219, "Low": 5.17048, "High": 6.81715, "Comments": null, "Date": "2021-02-09T16:20:04.763+01:00", "TimeDimensionValue": "2017", "TimeDimensionBegin": "2017-01-01T00:00:00+01:00", "TimeDimensionEnd": "2017-12-31T00:00:00+01:00" }
  • The SpatialDim attribute seems to contain the three-letter country code.
  • The Dim1 attribute seems to indicate the gender (in this case, probably female).
  • The NumericValue attribute seems to be the average rate based on the Low and High attributes.

Diving deeper into the raw data, you figure out that "Dim1": "BTSX" indicates "both sexes" (female and male).

Ok. You need to filter the raw data to contain only the entries that match the requirements (the latest available average rate below 5.0 for each country concerning both females and males).

In addition, you need to transform the data to conform to the format expected by the sales system API.

No worries - Robocorp supports all this data wrangling! Supports meaning you can use the handy RPA Framework libraries, any Python libraries (such as numpy), or write your own solution. No restrictions. Use whatever makes sense to you!