Data Pipeline in Control Room
Data Pipeline for Control Room is a foundation and set of functionality that allows you to scale, manage and analyze your processes in a flexible manner. The first version of Data Pipeline is already available on Control Room, and more functionality will be added over the year 2021.
This page gives an overview of the Data Pipeline vision, what Data Pipeline features are available currently and what is planned for future releases.
The vision for Data Pipeline
Our vision for Data Pipeline, in short, is "Built-in robustness, connectivity, scalability, analytics and reporting to ANY automation use case ".
Multiple aspects need to be taken into account when building business-critical RPA processes. Just having the robot to run is only a part of the bigger picture, and typical questions that arise during development and delivery are, for example:
- How do I input data to the robot or get out results?
- How do I scale the workload efficiently over multiple execution environments?
- How do I send the failed items to be processed manually?
- How do I automatically re-run only the failed data items? I don't want to run everything again.
- How do I integrate system X with process Y?
- How can I be sure that the robot has processed the right information?
- What happens when the execution environment crashes mid-run before the log file gets generated?
- How can I be sure I don't lose any data if the robot crashes?
- How do I measure the value generated by the process?
- My process executes 500000 transactions in a month. How can I stay on top of what is happening?
- I would like to know how many X, Y, and Z the process has gone through today. How do I do that?
Typically finding solutions to all these questions would take a lot of time and resources. Instead of having to develop custom libraries, setting up databases, monitoring systems, analytics software, and figuring out how to use and maintain all that in the future, Control Room data pipeline in conjunction with RPA Framework provides a standard and best practice answer to all of these questions with a minimal effort from the user.
A core concept of the Data Pipeline is the work item. Work items are the entities used in Control Room to store any kind of data meant to be processed by robots. By using the data pipeline to handle every data item processed in the process by work item basis, the state of processing can be kept always up-to-date. Even if the robot crashes in an unexpected way, the state of each item stays known, and can be re-processed or handled manually.
Work items can be any individual pieces of data that your process handles — for example, invoices, URLs, or customer support tickets. Each work item can contain both input metadata for robots processing them as well as output data and output files.
The data pipeline also allows splitting the work to multiple execution environments in one-to-many style of processing, and can be used to process huge volumes of data efficiently. For example, some parts of a complex Enterprise process can be parallelized to hundreds of Control Room containers, making it possible to take advantage of the Cloud scale and efficiency without having to host your own infrastructure.
Business user friendly UI:s
Business user friendly UI:s make inputting data like Excel files to the robots to process a breeze, while extensive APIs help developers implement and debug all kinds of complex data flows. Data integrations can be done in a standard way while also allowing different trigger methods for the processing.
Measuring and monitoring
Measuring value generated and monitoring that everything works correctly are also important aspects of RPA. Data pipeline provides standard metrics like error rates, run times and value generated as time-series graphs, but also allows digging deeper in to the process when the content of work items are mapped. Process mining built-in!
Current Data Pipeline functionality
Control Room currently supports minimal functionality for visualizing work items and tracking data pipeline progress, as well as passing in work items to robots when running them. This will give you a glimpse of using the data pipeline and let you start experimenting with work items.
The full functionality will be unleashed when more features are launched according to the roadmap (see "Planned functionality" below).
Related development documentation
To add or read work item data in your robots, use the RPA.Robocloud.Items library.
- Development guide for using the RPA.Robocloud.Item library
- Keyword documentation for RPA.Robocloud.Items
- Cloud Tutorial Robot — A simple tutorial robot for storing data in a work item on the Robocorp Portal
More material on Data Pipeline will be added to the documentation in the coming months.
Below is a list of some of the planned functionality, which is planned to be released over the year 2021.
- Business user friendly UI to input and output data like Excel files into the processes
- API to offer CRUD operations to the work items in the pipeline
- Self-hosted environment groups
- Custom work item storages that can be mapped by the user
- Looping work items from the queue
- Automated exception handling
- Different exception types, e.g, system, data, environment error
- Persistence of data in the Cloud, instead of robot memory
- Visual business user friendly dashboards of standard metrics and KPIs
- Customized analytics on how processes run
- Process mining through work item analytics attributes
- Process start and end notifications with information about processed items
- Processing history and work item metadata
For more information on the Data Pipeline, how to use it, and on the roadmap, please contact Robocorp Sales and Customer Success.