Create your Automated Data Product
Having examined the contents of one or more data products in a Space, created a data product in publish_db
, re-used your code as a code asset, scheduled its execution via a task you can share this updating data product with other users on the platform.
We refer to an updating engineered data product as automated data product on the platform. Automated Data products use a Task as a Source of data and are updated and published whenever the Task is executed.
The subscription lineage of the automated data product is determined by the subscriptions the publisher of the data product has to the contributing data products.
Note: You must have an automation role or an automation administrator role, a technician role and a product owner role to perform this action.
You must have previously successfully executed your code asset via a task
To publish an automated data product:
Click on the Products icon on the navigation bar.
Click Create Product.
Click Automated data product.
Set Up your Data Product
Set publish parameters:
Enter the working Title for the data product. This can be changed when when your data is finally packaged as a data product.
Set the Frequency - this is for information only and is part of the Metadata Metrics. The Task is used to determine the actual update frequency.
Add users to the Publish Team.
Click Save and Continue.
Define the Source of your Data
Source Details:
Search for and select the required Task. A Task only appears as a source to publish a data product once it has been run successfully.
Click Save and Continue.
Review the Format of your Data
If your source data files listed in publish_db are as expected then click Save and Continue, if not then check for errors in your input files. The publish process does not complete successfully if this step is not correct.
A popup asks you to confirm the format of the data.
For Tabular file formats the data discovery process is initiated to determine the structure of your data. The process can take several minutes to complete, regardless of the size of the data.
For Non Tabular a Data Copy process begins which moves the data from the staging bucket into a bucket that has been created on the platform for the data product being created.
Click Continue when this process has completed successfully.
Confirm the Schema of your Data
If the format of your data is Tabular then the next step of the publish process is to confirm the format of your data and to select tables for preview. If your data is not tabular then this step is not required.
Review that the schema identified in the previous step is correct.
For each table in the data product select for preview if you want a random selection of rows to be shown as a Data Preview on the data product page. You can choose a maximum of 15.
Select Save and Continue to proceed.
The validation process processes all records within the data file(s).
If any tables have been selected to show a preview, the sample data is displayed as further confirmation that the data has been processed correctly.
You are now ready to add subscription templates to your data product
References and FAQs
Related Pages