Having examined the contents of one or more data products in a Space you can create your own custom data product by combining multiple inputs which you can then share with other users on the platform.

We refer to a custom data product as an engineered data product on the platform. Engineered Data products use a Space as a Source of data.

Note: You must have a technician role and a product manager role to perform this action.

You must have saved the results of your analysis as a table in the publish_db of the Space.

As a simple example, execute this SQL (via Hue) in a Space to create a custom data product :

DROP TABLE if exists publish_db.<example table>;

CREATE TABLE publish_db.<example table>

as

SELECT count (100) from <dbname.tablename>;

  1. Click on the Products icon on the navigation bar.

  2. Click on Create Product.

  3. Click on Engineered data product to begin to publish a static, non-updating data product created from a Space.
    The New Data Product page appears.

Set up your Data Product

  1. Set publish parameters:

    1. Enter the working Title for the data product. This can be changed when when your data is finally packaged as a data product.

  2. Add users to your Publish Team.

  3. Select Save and Continue.

Define the Source of your Data

  1. Source Details:

    1. Select the name of the Space in which the engineered data product has been created.

    2. Do not update the Data Location as the publish process always looks for your table(s) in the publish_db.

    3. View the Source Data Product History to find out if your engineered data product consists of any other contributing data products.

      1. Make sure you have the necessary subscriptions to any data products where license lineage is enforced.

      2. You do not need any subscriptions to data products where license lineage is not enforced.

    4. Your Space Collaborators are displayed for information.

  2. Click Save and Continue

Review the Format of your Data

If your data is Tabular then the publish process attempts to determine its format.

  1. If your source data files listed in publish_db are as expected then click Save and Continue, if not then check for errors in your input files. The publish process will not complete successfully if this step is not correct.

  2. A popup asks you to confirm the format of the data.

    1. For Tabular file formats the data discovery process is initiated to determine the structure of your data. The process can take several minutes to complete, regardless of the size of the data.

    2. For Non-Tabular a Data Copy process begins which moves the data from the staging bucket into a bucket that has been created on the platform for the data product being created.

  3. Click Continue when this process has completed successfully.

Confirm the Schema of your Data

If the format of your data is Tabular then the next step of the publish process is to confirm the format of your data and to select tables for preview. If your data is not tabular then this step is not required.

  1. Review that the schema identified in the previous step is correct.

  2. For each table in the data product select for preview if you want a random selection of rows to be shown as a Data Preview on the data product page. You can choose a maximum of 15.

  3. Select Save and Continue to proceed.

    1. The Validation processes all records within the data file(s). This process can take a few minutes during which you can leave the screen and return to the publish flow at a later date.

    2. If any tables have been selected to show a preview, the sample data is displayed as further confirmation that the data has been processed correctly.

You are now ready to add subscription plan templates to your Data Product 


References and FAQs

Data Discovery

Publish Team

Engineered Data Product

Validation Process

The Publish Process

Related Pages

Analyze your Data

Add Subscription Plans to your Data Product

View the Status of an Updating Data Product