Infoworks 6.1.3
Prepare Data

Creating a Pipeline Group

Overview

Pipelines can be grouped together in a Pipeline Group to run as a single job in Infoworks. Pipelines built in a pipeline group use a single connection and run as one transaction.

From the pipeline group, you can configure the following:

  • Data environment
  • Pipeline group name, description
  • Environment details
  • Pipelines(including pipeline version) to be included in the group
  • Execution order of pipelines added to the group

Even if one pipeline job fails, it would result in rollback of already executed pipelines in the respective group.

NOTE Pipeline group can also be built via Rest API.

Pipeline Group Creation

To create a new pipeline group:

Step 1: Go to Domains and select the specific domain.

Step 2: Click New Pipeline Group.

Step 3: Provide the following pipeline group details.

FieldDescriptionDetails
Data EnvironmentName of the data environmentSelect the name from the dropdown
NameName of the pipeline group NOTE There should not be any space in the name
DescriptionDescription of the pipeline group-
Snowflake WarehouseSnowflake warehouse name.Snowflake Warehouse drop-down will appear based on the selected snowflake profile.
Snowflake ProfileSnowflake profile name.Snowflake Profile drop-down will appear for the selected snowflake environment.
Run driver job on data planeSelect this checkbox to run the job driver on data plane.
Compute ClusterThe compute cluster that is spin up for each table.
Custom TagsThey're key-value pairs that help you identify resources based on settings that are applied to your cloud resources
Associated Custom TagsThe custom tags selected appear here.
Query TagsA string that is added to the Snowflake query tag and can be accessed via Query history in Snowflake.

Under the All Pipelines tab, add the required pipelines to the pipeline group.

FieldDescriptionDetails
NameName of the pipeline
Created ByUser who created the pipeline
Created AtTimestamp for pipeline Creation
Added to the GroupWhether it is added to any pipeline group

Step 4: Click Add to add the pipelines to the pipeline group.

Step 5: Under the Pipelines in the Group tab, configure the following fields based on your requirement:

FieldDescription
NameName of the pipeline.
Run Active VersionSelect this checkbox to run the active version
Version NumberActive version is the default version number. If you want to run any other version, select the specific version from the dropdown.
Execution OrderDescribes the order in which the pipelines will get executed.

Step 6: Click Save.

LIMITATIONS

  • Creating volatile temporary tables is not supported for pipelines running in a group.
  • Running multiple pipeline groups in parallel on databricks interactive cluster is not recommended since they will share the static session context and might result in unwanted results.
  • Target table data will be truncated in case of overwrite pipelines when the pipeline group fails.
  Last updated by Monika Momaya