Infoworks supports BigQuery as a target in data transformation pipelines. This helps data engineers onboard data to BigQuery. |
Ensure that the BigQuery target data connection is configured. For more details, see Setting BigQuery Data Connection.
Following are the steps to use BigQuery target in the pipeline:
Field | Description |
---|---|
Build Mode | The options include overwrite, append, or merge. Overwrite: Drops and recreates the BigQuery target. Append: Appends data to the existing BigQuery target. Merge: Merges data to the existing table based on the natural key. |
Data Connection | Data connection to be used by the BigQuery target. For more details, see Setting BigQuery Data Connection. |
Dataset Name | Dataset name of the BigQuery target. |
Create a dataset if it does not exist | Enable this option to create a new schema with the name provided above. Ensure that the user has sufficient privileges to create a dataset in BigQuery target. |
Table Name | Table name of the BigQuery target. |
Natural Keys | The required natural keys for the BigQuery target. |
Is Existing Table | When enabled, existing table behavior works as an external table to Infoworks. The table will not be created/dropped or managed by Infoworks. |
Partition Type | The column name for the partition. The options include BigQuery Load Time, Date Column, Timestamp Column, Integer Column. |
Partition Time Interval | The options include Day, Hour. |
Partition Column | The column based on which the data will be partitioned. This field is displayed for the date column, timestamp column, integer column partition types. |
Start | The start value for the partition of data. This field is displayed for integer column partition type. |
End | The end value for the partition of data. This field is displayed for integer column partition type. |
Range | The range for the partition of data. This field is displayed for integer column partition type. |
Clustering Columns | The columns to be used for clustering. You can select up to 4 columns. |
Persist Staging Data in GCS | The option to indicate whether the temporary staging data must be persisted in GCS. The default option is to not persist data. |