Infoworks 5.3.0

Prepare Data

Documentation for Versions 2.x, 3.1.

Exporting Data to BigQuery


	Infoworks supports BigQuery as a target in data transformation pipelines. This helps data engineers onboard data to BigQuery.

Prerequisite

Ensure that the BigQuery target data connection is configured. For more details, see Setting BigQuery Data Connection.

Setting BigQuery Target Properties

Following are the steps to use BigQuery target in the pipeline:

Double-click the BigQuery Table node. The properties page is displayed.
Click Edit Properties, and set the following fields:

Field	Description
Build Mode	The options include overwrite, append, or merge. Overwrite: Drops and recreates the BigQuery target. Append: Appends data to the existing BigQuery target. Merge: Merges data to the existing table based on the natural key.
Data Connection	Data connection to be used by the BigQuery target. For more details, see Setting BigQuery Data Connection.
Dataset Name	Dataset name of the BigQuery target.
Create a dataset if it does not exist	Enable this option to create a new schema with the name provided above. Ensure that the user has sufficient privileges to create a dataset in BigQuery target.
Table Name	Table name of the BigQuery target.
Natural Keys	The required natural keys for the BigQuery target.
Is Existing Table	When enabled, existing table behavior works as an external table to Infoworks. The table will not be created/dropped or managed by Infoworks.
Partition Type	The column name for the partition. The options include BigQuery Load Time, Date Column, Timestamp Column, Integer Column.
Partition Time Interval	The options include Day, Hour.
Partition Column	The column based on which the data will be partitioned. This field is displayed for the date column, timestamp column, integer column partition types.
Start	The start value for the partition of data. This field is displayed for integer column partition type.
End	The end value for the partition of data. This field is displayed for integer column partition type.
Range	The range for the partition of data. This field is displayed for integer column partition type.
Clustering Columns	The columns to be used for clustering. You can select up to 4 columns.
Persist Staging Data in GCS	The option to indicate whether the temporary staging data must be persisted in GCS. The default option is to not persist data.

Last updated on Jul 12, 2022

Was this page helpful?

On This Page

Exporting Data to BigQuery Prerequisite Setting BigQuery Target Properties