Synchronizing Data to External Target

You can configure the Target connections and sync data as described in the following sections.

Configure a Target

The following are the steps to configure a Target.

  1. From the Data Sources menu, select one of the tables and click View Source/Ingest button.
  2. Select the source table to be synchronized to Target.
  3. Click on the configuration link and click on the Sync to ExternalTarget tab.
  4. Configure the parameters, by selecting one of the following options available in the Sync Type drop-down menu.

$inline[badge,NOTE,primary] If the Sync type is set to Append and the table that is configured is selected for Incremental Merge mode of ingestion, then the exported table does not include any duplicates.

If the table is configured for Incremental Merge mode of ingestion and selected for Merge mode of Sync to Target, then the exported table can include random duplicate data (with or without duplicates).

However, for the delete operation during ingestion, the records will still be available in the exported table.

Step 5: Select one of the Target types and enter all the mandatory fields as provided under the following sections:

Configuring BigQuery Target

Configuring Azure Synapse Analytics Target

Configuring Snowflake Target

Configuring Cosmos DB Target

Configuring Delimited File Target

Configuring Postgres Target

Configuring Teradata Target

Step 6: Click Save to save the configuration settings.

Configuring BigQuery Target

Following are the BigQuery configuration details:

To sync data to Big Query, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

Configuring Azure Synapse Analytics Target

Following are the Azure Synapse Analytics Target configuration details:

To sync data to Azure Synapse Analytics, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

Configuring Snowflake Target

Following are the snowflake configuration details:

To sync data to Snowflake, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

$inline[badge,LIMITATION,warning] To run Sync type as Append or Merge mode with Snowflake target, you must ensure that a table (with same database name, schema name as configured in sync to target configuration) has to exist on Snowflake target. If the table is not available on the Snowflake target, then you must manually create it.

Configuring Cosmos DB Target

Following are the Cosmos DB configuration details:

To sync data to Cosmos DB, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

$inline[badge,NOTE,primary] For Dataproc 1.4 (Spark 2x and Scala 2.11 version), you must set internalIpOnly to true in the {INFOWORKS_HOME}/conf/dataproc_defaults.json file, to successfully run Sync to target with Cosmos DB.

$inline[badge,LIMITATIONS,warning] The following are some of the limitations for Sync to target jobs with Cosmos DB target.

  • Export to Cosmos DB target is not supported on Spark 2x and Scala 2.12 versions.
  • The column name "id" is mandatory in the data to be written while running jobs on Spark 3.x and Scala 2.12 versions. Else, the CDC merge jobs writing data to Cosmos DB will fail.

Configuring Delimited File Target

Following are the Delimited File configuration details:

To sync data to Delimited File, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

$inline[badge,LIMITATION,warning] The complex data types are not supported for Sync Data to Target on Delimited files.

Configuring Postgres Target

Following are the Postgres configuration details:

$inline[badge,LIMITATIONS,warning]

  • For Postgres, the master-slave selection during failover cannot be automated. It has to be set-up manually.
  • Airflow does not allow configurations to select where the reads / writes should be directed to.

To sync data to Postgres, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

Configuring Teradata Target

$inline[badge,NOTE,primary]

  • Indexing and Partition columns must be included in natural keys for Merge sync type.
  • We use TERADATA FASTLOAD by default, which does not support duplicate records. If you want to load duplicate records, set TYPE=DEFAULT in Additional Params.
  • (Applicable only for Fastload) If table export fails and you receive, "Details of the failure can be found in the exception chain that is accessible with getNextException", there are two ways to handle it. You can follow either of them.
    • Set TYPE=DEFAULT in Additional Params under Connection Parameters.
    • Set advance configuration with key set to export_datawriter_conf and value set to repartition_for_td_export=true.

To sync data to Postgres, see the section $link[page,210065,Sync Data to Target,sync-data-to-target].

Sync Data to Target

After the Target is configured, perform the following steps to sync data to target.

Step 1: From the Data Sources menu, select one of the tables, and click View Source button.

Step 2: Select the source table to be synchronized to Target.

Step 3: Click the Sync Data to Target button.

Step 4: Enter the mandatory fields as listed in the table below:

Step 5: Click Sync Data to Target button.

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches
On This Page