Infoworks supports Azure Synapse Analytics as a target for data transformation pipeline. Azure Synapse Analytics is a cloud-based enterprise data warehouse that leverages Massively Parallel Processing (MPP) to quickly run complex queries across petabytes of data and can be used as a key component of a big data solution. |
Following are the steps to use Azure Synapse Analytics target in pipeline:
| Field | Description |
|---|---|
| Build Mode | The options include overwrite, append, merge or insert-overwrite. Overwrite: Drops and recreates the Azure Synapse Analytics target. Append: Appends data to the existing Azure Synapse Analytics target. Merge: Merges data to the existing table based on the natural key. Insert-Overwrite : Overwrites all the rows based on combination of natural keys (OR condition) and inserts new rows. |
| Data Connection Name | Data connection to be used by the Azure Synapse Analytics target. |
| Schema Name | Schema name of the Azure Synapse Analytics target. |
| Create a schema if it does not exist | Enable this option to create a new schema with the name provided above. Ensure that the user has sufficient privileges to create a schema in Azure Synapse Analytics Target. |
| Table Name | Table name of the Azure Synapse Analytics target. |
| Indexing Type | The indexing options include clustered columnstore index, clustered index and nonclustered index, and a non-index option - heap. Note: Clustered Column Store and Non-Clustered indexes are not supported for tables containing column(s) of STRING data type, and Columns with STRING data type and TIMESTAMP datatype can not be used as a partitioning column.(https://docs.microsoft.com/en-us/sql/relational-databases/partitions/partitioned-tables-and-indexes?view=sql-server-ver15) |
| Distribution Type | Distribution methods for designing distributed tables. The options include hash, replication and round-robin. |
| Natural Keys | The required natural keys for the Azure Synapse Analytics target. |
| Partition Column Name | The column name for partition. |
| Partition Range | The range type. The options include left and right. |
| Boundary Values | The boundary points for partition (comma separated). $inline[badge,NOTE,primary] Boundary values must exactly match the partition column data type. For example, if the column data type is int, the boundary values can be 10,20,50,... If the column data type is date, the boundary values can be ‘11-12-2019’,’10-11-2020’,... If the column data type is string, the boundary values must be provided in single quotes (comma separated). |
| Sync Table Schema | The option to synchronise pipeline export table schema with source table schema. |

Below is the list of supported source datatypes and their mapping to Azure Synapse Analytics datatypes:
| Spark Datatype | SQL Datatype |
|---|---|
| DecimalType | decimal |
| IntegerType | int |
| FloatType | float |
| DoubleType | float |
| LongType | bigint |
| StringType | nvarchar(MAX) |
| DateType | datetime |
| TimestampType | datetime |
| BooleanType | bit |