Infoworks supports Azure Synapse Analytics as a target for data transformation pipeline. Azure Synapse Analytics is a cloud-based enterprise data warehouse that leverages Massively Parallel Processing (MPP) to quickly run complex queries across petabytes of data and can be used as a key component of a big data solution. |
Following are the steps to use Azure Synapse Analytics target in pipeline:
Field | Description |
---|---|
Build Mode | The options include overwrite, append or merge. Overwrite: Drops and recreates the Azure Synapse Analytics target. Append: Appends data to the existing Azure Synapse Analytics target. Merge: Merges data to the existing table based on the natural key. |
Data Connection Name | Data connection to be used by the Azure Synapse Analytics target. |
Schema Name | Schema name of the Azure Synapse Analytics target. |
Create a schema if it does not exist | Enable this option to create a new schema with the name provided above. Ensure that the user has sufficient privileges to create a schema in Azure Synapse Analytics Target. |
Table Name | Table name of the Azure Synapse Analytics target. |
Indexing Type | The indexing options include clustered columnstore index, clustered index and nonclustered index, and a non-index option - heap. Note: Clustered Column Store and Non-Clustered indexes are not supported for tables containing column(s) of STRING data type, and Columns with STRING data type and TIMESTAMP datatype can not be used as a partitioning column.(https://docs.microsoft.com/en-us/sql/relational-databases/partitions/partitioned-tables-and-indexes?view=sql-server-ver15) |
Distribution Type | Distribution methods for designing distributed tables. The options include hash, replication and round-robin. |
Natural Keys | The required natural keys for the Azure Synapse Analytics target. |
Partition Column Name | The column name for partition. |
Partition Range | The range type. The options include left and right. |
Boundary Values | The boundary points for partition (comma separated). |
Below is the list of supported source datatypes and their mapping to Azure Synapse Analytics datatypes:
Spark Datatype | SQL Datatype |
---|---|
DecimalType | decimal |
IntegerType | int |
FloatType | float |
DoubleType | float |
LongType | bigint |
StringType | nvarchar(MAX) |
DateType | datetime |
TimestampType | datetime |
BooleanType | bit |