For creating an Amazon Redshift source, see Configuring Additional Connectors. Ensure that the Source Type selected is Amazon Redshift.
For configuring an Amazon Redshift Table, see Configuring a Table.
Field | Description |
---|---|
Fetch Data Using | The mechanism through which Infoworks fetches data from the database. |
Connection URL | The connection URL through which Infoworks connects to the database. |
Username | The username for the connection to the database. |
Password | The password for the username provided. |
Source Schema | The schema in the database to be crawled. The schema value is case sensitive. |
Once the settings are saved, you can test the connection.
With the source metadata in the catalog, you can now configure the table for CDC and incremental synchronization.
Field | Description |
---|---|
Ingestion Configuration | |
Ingest Type | The type of synchronization for the table. The options include full refresh and incremental. |
Natural Keys | The combination of keys to uniquely identify the row. This field is mandatory in incremental ingestion tables. It helps in identifying and merging incremental data with the already existing data on target. |
Incremental Mode | The option to indicate if the incremental data must be appended or merged to the base table. This field is displayed only for incremental ingestion. The options include append and merge. |
Watermark Column | The watermark column to identify the incremental records. |
Ingest subset of data | The option to configure filter conditions to ingest a subset of data. For more details, see Filter Query for RDBMS Sources. |
SQL Queries | |
Select Query | The additional inputs required with a select clause, like optimization hints or lock changes. The default value is SELECT ${columnList} FROM . , where, ${columnList} is the placeholder for all the columns to be queried and and will be pre-filled with the appropriate values.<br><br>$inline[badge,Note,primary] Ensure to provide the correct SQL query syntax as per the DB used. |
Target Configuration | |
Target Table Name | The name of the target table. The name cannot contain special characters except underscore. |
Schema Name | The name of the target schema. The name cannot contain special characters except underscore. |
Storage Format | The format in which the tables must be stored. The options include Read Optimised (Delta) and Write Optimised (Avro). |
Partition Column | The column used to partition the data in target. Selecting the Create Derived Column option allows you to derive a column from the partition column. This option is enabled only if the partition column datatype is date or timestamp. <br><br>Provide the Derived Column Function and Derived Column Name. The data will be partitioned based on this derived column. |
Optimization Configuration | |
Split By Column | The column used to crawl the table in parallel with multiple connections to database. Split-by column can be an existing column in the data. Any column for which minimum and maximum values can be computed, can be a split-by key. <br><br>Select the Create Derived Split Column option and provide the Derived Split Column Function to derive a column from the split-by column. This option is enabled only if the split-by column datatype is date or timestamp. The data will be split based on the derived value. |
Generate History View | The option to preserve data in the history table. If enabled, after each CDC, the data will be appended to the history table. |
To create a JDBC URL, see Establishing a Connection.