Onboarding Data from SQL Server

Configuring a SQL Server Source

For onboarding data from a SQL Server source, see $link[page,383152,auto$]. Ensure that the Source Type selected is SQL Server.

SQL Server Configurations

Once the settings are saved, you can test the connection.

$inline[badge,Note,primary] To connect to Azure SQL using the SQL connector, you can use catalog filters to load tables and schemas from single catalog. Ensure that the catalog filter matches just one catalog. You will not be able to browse the entire source like other connectors. Perform the following:

Add the catalog name in the Connection URL string. For example, to connect to the test_sql_pool_catalog, the following string can be used: jdbc:sqlserver://<ip>/TMODE=<tmode>;database=test_sql_pool

And Use the same in the Catalog filter:

Configuring a SQL Server Table

With the source metadata in the catalog, you can now configure the table for CDC and incremental synchronization.

  • Click the Configuration link, for the desired table
  • Provide the ingestion configuration details.

Performing Incremental Ingestion from SQL Server Replication Table

Through MSSQL Server Log Based method, data is ingested and continuously synchronised with the database using captured CDC tables provided by MSSQL Server. But this method works only with the enterprise editions of MSSQL Server. In this method, change data capture must be enabled for the incremental tables. Before enabling for a table, it must be enabled for the schema in which the table is present.

Infoworks uses SQL Server transactional replication to capture change data from SQL server distribution databases. For CDC to work, you must enable SQL Server Replication on the system from which change data is to be captured. If the database has a high volume of change activity, a distributed server must be used as the host of the distribution database.

To configure CDC in Infoworks, you must define a capture registration for each source table. In the capture registration, you can select a subset of columns for which the data is to be captured.

$inline[badge,Prerequisites,success]

  1. The replication table must be enabled in the source SQL Server database.

Logs in MSSQL server is enabled using the CDC program of the SQL Server. This program is bundled with Enterprise, Developer, and Evaluation editions of MSSQL Server installations. Once logs are enabled for a table, the corresponding table changes will be stored in a capture table instance in a default database called CDC. The capture table name can be the same as source table name or user can configure this while enabling logging for the table.

Following is a two-step process {[1,2] or [1,3]} to enable logging. In these scripts, to execute the commands, we assume that there is a database called automation_db and a table timestamps_inserts_updates in that database.

  • Enable CDC for a database automation_db. This is basically executing CDC program on the database:
  • Enable CDC for table timestamp_inserts_updates with default capture instance name.

NOTE:If you do not specify the name of the capture instance, the capture instance name will be same as source table name in the CDC schema.

  • Enable CDC for table with user-given captured instance name.

NOTE: If you specify the name of the capture instance while enabling logging for the table, the source table name is timestamp_inserts_updates logging the CDC table name as name_of_capture_instance.

  • Once the logging is enabled for a table, the metadata of the corresponding captured table will be stored in the cdc.change_tables table. You can use the following query to verify list of captured tables:

Below figure shows the sample output of the CDC-enabled tables:

Following are the steps to perform incremental ingestion from SQL Server replication table:

  1. Navigate to the General Configurations page for the required table.
  2. Select the Use Capture Table for Incremental check box.
  3. In the table Capture Table Name field, enter the name of the captured table provided while enabling replication in the source database. When user performs CDC, the changed data will be fetched from this table.

$inline[badge,Known Issue,primary] If a SQL Server replication table includes any change in schema, the table must be recrawled (Tables > Recrawl Metadata) and the table configuration must be saved again (General Configurations).

Advanced Configurations

Following are the steps to set advanced configuration for a table:

  1. Click the Data Sources menu and click Ingest for the required source.

$inline[badge,NOTE,primary] For already ingested table, click View Source, click the Tables tab, click Configure for the required table and click the Advanced Configuration tab.

  1. Click the Configure Tables tab, click the Advanced Configuration tab and click Add Configuration.
  2. Enter key, value, and description. You can also select the configuration from the list displayed.

Sync Data to Target

Using this option, you can configure the Target connections and sync data as described in the section $link[page,383165,auto$]

The following are the steps to sync data to target.

  1. From the Data Sources menu, select one of the tables and click View Source/Ingest button.
  2. Select the source table to be synchronized to Target.
  3. Click the Sync Data to Target button.
  4. Enter the mandatory fields as listed in the table below:

Click $link[page,383152,auto$] to navigate back to complete the onboarding process.

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches
On This Page