For creating a SalesForce source, see Creating a Source. Ensure that the Source Type selected is SalesForce.
For setting a SalesForce source, see Setting Up a Source.
Field | Description |
---|---|
Environment | The Salesforce environment Production or Sandbox. |
Connection URL | The connection URL through which Infoworks connects to the Salesforce instance. |
Username | Salesforce login user name. |
Authentication Type for Password | Select the authentication type from the dropdown. For example, Infoworks Managed or External Secret Store. If you select Infoworks Managed, then provide Authentication Password for Password. If you select External Secret Store, then select the Secret which contains the password. |
Authentication Type for Secret Key | Salesforce connected application secret key. If you select Infoworks Managed from the Authentication Type for Secret Key dropdown, then provide Authentication Password for Secret Key. If you select External Secret Store, select the secret from the Secret for Secret Key dropdown. |
Fetch Mechanism | The fetch mechanisms include Bulk, REST and Both. For details, see SalesForce Fetch Mechanism. |
API Version | Salesforce API version. |
Infoworks supports the following methods to crawl data from Salesforce:
REST API
Infoworks supports data ingestion via Salesforce REST API. The Salesforce REST API provides data in a paginated manner, where one API response provides the URL of the next API call. Ingestion via REST API is slower compared to bulk API but also provides features like Salesforce Custom Fields and Column Addition. This is the recommended method to crawl data via Infoworks.
Bulk API
Ingestion via Bulk API is faster and must be used in cases where data volumes are high. SalesForce Bulk API provides options to dump data into CSV for faster ingestion. Bulk API also allows data crawled to be parallelised using option like PK chunking. However, Ingestion via bulk API does NOT support the Custom fields and Column Addition features.
For configuring SalesForce tables, see Configuring a Table.
For ingesting SalesForce source data, see Onboarding Data.
Field | Description |
---|---|
Ingest Type | The type of synchronization for the table. The options include full refresh and incremental. |
Update Strategy | This field is displayed for only incremental ingestion. The options include append and merge. |
Natural Keys | The combination of keys to uniquely identify the row. This field is mandatory in incremental ingestion tables. It helps in identifying and merging incremental data with the already existing data on target. |
Watermark Column | Select single/multiple watermark columns to identify the incremental records. The selected watermark column(s) should be of the same datatype. |
Storage Format | The format in which the tables much be stored. The options include: Read Optimized (Delta),Write Optimized (Avro). |
Target Table Name | Name of the target table. |
Partition Column | The column used to partition the data in target. |
Fetch Mechanism | The fetch mechanisms include Bulk and REST . |
Enable PK Chunking | The option to enable PK chunking for the table. If enabled, SalesForce splits data based on the primary key and data will be fetched in parallel. |
Generate History View | The option specifies whether to preserve data in the history table. After each CDC, the data will be appended to the history table. |
During data crawl, the data that cannot be crawled will be stored in an error table, <tablename>_error
.