Title
Create new category
Edit page index title
Edit category
Edit link
Metadata Crawl from BigQuery
Overview
This functionality allows you to get the metadata of already existing BigQuery tables, so that they can be used in pipelines downstream and can be used in conjunction with tables ingested from other sources.
Creating a BigQuery Source
The following are the steps to create a BigQuery source:
Step 1: In the left navigation pane of Infoworks UI page, click the Data Sources icon.

Step 2: Click Onboard New Data. The Source Connectors page appears with the list of all available connectors.
Step 3:In the Search... bar, type “BigQuery Metadata Sync”.
Step 4: Click the BigQuery Metadata Sync connector. The configuration page of the connector appears.
Configuring a BigQuery Source
The following are the steps to configure a BigQuery source:
Configure Source & Target
Step 1: In the Configure Source & Target page, enter the following configuration details.
| Field | Description |
|---|---|
| Source Name | Provide a source name for the target table. |
| Project ID | Provide the respective Project ID. This ID is present in the Google BigQuery Console. |
| Data Environment | Select the environment where the tables are registered. Infoworks will spawn a spark session in the persistent cluster running in the environment and fetch all the tables registered. |
| Temporary Storage | Select from one of the storage options defined in the BigQuery environment. |
| Base Location | The path to the base/target directory where all the data should be stored. |
| Make available in infoworks domains | Select the relevant domain from the dropdown list to make the source available in the selected domain. |
Step 2: Click the Save button. Click Next.
Select Tables
You can select the tables for which the metadata crawl is required. You can add more tables later.
Step 1: In the Select Tables step, you can choose to Browse entire source or Filter tables to browse.
Step 2: Filter the tables by Schema Name, Table Name, by entering multiple names separated by comma or by using a "%" as a wildcard.
Step 3: Click Browse Source. The Browse source area appears.
bulk_payload_record_size is set to 6500, by default.
For the tables to appear quickly, scroll down to the Advanced Configurations section, and set the value of bulk_payload_record_size to 100. The value can be changed at admin and source levels.
Step 4: Select the check boxes against the relevant table(s), and click Add Selected Tables.
Step 5: Click Crawl Metadata to proceed. A success message appears.
Metadata crawl has been triggered. To view the job status, click View Job Status.

For more details, refer to our Knowledge Base and Best Practices!
For help, contact our support team!
© UNIPHORE TECHNOLOGIES 2025 | Confidential