Infoworks 6.1.3
Getting Started

Configuring Infoworks with Custom Data Environments

To configure and connect to the required Azure Databricks instance, navigate to Admin > Manage Data Environments, and then click Add button under the Custom Data Environment option.

The following window appears.

Environment defines where and how your data will be stored and accessed. Select the cloud platform and the execution engine from the UI, which you want to use while onboarding the data.

Custom Data Environment allows manual configuration with minimal auto-filling support.

For more details on configuration, see Configuring Infoworks with GCP Dataproc.

Configuring Clusters Using Init Scripts

This section explains how to configure init scripts that need to be run for cluster bootstrap.

  1. SSH into Infoworks machine.
  2. Navigate to the path where Infoworks is installed. For example, IW_HOME, /opt/infoworks/
  3. Open the conf/<env>defaults[<cloud>].json file.
  4. Navigate to the init_scripts list and add the remote path of the script on the cluster to the “source” tag.

NOTE From 5.5.0 remote init scripts stored under dbfs will not be supported, instead user should store the init scripts under /Shared directory of databricks workspace

NOTE In the databricks-default-[<cloud>].json, init-script location should be changed to use workspace://<init-script-location> directory, such as

  1. If the script is not present on the cluster, then add the local path on the infoworks machine to the “source” tag under init_scripts list.

Step Result: The file will be copied and added to the bootstrap list.

  Last updated by Monika Momaya