Infoworks Release Notes
Release Notes

v5.5.0

Date of Release: September 2023

This section consists of the new features and enhancements introduced in this release.

  • Infoworks now supports EMR version 6.6 in addition to 6.2. For more information, refer to the "Type and Run Time Version" field in the Compute section. Also, note that support for 5.32 has been deprecated.
  • Infoworks will support creation of multiple profiles in single snowflake environment. Users using snowflake environment will now have a choice to select profile during source, pipeline and pipeline group creation. For more details, please refer to the Table Schema section here.
  • For sources created on Snowflake environments, time data type has been added. Additionally there is a textbox which takes the precision for time data type columns. The value of this is prefilled during metacrawl but it can be edited. Refer to Configuring a Table.
  • Infoworks now supports access control when using Databricks clusters and jobs. Infoworks will accept access control field as json string and use the field as is in Databricks API calls for cluster creation and job submission. Access control must be enabled and configured as per Databricks documentation before usage in infoworks. For detailed information, please refer to the Access Control List section here.
  • Now the user will have the option to synchronize pipeline export table schema with source table schema for the targets: BiqQuery Target Table, Snowflake Target Table, Postgres Target Table, Teradata Target Table, Oracle Target Table, Redshift Target Table, SQL Server Target Table, Azure Synapse Target Table. To know more, refer to the sections listed under Pipeline Targets.
  • Support for storing and accessing client secret of service principal from Azure Key Vault.Now we can have two sets of service principal one for Azure key vault authentication and other for authentication of Databricks api whose client secret stored in azure key vault. Refer to Managing Secrets.
  • With SLA configurations, user can now configure expected run time of job (eg: hours and minutes) on entity level (table_groups, pipeline, pipeline_groups, workflow) and get notified if job run exceeds configured run time. For more details, view pages: SLA Configuration and Criticality and Using Operations Analyst Dashboard.
  • Now the SQL Import feature supports the DELETE and UPDATE queries. Refer to Importing SQL.
  • Onboarding flow for SFDC sources modified to enable the ability to only onboard the required tables to the source. For more information, view Onboarding Data from SalesForce page.
  • New read only role added to provide view only access of infoworks. For more details, check out the Editing a User section here.
  • Infoworks now supports native idle timeout option for Dataproc and EMR persistent clusters (previously only available for Databricks).
  • There are a few changes in the transitive access control of domains through the “Accessible Domains” functionality. Please review here for specific changes.
  • The Databricks runtime version for 9.1 has been deprecated. More information can be found under the Compute section.
  • There are changes in SAML ACS URL format to work with V3 APIs. Please check here for information.

Resolved Issues

This section consists of the resolved issues in this release:

JIRA IDIssue
IPD-22036Pipeline build fails when the external target BQ table already exists and is clustered/partitioned.
IPD-22090Export to delimited files: Timestamp Format issue.
IPD-22351Storage project requires create permissions despite parent project feature enablement.
IPD-22615Partition and the clustering details are missing in the big query table created by infoworks pipeline.
IPD-22721Advance configuration for streaming_group_id_prefix not working as expected.
IPD-22025SQL Pipeline: Sql pipeline validation API is not filtering the advanced configuration by "IsActive" flag.
IPD-22034Non Visual Infoworks SQL Pipeline cannot execute SnowSQL.
IPD-22037User managed table options are not available for streaming sources in CDW environment.
IPD-22048Dropdown option is not selectable in Interactive Compute Mappings in workflow migration.
IPD-22076Sync to external target to delimited files is showing "Please select a Data Connection" on saving the configuration.
IPD-22332Unable to set-active pipeline API in v5.4.2.
IPD-22421Workflow config migration API is not behaving as expected in Infoworks 5.4.2.
IPD-21534Initialize & ingest/truncate job is not resetting/updating the value of key last_merged_watermark.
IPD-22113v5.4.1.3 - Pipeline creation does not happen with compute template name.
IPD-22817Need validation on batch_engine key during the pipeline creation in API flow.
IPD-22755GET and PUT methods aren't returning table doc with user_managed_target_path key for user managed tables.
IPD-22857Segment load using fetch type as TPT and with multi-water mark column is running for 18 hours in the environment.
IPD-22963The second pipeline build is causing duplicate records when the load incrementally is enabled and the sync type is set to append in 5.3.X and 5.4.X versions.
IPD-22964Kafka schema registry password emptied post upgrade to 5.4.1 from 5.3.0.x.
IPD-22983Incremental pipeline bringing entire dataset when the source table is from confluent source with incremental mode as append.
IPD-22985Append and merge mode tables getting overwritten in each run of confluent source ingestion.
IPD-22038The ingestion job continued to stay in the blocked state even if we unlock them on the admin page.
IPD-23007Project ID field issue on the csv source configured on the BigQuery data environment.
IPD-22972Azure Key Vault drop down doesn't show context of the secret for non-Admin users.
IPD-23079Sync to target job for Postgres fails with authentication error.
IPD-23080Job logs show password for Sync to Target Job.
IPD-23128Unsaved changes popup when updated table configurations via API in v5.4.2.
IPD-23129For SCD 2 type, few fields are not getting saved in Table Document in v5.4.2.
IPD-23228Blob storage changes for supporting 17.2 TPT.
IPD-23239Create source API on snowflake meta sync source is failing.
IPD-23264Relax scale and precision check in snowflake onboarding.
IPD-23197Backend Fix (Ingestion) - UI/RESTAPI behaviour inconsistency while updating Sync to Target properties.
IPD-23200GET source configuration call is failing on snowflake meta sync source.
IPD-23263IWX scheduler issue - 5.3.0.14.
IPD-23418BigQuery pipelines are failing with java.lang.IllegalArgumentException: Provided query is null or empty error after the upgrade.
IPD-23327Inactive spark advanced configs are taking effect in 5.4.1.4.
IPD-23452Rest API - Bug in 5.4.1 config migration where few keys are in camelCase (should be in snake_case).
IPD-23498Project ID field issue on the teradata source configured on the BigQuery data environment.
IPD-23515Ingestion job on BQ environment is running queries on GCP project present in service account JSON instead of parent_project.
IPD-23526Issues while using partitioned externally created BQ table in our pipelines.
IPD-23613Last Modified Date in Workflow is shown incorrectly.
IPD-23663File Archival process is not archiving files with only header records post upgrade to 5.4.
IPD-23636Pipeline Jobs list not visible.
IPD-23647The staging table created by Infoworks during pipeline build will be persisted in the BigQuery if the load job fails.
IPD-23754Records were missing in the BigQuery target table even though the pipeline was completed successfully..
IPD-23822Schema name is null/empty for incremental pipelines in BQ environment on using use_ingestion_time_as_watermark key
IPD-23719Pipeline versions getting emptied on production.
IPD-23313Environment details API throws authentication error for prodops user, however the same can be viewed from UI by prodops user
IPD-23589Issues when selecting all tables in UI for onboarding, metadata crawl.
IPD-23634List table APIs bringing duplicates during different offsets parameters in v5.4.2.3.
IPD-23714Source connection details API not returning the additional connection parameters in 5.4.x.
IPD-23716Add File mapping API failing, causing the mass ingestion script failure.
IPD-23733"Make available in Infoworks domains" not coming in source details API.
IPD-23734Source Config migration API on changing the table group names in iw_mappings not reflecting after import.
IPD-23783Source config migration API doesn't return use_staging_schema_for_infoworks_managed_tables
IPD-23899Refresh token not generated for newly created users in 5.4.1.6.
IPD-23949Source Config Migration API on 5.4.2.4 latest build started to fail.
IPD-23963Issues with table-level email notification.
IPD-23732Add tables api does not return table_type in its json dump.
IPD-24007View run is not loading workflow run page beyond recent 20 workflow runs limit.
IPD-24164Databricks Environment - switch to global init scripts.
IPD-24213Issues with downloading Databricks log file.
IPD-24406Metadata crawl for Confluent Kafka source is failing with no messages found for Topic.

Known Issues

The following section contains known issues that Infoworks is aware of, and is working on a fix in an upcoming release:

JIRA IDIssue
IPD-23583Pipeline that is a part of pipeline group can be deleted. Ideally, this operation should not be allowed.
IPD-23078Table preview request times out for the first time for SQL source.

Limitations

  • Persistent Cluster : Cluster restart does not work reliably with inactivity-timeout set less than 10 minutes.
  • For SQL pipelines, use of snowflake variables as a table identifier / table name not supported.
  • BigQuery as a sync to target is not supported on EMR 6.6.
  • Deploy mode as a Cluster is not supported on EMR and Dataproc.

Installation

For Kubernetes-based installation, refer to Infoworks Installation on Azure Kubernetes Service (AKS).

For VM-based installation, refer to VM-Based Deployment.

For more information, contact support@infoworks.io.

Upgrade

For upgrading from 5.4.2.x to 5.5.0 for Azure Kubernetes, refer to Upgrading Infoworks from 5.4.2.x to 5.5.0 for Azure Kubernetes.

For upgrading from 5.0 to 5.5.0 for VM, refer to Upgrading Infoworks from 5.0 to 5.5.0 for VM.

PAM

The Product Availability Matrix (PAM) is available here.

On This Page
v5.5.0