Infoworks Release Notes
Release Notes

v5.5.1

Date of Release: April 2024

This section consists of the new features and enhancements introduced in this release.

  • Airflow executor is changed to CeleryKubernetesExecutor to run deferrable tasks using celery worker and bash nodes using dedicated kubernetes pods. To know the details, refer to the notes section here.
  • Infoworks now provides a log archival feature for Kubernetes-based installation, enabling periodic archiving and deletion of service logs and job logs to optimize disk usage. For more details, please refer to the section on Log Archival Configuration.
  • Users can now export the data directly from Data Lake to SQL Server target table during Ingestion. Please refer to the Configuring SQL Server Target section here.
  • Users can now optionally set origin 'from address' and the 'sender name' by setting the SMTP configurations - "email.fromAddress" and "email.fromName". Please refer here for details.
  • Now Infoworks supports ADLS Gen2 for Teradata TPT Configurations. All the information on this can be found here.
  • Support has been added for downloading dataplane logs for ingestion interactive jobs. You can view the details here.
  • Users can now export the data directly from Data Lake to Oracle target table during Ingestion. Refer to the Configuring Oracle Target section.
  • Now users can sort the snowflake target tables based on the column provided in the table configuration or pipeline target node configuration.
  • Users can delete tables from a data source that are crawled but do not need to be ingested. Detailed information can be found in the auto$ page.
  • Now the can handle the limitation set by databricks where job parameter length is greater than 10,000 characters. Refer to the note here.
  • Users can now enable to configure the databricks cluster log path and job log path as per the user generalized path. Information on this can be viewed here.

Resolved Issues

This section consists of the resolved issues in this release:

JIRA IDIssue
IPD-25763Print Number of Records Inserted/deleted/updated to the snowflake pipeline/pipeline group logs.
IPD-25661Handling pipeline migration when reference table is not present to create a new table.
IPD-24093Config-maps are getting overwritten during upgrade
IPD-24529Subset of users unable to authenticate via SAML after upgrade to 5.5
IPD-24201Unable to use derived column as watermark column for incremental Ingestion
IPD-24430Unable to delete or inactive the advanced key, if the key has trailing space
IPD-24258API Calls failed with "504 Gateway Time-out" Error
IPD-24502Support ingestion from Azure Event Hub using OAuth
IPD-24537Uncheck "enable watermark offset" is not working 5.5
IPD-24546Ingestion job marked as failed after successfully ingesting data
IPD-24551Modify the manifest script to pick the correct manifest information.
IPD-24555Filter sources by associated_domains using Infoworks API in v5.5
IPD-24598All the source tables(included in the domain) are not listed in the table mappings dropdown
IPD-24601All the reference tables are not listed in the reference table dropdown of the target configuration.
IPD-24602Pipeline config import not proceeding further, giving a blank screen.
IPD-24610Segmentation for Snowflake Tables - When using MOD in segmentation the query is failing in the background
IPD-24547Questions on Log Path location for Databricks clusters
IPD-24576Missing Teradata Drivers After Upgrade
IPD-24628Not able to delete Pipeline extension in v5.5
IPD-24672Pipeline Config migration API automatically adds the audit columns to target node when the config json doesn't have any audit columns in target ( This is not acceptable for an existing target table)
IPD-24680Ingestion jobs failing with class not found exception after 5.5.0.2 upgrade
IPD-24696Change in behavior of upload schema in the 5.5 version
IPD-24740Infoworks Config Migration API fails to update table configurations for tables if any of the tables within the source have already been ingested.
IPD-24754Sql node pipeline failing with query validation
IPD-24759In User Patch endpoint, when username is updated associated domains gets removed
IPD-24747Update pipelines failing with invalid identifier of update column
IPD-24785Issue in Onboarding tables with same name from different schema.
IPD-24820Scheduled workflows continue to trigger even if we make the source unavailable to the domain
IPD-24821Enhancement request: Accessible sources and Make available in infoworks domains are not in sync
IPD-24825Pipeline Group Jobs Next URL in API is not working as expected.
IPD-24826Status of the pipeline is showing as Pending eventough pipeline-group job is aborted.
IPD-24833Unable to set schedule job time correctly when minute is chosen as '00'
IPD-24850RESTAPIs do not encode # in the url
IPD-24873Job Metrics are returned empty for a completed job via API
IPD-24939Configuration key to use custom port for Teradata TPT job
IPD-24944Persist airflow logs on volume mounts
IPD-24946Unable to add S3 storage in Databrics AWS.
IPD-25236Recrawl metadata throws "Provided Table Ids are not present in the Source"
IPD-25245Pipeline failed with SQL compilation error in v5.5.0.3
IPD-25283set access-control/permission (access_control_list) when creating Databricks clusters is not working in v5.5.0.3
IPD-25307User Managed tables should not be reference tables.
IPD-25341Bash nodes with custom image are failing on version 5.5.0.4
IPD-25319Updating CSV Source is affecting stored access key details
IPD-25318Unable to save in UI while configuring same target details for user managed tables.
IPD-25372The source setup page refreshes continuously
IPD-25362Ingestion of snowflake table onboarded as a query as table failing with null pointer exception.
IPD-25448Questions on Log Path location for Databricks clusters
IPD-25368Add warnings on table's save action
IPD-25542Support Enhanced Flexibility Mode for Dataproc Clusters
IPD-25580Jobs failing randomly during cluster creation with File not found error after changing to dataproc 2.0
IPD-25404Ingestion of snowflake table failing with Null pointer exception when we configure watermark offset in Ingest mode job run in v5.5.0.4
IPD-25470PyCryptodome package version3.15.0 vulnerability
IPD-25479Enable watermark offset option gets disabled if we change the data type to string
IPD-25509Enhancement: Ability to receive the output of bash script to pipeline
IPD-25554Not able to override workflow variables in pipeline parameters

Known Issues

The following section contains known issues that Infoworks is aware of, and is working on a fix in an upcoming release:

JIRA IDIssue
IPD-25836

CDC pipeline builds with Insert Overwrite mode is failing on Azure databricks environment with DELTA storage format.

NOTE When delta format is used for target node in insert overwrite mode, the pipeline build fails on persistent and ephemeral clusters for Azure Databricks environment. You will receive the following error message "Nested subquery is not supported in the DELETE condition". This is a known limitation from Databricks SQL on Delta and the pipeline build cannot be performed on Azure Databricks in Insert Overwrite mode and Delta format.

IPD-25881In snowflake on EMR environment, pipeline group jobs will fail when "run driver job on dataplane" is enabled.
IPD-25879Sync to target to oracle db fails for the latest oracle instance (Oracle Version 19)
IPD-25882NPE is noticed for Oracle sync to target when additional param field is empty.

Limitations

  • Multiple pipelines executing in parallel on spark compute (dataplane) cause snowflake session failures leading to failed pipeline builds. Thus, It is recommended to run snowflake pushdown queries on compute plane.

Installation

For Kubernetes-based installation, refer to Infoworks Installation on Azure Kubernetes Service (AKS).

For VM-based installation, refer to VM-Based Deployment.

For more information, contact support@infoworks.io.

Upgrade

For upgrading from 5.5.0.x to 5.5.1 for Azure Kubernetes, refer to Upgrading Infoworks from 5.5.0.x to 5.5.1 for Azure Kubernetes.

For upgrading from 5.0 to 5.5.1 for VM, refer to Upgrading Infoworks from 5.5.0 to 5.5.1 for VM.

PAM

The Product Availability Matrix (PAM) is available here.

On This Page
v5.5.1