Infoworks 5.4.0
Software Release Notes
Software Release Notes

Software Release Notes 5.4

Date of Release: March 2023

This section consists of the new features and enhancements introduced in this release.

  • Autoscaling using Kubernetes: Infoworks now supports autoscaling when the control plane is deployed on Kubernetes, providing for significantly increased scalability for enterprises. For more information, refer to Autoscaling Configuration.
  • High-Availability using Kubernetes: Infoworks support high-availability when the control plane is deployed on Kubernetes.
  • External Secrets Management: Infoworks now supports storing sensitive data, such as passwords and credentials securely in Azure Keyvault instead of Infoworks MetaDB. For more information, refer to Storing Secrets.
  • Support for Authentication using Azure AD Service Principal and Managed Identity: Infoworks now supports accessing supported Azure resources (Databricks and Keyvault) using Service Principal and Managed Identity along with existing authentication mechanism. For more information, refer to Service Principal and Managed Identity
  • Pipeline Groups: In CDW(Snowflake) Data Environment, Infoworks provides the ability to group multiple IWX pipelines as a single transaction. This feature is primarily targeted at workload migration use cases such as BTEQ conversion where the multiple IWX pipelines are created as part of the conversion process. For more information, refer to Creating a Pipeline Group and Building a Pipeline Group.
  • Map Roles and Domains to SAML groups:Infoworks now lets system administrators map SAML (e.g. Active Directory) groups to IWX roles and domains. Every time a new user is added, the integration with SAML manages user roles and domain accessibility. For more details, refer to Configuring roles and domains
  • Support for Role Assignment and Domain Accessibility: You can now assign roles to the user and provide access to multiple domains at the time of user creation. For more information, refer to Managing Domains and Role Assignment.
  • Parametrization: Infoworks provides the ability to pass run-time parameters to pipelines and other nodes in workflow via workflow parameters. For more information, refer to Workflow Parameters.
  • Ability to Exclude Columns during Ingestion (Column Projection): Infoworks now provides the ability to ingest a table with a selected subset of columns in both Datalake and CDW environments. You can now choose to exclude certain columns before the ingestion job is submitted. For more details, refer to Configuring a Table.
  • AVRO & ORC Source Connector (Native): Infoworks supports onboarding of ORC and AVRO files. This allows additional use cases for fast data migration. For more information, refer to Onboarding Data from AVRO and ORC.
  • New Cdata Source Connector Support : Infoworks now supports the following as Ingest sources - activecampaign, apachephoenix, btrieve, cockroachdb, databricks, googlespanner, graphql, informix, neo4j, sasdatasets, sasxpt, tableaucrm. For more information, refer to the list of CDATA connectors.
  • End-of-support for certain Source Connectors : Infoworks no longer supports the following connectors as source types - alfresco, awsdatamanagement, azuredatamanagement, datarobot, digitalocean, edgaronline, evernote, fedex, financialedgenxt, hpcc, openexchangerates, quandl, salesforcechatter, sapbusinessonedi, sfeinsteinanalytics, ups, usps, wasabi.
  • Add query tags to snowflake: Infoworks now provides the ability to add query tags to all Snowflake queries to help customers manage/allocate costs internally by business unit or cost center. Query tags are available only for Snowflake CDW environments. For more information, refer to Query Tags in Snowflake.
  • Infoworks on Kubernetes supports PDB and PodAntiAffinity. For more information, refer to Optional Configuration.
  • Ability to perform Anti-Joins: Infoworks now provides the ability to create and edit anti-joins under transformation nodes to help the user bring in the necessary data. For more information, refer to Joining Tables/Nodes.
  • Multi-Column CDC Watermark based Ingestion for Infoworks native RDBMS, Salesforce and Generic JDBC sources : Infoworks now allows users to configure Ingestion with multiple watermark columns of the same data type for the mentioned source types. For more information, refer to RDBMS INGESTION, APPLICATION INGESTION, and GENERIC JDBC.
  • Enable Pre and Post jobs hooks in pipelines: Infoworks now provides the ability to add pre and post job hooks which can be run before or after a pipeline job in the data plane. For more information, refer to Job Hooks.
  • Databricks 11.3 and Photon support: Infoworks now supports Databricks 11.3 LTS and the Photon runtime engine, enabling faster data onboarding and preparation. For more information, refer to Azure Databricks, AWS Databricks, GCP Databricks, and Snowflake.

Resolved Issues

This section consists of the resolved issues in this release:

JIRA IDIssue
IPD-19400A large number of jobs in BLOCKED state would prevent execution of new jobs. This would lead to the new jobs being in PENDING state for a long time.
IPD-19339Despite cluster creation getting completed, the Creating Cluster timer duration keeps increasing.
IPD-19545The list of data connections and GET data connections APIs are accessible only to Admin users.
IPD-19720User is unable to read Snowflake warehouse name in config-migration APIs.
IPD-19773APIs are failing since private_key_file_details field is stored in JSON format instead of array format.
IPD-19751Infoworks does not disable query caching while fetching schema from BigQuery.
IPD-19801The job summary is not provided as a part of the Job Status API response.
IPD-19766For BigQuery export files, Sync to Target is failing when the table schema contains array type.
IPD-19810User is able to set multiple default computes using API calls.
IPD-19821When service credential used in BigQuery target is different than service credentials used to create environment , sync to target fails with “Invalid JWT Signature” error.
IPD-19854For the Streaming sources, the Table Configuration Translator shows an error while saving advanced configuration.
IPD-19853If number of characters in table name exceed 27 characters, then export to teradata is failing with "table_name_temp already exist" error.
IPD-19815There are incorrect log messages in Sync to Target for teradata job in 5.3.
IPD-19900The zip file downloaded from Application Logs does not contain cluster logs.
IPD-19945Infoworks does not fetch the correct datatypes for the CDATA sources.
IPD-19929In few scenarios, Upgrading from 5.3.0 to 5.3.0.5 crashes the Ingestion service.
IPD-19943There is no provision to configure the disk space for the Dataproc clusters.
IPD-20022Despite disabling the dataset creation in the pipeline configuration, the pipeline still creates the schema.
IPD-20087The Clustering columns are missing in the Target BigQuery table.
IPD-20082The API does not populate ClusterID for the Ephemeral cluster.
IPD-20397Pipeline build fails when the source table column has trailing "%".
IPD-20371While configuring the BigQuery target, the columns are getting ordered alphabetically irrespective of the order user chooses.
IPD-20570The "Add tables to crawl" API is not working for BigQuery Sync source
IPD-20455The dt advanced configurations to merge partitions is not taking effect in the pipeline job dt_batch_spark_coalesce_partitions.
IPD-20670Google has changed the return message for exception handling of autoscaling policies resulting in job failure.
IPD-20752The iw_environment_cluster_policy configuration does not take effect for ephemeral clusters.
IPD-20931The Save and Save & Add Another buttons are not working in the aggregate node.
IPD-20936The Preview Data tab on the Infoworks pipeline fails to load data sometimes.
IPD-20949Upgrade from 5.3.1 to 5.3.1.5 is failing due to invalid image reference as there is a change in the image format in the templates resulting in pod failure.

Known Issues

The following section contains known issues that Infoworks is aware of, and is working on a fix in an upcoming release:

JIRA IDIssue
IPD-21161Ingestion job fails for MongoDB table when there is timestamp field in nested columns.
IPD-20820The Auth API is not working as expected if restricted_visibility_mode flag or user role is changed. It starts working again either after default cache expiry time of 15 minutes or the user-configured expiry time.

Installation

For Kubernetes-based installation, refer to Infoworks Installation on Azure Kubernetes Service (AKS).

For more information, contact support@infoworks.io.

Upgrade

For upgrading to 5.4 Kubernetes, refer to Upgrade to 5.4.0 Kubernetes.

PAM

The Product Availability Matrix (PAM) is available here.