Infoworks Release Notes
Release Notes

v5.3.1

Date of Release: October 2022

This section consists of the new features and enhancements introduced in this release.

  • Support for AKS: Infoworks’ control plane may now be installed and used on Azure Kubernetes Service (AKS). The deployment model enables failover of Infoworks pods, providing better availability and scalability when a large number of concurrent jobs and workflows are executed. For more details, refer to Infoworks Installation on Azure Kubernetes Service (AKS).
  • Onboarding Data from Vertica: Infoworks supports onboarding data from Vertica data platform.
  • Ability to Exclude Columns during Ingestion (Column Projection): Infoworks now provides the ability to ingest a table with a selected subset of columns. You can now choose to exclude certain columns before the ingestion job is submitted. For more details, refer to Configuring a Table.
  • Added Advanced Mode in In NotIn: Infoworks now allows multiple columns and data modifying expressions for inner and outer ports. This feature is supported on Spark, Snowflake, and BigQuery execution engines. For more details, refer to Performing In NotIn Operation.
  • Streaming Deserializers: Infoworks now allows you to implement your own deserializer for deserializing the byte serialized messages from streaming sources such as Kafka and Confluent. For more details, refer to Configuring Deserializers.
  • Support for Bash Scripts on Kubernetes using Custom Images: When running bash scripts from workflows in a Kubernetes deployment, you may use custom Kubernetes containers based on images that may include libraries and tools used by the bash script. For more details, see Bash Scripts in Kubernetes using Custom Images.
  • Added Staging Names for CDW Target Nodes: Infoworks now allows you to create views in Staging database (Snowflake) and Staging Dataset (BigQuery) environment. For more details refer to Snowflake and BigQuery targets.
  • Support for Readable Column Aliases in Pipeline Generated Queries: A new key called dt_use_iwx_column_aliasing is added. When this key is set to false, Infoworks will use original column names as aliases. For more information, refer to Setting Pipeline Advanced Configurations.

Resolved Issues

This section consists of the resolved issues in this release:

JIRA IDIssueSeverity
IPD-18695File Preview/Schema Crawl fails for CSV ingestion from S3 Buckets hosted on Gov Cloud.High
IPD-18623The DISTINCT function in SQL is not getting imported properly in Infoworks PipelinesHighest
IPD-18777Direct incremental ingestion to existing table on snowflake without any audit column is not allowed.Highest
IPD-18835The user managed table configuration could not enabled due to "is_table_user_managed" not being present.Highest
IPD-18911The REST API doesn’t support “HIVE_UDF” source extensions for columns.Highest
IPD-18921The "in_notin" node under pipelines does not support multiple columns.Highest
IPD-18931Infoworks metacrawl job for CSV source fetches incorrect number of columns.Highest
IPD-19474The API POST call to Pipeline Config-Migration fails.Highest
IPD-19542When running ingestion on BigQuery environment, error table is not getting created on BigQuery dataset if the source has only one error record.Medium
IPD-19579The workflow variable "job_id" is not being preserved. Fetching it inside other nodes will return None.High

Known Issues

The following section contains limitations that Infoworks is aware of, and is working on a fix in an upcoming release:

JIRA IDIssueSeverity
IPD-19633

Sometimes, when you try to import the SQL file via SQL import section in pipeline settings page, the pipeline version for the corresponding SQL file is successfully created, but the application gets incorrectly directed to the mappings page. However, this is an intermittent issue.

Workaround: Go to the Pipeline overview page and check if the version is created. If not, try importing again.

High

Limitations

  • Limitations for Databricks Compute’s "Enable Elastic Disk" Option: Elastic Disk is available only on Amazon AWS Databricks. Enabling or Disabling this option for Databricks on GCP or Azure will have no effect. On AWS Databricks, if Instance Pool is being used in Ephemeral Compute, then this option will simply be ignored as AWS doesn’t support Elastic Disk along with Instance Pool.

Installation

For Kubernetes-based installation, refer to Infoworks Installation on Azure Kubernetes Service (AKS).

For more information, contact support@infoworks.io.

Upgrade

For upgrading from 5.3.0 VM to 5.3.1 Kubernetes, refer to Upgrade to 5.3.1 Kubernetes.

PAM

For the PAM, see Product Availability Matrix.

On This Page
v5.3.1