Date of Release: March 2023
| |
---|
This section consists of the new features and enhancements introduced in this release.
|
- Autoscaling using Kubernetes: Infoworks now supports autoscaling when the control plane is deployed on Kubernetes, providing for significantly increased scalability for enterprises. For more information, refer to Autoscaling Configuration.
- High-Availability using Kubernetes: Infoworks support high-availability when the control plane is deployed on Kubernetes.
- External Secrets Management: Infoworks now supports storing sensitive data, such as passwords and credentials securely in Azure Keyvault instead of Infoworks MetaDB. For more information, refer to Storing Secrets.
- Support for Authentication using Azure AD Service Principal and Managed Identity: Infoworks now supports accessing supported Azure resources (Databricks and Keyvault) using Service Principal and Managed Identity along with existing authentication mechanism. For more information, refer to Service Principal and Managed Identity
- Pipeline Groups: In CDW(Snowflake) Data Environment, Infoworks provides the ability to group multiple IWX pipelines as a single transaction. This feature is primarily targeted at workload migration use cases such as BTEQ conversion where the multiple IWX pipelines are created as part of the conversion process. For more information, refer to Creating a Pipeline Group and Building a Pipeline Group.
- Map Roles and Domains to SAML groups:Infoworks now lets system administrators map SAML (e.g. Active Directory) groups to IWX roles and domains. Every time a new user is added, the integration with SAML manages user roles and domain accessibility. For more details, refer to Configuring roles and domains
- Support for Role Assignment and Domain Accessibility: You can now assign roles to the user and provide access to multiple domains at the time of user creation. For more information, refer to Managing Domains and Role Assignment.
- Parametrization: Infoworks provides the ability to pass run-time parameters to pipelines and other nodes in workflow via workflow parameters. For more information, refer to Workflow Parameters.
- Ability to Exclude Columns during Ingestion (Column Projection): Infoworks now provides the ability to ingest a table with a selected subset of columns in both Datalake and CDW environments. You can now choose to exclude certain columns before the ingestion job is submitted. For more details, refer to Configuring a Table.
- AVRO & ORC Source Connector (Native): Infoworks supports onboarding of ORC and AVRO files. This allows additional use cases for fast data migration. For more information, refer to Onboarding Data from AVRO and ORC.
- New Cdata Source Connector Support : Infoworks now supports the following as Ingest sources - activecampaign, apachephoenix, btrieve, cockroachdb, databricks, googlespanner, graphql, informix, neo4j, sasdatasets, sasxpt, tableaucrm. For more information, refer to the list of CDATA connectors.
- End-of-support for certain Source Connectors : Infoworks no longer supports the following connectors as source types - alfresco, awsdatamanagement, azuredatamanagement, datarobot, digitalocean, edgaronline, evernote, fedex, financialedgenxt, hpcc, openexchangerates, quandl, salesforcechatter, sapbusinessonedi, sfeinsteinanalytics, ups, usps, wasabi.
- Add query tags to snowflake: Infoworks now provides the ability to add query tags to all Snowflake queries to help customers manage/allocate costs internally by business unit or cost center. Query tags are available only for Snowflake CDW environments. For more information, refer to Query Tags in Snowflake.
- Infoworks on Kubernetes supports PDB and PodAntiAffinity. For more information, refer to Optional Configuration.
- Ability to perform Anti-Joins: Infoworks now provides the ability to create and edit anti-joins under transformation nodes to help the user bring in the necessary data. For more information, refer to Joining Tables/Nodes.
- Multi-Column CDC Watermark based Ingestion for Infoworks native RDBMS, Salesforce and Generic JDBC sources : Infoworks now allows users to configure Ingestion with multiple watermark columns of the same data type for the mentioned source types. For more information, refer to RDBMS INGESTION, APPLICATION INGESTION, and GENERIC JDBC.
- Enable Pre and Post jobs hooks in pipelines: Infoworks now provides the ability to add pre and post job hooks which can be run before or after a pipeline job in the data plane. For more information, refer to Job Hooks.
- Databricks 11.3 and Photon support: Infoworks now supports Databricks 11.3 LTS and the Photon runtime engine, enabling faster data onboarding and preparation. For more information, refer to Azure Databricks, AWS Databricks, GCP Databricks, and Snowflake.
Resolved Issues
This section consists of the resolved issues in this release:
JIRA ID | Issue |
---|
IPD-19400 | A large number of jobs in BLOCKED state would prevent execution of new jobs. This would lead to the new jobs being in PENDING state for a long time. |
IPD-19339 | Despite cluster creation getting completed, the Creating Cluster timer duration keeps increasing. |
IPD-19545 | The list of data connections and GET data connections APIs are accessible only to Admin users. |
IPD-19720 | User is unable to read Snowflake warehouse name in config-migration APIs. |
IPD-19773 | APIs are failing since private_key_file_details field is stored in JSON format instead of array format. |
IPD-19751 | Infoworks does not disable query caching while fetching schema from BigQuery. |
IPD-19801 | The job summary is not provided as a part of the Job Status API response. |
IPD-19766 | For BigQuery export files, Sync to Target is failing when the table schema contains array type. |
IPD-19810 | User is able to set multiple default computes using API calls. |
IPD-19821 | When service credential used in BigQuery target is different than service credentials used to create environment , sync to target fails with “Invalid JWT Signature” error. |
IPD-19854 | For the Streaming sources, the Table Configuration Translator shows an error while saving advanced configuration. |
IPD-19853 | If number of characters in table name exceed 27 characters, then export to teradata is failing with "table_name_temp already exist" error. |
IPD-19815 | There are incorrect log messages in Sync to Target for teradata job in 5.3. |
IPD-19900 | The zip file downloaded from Application Logs does not contain cluster logs. |
IPD-19945 | Infoworks does not fetch the correct datatypes for the CDATA sources. |
IPD-19929 | In few scenarios, Upgrading from 5.3.0 to 5.3.0.5 crashes the Ingestion service. |
IPD-19943 | There is no provision to configure the disk space for the Dataproc clusters. |
IPD-20022 | Despite disabling the dataset creation in the pipeline configuration, the pipeline still creates the schema. |
IPD-20087 | The Clustering columns are missing in the Target BigQuery table. |
IPD-20082 | The API does not populate ClusterID for the Ephemeral cluster. |
IPD-20397 | Pipeline build fails when the source table column has trailing "% ". |
IPD-20371 | While configuring the BigQuery target, the columns are getting ordered alphabetically irrespective of the order user chooses. |
IPD-20570 | The "Add tables to crawl" API is not working for BigQuery Sync source |
IPD-20455 | The dt advanced configurations to merge partitions is not taking effect in the pipeline job dt_batch_spark_coalesce_partitions . |
IPD-20670 | Google has changed the return message for exception handling of autoscaling policies resulting in job failure. |
IPD-20752 | The iw_environment_cluster_policy configuration does not take effect for ephemeral clusters. |
IPD-20931 | The Save and Save & Add Another buttons are not working in the aggregate node. |
IPD-20936 | The Preview Data tab on the Infoworks pipeline fails to load data sometimes. |
IPD-20949 | Upgrade from 5.3.1 to 5.3.1.5 is failing due to invalid image reference as there is a change in the image format in the templates resulting in pod failure. |
Known Issues
The following section contains known issues that Infoworks is aware of, and is working on a fix in an upcoming release:
JIRA ID | Issue |
---|
IPD-21161 | Ingestion job fails for MongoDB table when there is timestamp field in nested columns. |
IPD-20820 | The Auth API is not working as expected if restricted_visibility_mode flag or user role is changed. It starts working again either after default cache expiry time of 15 minutes or the user-configured expiry time. |
Installation
For Kubernetes-based installation, refer to Infoworks Installation on Azure Kubernetes Service (AKS).
For more information, contact support@infoworks.io.
Upgrade
For upgrading to 5.4 Kubernetes, refer to Upgrade to 5.4.0 Kubernetes.
PAM
The Product Availability Matrix (PAM) is available here.