Incremental ingestion fails with cannot perform MERGE as multiple source rows matched and attempted to update the same t

Incremental ingestion fails with "cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table" error at merge phase.

Problem Description

Incremental ingestion fails at the merge phase if the CDC data has multiple records for the given natural key with the below error.

20/06/05 05:41:30 ERROR Merger: java.lang.UnsupportedOperationException: Cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table.

20/06/05 05:41:30 ERROR DistJobsDriver: io.infoworks.saas.ingestion.commons.exception.IWException: java.lang.UnsupportedOperationException: Cannot perform MERGE as multiple source rows matched and attempted to update the same target row in the Delta table.

at io.infoworks.saas.ingestion.jobs.merger.Merger.databricksMerge(Merger.java:147)

at io.infoworks.saas.ingestion.jobs.merger.Merger.runMerge(Merger.java:217)

at io.infoworks.saas.ingestion.jobs.merger.Merger.runJob(Merger.java:204)

at io.infoworks.saas.ingestion.jobs.DistJobsDriver.submitMergeJob(DistJobsDriver.java:87)

at io.infoworks.saas.ingestion.jobs.DistJobsDriver.runJobs(DistJobsDriver.java:51)

at io.infoworks.saas.ingestion.jobs.DistJobsDriver.main(DistJobsDriver.java:28)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw$$iw$$iw$$iw.(command--1:1)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw$$iw$$iw.(command--1:44)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw$$iw.(command--1:46)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw$$iw.(command--1:48)

at line06d343ff07b84216ae75d4f415b410f925.$read$$iw$$iw.(command--1:50)

Cause

This happens if the CDC job brings multiple records for the given natural key.

Solution

If the CDC table has the duplicate records then set a configuration at a table level to run dedupe on the CDC table before the merge. This will dedupe the CDC table using the natural key and ZIW_TARGET_TIMESTAMP column. Set the below key-value pair at the table level.

Copy

Applicable Infoworks versions

Infoworks Datafoundry v3.x on Databricks

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches