Title
Create new category
Edit page index title
Edit category
Edit link
Intermittent Failure of CDC SCD2 Pipeline Builds Due to Timestamp Casting Issue
Affected Versions
6.1.1
Description
The CDC SCD2 pipeline builds intermittently fail due to incorrect timestamp casting. The issue stems from Spark's transition to a new datetime parser in version 3.0, which introduces stricter datetime validation. Pipelines relying on timestamp parsing fail with errors, disrupting the build process. This article provides the root cause analysis and steps to resolve the issue effectively.
Root Cause
The error stems from a change in behavior in Spark >= 3.0 related to datetime parsing. By default, Spark uses the new parser introduced in version 3.0, which may fail to parse certain datetime formats, resulting in errors such as:
xxxxxxxxxxorg.apache.spark.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSION.PARSE_DATETIME_BY_NEW_PARSER] Fail to parse '2024-11-26 7:20:59' in the new parser.The error is due to Spark’s timeParserPolicy defaulting to EXCEPTION, which treats specific datetime strings as invalid unless they strictly follow ISO8601 formatting. This impacts pipelines reliant on Spark’s datetime parsing capabilities.
To Resolve
To work around this issue, adjust the timeParserPolicy to use the legacy datetime parser. This can be achieved using one of the following approaches:
Option 1: Configure Pipeline Settings
Navigate to the settings page of that pipeline.
In the Advanced Configuration section, add the following key-value pair:
- Key:
iw_spark_app_conf - Value:
spark.sql.legacy.timeParserPolicy=LEGACY
- Key:
Use an ephemeral cluster for pipeline execution.
Option 2: Configure Compute (Environment Level)
Open the Advanced Configuration settings for the compute in your environment.
Add the following key-value pair:
- Key:
spark.sql.legacy.timeParserPolicy - Value:
LEGACY
- Key:
Restart the compute to apply the changes.
Option 3: Directly Set the Property in Compute
Directly set the following configuration in the compute settings:
- Key:
spark.sql.legacy.timeParserPolicy - Value:
LEGACY
- Key:
Restart the compute cluster.
For more details, refer to our Knowledge Base and Best Practices!
For help, contact our support team!
© UNIPHORE TECHNOLOGIES 2025 | Confidential