Incremental ingestion failing with error during table crawl due to path target_hdfs_path table_id merged orc does not ex

Incremental ingestion failing with error during table crawl due to path /target_hdfs_path/table_id/merged/orc does not exist

Problem Description:

Incremental ingestion failing with ///merged/orc does not exists. Sample stack trace looks like below,

Copy

Root cause:

This happens when someone deletes the target HDFS path manually. Infoworks maintains directory structure for ingestion job and the final data set will be stored inside /merged directory by the end of each job. And if someone deletes this directory subsequent incremental job will fail with the above-mentioned error.

Solution:

To fix this issue need to run the ingestion as initialize and ingest(Full load). This will populate the directory structures in the underlying storage location.

Applicable IWX versions:

IWX 2.X, 3.x

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches