Count mismatch due to hive fetching data from metastore

Count mismatch due to hive fetching data from metastore.

Problem Description:

During the pipeline build process,When we run the row count query on the pipeline target tables to show the record count, there is mismatch in the record count between hive and Bigquery target.

Root cause:

Thecause for the count mismatch was due to Hive shell returning the count from Hive Statistics stored in the metastore.

Solution:

Set the below hive configuration to force hive to run a count query every time.

hive>set hive.compute.query.using.stats=false

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches