Impala lineage
You can use the Atlas lineage graph to understand the source and impact of data and changes to data over time and across all your data.
Atlas collects metadata from Impala to represent the lineage among data assets. The Atlas lineage graph shows the input and output processes that the current entity participated in. Entities are included if they were inputs to processes that lead to the current entity or they are output from processes for which the current entity was an input. Impala processes follow this pattern.
Starting with Cloudera Runtime 7.1.9 SP2, Impala lineage events include
an explicit operationType field. Atlas uses that value when it builds Impala
process and column lineage so the lineage graph reflects the operation without relying only on
parsing the full queryText. Before 7.1.9 SP2, Atlas inferred the operation
from the query text, which was less reliable for some statements.
