Product Release Notes - Dec 31th, 2024

Learn about the updated naming convention from Unknown to Inferred for objects logically deduced by the Cloudera Octopai Data Lineage engine.

Introduction of Inferred Objects in Cloudera Octopai Data Lineage

The naming convention is now updated from Unknown to Inferred for objects logically deduced by the Cloudera Octopai Data Lineage engine. These Inferred objects are derived from transformations and dependencies analyzed during lineage creation, even when the source or target was not explicitly harvested. This terminology change provides greater clarity and highlights Cloudera Octopai commitment to delivering complete lineage flows across systems.

Use cases for Inferred objects:

  1. Cross-System Data Lineage Analysis Easily identify and analyze intermediate or unharvested data flows that play a crucial role in understanding dependencies across different systems.

  2. End-to-End Lineage Investigation Gain a comprehensive view of how unharvested data objects are referenced and transformed, enabling deeper insights into the overall data lifecycle.

  3. Data Transformation Tracking Understand how inferred objects impact downstream data by tracking transformations applied to them, helping to assess dependencies and risks.

  4. Auditing and Compliance Leverage inferred objects to provide additional context for audit trails, ensuring that even unharvested elements are accounted for in compliance reporting.

  5. Impact Analysis for Unharvested Systems Perform impact analysis on unharvested data objects inferred within the lineage to better prepare for system changes or migrations.

  6. Metadata Enrichment Use inferred objects to enrich metadata, improving the quality of data catalogs and lineage for systems with limited direct connectivity.

By incorporating these use cases, you can better understand the significance of inferred objects and how they contribute to a full data lineage flow. This unique capability supports both operational and strategic decision-making processes in your data ecosystem.