Cross system lineage

Cross System Lineage is a feature or capability provided by Cloudera Octopai, a data management and metadata management platform. Cloudera Octopai is designed to help organizations understand, govern, and optimize their data assets across various systems and platforms.

Figure 1. Visualized data flow
Cross system lineage example of a visualized data flow illustrating how data moves from its source to its destination, traversing through multiple systems, applications, and processes

Cross System Lineage specifically focuses on tracking and visualizing the flow of data across different systems within an organization. It provides a comprehensive view of how data moves from its source to its destination, traversing through multiple systems, applications, and processes.

With Cross System Lineage, Cloudera Octopai enables users to gain insights into the end-to-end data lineage, regardless of the complexity of the data ecosystem. It allows users to trace the data path across systems such as databases, data warehouses, data lakes, Extract, Transform, Load (ETL) processes, Business Intelligence (BI) tools, and more.

Cross System Lineage has the following benefits:
  • Understanding data flow – Users can track the flow of data from its origin to its final destination, providing a clear understanding of how data is transformed and used throughout the organization.
  • Impact analysis – Cross System Lineage helps users identify the impact of changes or issues in one system on downstream systems. It allows organizations to assess the potential consequences of modifications, ensuring data integrity and minimizing risks.
  • Compliance and governance – By providing visibility into the movement of data across systems, Cloudera Octopai Cross System Lineage assists in meeting compliance requirements and data governance initiatives. It helps organizations maintain data lineage documentation and ensure data accuracy, privacy, and security.
  • Troubleshooting and root cause analysis – When data-related issues occur, Cross System Lineage aids in identifying the root causes and troubleshooting effectively. It enables users to pinpoint where problems arise within the data flow and take appropriate actions to resolve them.

Overall, Cross System Lineage offered by Cloudera Octopai enhances data understanding, enables efficient data management, and facilitates informed decision-making across complex data landscapes by visualizing the end-to-end flow of data across systems.

Colorful cross system lineage bubble types and their connections.

Clicking on each Data Object Bubble will show a Radial button with the following Cross System Lineage functionalities:

Figure 2. Data object bubble functionalities
One data object bubble with its functionalities numbered from one to seven. Number three is enlarged with its subfunctionalities.
  1. Hop on to Inner View – Internal lineage view of the component
  2. Lineage Expansion – Impact analysis
  3. More Actions – See or hide target and see or hide source
  4. Information – Component properties
  5. Lineage Expansion – Root cause analysis
  6. Lineage Focus – Change focus to this item
  7. Hop to Catalog Module – Automatic Data Catalog, if available
Figure 3. Data object bubbles with full circle and semi-circle
Three data object bubbles with possible expansions to both sides or to either to the left or right.

Enhanced Focused Path Analysis

The Cloudera Octopai focused path analysis tool offers better usability and clearer visual indications with the following enhancements:
  • Visual Indicators for Selected Objects – When analyzing cross-system data flows, any object selected for focused path analysis displays a visual indication. This makes it easier for users to identify which objects are part of the focused path.
  • Improved Object Selection – If an object cannot be selected for focused path analysis, it means the map is already reduced to that specific path. The map shows all objects going through the selected object and their connected objects.
  • Stable Map Filters – If your analysis is focused on a specific path, filters cannot be activated. This ensures the stability of the map, as any filter changes would trigger a map recalculation. For optimal results, Cloudera Octopai recommends configuring filters before applying focused path analysis.
Figure 4. Active and inactive data objects
Active data object with the other objects, unrelated to this specific path analysis, greyed out. The radial button is also displayed for the active object.
Figure 5. Focused cross system map
Cross system map with its connected objects is in focused path analysis mode with disabled filters.