Product Release Notes - March 2026

Cloudera Octopai Data Lineage now supports the Trino federated query engine for big data. Trino users can access distributed data while keeping governance and lineage visibility across federated query environments.

Overview

Cloudera Octopai Data Lineage now supports the Trino federated query engine for big data. Trino users can query data in place across hybrid and multi-cloud environments while preserving governance and lineage. Combining Trino with the automated lineage of Cloudera Octopai gives you cross-system visibility into federated query workloads.

Primary benefits and use cases

The integration of Cloudera Octopai with Trino provides the following primary benefits and use cases:

Interoperability and federated access

The Trino integration in Cloudera Octopai enables customers to federate queries and securely access data from both Cloudera and non-Cloudera engines. This capability allows organizations to query data in place, improving interoperability within extended data environments.

Comprehensive visibility across the data estate

While Trino connects disparate and decentralized data sources, Cloudera Octopai keeps federated querying visible: you get cross-system visibility across the data estate, end-to-end data lineage, and metadata across connected systems.

End-to-end impact and root cause analysis

Querying data across complex, hybrid environments naturally complicates troubleshooting and change management. Some federated query tools limit lineage to the first level on immediate systems; Cloudera Octopai delivers multi-layered traceability. Data teams can perform impact analysis by tracing upstream to original source systems and downstream to BI reports and AI models that Trino supports, using a transparent, automated workflow.

Example:

If a data engineer needs to drop or rename a column in an on-premises Oracle database, they can trace the lineage to see that this specific column feeds a Trino query powering a critical executive Power BI dashboard. They can proactively update the downstream queries before making the change, preventing dashboard failures and data downtime.

Discovery and Business Glossary

By linking technical metadata from federated sources to standardized business terms, the integration aligns IT and business teams on the same definitions. Users can find the data they need and understand its business context before running a Trino query, using a single source of truth for terms and certified objects.

Example:

A business analyst wanting to report on "Customer Lifetime Value" no longer has to guess which of the dozens of cryptic, federated tables (for example, cust_ltv_v2 versus c_val_final) to query. They simply search the Cloudera Octopai business glossary for the approved term and are immediately directed to the exact, certified technical tables they should query using Trino.

A trusted foundation for AI adoption

For AI and advanced analytics initiatives to succeed, organizations must fundamentally trust the data feeding their models. Trino delivers the expansive data access required for AI, and Cloudera Octopai provides the crucial transparency and audit trails needed to verify data origins and transformations. This ensures that AI models are built on reliable, governed data.

Example:

When deploying a new AI or machine learning model for fraud detection, compliance officers can use the lineage graph to audit the exact Trino data pipelines feeding the model. They can prove to internal risk boards and external regulators that the model relies exclusively on governed, bias-checked source data—supporting the safe launch of the AI initiative.

Conclusion

The integration of Cloudera Octopai for Trino creates a robust, open data fabric that improves time-to-insight and supports modern AI and analytics.

For more information, see Configuring Octopai Connector for Trino.