October 31, 2024
This release (2.9.0-b383) of Cloudera DataFlow increases developer productivity through the introduction of Parameter Groups which can be shared between flow drafts. Developers can now also create NiFi 2.0 flows in the Designer leveraging new Cloudera exclusive processors for building RAG data pipelines. Deployments can now be configured with a Prometheus endpoint that allows scraping Apache NiFi metrics. Cloudera DataFlow’s service and deployment events and alerts now support in-app and email notifications.
What's new
- Latest NiFi version
-
Flow Deployments and Test Sessions now support the latest Apache NiFi 1.27 release.
- Build NiFi 2.0 flows in Flow Designer [Technical Preview]
-
You can now select NiFi 2.0 when creating drafts and start test sessions including the ability to configure your test session to use your custom Python based processors.
New Cloudera-exclusive AI processors for NiFi 2 [Technical Preview]
You can now implement RAG pipelines by using new processors to parse, chunk and vectorize data, bringing context to their LLMs. The following processors are now available with NiFi 2:
-
PartitionPdf, PartitionHtml, PartitionText, PartitionDocx, PartitionCsv
-
ChunkData
-
EmbedData
-
InsertToMilvus, LexicalQueryMilvus, VectorQueryMilvus
-
PutChroma, QueryChroma
-
PutOpenSearchVector, QueryOpenSearchVector
-
- Bedrock Parameter Groups
-
You can now centrally define and manage parameter groups in a workspace and re-use them for multiple drafts, eliminating tedious copy-and-pasting of parameters and their values.
- New Resources page
-
Users can now easily view and manage all their workspace resources like deployments, drafts, parameter groups, inbound connections, custom NAR/Python configurations in a single place.
- Notifications via App and Email for Cloudera DataFlow service and deployment events
-
You can now receive real-time notifications for all events related to a Cloudera DataFlow Service and its deployments through the Cloudera Management Console, under the Notifications tab, and through email.
For more information, see Setting up service and deployment notifications
- NiFi metrics can now be exposed via a Prometheus endpoint
-
You can now configure deployments to expose NiFi metrics through a Prometheus endpoint. Once set up, you can configure your Prometheus instances to scrape these endpoints, consume relevant metrics and build custom dashboards.
For more information, see Configuring access for NiFi metrics scraping.
- New ReadyFlows
-
-
ADLS to Pinecone
-
S3 to Pinecone
-
ADLS to Milvus
-
S3 to Milvus
-
RAG Query Milvus
-
- New Kubernetes version support
-
Cloudera DataFlow now supports EKS/AKS 1.29
Changes and improvements
-
As part of the upgrade process to Cloudera DataFlow 2.9.0, the Azure Postgres database is migrated from a single server to a flexible server deployment.
-
Improved asset handling for deployments makes deployment creation more robust in cases where many deployments are created at the same time.
-
Kubernetes scale up events could result in the Cloudera DataFlow application container being rescheduled causing Cloudera DataFlow to become unavailable. Additional restrictions for rescheduling the Cloudera DataFlow application were added to avoid downtime.
-
Dependencies have been updated to Java 21, Spring 6 and Spring Boot 3.
Fixed issues
- NiFi cluster failed to auto scale with a UDP inbound connection configured
- NiFi node failed to start up due to custom Kubernetes cluster domain name
- MiniFi logging failed to clean up a full content volume
- Vault failed to start up due to insufficient wait in its postStart script
- Auto scaling driven by flow metrics did not kick in