October 31, 2024

This release (2.9.0-b383) of Cloudera DataFlow increases developer productivity through the introduction of Parameter Groups which can be shared between flow drafts. Developers can now also create NiFi 2.0 flows in the Designer leveraging new Cloudera exclusive processors for building RAG data pipelines. Deployments can now be configured with a Prometheus endpoint that allows scraping Apache NiFi metrics. Cloudera DataFlow’s service and deployment events and alerts now support in-app and email notifications.

What's new

Latest NiFi version

Flow Deployments and Test Sessions now support the latest Apache NiFi 1.27 release.

Build NiFi 2.0 flows in Flow Designer [Technical Preview]

You can now select NiFi 2.0 when creating drafts and start test sessions including the ability to configure your test session to use your custom Python based processors.

New Cloudera-exclusive AI processors for NiFi 2 [Technical Preview]

You can now implement RAG pipelines by using new processors to parse, chunk and vectorize data, bringing context to their LLMs. The following processors are now available with NiFi 2:

  • PartitionPdf, PartitionHtml, PartitionText, PartitionDocx, PartitionCsv

  • ChunkData

  • EmbedData

  • InsertToMilvus, LexicalQueryMilvus, VectorQueryMilvus

  • PutChroma, QueryChroma

  • PutOpenSearchVector, QueryOpenSearchVector

Bedrock Parameter Groups

You can now centrally define and manage parameter groups in a workspace and re-use them for multiple drafts, eliminating tedious copy-and-pasting of parameters and their values.

New Resources page

Users can now easily view and manage all their workspace resources like deployments, drafts, parameter groups, inbound connections, custom NAR/Python configurations in a single place.

Notifications via App and Email for Cloudera DataFlow service and deployment events

You can now receive real-time notifications for all events related to a Cloudera DataFlow Service and its deployments through the Cloudera Management Console, under the Notifications tab, and through email.

For more information, see Setting up service and deployment notifications

NiFi metrics can now be exposed via a Prometheus endpoint

You can now configure deployments to expose NiFi metrics through a Prometheus endpoint. Once set up, you can configure your Prometheus instances to scrape these endpoints, consume relevant metrics and build custom dashboards.

For more information, see Configuring access for NiFi metrics scraping.

New ReadyFlows
  • ADLS to Pinecone

  • S3 to Pinecone

  • ADLS to Milvus

  • S3 to Milvus

  • RAG Query Milvus

New Kubernetes version support

Cloudera DataFlow now supports EKS/AKS 1.29

Changes and improvements

  • As part of the upgrade process to Cloudera DataFlow 2.9.0, the Azure Postgres database is migrated from a single server to a flexible server deployment.

  • Improved asset handling for deployments makes deployment creation more robust in cases where many deployments are created at the same time.

  • Kubernetes scale up events could result in the Cloudera DataFlow application container being rescheduled causing Cloudera DataFlow to become unavailable. Additional restrictions for rescheduling the Cloudera DataFlow application were added to avoid downtime.

  • Dependencies have been updated to Java 21, Spring 6 and Spring Boot 3.

Fixed issues

  • NiFi cluster failed to auto scale with a UDP inbound connection configured
  • NiFi node failed to start up due to custom Kubernetes cluster domain name
  • MiniFi logging failed to clean up a full content volume
  • Vault failed to start up due to insufficient wait in its postStart script
  • Auto scaling driven by flow metrics did not kick in