ReadyFlow: ADLS to Chroma DB [Technical Preview]

You can use the ADLS to Chroma DB [Technical Preview] Readyflow to consume PDF documents from ADLS, vectorize them using an OpenAI model and write the results to Chroma DB.

This ReadyFlow consumes PDF documents from a source ADLS location, partitions the PDFs, chunks the data, vectorizes the data using an OpenAI embedding model, and stores the results in Chroma DB. The default OpenAI model is 'text-embedding-ada-002'. An OpenAI API key and a Chroma Server Authentication Token are required to run this flow. Define a KPI on the failure_WriteToChroma connection to monitor failed write operations.

ADLS to Chroma DB [Technical Preview] ReadyFlow details
Source Cloudera Managed ADLS
Source Format PDF
Destination Chroma DB
Destination Format Vector DB