ReadyFlow: ADLS to OpenSearch [Technical Preview]
You can use the ADLS to OpenSearch [Technical Preview] Readyflow to consume PDF documents from ADLS, vectorize them using an OpenAI model and write the results to OpenSearch.
This ReadyFlow consumes PDF documents from a source ADLS location, partitions the PDFs, chunks the data, vectorizes the data using an OpenAI embedding model, and stores the results in OpenSearch. The default OpenAI model is 'text-embedding-ada-002'. An OpenAI API key and an OpenSearch password are required to run this flow. Define a KPI on the failure_WriteToOpenSearch connection to monitor failed write operations.
| ADLS to OpenSearch [Technical Preview] ReadyFlow details | |
|---|---|
| Source | Cloudera Managed ADLS |
| Source Format | |
| Destination | OpenSearch |
| Destination Format | Vector DB |
