PutQdrant

Description:

Publishes JSON data to Qdrant. The Incoming data must be in single JSON per Line format, each with two keys: 'text' and 'metadata'. The text must be a string, while metadata must be a map with strings for values. Any additional fields will be ignored.

Tags:

qdrant, vector, vectordb, vectorstore, embeddings, ai, artificial intelligence, ml, machine learning, text, LLM

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display Name	API Name	Default Value	Description
Embedding Model	Embedding Model	OpenAI Model	Specifies which embedding model should be used in order to create embeddings from incoming Documents. Default model is OpenAI.
Document ID Field Name	Document ID Field Name		Specifies the name of the field in the 'metadata' element of each document where the document's ID can be found. If not specified, a UUID will be generated based on the FlowFile's filename and an incremental number. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Force Recreate Collection	Force Recreate Collection	False	Specifies whether to recreate the collection if it already exists. Essentially clearing the existing data.
Similarity Metric	Similarity Metric	COSINE	Specifies the similarity metric when creating the collection.
Collection Name	Collection Name	apache-nifi	The name of the Qdrant collection to use. Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Qdrant URL	Qdrant URL	http://localhost:6333	The fully qualified URL to the Qdrant instance.
Qdrant API Key	Qdrant API Key		The API Key to use in order to authentication with Qdrant. Can be empty. Sensitive Property: true
Prefer gRPC	Prefer gRPC	False	Specifies whether to use gRPC for interfacing with Qdrant.
Use HTTPS	Use HTTPS	False	Specifies whether to TLS(HTTPS) while interfacing with Qdrant.
HuggingFace API Key	HuggingFace API Key		The API Key for interacting with HuggingFace Sensitive Property: true
OpenAI API Key	OpenAI API Key		The API Key for OpenAI in order to create embeddings. Sensitive Property: true
OpenAI Model	OpenAI Model	text-embedding-ada-002	The name of the OpenAI model to use
HuggingFace Model	HuggingFace Model	sentence-transformers/all-MiniLM-L6-v2	The name of the HuggingFace model to use

Example Use Cases:

Use Case:

Create embeddings that semantically represent text content and upload to Qdrant - https://qdrant.tech/

Notes:

This processor assumes that the data has already been formatted in JSONL format with the text to store in Qdrant provided in the 'text' field.

Keywords:

qdrant, embedding, vector, text, vectorstore, insert

Configuration:

Configure 'Collection Name' to the name of the Qdrant collection to use.

Configure 'Qdrant URL' to the fully qualified URL of the Qdrant instance.

Configure 'Qdrant API Key' to the API Key to use in order to authenticate with Qdrant.

Configure 'Prefer gRPC' to True if you want to use gRPC for interfacing with Qdrant.

Configure 'Use HTTPS' to True if you want to use TLS(HTTPS) while interfacing with Qdrant.

Configure 'Embedding Model' to indicate whether OpenAI embeddings should be used or a HuggingFace embedding model should be used: 'Hugging Face Model' or 'OpenAI Model'

Configure 'HuggingFace API Key' or 'OpenAI API Key', depending on the chosen Embedding Model.

Configure 'HuggingFace Model' or 'OpenAI Model' to the name of the model to use.

Configure 'Force Recreate Collection' to True if you want to recreate the collection if it already exists.

Configure 'Similarity Metric' to the similarity metric to use when querying Qdrant.

If the documents to send to Qdrant contain a unique identifier(UUID), set the 'Document ID Field Name' property to the name of the field that contains the document ID.

This property can be left blank, in which case a UUID will be generated based on the FlowFile's filename.