PutPinecone

Description:

Publishes JSON data to Pinecone. The Incoming data must be in single JSON per Line format, each with two keys: 'text' and 'metadata'. The text must be a string, while metadata must be a map with strings for values. Any additional fields will be ignored.

Tags:

pinecone, vector, vectordb, vectorstore, embeddings, ai, artificial intelligence, ml, machine learning, text, LLM

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueDescription
Embedding ModelEmbedding ModelOpenAI ModelSpecifies which embedding model should be used in order to create embeddings from incoming Documents. Default model is OpenAI.
Pinecone API KeyPinecone API KeyThe API Key to use in order to authentication with Pinecone
Sensitive Property: true
HuggingFace API KeyHuggingFace API KeyThe API Key for interacting with HuggingFace
Sensitive Property: true
OpenAI API KeyOpenAI API KeyThe API Key for OpenAI in order to create embeddings
Sensitive Property: true
Pinecone EnvironmentPinecone EnvironmentThe name of the Pinecone Environment. This can be found in the Pinecone console next to the API Key.
Index NameIndex NameThe name of the Pinecone index.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Text KeyText KeytextThe key in the document that contains the text to create embeddings for.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
NamespaceNamespaceThe name of the Pinecone Namespace to put the documents to.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
Document ID Field NameDocument ID Field NameSpecifies the name of the field in the 'metadata' element of each document where the document's ID can be found. If not specified, an ID will be generated based on the FlowFile's filename and a one-up number.
Supports Expression Language: true (will be evaluated using flow file attributes and Environment variables)
OpenAI ModelOpenAI Modeltext-embedding-ada-002The name of the OpenAI model to use
HuggingFace ModelHuggingFace Modelsentence-transformers/all-MiniLM-L6-v2The name of the HuggingFace model to use

Example Use Cases:

Use Case:

Create vectors/embeddings that represent text content and send the vectors to Pinecone

Notes:

This use case assumes that the data has already been formatted in JSONL format with the text to store in Pinecone provided in the 'text' field.

Keywords:

pinecone, embedding, vector, text, vectorstore, insert

Configuration:

Configure the 'Pinecone API Key' to the appropriate authentication token for interacting with Pinecone.

Configure 'Embedding Model' to indicate whether OpenAI embeddings should be used or a HuggingFace embedding model should be used: 'Hugging Face Model' or 'OpenAI Model'

Configure the 'OpenAI API Key' or 'HuggingFace API Key', depending on the chosen Embedding Model.

Set 'Pinecone Environment' to the name of your Pinecone environment

Set 'Index Name' to the name of your Pinecone Index.

Set 'Namespace' to appropriate namespace, or leave it empty to use the default Namespace.

If the documents to send to Pinecone contain a unique identifier, set the 'Document ID Field Name' property to the name of the field that contains the document ID.

This property can be left blank, in which case a unique ID will be generated based on the FlowFile's filename.



Use Case:

Update vectors/embeddings in Pinecone

Notes:

This use case assumes that the data has already been formatted in JSONL format with the text to store in Pinecone provided in the 'text' field.

Keywords:

pinecone, embedding, vector, text, vectorstore, update, upsert

Configuration:

Configure the 'Pinecone API Key' to the appropriate authentication token for interacting with Pinecone.

Configure 'Embedding Model' to indicate whether OpenAI embeddings should be used or a HuggingFace embedding model should be used: 'Hugging Face Model' or 'OpenAI Model'

Configure the 'OpenAI API Key' or 'HuggingFace API Key', depending on the chosen Embedding Model.

Set 'Pinecone Environment' to the name of your Pinecone environment

Set 'Index Name' to the name of your Pinecone Index.

Set 'Namespace' to appropriate namespace, or leave it empty to use the default Namespace.

Set the 'Document ID Field Name' property to the name of the field that contains the identifier of the document in Pinecone to update.