PineconeEmbeddingRetriever
Retrieves documents from the PineconeDocumentStore, based on their dense embeddings.
Basic Information
- Type:
haystack_integrations.pinecone.src.haystack_integrations.components.retrievers.pinecone.embedding_retriever.PineconeEmbeddingRetriever
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| query_embedding | List[float] | Embedding of the query. | |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization. See init method docstring for more details. |
| top_k | Optional[int] | None | Maximum number of Documents to return. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of Document similar to query_embedding. |
Overview
Work in Progress
Bear with us while we're working on adding pipeline examples and most common components connections.
Retrieves documents from the PineconeDocumentStore, based on their dense embeddings.
Usage example:
import os
from haystack.document_stores.types import DuplicatePolicy
from haystack import Document
from haystack import Pipeline
from haystack.components.embedders import SentenceTransformersTextEmbedder, SentenceTransformersDocumentEmbedder
from haystack_integrations.components.retrievers.pinecone import PineconeEmbeddingRetriever
from haystack_integrations.document_stores.pinecone import PineconeDocumentStore
os.environ["PINECONE_API_KEY"] = "YOUR_PINECONE_API_KEY"
document_store = PineconeDocumentStore(index="my_index", namespace="my_namespace", dimension=768)
documents = [Document(content="There are over 7,000 languages spoken around the world today."),
Document(content="Elephants have been observed to behave in a way that indicates..."),
Document(content="In certain places, you can witness the phenomenon of bioluminescent waves.")]
document_embedder = SentenceTransformersDocumentEmbedder()
document_embedder.warm_up()
documents_with_embeddings = document_embedder.run(documents)
document_store.write_documents(documents_with_embeddings.get("documents"), policy=DuplicatePolicy.OVERWRITE)
query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", SentenceTransformersTextEmbedder())
query_pipeline.add_component("retriever", PineconeEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query = "How many languages are there?"
res = query_pipeline.run({"text_embedder": {"text": query}})
assert res['retriever']['documents'][0].content == "There are over 7,000 languages spoken around the world today."
Usage Example
components:
PineconeEmbeddingRetriever:
type: pinecone.src.haystack_integrations.components.retrievers.pinecone.embedding_retriever.PineconeEmbeddingRetriever
init_parameters:
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| document_store | PineconeDocumentStore | The Pinecone Document Store. | |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved Documents. |
| top_k | int | 10 | Maximum number of Documents to return. |
| filter_policy | Union[str, FilterPolicy] | FilterPolicy.REPLACE | Policy to determine how filters are applied. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| query_embedding | List[float] | Embedding of the query. | |
| filters | Optional[Dict[str, Any]] | None | Filters applied to the retrieved Documents. The way runtime filters are applied depends on the filter_policy chosen at retriever initialization. See init method docstring for more details. |
| top_k | Optional[int] | None | Maximum number of Documents to return. |
Was this page helpful?