QdrantHybridRetriever

A component for retrieving documents from an QdrantDocumentStore using both dense and sparse vectors

Basic Information

Type: haystack_integrations.components.retrievers.qdrant.retriever.QdrantHybridRetriever

Inputs

Parameter	Type	Default	Description
query_embedding	List[float]		Dense embedding of the query.
query_sparse_embedding	SparseEmbedding		Sparse embedding of the query.
filters	Optional[Union[Dict[str, Any], models.Filter]]	None	Filters applied to the retrieved Documents. The way runtime filters are applied depends on the `filter_policy` chosen at retriever initialization. See init method docstring for more details.
top_k	Optional[int]	None	The maximum number of documents to return. If using `group_by` parameters, maximum number of groups to return.
return_embedding	Optional[bool]	None	Whether to return the embedding of the retrieved Documents.
score_threshold	Optional[float]	None	A minimal score threshold for the result. Score of the returned result might be higher or smaller than the threshold depending on the Distance function used. E.g. for cosine similarity only higher scores will be returned.
group_by	Optional[str]	None	Payload field to group by, must be a string or number field. If the field contains more than 1 value, all values will be used for grouping. One point can be in multiple groups.
group_size	Optional[int]	None	Maximum amount of points to return per group. Default is 3.

Outputs

Parameter	Type	Default	Description
documents	List[Document]		The retrieved documents.

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

A component for retrieving documents from an QdrantDocumentStore using both dense and sparse vectors and fusing the results using Reciprocal Rank Fusion.

Usage example:

from haystack_integrations.components.retrievers.qdrant import QdrantHybridRetriever
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
from haystack.dataclasses import Document, SparseEmbedding

document_store = QdrantDocumentStore(
    ":memory:",
    use_sparse_embeddings=True,
    recreate_index=True,
    return_embedding=True,
    wait_result_from_api=True,
)

doc = Document(content="test",
               embedding=[0.5]*768,
               sparse_embedding=SparseEmbedding(indices=[0, 3, 5], values=[0.1, 0.5, 0.12]))

document_store.write_documents([doc])

retriever = QdrantHybridRetriever(document_store=document_store)
embedding = [0.1]*768
sparse_embedding = SparseEmbedding(indices=[0, 1, 2, 3], values=[0.1, 0.8, 0.05, 0.33])
retriever.run(query_embedding=embedding, query_sparse_embedding=sparse_embedding)

Usage Example

components:
  QdrantHybridRetriever:
    type: qdrant.src.haystack_integrations.components.retrievers.qdrant.retriever.QdrantHybridRetriever
    init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

Parameter	Type	Default	Description
document_store	QdrantDocumentStore		An instance of QdrantDocumentStore.
filters	Optional[Union[Dict[str, Any], models.Filter]]	None	A dictionary with filters to narrow down the search space.
top_k	int	10	The maximum number of documents to retrieve. If using `group_by` parameters, maximum number of groups to return.
return_embedding	bool	False	Whether to return the embeddings of the retrieved Documents.
filter_policy	Union[str, FilterPolicy]	FilterPolicy.REPLACE	Policy to determine how filters are applied.
score_threshold	Optional[float]	None	A minimal score threshold for the result. Score of the returned result might be higher or smaller than the threshold depending on the Distance function used. E.g. for cosine similarity only higher scores will be returned.
group_by	Optional[str]	None	Payload field to group by, must be a string or number field. If the field contains more than 1 value, all values will be used for grouping. One point can be in multiple groups.
group_size	Optional[int]	None	Maximum amount of points to return per group. Default is 3.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

Parameter	Type	Default	Description
query_embedding	List[float]		Dense embedding of the query.
query_sparse_embedding	SparseEmbedding		Sparse embedding of the query.
filters	Optional[Union[Dict[str, Any], models.Filter]]	None	Filters applied to the retrieved Documents. The way runtime filters are applied depends on the `filter_policy` chosen at retriever initialization. See init method docstring for more details.
top_k	Optional[int]	None	The maximum number of documents to return. If using `group_by` parameters, maximum number of groups to return.
return_embedding	Optional[bool]	None	Whether to return the embedding of the retrieved Documents.
score_threshold	Optional[float]	None	A minimal score threshold for the result. Score of the returned result might be higher or smaller than the threshold depending on the Distance function used. E.g. for cosine similarity only higher scores will be returned.
group_by	Optional[str]	None	Payload field to group by, must be a string or number field. If the field contains more than 1 value, all values will be used for grouping. One point can be in multiple groups.
group_size	Optional[int]	None	Maximum amount of points to return per group. Default is 3.

Was this page helpful?

Basic Information​

Inputs​

Outputs​

Overview​

Usage Example​

Parameters​

Init Parameters​

Run Method Parameters​