FastembedTextEmbedder
FastembedTextEmbedder computes string embedding using fastembed embedding models.
Basic Information
- Type:
haystack_integrations.fastembed.src.haystack_integrations.components.embedders.fastembed.fastembed_text_embedder.FastembedTextEmbedder
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| text | str | A string to embed. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| embedding | List[float] | A dictionary with the following keys: - embedding: A list of floats representing the embedding of the input text. |
Overview
Work in Progress
Bear with us while we're working on adding pipeline examples and most common components connections.
FastembedTextEmbedder computes string embedding using fastembed embedding models.
Usage example:
from haystack_integrations.components.embedders.fastembed import FastembedTextEmbedder
text = ("It clearly says online this will work on a Mac OS system. "
"The disk comes and it does not, only Windows. Do Not order this if you have a Mac!!")
text_embedder = FastembedTextEmbedder(
model="BAAI/bge-small-en-v1.5"
)
text_embedder.warm_up()
embedding = text_embedder.run(text)["embedding"]
Usage Example
components:
FastembedTextEmbedder:
type: fastembed.src.haystack_integrations.components.embedders.fastembed.fastembed_text_embedder.FastembedTextEmbedder
init_parameters:
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | str | BAAI/bge-small-en-v1.5 | Local path or name of the model in Fastembed's model hub, such as BAAI/bge-small-en-v1.5 |
| cache_dir | Optional[str] | None | The path to the cache directory. Can be set using the FASTEMBED_CACHE_PATH env variable. Defaults to fastembed_cache in the system's temp directory. |
| threads | Optional[int] | None | The number of threads single onnxruntime session can use. Defaults to None. |
| prefix | str | A string to add to the beginning of each text. | |
| suffix | str | A string to add to the end of each text. | |
| progress_bar | bool | True | If True, displays progress bar during embedding. |
| parallel | Optional[int] | None | If > 1, data-parallel encoding will be used, recommended for offline encoding of large datasets. If 0, use all available cores. If None, don't use data-parallel processing, use default onnxruntime threading instead. |
| local_files_only | bool | False | If True, only use the model files in the cache_dir. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| text | str | A string to embed. |
Was this page helpful?