HuggingFaceLocalGenerator
Generate text using models from Hugging Face that run locally.
Basic Information
- Type:
haystack.components.generators.hugging_face_local.HuggingFaceLocalGenerator - Components it can connect with:
PromptBuilder: Receives a prompt fromPromptBuilder.AnswerBuilder: Sends generated replies toAnswerBuilder.
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| prompt | str | A string representing the prompt. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function that is called when a new token is received from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for text generation. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| replies | List[str] | A list of strings representing the generated replies. |
Overview
HuggingFaceLocalGenerator generates text using a Hugging Face model that runs locally. Running LLMs locally may need powerful hardware depending on the model and its parameter count.
This component is designed for text generation, not for chat. If you want to use Hugging Face LLMs for chat, use HuggingFaceLocalChatGenerator instead.
The component supports two task types:
text-generation: Supported by decoder models like GPT.text2text-generation: Supported by encoder-decoder models like T5.
Authorization
For remote files authorization, the component uses a HF_API_TOKEN environment variable by default. Connect Haystack Platform with Hugging Face first. For detailed instructions, see Use Hugging Face Models.
Usage Example
This query pipeline uses HuggingFaceLocalGenerator for local text generation:
components:
bm25_retriever:
type: haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
hosts:
index: 'default'
max_chunk_bytes: 104857600
embedding_dim: 768
return_embedding: false
method:
mappings:
settings:
create_index: true
http_auth:
use_ssl:
verify_certs:
timeout:
top_k: 10
fuzziness: 0
PromptBuilder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: |
Given the following information, answer the question.
Context:
{% for document in documents %}
{{ document.content }}
{% endfor %}
Question: {{ query }}
required_variables:
variables:
HuggingFaceLocalGenerator:
type: haystack.components.generators.hugging_face_local.HuggingFaceLocalGenerator
init_parameters:
model: google/flan-t5-large
task: text2text-generation
device:
token:
type: env_var
env_vars:
- HF_API_TOKEN
- HF_TOKEN
strict: false
generation_kwargs:
max_new_tokens: 100
temperature: 0.9
huggingface_pipeline_kwargs:
stop_words:
streaming_callback:
AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
connections:
- sender: bm25_retriever.documents
receiver: PromptBuilder.documents
- sender: PromptBuilder.prompt
receiver: HuggingFaceLocalGenerator.prompt
- sender: HuggingFaceLocalGenerator.replies
receiver: AnswerBuilder.replies
- sender: bm25_retriever.documents
receiver: AnswerBuilder.documents
inputs:
query:
- bm25_retriever.query
- PromptBuilder.query
- AnswerBuilder.query
outputs:
answers: AnswerBuilder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | str | google/flan-t5-base | The Hugging Face text generation model name or path. |
| task | Optional[Literal['text-generation', 'text2text-generation']] | None | The task for the Hugging Face pipeline. Options: text-generation (decoder models like GPT), text2text-generation (encoder-decoder models like T5). If not specified, the component infers the task from the model name. |
| device | Optional[ComponentDevice] | None | The device for loading the model. If None, automatically selects the default device. If a device or device map is specified in huggingface_pipeline_kwargs, it overrides this parameter. |
| token | Optional[Secret] | Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False) | The token to use as HTTP bearer authorization for remote files. If the token is specified in huggingface_pipeline_kwargs, this parameter is ignored. |
| generation_kwargs | Optional[Dict[str, Any]] | None | A dictionary with keyword arguments to customize text generation: max_length, max_new_tokens, temperature, top_k, top_p. See Hugging Face documentation. |
| huggingface_pipeline_kwargs | Optional[Dict[str, Any]] | None | Dictionary with keyword arguments to initialize the Hugging Face pipeline. These override model, task, device, and token init parameters. See Hugging Face documentation. |
| stop_words | Optional[List[str]] | None | If the model generates a stop word, the generation stops. If you provide this parameter, don't specify stopping_criteria in generation_kwargs. |
| streaming_callback | Optional[StreamingCallbackT] | None | An optional callable for handling streaming responses. |
Run Method Parameters
These are the parameters you can configure for the run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.
| Parameter | Type | Default | Description |
|---|---|---|---|
| prompt | str | A string representing the prompt. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function that is called when a new token is received from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for text generation. |
Was this page helpful?