Skip to main content

HuggingFaceLocalGenerator

Generates text using models from Hugging Face that run locally.

Basic Information

  • Type: haystack_integrations.generators.hugging_face_local.HuggingFaceLocalGenerator

Inputs

ParameterTypeDefaultDescription
promptstrA string representing the prompt.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that is called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for text generation.

Outputs

ParameterTypeDefaultDescription
repliesList[str]A dictionary containing the generated replies. - replies: A list of strings representing the generated replies.

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

Generates text using models from Hugging Face that run locally.

LLMs running locally may need powerful hardware.

Usage Example

components:
HuggingFaceLocalGenerator:
type: components.generators.hugging_face_local.HuggingFaceLocalGenerator
init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
modelstrgoogle/flan-t5-baseThe Hugging Face text generation model name or path.
taskOptional[Literal['text-generation', 'text2text-generation']]NoneThe task for the Hugging Face pipeline. Possible options: - text-generation: Supported by decoder models, like GPT. - text2text-generation: Supported by encoder-decoder models, like T5. If the task is specified in huggingface_pipeline_kwargs, this parameter is ignored. If not specified, the component calls the Hugging Face API to infer the task from the model name.
deviceOptional[ComponentDevice]NoneThe device for loading the model. If None, automatically selects the default device. If a device or device map is specified in huggingface_pipeline_kwargs, it overrides this parameter.
tokenOptional[Secret]Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False)The token to use as HTTP bearer authorization for remote files. If the token is specified in huggingface_pipeline_kwargs, this parameter is ignored.
generation_kwargsOptional[Dict[str, Any]]NoneA dictionary with keyword arguments to customize text generation. Some examples: max_length, max_new_tokens, temperature, top_k, top_p. See Hugging Face's documentation for more information: - customize-text-generation - transformers.GenerationConfig
huggingface_pipeline_kwargsOptional[Dict[str, Any]]NoneDictionary with keyword arguments to initialize the Hugging Face pipeline for text generation. These keyword arguments provide fine-grained control over the Hugging Face pipeline. In case of duplication, these kwargs override model, task, device, and token init parameters. For available kwargs, see Hugging Face documentation. In this dictionary, you can also include model_kwargs to specify the kwargs for model initialization: transformers.PreTrainedModel.from_pretrained
stop_wordsOptional[List[str]]NoneIf the model generates a stop word, the generation stops. If you provide this parameter, don't specify the stopping_criteria in generation_kwargs. For some chat models, the output includes both the new text and the original prompt. In these cases, make sure your prompt has no stop words.
streaming_callbackOptional[StreamingCallbackT]NoneAn optional callable for handling streaming responses.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
promptstrA string representing the prompt.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that is called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for text generation.