Skip to main content

HuggingFaceAPIGenerator

Generates text using Hugging Face APIs.

Basic Information

  • Type: haystack_integrations.generators.hugging_face_api.HuggingFaceAPIGenerator

Inputs

ParameterTypeDefaultDescription
promptstrA string representing the prompt.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that is called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for text generation.

Outputs

ParameterTypeDefaultDescription
repliesList[str]A dictionary with the generated replies and metadata. Both are lists of length n. - replies: A list of strings representing the generated replies.
metaList[Dict[str, Any]]A dictionary with the generated replies and metadata. Both are lists of length n. - replies: A list of strings representing the generated replies.

Overview

Work in Progress

Bear with us while we're working on adding pipeline examples and most common components connections.

Generates text using Hugging Face APIs.

Use it with the following Hugging Face APIs:

Note: As of July 2025, the Hugging Face Inference API no longer offers generative models through the text_generation endpoint. Generative models are now only available through providers supporting the chat_completion endpoint. As a result, this component might no longer work with the Hugging Face Inference API. Use the HuggingFaceAPIChatGenerator component, which supports the chat_completion endpoint.#### With self-hosted text generation inference

from haystack.components.generators import HuggingFaceAPIGenerator

generator = HuggingFaceAPIGenerator(api_type="text_generation_inference",
api_params={"url": "http://localhost:8080"})

result = generator.run(prompt="What's Natural Language Processing?")
print(result)

With the free serverless inference API

Be aware that this example might not work as the Hugging Face Inference API no longer offer models that support the text_generation endpoint. Use the HuggingFaceAPIChatGenerator for generative models through the chat_completion endpoint.

from haystack.components.generators import HuggingFaceAPIGenerator
from haystack.utils import Secret

generator = HuggingFaceAPIGenerator(api_type="serverless_inference_api",
api_params={"model": "HuggingFaceH4/zephyr-7b-beta"},
token=Secret.from_token("<your-api-key>"))

result = generator.run(prompt="What's Natural Language Processing?")
print(result)

Usage Example

components:
HuggingFaceAPIGenerator:
type: haystack.components.generators.hugging_face_api.HuggingFaceAPIGenerator
init_parameters:

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
api_typeUnion[HFGenerationAPIType, str]The type of Hugging Face API to use. Available types: - text_generation_inference: See TGI. - inference_endpoints: See Inference Endpoints. - serverless_inference_api: See Serverless Inference API. This might no longer work due to changes in the models offered in the Hugging Face Inference API. Please use the HuggingFaceAPIChatGenerator component instead.
api_paramsDict[str, str]A dictionary with the following keys: - model: Hugging Face model ID. Required when api_type is SERVERLESS_INFERENCE_API. - url: URL of the inference endpoint. Required when api_type is INFERENCE_ENDPOINTS or TEXT_GENERATION_INFERENCE. - Other parameters specific to the chosen API type, such as timeout, headers, provider etc.
tokenOptional[Secret]Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False)The Hugging Face token to use as HTTP bearer authorization. Check your HF token in your account settings.
generation_kwargsOptional[Dict[str, Any]]NoneA dictionary with keyword arguments to customize text generation. Some examples: max_new_tokens, temperature, top_k, top_p. For details, see Hugging Face documentation for more information.
stop_wordsOptional[List[str]]NoneAn optional list of strings representing the stop words.
streaming_callbackOptional[StreamingCallbackT]NoneAn optional callable for handling streaming responses.

Run Method Parameters

These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.

ParameterTypeDefaultDescription
promptstrA string representing the prompt.
streaming_callbackOptional[StreamingCallbackT]NoneA callback function that is called when a new token is received from the stream.
generation_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for text generation.