HuggingFaceAPIGenerator
Generates text using Hugging Face APIs.
Basic Information
- Type:
haystack_integrations.generators.hugging_face_api.HuggingFaceAPIGenerator
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| prompt | str | A string representing the prompt. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function that is called when a new token is received from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for text generation. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| replies | List[str] | A dictionary with the generated replies and metadata. Both are lists of length n. - replies: A list of strings representing the generated replies. | |
| meta | List[Dict[str, Any]] | A dictionary with the generated replies and metadata. Both are lists of length n. - replies: A list of strings representing the generated replies. |
Overview
Bear with us while we're working on adding pipeline examples and most common components connections.
Generates text using Hugging Face APIs.
Use it with the following Hugging Face APIs:
Note: As of July 2025, the Hugging Face Inference API no longer offers generative models through the
text_generation endpoint. Generative models are now only available through providers supporting the
chat_completion endpoint. As a result, this component might no longer work with the Hugging Face Inference API.
Use the HuggingFaceAPIChatGenerator component, which supports the chat_completion endpoint.#### With self-hosted text generation inference
from haystack.components.generators import HuggingFaceAPIGenerator
generator = HuggingFaceAPIGenerator(api_type="text_generation_inference",
api_params={"url": "http://localhost:8080"})
result = generator.run(prompt="What's Natural Language Processing?")
print(result)
With the free serverless inference API
Be aware that this example might not work as the Hugging Face Inference API no longer offer models that support the
text_generation endpoint. Use the HuggingFaceAPIChatGenerator for generative models through the
chat_completion endpoint.
from haystack.components.generators import HuggingFaceAPIGenerator
from haystack.utils import Secret
generator = HuggingFaceAPIGenerator(api_type="serverless_inference_api",
api_params={"model": "HuggingFaceH4/zephyr-7b-beta"},
token=Secret.from_token("<your-api-key>"))
result = generator.run(prompt="What's Natural Language Processing?")
print(result)
Usage Example
components:
HuggingFaceAPIGenerator:
type: haystack.components.generators.hugging_face_api.HuggingFaceAPIGenerator
init_parameters:
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| api_type | Union[HFGenerationAPIType, str] | The type of Hugging Face API to use. Available types: - text_generation_inference: See TGI. - inference_endpoints: See Inference Endpoints. - serverless_inference_api: See Serverless Inference API. This might no longer work due to changes in the models offered in the Hugging Face Inference API. Please use the HuggingFaceAPIChatGenerator component instead. | |
| api_params | Dict[str, str] | A dictionary with the following keys: - model: Hugging Face model ID. Required when api_type is SERVERLESS_INFERENCE_API. - url: URL of the inference endpoint. Required when api_type is INFERENCE_ENDPOINTS or TEXT_GENERATION_INFERENCE. - Other parameters specific to the chosen API type, such as timeout, headers, provider etc. | |
| token | Optional[Secret] | Secret.from_env_var(['HF_API_TOKEN', 'HF_TOKEN'], strict=False) | The Hugging Face token to use as HTTP bearer authorization. Check your HF token in your account settings. |
| generation_kwargs | Optional[Dict[str, Any]] | None | A dictionary with keyword arguments to customize text generation. Some examples: max_new_tokens, temperature, top_k, top_p. For details, see Hugging Face documentation for more information. |
| stop_words | Optional[List[str]] | None | An optional list of strings representing the stop words. |
| streaming_callback | Optional[StreamingCallbackT] | None | An optional callable for handling streaming responses. |
Run Method Parameters
These are the parameters you can configure for the component's run() method. This means you can pass these parameters at query time through the API, in Playground, or when running a job. For details, see Modify Pipeline Parameters at Query Time.
| Parameter | Type | Default | Description |
|---|---|---|---|
| prompt | str | A string representing the prompt. | |
| streaming_callback | Optional[StreamingCallbackT] | None | A callback function that is called when a new token is received from the stream. |
| generation_kwargs | Optional[Dict[str, Any]] | None | Additional keyword arguments for text generation. |
Was this page helpful?