SagemakerGenerator Parameters

Learn how to customize SagemakerGenerator.

YAML Init Parameters

These are the parameters you can pass to this component in the pipeline YAML configuration:

Parameter

Type

Possible values

Description

model

String

The name of the Sagemaker model endpoint.
Required.

aws_access_key_id

Secret

Default: {"type": "env_var", "env_vars": ["AWS_ACCESS_KEY_ID"], "strict": False}

The Secret for AWS access key ID. Optional.

aws_secret_access_key

Secret

Default: {"type": "env_var", "env_vars": ["AWS_SECRET_ACCESS_KEY"], "strict": False}

The Secret for AWS secret access key.
Optional.

aws_session_token

Secret

Default: {"type": "env_var", "env_vars": ["AWS_SESSION_TOKEN"], "strict": False}

The Secret for AWS session token. Optional.

aws_region_name

Secret

Default: {"type": "env_var", "env_vars": ["AWS_DEFAULT_REGION"], "strict": False}

The Secret for AWS region name. If not provided, the default region is used. Optional.

aws_profile_name

Secret

Default: {"type": "env_var", "env_vars": ["AWS_PROFILE"], "strict": False}

The Secret for AWS profile name. If not provided, the default profile is used. Optional.

aws_custom_attributes

Dictionary of string and any

Default: None

Custom attributes to be passed to SageMaker, for example {"accept_eula": True} for Llama2 models. Optional.

generation_kwargs

Dictionary of string and any

Default: None

Additional keyword arguments for text generation. For a list of supported parameters, see your model's documentation page. For example, for Hugging Face models, see: Run inference.
Llama2 models support the following inference payload parameters:

  • max_new_tokens: The model generates text until the output length (excluding the input context length) reaches max_new_tokens. If specified, it must be a positive integer.
  • temperature: A float that controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If temperature=0, it results in greedy decoding. If specified, it must be a positive float.
  • top_p: In each step of text generation, sample from the smallest possible set of words with cumulative probability top_p. If specified, it must be a float between 0 and 1.
  • return_full_text: If True, input text is a part of the output generated text.
    Optional.


REST API Runtime Parameters

There are no runtime parameters you can pass to this component when making a request to the Search REST API endpoint.