SagemakerGenerator Parameters

Learn how to customize SagemakerGenerator.

YAML Init Parameters

These are the parameters you can pass to this component in the pipeline YAML configuration:

ParameterTypePossible valuesDescription
modelStringThe name of the Sagemaker model endpoint.
Required.
aws_access_key_idSecretDefault: {"type": "env_var", "env_vars": ["AWS_ACCESS_KEY_ID"], "strict": False}The Secret for AWS access key ID. Optional.
aws_secret_access_keySecretDefault: {"type": "env_var", "env_vars": ["AWS_SECRET_ACCESS_KEY"], "strict": False}The Secret for AWS secret access key.
Optional.
aws_session_tokenSecretDefault: {"type": "env_var", "env_vars": ["AWS_SESSION_TOKEN"], "strict": False}The Secret for AWS session token. Optional.
aws_region_nameSecretDefault: {"type": "env_var", "env_vars": ["AWS_DEFAULT_REGION"], "strict": False}The Secret for AWS region name. If not provided, the default region is used. Optional.
aws_profile_nameSecretDefault: {"type": "env_var", "env_vars": ["AWS_PROFILE"], "strict": False}The Secret for AWS profile name. If not provided, the default profile is used. Optional.
aws_custom_attributesDictionary of string and anyDefault: NoneCustom attributes to be passed to SageMaker, for example {"accept_eula": True} for Llama2 models. Optional.
generation_kwargsDictionary of string and anyDefault: NoneAdditional keyword arguments for text generation. For a list of supported parameters, see your model's documentation page. For example, for Hugging Face models, see: Run inference.
Llama2 models support the following inference payload parameters:

- max_new_tokens: The model generates text until the output length (excluding the input context length) reaches max_new_tokens. If specified, it must be a positive integer.
- temperature: A float that controls the randomness in the output. Higher temperature results in output sequence with low-probability words and lower temperature results in output sequence with high-probability words. If temperature=0, it results in greedy decoding. If specified, it must be a positive float.
- top_p: In each step of text generation, sample from the smallest possible set of words with cumulative probability top_p. If specified, it must be a float between 0 and 1.
- return_full_text: If True, input text is a part of the output generated text.
Optional.


REST API Runtime Parameters

There are no runtime parameters you can pass to this component when making a request to the Search REST API endpoint.