Pipeline Methods

Use your SDK to manage deepset Cloud pipelines. You can save, load, list, deploy, and undeploy pipelines using Python methods of the Pipeline class.

Prerequisites

  • You must be an Admin to perform this task.
  • If you work from your local SDK, you must have Haystack installed. For more information, see Haystack Installation. If you use Jupyter Notebooks in deepset Cloud, you don't have to worry about that.
  • Add API endpoint and API key to the environment variables. The API endpoint is <https://api.cloud.deepset.ai/api/v1>. See Generate an API Key.

📘

Server restart

If the Notebooks server closes while you're still working, all the files you saved are still there. When you restart the server, you'll be able to work on them again.

Save Your Pipeline to deepset Cloud

Create native Haystack pipelines and use the SDK to save them to deepset Cloud. Remember that your pipeline file should contain both a query pipeline and an indexing pipeline.

You can only save a pipeline that has not been deployed.

When using this method, you can only use DeepsetCloudDocumentStore as your document_store. Other document store types are not allowed and if you use them, they are replaced with DeepsetCloudDocumentStore with default parameters.

Here's a table describing the parameters that you can use with the save_to_deepset_cloud() method:

MethodParametersDescription
save_to_deepset_cloudquery_pipeline - specifies the query pipeline of type BasePipeline to save

index_pipeline - specifies the indexing pipeline to save

pipeline_config_name - specifies the name you want to give to your pipeline file. String.
This file is a YAML file that contains both the indexing and the query pipeline.

workspace - specifies the deepset Cloud workspace containing the pipeline, string. Optional. Set this value to default.

api_key - contains the secret value of the deepset Cloud API key. If no value is specified, it is read from the DEEPSET_CLOUD_API_KEY environment variable. String. Optional

api_endpoint - specifies the URL of the deepset Cloud API. If not specified, it is read from the DEEPSET_CLOUD_API_ENDPOINT environment variable. String. Optional.
The endpoint should be <https://api.cloud.deepset.ai/api/v1>.

overwrite - overwrites the configuration if it already exists. Boolean. Possible values: True/False
Exception: If a pipeline with the same name is deployed, it is not overwritten, and saving the pipeline fails.
Saves a pipeline configuration to deepset Cloud. A single pipeline configuration file must declare two pipelines: a query pipeline and an indexing pipeline.
Example of usage
from haystack import Pipeline

query_pipeline = Pipeline() 
index_pipeline = Pipeline() 

Pipeline.save_to_deepset_cloud(
    query_pipeline=query_pipeline,
    index_pipeline=index_pipeline,
    pipeline_config_name="my_new_pipeline",
    api_endpoint=DC_API_ENDPOINT,
    api_key=DC_API_KEY,
)

Generate Code from an Existing Pipeline

Turn a pipeline object into code and then continue working with it. The pipeline must already exist, otherwise you'll receive an error message.

This table lists the parameters that you can use with the to_code() and to_notebook_cell() methods.

MethodParametersDescription
to_code()pipeline_variable_name - contains the name of the pipeline you want to generate. String.

generate_imports - imports the nodes and document store that the pipeline contains. Boolean, possible values: True/False.
Default: True

add_comment - Adds a comment informing that this code has been generated, Boolean, possible values: True/False.
Default: False
Returns the pipeline as a string of code.
to_notebook_cell()pipeline_variable_name - contains the name of the pipeline you want to generate, string

generate_imports - imports the nodes and document store that the pipeline contains. Boolean, possible values: True/False.
Default: True

add_comment - Shows a comment informing that it has been generated before the code, Boolean, possible values: True/False.
Default: True
Creates a new cell with the pipeline code in a Jupyter Notebook.

🚧

Missing API Key

If you see an API key error after running the pipeline code in your notebook, it means that you must add the API key to the document store parameters and rerun your code. You can find the API keys when you click your name in deepset Cloud and go to Connections.

Example of usage
from haystack import pipeline

index_pipeline = Pipeline.load_from_yaml(
    "test_pipeline.yaml", pipeline_name="indexing_pipeline"
)

query_pipeline_code = query_pipeline.to_code(pipeline_variable_name="query_pipeline_from_code")
query_pipeline_code # string that contains the code of the query pipeline

query_pipeline_code = query_pipeline.to_notebook_cell(pipeline_variable_name="query_pipeline_from_code")

List Available Pipelines

Display a list of all pipelines in deepset Cloud.

This table lists all the parameters that you can use with the list_pipelines_on_deepset_cloud() method.

MethodParametersDescription
list_pipelines_on_deepset_cloud() workspace - specifies the deepset Cloud workspace containing the pipeline, string. Optional. Set this value to default.

api_key - contains the secret value of the deepset Cloud API key. If no value is specified, it is read from the DEEPSET_CLOUD_API_KEY environment variable. String. Optional

api_endpoint - specifies the URL of the deepset Cloud API. If not specified, it is read from the DEEPSET_CLOUD_API_ENDPOINT environment variable. String. Optional.
The endpoint should be <https://api.cloud.deepset.ai/api/v1>.
Lists all pipelines available in deepset Cloud.
Returns a list of dictionaries. Each dictionary has a name. This is the pipeline_config_name that you can use in the load_from_deepset_cloud() method.
Example of usage
from haystack import Pipeline
pipeline_list = Pipeline.list_pipelines_on_deepset_cloud()
print(pipeline_list) #this lets you see the names of the pipelines

Load a Pipeline

Load a pipeline from deepset Cloud.

Here are all the parameters that you can use with the load_from_deepset_cloud() method.

MethodParametersDescription
load_from_deepset_cloud()pipeline_config_name - specifies the name of the pipeline configuration to load. String.
Note: You can check this name using list_pipelines_on_deepset_cloud() method.
pipeline_name - specifies the type of the pipeline to load from the file. Typically, a pipeline file in deepset Cloud contains two pipelines: a query pipeline and an indexing pipeline. Here, you can indicate which pipeline you want to load. String.
Possible values:
query
indexing

workspace - specifies the deepset Cloud workspace containing the pipeline, string. Optional. Set this value to default.

api_key - contains the secret value of the deepset Cloud API key. If no value is specified, it is read from the DEEPSET_CLOUD_API_KEY environment variable. String. Optional

api_endpoint - specifies the URL of the deepset Cloud API. If not specified, it is read from the DEEPSET_CLOUD_API_ENDPOINT environment variable. String. Optional.
The endpoint should be <https://api.cloud.deepset.ai/api/v1>.

overwrite_with_env_variables - overwrites the configuration with environment variables.
Example: to change the return_no_answer parameter for a FARMReader, you can set the READER_PARAMS_RETURN_NO_ANSWER=False variable.
For nested hierarchies, use and underscore.
Boolean. Possible values: True/False.
Loads a pipeline from deepset Cloud.
Example of usage
from haystack import Pipeline 

p = Pipeline.load_from_deepset_cloud(
    pipeline_config_name="my_pipeline", api_endpoint=API_ENDPOINT, api_key=API_KEY, pipeline_name="query"
)

p.run(query="Who is the father of Arya Stark?")

Deploy and Undeploy Pipelines

Deploy and undeploy pipelines that exist in deepset Cloud. While a pipeline is being deployed (or undeployed), you cannot modify it. If a pipeline is already deployed (or undeployed), nothing happens.

This table describes the parameters that you can use with the deploy_on_deepset_cloud and undeploy_on_deepset_cloud methods.

MethodParametersDescription
deploy_on_deepset_cloudpipeline_config_name - specifies the name of the pipeline to deploy, string

workspace - specifies the deepset Cloud workspace containing the pipeline, string. Optional. Set this value to default.

api_key - contains the secret value of the deepset Cloud API key. If no value is specified, it is read from the DEEPSET_CLOUD_API_KEY environment variable. String. Optional.

api_endpoint - specifies the URL of the deepset Cloud API. If not specified, it is read from the DEEPSET_CLOUD_API_ENDPOINT environment variable. String. Optional.
The endpoint should be <https://api.cloud.deepset.ai/api/v1>.

timeout - specifies the time in seconds to wait until deployment completes. When exceeded, an error is displayed. Integer.
Deploys the pipeline in deepset Cloud.
undeploy_on_deepset_cloudpipeline_config_name - specifies the name of the pipeline to undeploy, string

workspace - specifies the deepset Cloud workspace containing the pipeline, string. Optional. Set this value to default.

api_key - contains the secret value of the deepset Cloud API key. If no value is specified, it is read from the DEEPSET_CLOUD_API_KEY environment variable. String. Optional.

api_endpoint - specifies the URL of the deepset Cloud API. If not specified, it is read from the DEEPSET_CLOUD_API_ENDPOINT environment variable. String. Optional.
The endpoint should be <https://api.cloud.deepset.ai/api/v1>.

timeout - specifies the time in seconds to wait until undeployment completes. When exceeded, an error is displayed. Integer.
Undeploys a pipeline in deepset Cloud.
Example of usage
from haystack import Pipeline 

p = Pipeline.deploy_on_deepset_cloud(
    pipeline_config_name="my_pipeline", workspace="default", api_endpoint=API_ENDPOINT, api_key=API_KEY)