SnowflakeExecutor

Use this component to establish a connection to a Snowflake database. You can then query your data in a Snowflake database with a deepset Cloud pipeline.

Snowflake is a cloud-based data warehousing platform that provides an SQL database engine. With SnowflakeExecutor, you can pass all the details needed to query your data in Snowflake with your deepset Cloud pipeline.

You can use SnowflakeExecutor on its own. Then, you need to provide an SQL query to it, and you'll get a database table formatted as Markdown as an output. However, a more common way is to use SnowflakeExecutor preceded by a PromptNode instructed to turn the user query into SQL, and followed by another PromptNode that answers the query based on the database tables SnowflakeExecutor retrieved.

Basic Information

  • Pipeline type: Used in query pipelines.
  • Nodes that can precede it in a pipeline: PromptNode instructed to generate SQL from the user query
  • Nodes that can follow it in a pipeline: PromptNode instructed to answer the query based on the tables SnowflakeExecutor fetched
  • Input: SQL query
  • Output: Database table

Usage Example

This is an example of SnowflakeExecutor being used between PromptNodes. The first PromptNode converts the user query into SQL and sends it to SnowflakeExecutor. SnowflakeExecutor contains all the database details. It connects to it and retrieves the table relevant to the SQL query. Then, it passes this table on to another PromptNode that answers the query based on it.

components:
  - name: sql_template #This is the template for the PromptNode that converts user query into SQL
    type: PromptTemplate
    params:
      prompt: |-
        You are a SQL expert working with Snowflake.
        Your task is to create a Snowflake SQL query for the given question.
        Refrain from explaining your answer. Your answer must be the SQL query in plain text format without using Markdown.
        Here are some relevant tables, a description about it, and their columns:
        {join(documents, delimiter=new_line, pattern=new_line+'Table Name: $table'+new_line+'Table Description: $content'+new_line+'Table Columns:'+new_line+'$table_columns')}
        User's question: {query}
        Generated SQL query:
  - name: PromptNodeSQLGenerator
    type: PromptNode
    params:
      default_prompt_template: sql_template
      max_length: 600 # The maximum number of tokens the generated SQL can have
      model_kwargs:
        # Specifies additional model settings
        temperature: 0
      model_name_or_path: gpt-4-turbo-preview
      truncate: false
  - name: SnowflakeExecutor
    type: SnowflakeExecutor
    params: # Pass all the database connection details here
      account: <SF-ACCOUNT>
      user: <SF-USER>
      warehouse: <SF-WAREHOUSE>
      database: <SF-DB>
      db_schema: <SF-DB-SCHEMA>
  - name: display_template # this is the prompt for a PromptNode that answers the query based on Snowflake database tables
    type: PromptTemplate
    params:
      output_parser:
        type: AnswerParser
      prompt: |-
        You are an expert data analyst.
        Your role is to answer the user's question {query} using the information in the table.
        You will base your response solely on the information provided in the table.
        Do not rely on your knowledge base; only the data that is in the table.
        Refrain from using the term "table" in your response, but instead, use the word "data"
        If the table is blank say:
        "The specific answer can't be found in the database. Try rephrasing your question."
        Additionally, you will present the table in a tabular format and provide the SQL query used to extract the relevant rows from the database in Markdown.
        If the table is larger than 10 rows, display the most important rows up to 10 rows. Your answer must be detailed and provide insights based on the question and the available data.
        SQL query:
        {results}
        Table:
        {table}
        Answer:
  - name: PromptNodeDisplay
    type: PromptNode
    params:
      default_prompt_template: display_template
      max_length: 4096 # The maximum number of tokens the generated answer can have
      model_kwargs:
        # Specifies additional model settings
        temperature: 0
      model_name_or_path: gpt-4-turbo-preview
      truncate: false
      
   pipelines:
      - name: PromptNodeSQLGenerator
        inputs: [Query]
      - name: SnowflakeExecutor
        inputs: [PromptNodeSQLGenerator]
      - name: PromptNodeDisplay
        inputs: [SnowflakeExecutor]

For an easy start, you can use one of the Text-to-SQL templates in deepset Cloud.

SnowflakeExecutor Parameters

ParameterTypePossible valuesDescription
userStringSnowflake user login.
Required.
accountStringSnowflake account identifier.
Required.
databaseStringName of the database to use.
Required.
db_schemaStringName of the schema to use.
Required.
api_keyStringDefault: NoneSnowflake account password. We recommend you set it on the Connections page in deepset Cloud.
Optional.
warehouseStringDefault: NoneName of the warehouse to use.
Optional.
login_timeoutIntegerDefault: 60Timeout in seconds for login. By default, 60 seconds.
Optional.