Use Snowflake Database
Query data in your Snowflake database using deepset Cloud pipelines.
About this Task
deepset Cloud provides the SnowflakeExecutor component that connects to your Snowflake database. It takes a SQL query as input and returns a database table that matches the query, and the generated answer.
Prerequisites
You need the following information about your Snowflake database:
- Snowflake account identifier
- Snowflake user login
- Warehouse name
- Schema name
- Database name
Query Your Snowflake Database
First, connect deepset Cloud to Snowflake through the Connections page:
-
Click your initials in the top right corner and select Connections.
-
Click Connect next to the provider.
-
Enter your user access token and submit it.
Then, add SnowflakeExecutor to your query pipeline.
Usage Examples
This is an example of a query pipeline that uses two Generators: one to turn a natural language query into SQL and send it to SnowflakeExecutor, and another one to build an answer based on SnowflakeExecutor's output. The generated answer includes the database table in Markdown format, the query, and the answer.
components:
bm25_retriever:
type: >-
haystack_integrations.components.retrievers.opensearch.bm25_retriever.OpenSearchBM25Retriever
init_parameters:
document_store:
type: >-
haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
top_k: 20
query_embedder:
type: >-
haystack.components.embedders.sentence_transformers_text_embedder.SentenceTransformersTextEmbedder
init_parameters:
model: intfloat/e5-base-v2
embedding_retriever:
type: >-
haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
init_parameters:
document_store:
type: >-
haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
init_parameters:
use_ssl: true
verify_certs: false
hosts:
- ${OPENSEARCH_HOST}
http_auth:
- ${OPENSEARCH_USER}
- ${OPENSEARCH_PASSWORD}
embedding_dim: 768
similarity: cosine
top_k: 20
document_joiner:
type: haystack.components.joiners.document_joiner.DocumentJoiner
init_parameters:
join_mode: concatenate
ranker:
type: >-
haystack.components.rankers.transformers_similarity.TransformersSimilarityRanker
init_parameters:
model: intfloat/simlm-msmarco-reranker
top_k: 8
model_kwargs:
torch_dtype: torch.float16
sql_prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: >-
You are a SQL expert working with Snowflake.
Your task is to create a Snowflake SQL query for the given question.
Refrain from explaining your answer. Your answer must be the SQL query
in plain text format without using Markdown.
Here are some relevant tables, a description about it, and their
columns:
{% for document in documents %}
Document[{{ loop.index }}]:
{{ document.content }}
{% endfor %}
User's question: {{ question }}
Generated SQL query:
sql_llm:
type: >-
deepset_cloud_custom_nodes.generators.deepset_amazon_bedrock_generator.DeepsetAmazonBedrockGenerator
init_parameters:
model: anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-east-1
max_length: 1000
model_max_length: 200000
temperature: 0
snowflake_executor:
type: >-
deepset_cloud_custom_nodes.augmenters.snowflake_executor.DeepsetSnowflakeExecutor
init_parameters:
user: <SF-USER>
account: <SF-ACCOUNT>
replies_to_sql:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: str
display_prompt_builder:
type: haystack.components.builders.prompt_builder.PromptBuilder
init_parameters:
template: >-
You are an expert data analyst.
Your role is to answer the user's question {query} using the information
in the table.
You will base your response solely on the information provided in the
table.
Do not rely on your knowledge base; only the data that is in the table.
Refrain from using the term "table" in your response, but instead, use
the word "data"
If the table is blank say:
"The specific answer can't be found in the database. Try rephrasing your
question."
Additionally, you will present the table in a tabular format and provide
the SQL query used to extract the relevant rows from the database in
Markdown.
If the table is larger than 10 rows, display the most important rows up
to 10 rows. Your answer must be detailed and provide insights based on
the question and the available data.
SQL query:
{{ sql_query }}
Table:
{{ table }}
Answer:
display_llm:
type: >-
deepset_cloud_custom_nodes.generators.deepset_amazon_bedrock_generator.DeepsetAmazonBedrockGenerator
init_parameters:
model: anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-east-1
max_length: 650
model_max_length: 200000
temperature: 0
answer_builder:
type: >-
deepset_cloud_custom_nodes.augmenters.deepset_answer_builder.DeepsetAnswerBuilder
init_parameters:
reference_pattern: acm
connections:
- sender: bm25_retriever.documents
receiver: document_joiner.documents
- sender: query_embedder.embedding
receiver: embedding_retriever.query_embedding
- sender: embedding_retriever.documents
receiver: document_joiner.documents
- sender: document_joiner.documents
receiver: ranker.documents
- sender: ranker.documents
receiver: sql_prompt_builder.documents
- sender: ranker.documents
receiver: answer_builder.documents
- sender: sql_prompt_builder.prompt
receiver: sql_llm.prompt
- sender: sql_llm.replies
receiver: replies_to_sql.replies
- sender: replies_to_sql.output
receiver: snowflake_executor.query
- sender: snowflake_executor.answers
receiver: display_prompt_builder.table
- sender: replies_to_sql.output
receiver: display_prompt_builder.sql_query
- sender: display_prompt_builder.prompt
receiver: answer_builder.prompt
- sender: display_prompt_builder.prompt
receiver: display_llm.prompt
- sender: display_llm.replies
receiver: answer_builder.replies
max_loops_allowed: 100
metadata: {}
inputs:
query:
- bm25_retriever.query
- query_embedder.text
- ranker.query
- sql_prompt_builder.question
- answer_builder.query
filters:
- bm25_retriever.filters
- embedding_retriever.filters
outputs:
documents: ranker.documents
answers: answer_builder.answers
Updated 14 days ago