# Use OpenAI Models

Use OpenAI models in your pipelines.

***

## About This Task

You can use OpenAI's embedding models and LLMs:

- For a list of embedding models, see [OpenAI documentation](https://platform.openai.com/docs/guides/embeddings/embedding-models).
- For a list of LLMs, see [OpenAI model overview](https://platform.openai.com/docs/models/models-overview).

## Prerequisites

You need an API key from an active OpenAI account. For details on obtaining it, see [Secret keys in OpenAI](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key).

## Use OpenAI Models

First, connect <ProductName /> to OpenAI through the Integrations page. You can set up the connection for a single workspace or for the whole organization:
<AddIntegration />

Then, add a component that uses an OpenAI model to your pipeline. Here are the components by the model type they use:

- Embedding models:
  - `OpenAITextEmbedder`: Calculates embeddings for text, like query. Often used in query pipelines to embed a query and pass the embedding to an embedding retriever.
  - `OpenAIDocumentEmbedder`: Calculates embeddings for documents. Often used in indexes to embed documents and pass them to `DocumentWriter`.
    <EmbeddingInfoCallout />

- LLMs:
  - `OpenAIGenerator`: Generates text using OpenAI models, often used in RAG pipelines.

## Usage Examples

This is an example of how to use OpenAI's embedding models and an LLM in an index and a query pipeline (each in a separate tab):

<Tabs>
<TabItem value="index" label="Index" default>

```yaml 
components:
  ...
    splitter:
      type: haystack.components.preprocessors.document_splitter.DocumentSplitter
      init_parameters:
        split_by: word
        split_length: 250
        split_overlap: 30

    document_embedder:
      type: haystack.components.embedders.openai_document_embedder.OpenAIDocumentEmbedder
      init_parameters:
        model: text-embedding-ada-002 # the model to use

    writer:
      type: haystack.components.writers.document_writer.DocumentWriter
      init_parameters:
        document_store:
          type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
          init_parameters:
            embedding_dim: 768
            similarity: cosine
        policy: OVERWRITE
        
connections:  # Defines how the components are connected
  ...
  - sender: splitter.documents
    receiver: document_embedder.documents
  - sender: document_embedder.documents
    receiver: writer.documents
```
</TabItem>
<TabItem value="query" label="Query Pipeline">
```yaml Query Pipeline
components:
  ...
    query_embedder:
      type: haystack.components.embedders.openai_text_embedder.OpenAITextEmbedder
      init_parameters:
        model: "text-embedding-ada-002"
        
    retriever:
      type: haystack_integrations.components.retrievers.opensearch.embedding_retriever.OpenSearchEmbeddingRetriever
      init_parameters: 
        document_store:
          init_parameters:
            use_ssl: True
            verify_certs: False
            http_auth:
              - "${OPENSEARCH_USER}"
              - "${OPENSEARCH_PASSWORD}"
          type: haystack_integrations.document_stores.opensearch.document_store.OpenSearchDocumentStore
        top_k: 20 
        
    prompt_builder:
      type: haystack.components.builders.prompt_builder.PromptBuilder
      init_parameters:
        template: |-
          You are a technical expert.
          You answer questions truthfully based on provided documents.
          For each document check whether it is related to the question.
          Only use documents that are related to the question to answer it.
          Ignore documents that are not related to the question.
          If the answer exists in several documents, summarize them.
          Only answer based on the documents provided. Don't make things up.
          If the documents can't answer the question or you are unsure say: 'The answer can't be found in the text'.
          These are the documents:
          {% for document in documents %}
          Document[{{ loop.index }}]:
          {{ document.content }}
          {% endfor %}
          Question: {{question}}
          Answer:

    generator:
      type: haystack.components.generators.openai.OpenAIGenerator
      init_parameters:
        model: "gpt-3.5-turbo" # the model to use
        generation_kwargs: # additional parameters for the model
          max_tokens: 400
          temperature: 0.0
          seed: 0
       
    answer_builder:
      init_parameters: {}
      type: haystack.components.builders.answer_builder.AnswerBuilder
   
      ...
      
  connections:  # Defines how the components are connected
  ...
  - sender: query_embedder.embedding # OpenAITextEmbedder sends the embedded query to the retriever
    receiver: retriever.query_embedding 
  - sender: retriever.documents
    receiver: prompt_builder.documents
  - sender: prompt_builder.prompt
    receiver: generator.prompt
  - sender: generator.replies
    receiver: answer_builder.replies
    ...
    
  inputs:
   query:
   ..
   - "query_embedder.text" # TextEmbedder needs query as input and it's not getting it
   - "retriever.query"     # from any component it's connected to, so it needs to receive it from the pipeline.
   - "prompt_builder.question"
   - "answer_builder.query"                       
	
    ...													 
   ...
   
  
```
</TabItem>
</Tabs>

Here is how to connect the components in Pipeline Builder. In the index, `OpenAIDocumentEmbedder` receives documents from `DocumentSplitter` and then passes the embedded documents to `DocumentWriter`, which writes them into the Document Store:

<ClickableImage
  src="/img/how-tos/use_openai_models_doc_embedder.png"
  alt="In indexes, OpenAIDocumentEmbedder receives documents from DocumentSplitter and then passes the embedded documents to DocumentWriter, which writes them into the Document Store"
  size="standard"
/>

In a query pipeline, `OpenAITextEmbedder` embeds the query using the same model as the `OpenAIDocumentEmbedder` in the index. Then, it sends the embedded query to the retriever, which fetches matching documents and sends them to `PromptBuilder`. `OpenAIGenerator` then receives the rendered prompt from the `PromptBuilder` and sends the generated replies to `AnswerBuilder` to build a proper `GeneratedAnswer` object. 

<ClickableImage
  src="/img/how-tos/use_openai_models_generator.png"
  alt="In the query pipeline, OpenAITextEmbedder embeds the query using the same model as the OpenAIDocumentEmbedder in the index. Then, it sends the embedded query to the retriever, which fetches matching documents and sends them to PromptBuilder. OpenAIGenerator then receives the rendered prompt from the PromptBuilder and sends the generated replies to AnswerBuilder to build a proper GeneratedAnswer object."
  size="standard"
/>
