Skip to main content
For the complete documentation index for agents and LLMs, see llms.txt.

Haystack Enterprise Platform Documentation Knowledge Graph

This knowledge graph represents the structure and relationships between documentation topics in the dc-docs repository (Haystack Enterprise Platform documentation).

Main Documentation Areas

graph TB
Root[Haystack Enterprise Platform Documentation]

Root --> GettingStarted[Getting Started]
Root --> Learn[Learn]
Root --> Concepts[Concepts]
Root --> HowTo[How-To Guides]
Root --> Tutorials[Tutorials]
Root --> API[API Reference]
Root --> Builder[Builder]

style Root fill:#2563eb,stroke:#1e40af,color:#fff
style GettingStarted fill:#059669,stroke:#047857,color:#fff
style Learn fill:#dc2626,stroke:#b91c1c,color:#fff
style Concepts fill:#7c3aed,stroke:#6d28d9,color:#fff
style HowTo fill:#ea580c,stroke:#c2410c,color:#fff
style Tutorials fill:#0891b2,stroke:#0e7490,color:#fff
style API fill:#4b5563,stroke:#374151,color:#fff
style Builder fill:#65a30d,stroke:#4d7c0f,color:#fff

Core Concepts Relationships

graph LR
subgraph Core["Core Concepts"]
Pipelines[Pipelines]
Components[Pipeline Components]
Indexes[Indexes]
DocStores[Document Stores]
Data[Data Flow]
LLMs[Language Models]
Jobs[Jobs]
Agents[AI Agents]
end

Pipelines -->|contain| Components
Pipelines -->|use| Indexes
Pipelines -->|connect to| DocStores
Pipelines -->|use| LLMs
Pipelines -->|run as| Jobs
Pipelines -->|can include| Agents

Indexes -->|write to| DocStores
Indexes -->|process| Data

Components -->|read from| DocStores
Components -->|use| LLMs

Agents -->|use| Components
Agents -->|call| Tools[Agent Tools]
Agents -->|maintain| Memory[Agent Memory]

style Pipelines fill:#2563eb,stroke:#1e40af,color:#fff
style Components fill:#7c3aed,stroke:#6d28d9,color:#fff
style Indexes fill:#059669,stroke:#047857,color:#fff
style DocStores fill:#dc2626,stroke:#b91c1c,color:#fff
style Agents fill:#ea580c,stroke:#c2410c,color:#fff

Pipeline Components Ecosystem

graph TB
subgraph ComponentTypes["Component Types"]
Embedders[Embedders]
Generators[Generators]
Retrievers[Retrievers]
Rankers[Rankers]
Builders[Builders]
Converters[Converters]
Preprocessors[Preprocessors]
Joiners[Joiners]
Routers[Routers]
Writers[Writers]
end

subgraph Providers["Model Providers"]
OpenAI[OpenAI]
Anthropic[Anthropic]
Cohere[Cohere]
AmazonBedrock[Amazon Bedrock]
GoogleVertex[Google Vertex]
HuggingFace[Hugging Face]
Nvidia[Nvidia]
Ollama[Ollama]
end

Embedders -->|powered by| Providers
Generators -->|powered by| Providers
Rankers -->|powered by| Providers

Converters --> Preprocessors
Preprocessors --> Embedders
Embedders --> Retrievers
Retrievers --> Rankers
Rankers --> Builders
Builders --> Generators
Generators --> Writers

style ComponentTypes fill:#f3f4f6,stroke:#9ca3af
style Providers fill:#fef3c7,stroke:#fbbf24

Document Stores Relationships

graph TB
subgraph DocStores["Document Stores"]
OpenSearch[OpenSearch<br/>Core Store]
Elasticsearch[Elasticsearch]
Pinecone[Pinecone]
Weaviate[Weaviate]
Qdrant[Qdrant]
MongoDB[MongoDB Atlas]
PgVector[PgVector]
Snowflake[Snowflake]
end

subgraph Components["Components That Use Stores"]
DocWriter[DocumentWriter]
BM25[BM25 Retriever]
Embedding[Embedding Retriever]
Hybrid[Hybrid Retriever]
end

OpenSearch -->|core managed| DocWriter
OpenSearch -->|retrieve from| BM25
OpenSearch -->|retrieve from| Embedding

Elasticsearch --> DocWriter
Pinecone --> DocWriter
Weaviate --> DocWriter
Qdrant --> DocWriter
MongoDB --> DocWriter
PgVector --> DocWriter

style OpenSearch fill:#2563eb,stroke:#1e40af,color:#fff
style DocWriter fill:#059669,stroke:#047857,color:#fff

Data Flow

graph LR
subgraph Upload["Data Upload"]
Files[Files]
S3[AWS S3]
VPC[Private VPC]
end

subgraph Indexing["Indexing Process"]
Index[Index]
Preprocess[Preprocessing]
Documents[Documents]
end

subgraph Storage["Storage Layer"]
DocStore[Document Store]
Database[SQL Database]
end

subgraph Query["Query Pipeline"]
UserQuery[User Query]
Retriever[Retriever]
GenOrLLM[Generator or LLM]
AnswerBuilder[Answer builder]
Answer[Answer]
end

Files --> S3
Files --> VPC
S3 --> Index
VPC --> Index

Index --> Preprocess
Preprocess --> Documents
Documents --> DocStore
Files -->|metadata| Database

UserQuery --> Retriever
DocStore --> Retriever
Retriever --> GenOrLLM
GenOrLLM --> AnswerBuilder
AnswerBuilder --> Answer
Answer -->|results| Database

style Upload fill:#dbeafe,stroke:#3b82f6
style Indexing fill:#dcfce7,stroke:#22c55e
style Storage fill:#fce7f3,stroke:#ec4899
style Query fill:#fef3c7,stroke:#f59e0b

AI Agents Architecture

graph TB
subgraph Agent["AI Agent System"]
AgentComp[Agent Component]
LLM[Language Model]
Memory[Agent Memory]
Tools[Agent Tools]
end

subgraph ToolTypes["Tool Types"]
Pipelines[Pipelines]
CustomFunc[Custom Functions]
MCP[MCP Servers]
WebSearch[Web Search]
end

UserInput[User Input] --> AgentComp
AgentComp --> LLM
LLM -->|decides| Tools
Tools -->|execute| ToolTypes
ToolTypes -->|results| LLM
LLM -->|output| Memory
Memory -->|context| LLM
LLM --> Answer[Answer]

style AgentComp fill:#ea580c,stroke:#c2410c,color:#fff
style Tools fill:#7c3aed,stroke:#6d28d9,color:#fff
style Memory fill:#059669,stroke:#047857,color:#fff

How-To Guides Organization

graph TB
HowTo[How-To Guides]

HowTo --> BuildingAgents[Building Agents]
HowTo --> DesigningPipelines[Designing Pipelines]
HowTo --> WorkingWithIndexes[Working with Indexes]
HowTo --> WorkingWithData[Working with Data]
HowTo --> Searching[Searching]
HowTo --> Evaluating[Evaluating]
HowTo --> Optimizing[Optimizing]
HowTo --> Productionizing[Productionizing]
HowTo --> ManagingAccess[Managing Access]
HowTo --> WorkingWithJobs[Working with Jobs]
HowTo --> UsingSDK[Using SDK]

DesigningPipelines --> CreatePipeline[Create Pipeline]
DesigningPipelines --> DeployPipeline[Deploy Pipeline]
DesigningPipelines --> EditPipeline[Edit Pipeline]
DesigningPipelines --> WorkWithLLMs[Work with LLMs]
DesigningPipelines --> CustomComponents[Custom Components]
DesigningPipelines --> HostedModels[Hosted Models]

BuildingAgents --> ConfigureAgent[Configure Agent]
BuildingAgents --> AdvancedConfig[Advanced Config]
BuildingAgents --> MinimalAgent[Minimal Agent]
BuildingAgents --> Troubleshooting[Troubleshooting]

style HowTo fill:#ea580c,stroke:#c2410c,color:#fff

Learning Path

graph LR
subgraph Learn["Learn Section"]
Basics[5-Step Guide]
AppComponents[App Components]
DocRetrieval[Document Retrieval]
Extractive[Extractive QA]
RAG[RAG QA]
LLMOverview[LLM Overview]
PromptEng[Prompt Engineering]
MCP[Model Context Protocol]
end

Basics --> AppComponents
AppComponents --> DocRetrieval
DocRetrieval --> Extractive
Extractive --> RAG
RAG --> LLMOverview
LLMOverview --> PromptEng

style Basics fill:#059669,stroke:#047857,color:#fff
style RAG fill:#dc2626,stroke:#b91c1c,color:#fff

Tutorials Journey

graph TB
Tutorials[Tutorials]

subgraph Basics["Learn the Basics"]
FirstSearch[First Document Search]
FirstQA[First QA App]
RobustRAG[Robust RAG System]
DataCleaning[Data Cleaning Agent]
PII[PII Masking]
MongoDB[MongoDB RAG]
AutoTagging[Auto-tagging with LLM]
ConnectUI[Connect to UI]
UploadCLI[Upload with CLI]
end

subgraph Advanced["Learn Advanced Features"]
CustomComponent[Custom Component]
DemoApp[Demo Your App]
PythonUpload[Upload with Python]
end

subgraph RESTAPI["REST API Tutorials"]
ChatApp[Chat App API]
FeedbackAPI[Feedback API]
end

Tutorials --> Basics
Tutorials --> Advanced
Tutorials --> RESTAPI

FirstSearch --> FirstQA
FirstQA --> RobustRAG

style Tutorials fill:#0891b2,stroke:#0e7490,color:#fff

Cross-Cutting Concerns

graph TB
subgraph CrossCutting["Cross-Cutting Concerns"]
Security[Secrets & Integrations]
Roles[User Roles & Permissions]
Workspaces[Workspaces]
Organizations[Organizations]
Settings[Settings]
Status[Platform Status]
end

subgraph AllAreas["Affects All Areas"]
Pipelines2[Pipelines]
Indexes2[Indexes]
Data2[Data]
Agents2[Agents]
end

Security -.->|secures| AllAreas
Roles -.->|controls access| AllAreas
Workspaces -.->|contains| AllAreas
Organizations -.->|manages| Workspaces

style CrossCutting fill:#f3f4f6,stroke:#9ca3af

Component Providers and Integrations

graph TB
subgraph Haystack["Haystack Components"]
HaystackCore[Core Components]
Embedders2[Embedders]
Generators2[Generators]
Retrievers2[Retrievers]
Preprocessors2[Preprocessors]
Builders2[Builders]
end

subgraph DeepsetCustom["deepset Custom Nodes"]
Augmenters[Augmenters]
Code[Code]
Crawler[Firecrawl]
DeepsetGen[deepset Generators]
DeepsetConv[deepset Converters]
end

subgraph ThirdParty["Third-Party Integrations"]
OpenAI2[OpenAI]
Anthropic2[Anthropic]
Cohere2[Cohere]
Bedrock2[Amazon Bedrock]
Vertex2[Google Vertex]
Nvidia2[Nvidia]
Jina2[Jina]
Voyage2[Voyage]
Mistral2[Mistral]
end

HaystackCore --> Embedders2
HaystackCore --> Generators2
HaystackCore --> Retrievers2
HaystackCore --> Preprocessors2
HaystackCore --> Builders2

ThirdParty -.->|powers| HaystackCore
ThirdParty -.->|powers| DeepsetCustom

style Haystack fill:#2563eb,stroke:#1e40af,color:#fff
style DeepsetCustom fill:#059669,stroke:#047857,color:#fff
style ThirdParty fill:#fef3c7,stroke:#fbbf24

Documentation Structure Summary

Top-Level Organization

File counts are MDX pages under docs/ (approximate; rerun find when restructuring).

  1. Getting Started (~37 files)

    • Basic concepts
    • Quick start guide
    • What's new (releases)
    • Platform status
    • Settings management
    • Working in Haystack Enterprise Platform
  2. Learn (~9 files)

    • 5-step guide to prototyping
    • Document retrieval
    • Extractive QA
    • RAG QA
    • LLM overview
    • Prompt engineering
    • Model Context Protocol
  3. Concepts (~26 files under docs/concepts/)

    • Pipelines (including examples and multimodal topics)
    • AI Agents
    • Document stores
    • Indexes
    • Data in the platform
    • Language models
    • Jobs
    • Roles, secrets, and integrations (cross-cutting)
  4. Reference — Pipeline components (~234 files under docs/reference/pipeline-components/)

    • AI components (for example, Agent, LLM)
    • Knowledge retrieval, data processing, logic and flow
    • Third-party integrations (provider-specific components)
    • Custom Code and workspace custom components
    • Legacy and deprecated components
    • Input, output, and overview pages
  5. How-To Guides (~107 files)

    • Building agents
    • Designing pipelines (including smart connections and hosted models)
    • Working with indexes and data
    • Searching, evaluating, optimizing
    • Productionizing, managing access, jobs
    • Using the SDK and REST API workflows
  6. Tutorials (~14 files)

    • Learn the basics (9 tutorials)
    • Learn advanced features (3 tutorials)
    • REST API tutorials (2 tutorials)
  7. API Reference (~227 files)

    • Main REST API (OpenAPI-generated pages)
    • Jobs API and related endpoints
  8. Builder (1 file)

    • Deploy with Hayhooks

Key Relationships Summary

Primary Dependencies

  • Pipelines depend on Components, Indexes, Document Stores, and LLMs
  • Indexes depend on Document Stores and process Data
  • Components depend on Document Stores and LLMs
  • AI Agents depend on Components, Tools, and maintain Memory
  • Query pipelines depend on enabled Indexes
  • Document Stores are used by both Indexes (write) and Retrievers (read)

Data Flow Path

  1. Files → Upload to S3/VPC
  2. Files → Index (preprocessing)
  3. Index → Documents → Document Store
  4. User Query → Retriever → Document Store
  5. Retriever → Documents → Generator or LLM
  6. Generator or LLM → Answer builder (DeepsetAnswerBuilder or Haystack AnswerBuilder) → Answer

Agent Workflow

  1. User Input → Agent Component
  2. Agent → LLM (with tools list)
  3. LLM → Tool Call OR Direct Answer
  4. Tool → Execution → Result
  5. Result → Check Exit Condition
  6. Continue loop or return answer

Component Pipeline

Files → Converters → Preprocessors → Embedders → (Storage) → Retrievers → Rankers → Prompt builders → Generators or LLM → Answer builders → Output

Smart connections can merge compatible lists (for example, multiple retrievers into one documents input) and convert between some types (for example, string and ChatMessage), which reduces the need for joiners and adapters in many pipelines.

Cross-References and Integration Points

  • Security & Access: Applies to all pipelines, indexes, and data
  • Workspaces: Contain pipelines, indexes, and files
  • Organizations: Contain multiple workspaces
  • Jobs: Can run any pipeline type
  • LLMs: Used by Generators, ChatGenerators, the LLM component, and Agents
  • Document Stores: Central to both indexing and querying
  • Embedders: Must use same model in indexing and query pipelines
  • Agents: Can use pipelines as tools

Special Component Combinations

Common patterns documented:

  • Retriever + Ranker
  • PromptBuilder + Generator
  • ChatPromptBuilder + ChatGenerator
  • Embedder + Retriever
  • Joiner + Ranker (often replaceable with smart connections)
  • Generator or LLM + DeepsetAnswerBuilder (RAG answers with references in the UI) or Haystack AnswerBuilder
  • Input (messages) + Agent (minimal agent pipelines)
  • Router + Multiple paths
  • Validator + Loop