Skip to main content

Haystack Enterprise Platform Documentation Knowledge Graph

This knowledge graph represents the structure and relationships between documentation topics in the (dc-docs) repository.

Main Documentation Areas

graph TB
Root[deepset Cloud Documentation]

Root --> GettingStarted[Getting Started]
Root --> Learn[Learn]
Root --> Concepts[Concepts]
Root --> HowTo[How-To Guides]
Root --> Tutorials[Tutorials]
Root --> API[API Reference]
Root --> Builder[Builder]

style Root fill:#2563eb,stroke:#1e40af,color:#fff
style GettingStarted fill:#059669,stroke:#047857,color:#fff
style Learn fill:#dc2626,stroke:#b91c1c,color:#fff
style Concepts fill:#7c3aed,stroke:#6d28d9,color:#fff
style HowTo fill:#ea580c,stroke:#c2410c,color:#fff
style Tutorials fill:#0891b2,stroke:#0e7490,color:#fff
style API fill:#4b5563,stroke:#374151,color:#fff
style Builder fill:#65a30d,stroke:#4d7c0f,color:#fff

Core Concepts Relationships

graph LR
subgraph Core["Core Concepts"]
Pipelines[Pipelines]
Components[Pipeline Components]
Indexes[Indexes]
DocStores[Document Stores]
Data[Data Flow]
LLMs[Language Models]
Jobs[Jobs]
Agents[AI Agents]
end

Pipelines -->|contain| Components
Pipelines -->|use| Indexes
Pipelines -->|connect to| DocStores
Pipelines -->|use| LLMs
Pipelines -->|run as| Jobs
Pipelines -->|can include| Agents

Indexes -->|write to| DocStores
Indexes -->|process| Data

Components -->|read from| DocStores
Components -->|use| LLMs

Agents -->|use| Components
Agents -->|call| Tools[Agent Tools]
Agents -->|maintain| Memory[Agent Memory]

style Pipelines fill:#2563eb,stroke:#1e40af,color:#fff
style Components fill:#7c3aed,stroke:#6d28d9,color:#fff
style Indexes fill:#059669,stroke:#047857,color:#fff
style DocStores fill:#dc2626,stroke:#b91c1c,color:#fff
style Agents fill:#ea580c,stroke:#c2410c,color:#fff

Pipeline Components Ecosystem

graph TB
subgraph ComponentTypes["Component Types"]
Embedders[Embedders]
Generators[Generators]
Retrievers[Retrievers]
Rankers[Rankers]
Builders[Builders]
Converters[Converters]
Preprocessors[Preprocessors]
Joiners[Joiners]
Routers[Routers]
Writers[Writers]
end

subgraph Providers["Model Providers"]
OpenAI[OpenAI]
Anthropic[Anthropic]
Cohere[Cohere]
AmazonBedrock[Amazon Bedrock]
GoogleVertex[Google Vertex]
HuggingFace[Hugging Face]
Nvidia[Nvidia]
Ollama[Ollama]
end

Embedders -->|powered by| Providers
Generators -->|powered by| Providers
Rankers -->|powered by| Providers

Converters --> Preprocessors
Preprocessors --> Embedders
Embedders --> Retrievers
Retrievers --> Rankers
Rankers --> Builders
Builders --> Generators
Generators --> Writers

style ComponentTypes fill:#f3f4f6,stroke:#9ca3af
style Providers fill:#fef3c7,stroke:#fbbf24

Document Stores Relationships

graph TB
subgraph DocStores["Document Stores"]
OpenSearch[OpenSearch<br/>Core Store]
Elasticsearch[Elasticsearch]
Pinecone[Pinecone]
Weaviate[Weaviate]
Qdrant[Qdrant]
MongoDB[MongoDB Atlas]
PgVector[PgVector]
Snowflake[Snowflake]
end

subgraph Components["Components That Use Stores"]
DocWriter[DocumentWriter]
BM25[BM25 Retriever]
Embedding[Embedding Retriever]
Hybrid[Hybrid Retriever]
end

OpenSearch -->|core managed| DocWriter
OpenSearch -->|retrieve from| BM25
OpenSearch -->|retrieve from| Embedding

Elasticsearch --> DocWriter
Pinecone --> DocWriter
Weaviate --> DocWriter
Qdrant --> DocWriter
MongoDB --> DocWriter
PgVector --> DocWriter

style OpenSearch fill:#2563eb,stroke:#1e40af,color:#fff
style DocWriter fill:#059669,stroke:#047857,color:#fff

Data Flow

graph LR
subgraph Upload["Data Upload"]
Files[Files]
S3[AWS S3]
VPC[Private VPC]
end

subgraph Indexing["Indexing Process"]
Index[Index]
Preprocess[Preprocessing]
Documents[Documents]
end

subgraph Storage["Storage Layer"]
DocStore[Document Store]
Database[SQL Database]
end

subgraph Query["Query Pipeline"]
UserQuery[User Query]
Retriever[Retriever]
Generator[Generator]
Answer[Answer]
end

Files --> S3
Files --> VPC
S3 --> Index
VPC --> Index

Index --> Preprocess
Preprocess --> Documents
Documents --> DocStore
Files -->|metadata| Database

UserQuery --> Retriever
DocStore --> Retriever
Retriever --> Generator
Generator --> Answer
Answer -->|results| Database

style Upload fill:#dbeafe,stroke:#3b82f6
style Indexing fill:#dcfce7,stroke:#22c55e
style Storage fill:#fce7f3,stroke:#ec4899
style Query fill:#fef3c7,stroke:#f59e0b

AI Agents Architecture

graph TB
subgraph Agent["AI Agent System"]
AgentComp[Agent Component]
LLM[Language Model]
Memory[Agent Memory]
Tools[Agent Tools]
end

subgraph ToolTypes["Tool Types"]
Pipelines[Pipelines]
CustomFunc[Custom Functions]
MCP[MCP Servers]
WebSearch[Web Search]
end

UserInput[User Input] --> AgentComp
AgentComp --> LLM
LLM -->|decides| Tools
Tools -->|execute| ToolTypes
ToolTypes -->|results| LLM
LLM -->|output| Memory
Memory -->|context| LLM
LLM --> Answer[Answer]

style AgentComp fill:#ea580c,stroke:#c2410c,color:#fff
style Tools fill:#7c3aed,stroke:#6d28d9,color:#fff
style Memory fill:#059669,stroke:#047857,color:#fff

How-To Guides Organization

graph TB
HowTo[How-To Guides]

HowTo --> BuildingAgents[Building Agents]
HowTo --> DesigningPipelines[Designing Pipelines]
HowTo --> WorkingWithIndexes[Working with Indexes]
HowTo --> WorkingWithData[Working with Data]
HowTo --> Searching[Searching]
HowTo --> Evaluating[Evaluating]
HowTo --> Optimizing[Optimizing]
HowTo --> Productionizing[Productionizing]
HowTo --> ManagingAccess[Managing Access]
HowTo --> WorkingWithJobs[Working with Jobs]
HowTo --> UsingSDK[Using SDK]

DesigningPipelines --> CreatePipeline[Create Pipeline]
DesigningPipelines --> DeployPipeline[Deploy Pipeline]
DesigningPipelines --> EditPipeline[Edit Pipeline]
DesigningPipelines --> WorkWithLLMs[Work with LLMs]
DesigningPipelines --> CustomComponents[Custom Components]
DesigningPipelines --> HostedModels[Hosted Models]

BuildingAgents --> ConfigureAgent[Configure Agent]
BuildingAgents --> AdvancedConfig[Advanced Config]
BuildingAgents --> MinimalAgent[Minimal Agent]
BuildingAgents --> Troubleshooting[Troubleshooting]

style HowTo fill:#ea580c,stroke:#c2410c,color:#fff

Learning Path

graph LR
subgraph Learn["Learn Section"]
Basics[5-Step Guide]
AppComponents[App Components]
DocRetrieval[Document Retrieval]
Extractive[Extractive QA]
RAG[RAG QA]
LLMOverview[LLM Overview]
PromptEng[Prompt Engineering]
MCP[Model Context Protocol]
end

Basics --> AppComponents
AppComponents --> DocRetrieval
DocRetrieval --> Extractive
Extractive --> RAG
RAG --> LLMOverview
LLMOverview --> PromptEng

style Basics fill:#059669,stroke:#047857,color:#fff
style RAG fill:#dc2626,stroke:#b91c1c,color:#fff

Tutorials Journey

graph TB
Tutorials[Tutorials]

subgraph Basics["Learn the Basics"]
FirstSearch[First Document Search]
FirstQA[First QA App]
RobustRAG[Robust RAG System]
DataCleaning[Data Cleaning Agent]
PII[PII Masking]
MongoDB[MongoDB RAG]
AutoTagging[Auto-tagging with LLM]
ConnectUI[Connect to UI]
UploadCLI[Upload with CLI]
end

subgraph Advanced["Learn Advanced Features"]
CustomComponent[Custom Component]
DemoApp[Demo Your App]
PythonUpload[Upload with Python]
end

subgraph RESTAPI["REST API Tutorials"]
ChatApp[Chat App API]
FeedbackAPI[Feedback API]
end

Tutorials --> Basics
Tutorials --> Advanced
Tutorials --> RESTAPI

FirstSearch --> FirstQA
FirstQA --> RobustRAG

style Tutorials fill:#0891b2,stroke:#0e7490,color:#fff

Cross-Cutting Concerns

graph TB
subgraph CrossCutting["Cross-Cutting Concerns"]
Security[Secrets & Integrations]
Roles[User Roles & Permissions]
Workspaces[Workspaces]
Organizations[Organizations]
Settings[Settings]
Status[Platform Status]
end

subgraph AllAreas["Affects All Areas"]
Pipelines2[Pipelines]
Indexes2[Indexes]
Data2[Data]
Agents2[Agents]
end

Security -.->|secures| AllAreas
Roles -.->|controls access| AllAreas
Workspaces -.->|contains| AllAreas
Organizations -.->|manages| Workspaces

style CrossCutting fill:#f3f4f6,stroke:#9ca3af

Component Providers and Integrations

graph TB
subgraph Haystack["Haystack Components"]
HaystackCore[Core Components]
Embedders2[Embedders]
Generators2[Generators]
Retrievers2[Retrievers]
Preprocessors2[Preprocessors]
Builders2[Builders]
end

subgraph DeepsetCustom["deepset Custom Nodes"]
Augmenters[Augmenters]
Code[Code]
Crawler[Firecrawl]
DeepsetGen[deepset Generators]
DeepsetConv[deepset Converters]
end

subgraph ThirdParty["Third-Party Integrations"]
OpenAI2[OpenAI]
Anthropic2[Anthropic]
Cohere2[Cohere]
Bedrock2[Amazon Bedrock]
Vertex2[Google Vertex]
Nvidia2[Nvidia]
Jina2[Jina]
Voyage2[Voyage]
Mistral2[Mistral]
end

HaystackCore --> Embedders2
HaystackCore --> Generators2
HaystackCore --> Retrievers2
HaystackCore --> Preprocessors2
HaystackCore --> Builders2

ThirdParty -.->|powers| HaystackCore
ThirdParty -.->|powers| DeepsetCustom

style Haystack fill:#2563eb,stroke:#1e40af,color:#fff
style DeepsetCustom fill:#059669,stroke:#047857,color:#fff
style ThirdParty fill:#fef3c7,stroke:#fbbf24

Documentation Structure Summary

Top-Level Organization

  1. Getting Started (13 files)

    • Basic concepts
    • Quick start guide
    • What's new (releases)
    • Platform status
    • Settings management
    • Working in deepset Cloud
  2. Learn (9 files)

    • 5-step guide to prototyping
    • Document retrieval
    • Extractive QA
    • RAG QA
    • LLM overview
    • Prompt engineering
    • Model Context Protocol
  3. Concepts (260+ files)

    • Pipelines (3 files + examples)
    • AI Agents (4 files)
    • Pipeline Components (245 files)
      • Haystack components (200 files)
      • deepset custom nodes (45 files)
    • Document stores (8 files)
    • Indexes
    • Data flow
    • Language models
    • Jobs
  4. How-To Guides (100+ files)

    • Building agents (5 files)
    • Designing pipelines (38 files)
    • Working with indexes (6 files)
    • Working with data (9 files)
    • Searching (3 files)
    • Evaluating (5 files)
    • Optimizing (3 files)
    • Productionizing (12 files)
    • Managing access (9 files)
    • Working with jobs (3 files)
    • Using SDK (12 files)
  5. Tutorials (14 files)

    • Learn the basics (9 tutorials)
    • Learn advanced features (3 tutorials)
    • REST API tutorials (2 tutorials)
  6. API Reference (199 files)

    • Main API endpoints (178 files)
    • Jobs API endpoints (19 files)
  7. Builder (1 file)

    • Deploy with Hayhooks

Key Relationships Summary

Primary Dependencies

  • Pipelines depend on Components, Indexes, Document Stores, and LLMs
  • Indexes depend on Document Stores and process Data
  • Components depend on Document Stores and LLMs
  • AI Agents depend on Components, Tools, and maintain Memory
  • Query pipelines depend on enabled Indexes
  • Document Stores are used by both Indexes (write) and Retrievers (read)

Data Flow Path

  1. Files → Upload to S3/VPC
  2. Files → Index (preprocessing)
  3. Index → Documents → Document Store
  4. User Query → Retriever → Document Store
  5. Retriever → Documents → Generator
  6. Generator → Answer

Agent Workflow

  1. User Input → Agent Component
  2. Agent → LLM (with tools list)
  3. LLM → Tool Call OR Direct Answer
  4. Tool → Execution → Result
  5. Result → Check Exit Condition
  6. Continue loop or return answer

Component Pipeline

Files → Converters → Preprocessors → Embedders → (Storage) → Retrievers → Rankers → Builders → Generators → Answer

Cross-References and Integration Points

  • Security & Access: Applies to all pipelines, indexes, and data
  • Workspaces: Contain pipelines, indexes, and files
  • Organizations: Contain multiple workspaces
  • Jobs: Can run any pipeline type
  • LLMs: Used by Generators, ChatGenerators, and Agents
  • Document Stores: Central to both indexing and querying
  • Embedders: Must use same model in indexing and query pipelines
  • Agents: Can use pipelines as tools

Special Component Combinations

Common patterns documented:

  • Retriever + Ranker
  • PromptBuilder + Generator
  • ChatPromptBuilder + ChatGenerator
  • Embedder + Retriever
  • Joiner + Ranker
  • Generator + AnswerBuilder
  • Router + Multiple paths
  • Validator + Loop