Haystack Enterprise Platform Documentation Knowledge Graph
This knowledge graph represents the structure and relationships between documentation topics in the dc-docs repository (Haystack Enterprise Platform documentation).
Main Documentation Areas
graph TB
Root[Haystack Enterprise Platform Documentation]
Root --> GettingStarted[Getting Started]
Root --> Learn[Learn]
Root --> Concepts[Concepts]
Root --> HowTo[How-To Guides]
Root --> Tutorials[Tutorials]
Root --> API[API Reference]
Root --> Builder[Builder]
style Root fill:#2563eb,stroke:#1e40af,color:#fff
style GettingStarted fill:#059669,stroke:#047857,color:#fff
style Learn fill:#dc2626,stroke:#b91c1c,color:#fff
style Concepts fill:#7c3aed,stroke:#6d28d9,color:#fff
style HowTo fill:#ea580c,stroke:#c2410c,color:#fff
style Tutorials fill:#0891b2,stroke:#0e7490,color:#fff
style API fill:#4b5563,stroke:#374151,color:#fff
style Builder fill:#65a30d,stroke:#4d7c0f,color:#fff
Core Concepts Relationships
graph LR
subgraph Core["Core Concepts"]
Pipelines[Pipelines]
Components[Pipeline Components]
Indexes[Indexes]
DocStores[Document Stores]
Data[Data Flow]
LLMs[Language Models]
Jobs[Jobs]
Agents[AI Agents]
end
Pipelines -->|contain| Components
Pipelines -->|use| Indexes
Pipelines -->|connect to| DocStores
Pipelines -->|use| LLMs
Pipelines -->|run as| Jobs
Pipelines -->|can include| Agents
Indexes -->|write to| DocStores
Indexes -->|process| Data
Components -->|read from| DocStores
Components -->|use| LLMs
Agents -->|use| Components
Agents -->|call| Tools[Agent Tools]
Agents -->|maintain| Memory[Agent Memory]
style Pipelines fill:#2563eb,stroke:#1e40af,color:#fff
style Components fill:#7c3aed,stroke:#6d28d9,color:#fff
style Indexes fill:#059669,stroke:#047857,color:#fff
style DocStores fill:#dc2626,stroke:#b91c1c,color:#fff
style Agents fill:#ea580c,stroke:#c2410c,color:#fff
Pipeline Components Ecosystem
graph TB
subgraph ComponentTypes["Component Types"]
Embedders[Embedders]
Generators[Generators]
Retrievers[Retrievers]
Rankers[Rankers]
Builders[Builders]
Converters[Converters]
Preprocessors[Preprocessors]
Joiners[Joiners]
Routers[Routers]
Writers[Writers]
end
subgraph Providers["Model Providers"]
OpenAI[OpenAI]
Anthropic[Anthropic]
Cohere[Cohere]
AmazonBedrock[Amazon Bedrock]
GoogleVertex[Google Vertex]
HuggingFace[Hugging Face]
Nvidia[Nvidia]
Ollama[Ollama]
end
Embedders -->|powered by| Providers
Generators -->|powered by| Providers
Rankers -->|powered by| Providers
Converters --> Preprocessors
Preprocessors --> Embedders
Embedders --> Retrievers
Retrievers --> Rankers
Rankers --> Builders
Builders --> Generators
Generators --> Writers
style ComponentTypes fill:#f3f4f6,stroke:#9ca3af
style Providers fill:#fef3c7,stroke:#fbbf24
Document Stores Relationships
graph TB
subgraph DocStores["Document Stores"]
OpenSearch[OpenSearch<br/>Core Store]
Elasticsearch[Elasticsearch]
Pinecone[Pinecone]
Weaviate[Weaviate]
Qdrant[Qdrant]
MongoDB[MongoDB Atlas]
PgVector[PgVector]
Snowflake[Snowflake]
end
subgraph Components["Components That Use Stores"]
DocWriter[DocumentWriter]
BM25[BM25 Retriever]
Embedding[Embedding Retriever]
Hybrid[Hybrid Retriever]
end
OpenSearch -->|core managed| DocWriter
OpenSearch -->|retrieve from| BM25
OpenSearch -->|retrieve from| Embedding
Elasticsearch --> DocWriter
Pinecone --> DocWriter
Weaviate --> DocWriter
Qdrant --> DocWriter
MongoDB --> DocWriter
PgVector --> DocWriter
style OpenSearch fill:#2563eb,stroke:#1e40af,color:#fff
style DocWriter fill:#059669,stroke:#047857,color:#fff
Data Flow
graph LR
subgraph Upload["Data Upload"]
Files[Files]
S3[AWS S3]
VPC[Private VPC]
end
subgraph Indexing["Indexing Process"]
Index[Index]
Preprocess[Preprocessing]
Documents[Documents]
end
subgraph Storage["Storage Layer"]
DocStore[Document Store]
Database[SQL Database]
end
subgraph Query["Query Pipeline"]
UserQuery[User Query]
Retriever[Retriever]
GenOrLLM[Generator or LLM]
AnswerBuilder[Answer builder]
Answer[Answer]
end
Files --> S3
Files --> VPC
S3 --> Index
VPC --> Index
Index --> Preprocess
Preprocess --> Documents
Documents --> DocStore
Files -->|metadata| Database
UserQuery --> Retriever
DocStore --> Retriever
Retriever --> GenOrLLM
GenOrLLM --> AnswerBuilder
AnswerBuilder --> Answer
Answer -->|results| Database
style Upload fill:#dbeafe,stroke:#3b82f6
style Indexing fill:#dcfce7,stroke:#22c55e
style Storage fill:#fce7f3,stroke:#ec4899
style Query fill:#fef3c7,stroke:#f59e0b
AI Agents Architecture
graph TB
subgraph Agent["AI Agent System"]
AgentComp[Agent Component]
LLM[Language Model]
Memory[Agent Memory]
Tools[Agent Tools]
end
subgraph ToolTypes["Tool Types"]
Pipelines[Pipelines]
CustomFunc[Custom Functions]
MCP[MCP Servers]
WebSearch[Web Search]
end
UserInput[User Input] --> AgentComp
AgentComp --> LLM
LLM -->|decides| Tools
Tools -->|execute| ToolTypes
ToolTypes -->|results| LLM
LLM -->|output| Memory
Memory -->|context| LLM
LLM --> Answer[Answer]
style AgentComp fill:#ea580c,stroke:#c2410c,color:#fff
style Tools fill:#7c3aed,stroke:#6d28d9,color:#fff
style Memory fill:#059669,stroke:#047857,color:#fff
How-To Guides Organization
graph TB
HowTo[How-To Guides]
HowTo --> BuildingAgents[Building Agents]
HowTo --> DesigningPipelines[Designing Pipelines]
HowTo --> WorkingWithIndexes[Working with Indexes]
HowTo --> WorkingWithData[Working with Data]
HowTo --> Searching[Searching]
HowTo --> Evaluating[Evaluating]
HowTo --> Optimizing[Optimizing]
HowTo --> Productionizing[Productionizing]
HowTo --> ManagingAccess[Managing Access]
HowTo --> WorkingWithJobs[Working with Jobs]
HowTo --> UsingSDK[Using SDK]
DesigningPipelines --> CreatePipeline[Create Pipeline]
DesigningPipelines --> DeployPipeline[Deploy Pipeline]
DesigningPipelines --> EditPipeline[Edit Pipeline]
DesigningPipelines --> WorkWithLLMs[Work with LLMs]
DesigningPipelines --> CustomComponents[Custom Components]
DesigningPipelines --> HostedModels[Hosted Models]
BuildingAgents --> ConfigureAgent[Configure Agent]
BuildingAgents --> AdvancedConfig[Advanced Config]
BuildingAgents --> MinimalAgent[Minimal Agent]
BuildingAgents --> Troubleshooting[Troubleshooting]
style HowTo fill:#ea580c,stroke:#c2410c,color:#fff
Learning Path
graph LR
subgraph Learn["Learn Section"]
Basics[5-Step Guide]
AppComponents[App Components]
DocRetrieval[Document Retrieval]
Extractive[Extractive QA]
RAG[RAG QA]
LLMOverview[LLM Overview]
PromptEng[Prompt Engineering]
MCP[Model Context Protocol]
end
Basics --> AppComponents
AppComponents --> DocRetrieval
DocRetrieval --> Extractive
Extractive --> RAG
RAG --> LLMOverview
LLMOverview --> PromptEng
style Basics fill:#059669,stroke:#047857,color:#fff
style RAG fill:#dc2626,stroke:#b91c1c,color:#fff
Tutorials Journey
graph TB
Tutorials[Tutorials]
subgraph Basics["Learn the Basics"]
FirstSearch[First Document Search]
FirstQA[First QA App]
RobustRAG[Robust RAG System]
DataCleaning[Data Cleaning Agent]
PII[PII Masking]
MongoDB[MongoDB RAG]
AutoTagging[Auto-tagging with LLM]
ConnectUI[Connect to UI]
UploadCLI[Upload with CLI]
end
subgraph Advanced["Learn Advanced Features"]
CustomComponent[Custom Component]
DemoApp[Demo Your App]
PythonUpload[Upload with Python]
end
subgraph RESTAPI["REST API Tutorials"]
ChatApp[Chat App API]
FeedbackAPI[Feedback API]
end
Tutorials --> Basics
Tutorials --> Advanced
Tutorials --> RESTAPI
FirstSearch --> FirstQA
FirstQA --> RobustRAG
style Tutorials fill:#0891b2,stroke:#0e7490,color:#fff
Cross-Cutting Concerns
graph TB
subgraph CrossCutting["Cross-Cutting Concerns"]
Security[Secrets & Integrations]
Roles[User Roles & Permissions]
Workspaces[Workspaces]
Organizations[Organizations]
Settings[Settings]
Status[Platform Status]
end
subgraph AllAreas["Affects All Areas"]
Pipelines2[Pipelines]
Indexes2[Indexes]
Data2[Data]
Agents2[Agents]
end
Security -.->|secures| AllAreas
Roles -.->|controls access| AllAreas
Workspaces -.->|contains| AllAreas
Organizations -.->|manages| Workspaces
style CrossCutting fill:#f3f4f6,stroke:#9ca3af
Component Providers and Integrations
graph TB
subgraph Haystack["Haystack Components"]
HaystackCore[Core Components]
Embedders2[Embedders]
Generators2[Generators]
Retrievers2[Retrievers]
Preprocessors2[Preprocessors]
Builders2[Builders]
end
subgraph DeepsetCustom["deepset Custom Nodes"]
Augmenters[Augmenters]
Code[Code]
Crawler[Firecrawl]
DeepsetGen[deepset Generators]
DeepsetConv[deepset Converters]
end
subgraph ThirdParty["Third-Party Integrations"]
OpenAI2[OpenAI]
Anthropic2[Anthropic]
Cohere2[Cohere]
Bedrock2[Amazon Bedrock]
Vertex2[Google Vertex]
Nvidia2[Nvidia]
Jina2[Jina]
Voyage2[Voyage]
Mistral2[Mistral]
end
HaystackCore --> Embedders2
HaystackCore --> Generators2
HaystackCore --> Retrievers2
HaystackCore --> Preprocessors2
HaystackCore --> Builders2
ThirdParty -.->|powers| HaystackCore
ThirdParty -.->|powers| DeepsetCustom
style Haystack fill:#2563eb,stroke:#1e40af,color:#fff
style DeepsetCustom fill:#059669,stroke:#047857,color:#fff
style ThirdParty fill:#fef3c7,stroke:#fbbf24
Documentation Structure Summary
Top-Level Organization
File counts are MDX pages under docs/ (approximate; rerun find when restructuring).
-
Getting Started (~37 files)
- Basic concepts
- Quick start guide
- What's new (releases)
- Platform status
- Settings management
- Working in Haystack Enterprise Platform
-
Learn (~9 files)
- 5-step guide to prototyping
- Document retrieval
- Extractive QA
- RAG QA
- LLM overview
- Prompt engineering
- Model Context Protocol
-
Concepts (~26 files under
docs/concepts/)- Pipelines (including examples and multimodal topics)
- AI Agents
- Document stores
- Indexes
- Data in the platform
- Language models
- Jobs
- Roles, secrets, and integrations (cross-cutting)
-
Reference — Pipeline components (~234 files under
docs/reference/pipeline-components/)- AI components (for example,
Agent,LLM) - Knowledge retrieval, data processing, logic and flow
- Third-party integrations (provider-specific components)
- Custom
Codeand workspace custom components - Legacy and deprecated components
- Input, output, and overview pages
- AI components (for example,
-
How-To Guides (~107 files)
- Building agents
- Designing pipelines (including smart connections and hosted models)
- Working with indexes and data
- Searching, evaluating, optimizing
- Productionizing, managing access, jobs
- Using the SDK and REST API workflows
-
Tutorials (~14 files)
- Learn the basics (9 tutorials)
- Learn advanced features (3 tutorials)
- REST API tutorials (2 tutorials)
-
API Reference (~227 files)
- Main REST API (OpenAPI-generated pages)
- Jobs API and related endpoints
-
Builder (1 file)
- Deploy with Hayhooks
Key Relationships Summary
Primary Dependencies
- Pipelines depend on Components, Indexes, Document Stores, and LLMs
- Indexes depend on Document Stores and process Data
- Components depend on Document Stores and LLMs
- AI Agents depend on Components, Tools, and maintain Memory
- Query pipelines depend on enabled Indexes
- Document Stores are used by both Indexes (write) and Retrievers (read)
Data Flow Path
- Files → Upload to S3/VPC
- Files → Index (preprocessing)
- Index → Documents → Document Store
- User Query → Retriever → Document Store
- Retriever → Documents → Generator or LLM
- Generator or LLM → Answer builder (
DeepsetAnswerBuilderor HaystackAnswerBuilder) → Answer
Agent Workflow
- User Input → Agent Component
- Agent → LLM (with tools list)
- LLM → Tool Call OR Direct Answer
- Tool → Execution → Result
- Result → Check Exit Condition
- Continue loop or return answer
Component Pipeline
Files → Converters → Preprocessors → Embedders → (Storage) → Retrievers → Rankers → Prompt builders → Generators or LLM → Answer builders → Output
Smart connections can merge compatible lists (for example, multiple retrievers into one documents input) and convert between some types (for example, string and ChatMessage), which reduces the need for joiners and adapters in many pipelines.
Cross-References and Integration Points
- Security & Access: Applies to all pipelines, indexes, and data
- Workspaces: Contain pipelines, indexes, and files
- Organizations: Contain multiple workspaces
- Jobs: Can run any pipeline type
- LLMs: Used by Generators, ChatGenerators, the
LLMcomponent, and Agents - Document Stores: Central to both indexing and querying
- Embedders: Must use same model in indexing and query pipelines
- Agents: Can use pipelines as tools
Special Component Combinations
Common patterns documented:
- Retriever + Ranker
- PromptBuilder + Generator
- ChatPromptBuilder + ChatGenerator
- Embedder + Retriever
- Joiner + Ranker (often replaceable with smart connections)
- Generator or LLM +
DeepsetAnswerBuilder(RAG answers with references in the UI) or HaystackAnswerBuilder Input(messages) +Agent(minimal agent pipelines)- Router + Multiple paths
- Validator + Loop
Was this page helpful?