GitHubRepoViewer
Navigate and fetch content from GitHub repositories.
Basic Information
- Type:
haystack_integrations.components.connectors.github.repo_viewer.GitHubRepoViewer - Components it can connect with:
Input: Receives repository path and branch as input.- Converters: Sends fetched documents to other components for processing.
ChatPromptBuilder: Sends documents as context for LLM-based analysis.
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| repo | Optional[str] | None | Repository in format "owner/repo". |
| path | str | Path within repository (default: root). | |
| branch | Optional[str] | None | Git reference (branch, tag, commit) to use. |
Outputs
| Parameter | Type | Default | Description |
|---|---|---|---|
| documents | List[Document] | List of documents containing repository content. |
Overview
GitHubRepoViewer navigates and fetches content from GitHub repositories. It returns different document structures based on the path type:
For directories:
- Returns a list of Documents, one for each item
- Each Document's content is the item name
- Full path and metadata stored in
Document.meta
For files:
- Returns a single Document
- Document's content is the file content
- Full path and metadata stored in
Document.meta
For errors:
- Returns a single Document with error message as content
- Document's meta contains
type="error"
Authorization
To use this component, you must create a secret with your GitHub personal access token. Type GITHUB_TOKEN as the secret key. For detailed instructions on creating secrets, see Create Secrets.
Usage Example
This pipeline uses GitHubRepoViewer to fetch repository content for analysis:
components:
GitHubRepoViewer:
type: haystack_integrations.components.connectors.github.repo_viewer.GitHubRepoViewer
init_parameters:
github_token:
type: env_var
env_vars:
- GITHUB_TOKEN
strict: false
raise_on_failure: true
max_file_size: 1000000
repo:
branch: main
ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a code analysis assistant. Analyze the provided repository content and answer questions about the code."
- role: user
content: "Repository content:\n{% for doc in documents %}\nFile: {{ doc.meta.path }}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"
OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini
OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]
AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:
connections:
- sender: GitHubRepoViewer.documents
receiver: ChatPromptBuilder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: AnswerBuilder.replies
- sender: GitHubRepoViewer.documents
receiver: AnswerBuilder.documents
inputs:
query:
- ChatPromptBuilder.query
- AnswerBuilder.query
repo:
- GitHubRepoViewer.repo
path:
- GitHubRepoViewer.path
branch:
- GitHubRepoViewer.branch
outputs:
answers: AnswerBuilder.answers
max_runs_per_component: 100
metadata: {}
Parameters
Init Parameters
These are the parameters you can configure in Pipeline Builder:
| Parameter | Type | Default | Description |
|---|---|---|---|
| github_token | Optional[Secret] | None | GitHub personal access token for API authentication. |
| raise_on_failure | bool | True | If True, raises exceptions on API errors. |
| max_file_size | int | 1000000 | Maximum file size in bytes to fetch (default: 1MB). |
| repo | Optional[str] | None | Repository in format "owner/repo". |
| branch | str | main | Git reference (branch, tag, commit) to use. |
Run Method Parameters
These are the parameters you can configure for the run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.
| Parameter | Type | Default | Description |
|---|---|---|---|
| repo | Optional[str] | None | Repository in format "owner/repo". |
| path | str | Path within repository (default: root). | |
| branch | Optional[str] | None | Git reference (branch, tag, commit) to use. |
Was this page helpful?