Skip to main content

GitHubRepoViewer

Navigate and fetch content from GitHub repositories.

Basic Information

  • Type: haystack_integrations.components.connectors.github.repo_viewer.GitHubRepoViewer
  • Components it can connect with:
    • Input: Receives repository path and branch as input.
    • Converters: Sends fetched documents to other components for processing.
    • ChatPromptBuilder: Sends documents as context for LLM-based analysis.

Inputs

ParameterTypeDefaultDescription
repoOptional[str]NoneRepository in format "owner/repo".
pathstrPath within repository (default: root).
branchOptional[str]NoneGit reference (branch, tag, commit) to use.

Outputs

ParameterTypeDefaultDescription
documentsList[Document]List of documents containing repository content.

Overview

GitHubRepoViewer navigates and fetches content from GitHub repositories. It returns different document structures based on the path type:

For directories:

  • Returns a list of Documents, one for each item
  • Each Document's content is the item name
  • Full path and metadata stored in Document.meta

For files:

  • Returns a single Document
  • Document's content is the file content
  • Full path and metadata stored in Document.meta

For errors:

  • Returns a single Document with error message as content
  • Document's meta contains type="error"

Authorization

To use this component, you must create a secret with your GitHub personal access token. Type GITHUB_TOKEN as the secret key. For detailed instructions on creating secrets, see Create Secrets.

Usage Example

This pipeline uses GitHubRepoViewer to fetch repository content for analysis:

components:
GitHubRepoViewer:
type: haystack_integrations.components.connectors.github.repo_viewer.GitHubRepoViewer
init_parameters:
github_token:
type: env_var
env_vars:
- GITHUB_TOKEN
strict: false
raise_on_failure: true
max_file_size: 1000000
repo:
branch: main

ChatPromptBuilder:
type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
init_parameters:
template:
- role: system
content: "You are a code analysis assistant. Analyze the provided repository content and answer questions about the code."
- role: user
content: "Repository content:\n{% for doc in documents %}\nFile: {{ doc.meta.path }}\n{{ doc.content }}\n{% endfor %}\n\nQuestion: {{ query }}"

OpenAIChatGenerator:
type: haystack.components.generators.chat.openai.OpenAIChatGenerator
init_parameters:
api_key:
type: env_var
env_vars:
- OPENAI_API_KEY
strict: false
model: gpt-4o-mini

OutputAdapter:
type: haystack.components.converters.output_adapter.OutputAdapter
init_parameters:
template: '{{ replies[0] }}'
output_type: List[str]

AnswerBuilder:
type: haystack.components.builders.answer_builder.AnswerBuilder
init_parameters:
pattern:
reference_pattern:

connections:
- sender: GitHubRepoViewer.documents
receiver: ChatPromptBuilder.documents
- sender: ChatPromptBuilder.prompt
receiver: OpenAIChatGenerator.messages
- sender: OpenAIChatGenerator.replies
receiver: OutputAdapter.replies
- sender: OutputAdapter.output
receiver: AnswerBuilder.replies
- sender: GitHubRepoViewer.documents
receiver: AnswerBuilder.documents

inputs:
query:
- ChatPromptBuilder.query
- AnswerBuilder.query
repo:
- GitHubRepoViewer.repo
path:
- GitHubRepoViewer.path
branch:
- GitHubRepoViewer.branch

outputs:
answers: AnswerBuilder.answers

max_runs_per_component: 100

metadata: {}

Parameters

Init Parameters

These are the parameters you can configure in Pipeline Builder:

ParameterTypeDefaultDescription
github_tokenOptional[Secret]NoneGitHub personal access token for API authentication.
raise_on_failureboolTrueIf True, raises exceptions on API errors.
max_file_sizeint1000000Maximum file size in bytes to fetch (default: 1MB).
repoOptional[str]NoneRepository in format "owner/repo".
branchstrmainGit reference (branch, tag, commit) to use.

Run Method Parameters

These are the parameters you can configure for the run() method. You can pass these parameters at query time through the API, in Playground, or when running a job.

ParameterTypeDefaultDescription
repoOptional[str]NoneRepository in format "owner/repo".
pathstrPath within repository (default: root).
branchOptional[str]NoneGit reference (branch, tag, commit) to use.