Skip to main content

GitHub

Connect a GitHub repository to train your SiteGPT chatbot on the content in your codebase, such as Markdown documentation, READMEs, and other text-based files. This is ideal for developer docs, internal guides, and any knowledge that already lives alongside your code.

Prerequisites

  • A GitHub account (personal or organization)
  • Access to the repository you want to import
  • Owner or Editor permissions on the SiteGPT chatbot

Connecting GitHub

1

Navigate to Training

Go to your chatbot dashboard and open the Training tab.
2

Add Data Source

Click Add Files and select GitHub.
3

Authenticate

Click Connect GitHub and sign in with your GitHub credentials. Grant SiteGPT permission to read the repositories you want to import from.
4

Select Content

Choose the repository and branch, then select the files or folders you want to import. You can pick an entire folder (such as a docs/ directory) or individual files.
5

Start Training

Click Import Selected to begin training. SiteGPT processes each file and adds the content to your chatbot’s knowledge base.

Supported File Types

GitHub repositories are best suited for text-based content. SiteGPT can process:
File TypeExtensions
Markdown.md, .markdown, .mdx
Documents.txt, .rst
Web.html, .htm
Binary files (images, archives, compiled assets) are skipped. For best results, point SiteGPT at the documentation files in your repository rather than the entire source tree.

Best Practices

Point at your docs

Most repositories keep documentation in a dedicated folder. Select that folder (for example docs/) so your chatbot trains on the content meant for readers, not build scripts or configuration files.

Use the right branch

Import from the branch that holds your published documentation (often main). Re-import after merging significant doc changes to keep answers current.

Exclude noise

Skip generated output, changelogs you do not want surfaced, and test fixtures. The cleaner the selection, the more focused your chatbot’s answers will be.

Troubleshooting

  • Verify you granted SiteGPT access to the correct repository during authentication
  • Private repositories require that the connected account has read access
  • Check that the files are in a supported text-based format
  • Try importing a smaller folder or fewer files at once
  • Very large repositories may take longer to scan
  • Confirm your GitHub connection is still authorized
  • Re-import after pushing changes; imports capture a snapshot at import time
  • Make sure you imported from the branch that contains your latest docs