Data Sources
SiteGPT supports training your chatbot from multiple data sources beyond just websites. This allows you to create a comprehensive AI assistant that can answer questions based on your entire knowledge base.Available Data Sources
Website
Crawl your website automatically via sitemap or page-by-page
Google Drive
Import documents, sheets, and presentations from Google Drive
Notion
Connect your Notion workspace and import pages
YouTube
Train on video transcripts from YouTube channels or playlists
Dropbox
Import files and documents from Dropbox
OneDrive
Connect Microsoft OneDrive to import documents
SharePoint
Import content from Microsoft SharePoint sites
Confluence
Connect Atlassian Confluence for internal documentation
GitBook
Import documentation from GitBook
Box
Connect Box cloud storage
How Data Source Training Works
- Connect - Authenticate with your data source (OAuth for most services)
- Select - Choose which files, folders, or pages to import
- Train - SiteGPT processes the content and trains your chatbot
- Sync - Enable auto-sync to keep content updated (where supported)
Supported File Types
When importing from cloud storage services like Google Drive, Dropbox, OneDrive, or Box, SiteGPT can process:| File Type | Extensions |
|---|---|
| Documents | .pdf, .doc, .docx, .txt, .rtf |
| Spreadsheets | .xls, .xlsx, .csv |
| Presentations | .ppt, .pptx |
| Web | .html, .htm |
Best Practices
Organize your content before importing
Organize your content before importing
Create dedicated folders for chatbot training content. This makes it easier to manage what gets imported and keeps your chatbot focused.
Use multiple sources for comprehensive coverage
Use multiple sources for comprehensive coverage
Customers using 3+ data sources see 40% fewer “I don’t know” responses. Combine your website with internal docs and FAQs.
Keep content up to date
Keep content up to date
Enable auto-sync where available, or set reminders to retrain monthly. Outdated content leads to incorrect answers.
Review what you're importing
Review what you're importing
Only import content you want the chatbot to reference. Exclude internal-only documents, draft content, or sensitive information.
Adding a Data Source
- Navigate to your chatbot’s Training tab
- Click Add Data Source
- Select your preferred data source type
- Follow the authentication prompts
- Select the content to import
- Click Start Training
Training time depends on the amount of content. A typical import of 50-100 documents takes 2-5 minutes.