Skip to main content

Data Sources

SiteGPT supports training your chatbot from multiple data sources beyond just websites. This allows you to create a comprehensive AI assistant that can answer questions based on your entire knowledge base.

Available Data Sources

How Data Source Training Works

  1. Connect - Authenticate with your data source (OAuth for most services)
  2. Select - Choose which files, folders, or pages to import
  3. Train - SiteGPT processes the content and trains your chatbot
  4. Sync - Enable auto-sync to keep content updated (where supported)

Supported File Types

When importing from cloud storage services like Google Drive, Dropbox, OneDrive, or Box, SiteGPT can process:
File TypeExtensions
Documents.pdf, .doc, .docx, .txt, .rtf
Spreadsheets.xls, .xlsx, .csv
Presentations.ppt, .pptx
Web.html, .htm

Best Practices

Create dedicated folders for chatbot training content. This makes it easier to manage what gets imported and keeps your chatbot focused.
Customers using 3+ data sources see 40% fewer “I don’t know” responses. Combine your website with internal docs and FAQs.
Enable auto-sync where available, or set reminders to retrain monthly. Outdated content leads to incorrect answers.
Only import content you want the chatbot to reference. Exclude internal-only documents, draft content, or sensitive information.

Adding a Data Source

  1. Navigate to your chatbot’s Training tab
  2. Click Add Data Source
  3. Select your preferred data source type
  4. Follow the authentication prompts
  5. Select the content to import
  6. Click Start Training
Training time depends on the amount of content. A typical import of 50-100 documents takes 2-5 minutes.

Need Help?

If you encounter issues connecting a data source, check the specific integration guide for troubleshooting steps, or contact our support team.