Skip to main content
Your SiteGPT chatbot is trained on content measured in pages. This unified system makes it easy to understand your usage limits regardless of whether you’re adding web pages, uploading files, or pasting text.

What is a page?

A page equals 2,500 cleaned characters of text content. This is roughly equivalent to:
  • A typical web page with moderate content
  • 1-2 pages of a PDF document
  • About 400-500 words of text
“Cleaned characters” means the actual text content after removing HTML tags, scripts, styling, and other non-content elements.

Why pages?

The pages-based quota system provides several benefits:

Simplicity

One number to track instead of separate limits for links and files

Flexibility

Use your quota however you want — all web pages, all files, or any mix

Transparency

Clear understanding of exactly how much content you can add

Fairness

You pay for content, not arbitrary file counts

Plan limits

Each plan includes a generous pages quota:
PlanPages QuotaApproximate Content
Starter1,000 pages~400,000 words
Growth10,000 pages~4 million words
Scale50,000 pages~20 million words
Enterprise500,000 pages~200 million words
Not sure which plan you need? Most small to medium websites fit comfortably within the Starter plan. If you have extensive documentation or knowledge bases, consider Growth or Scale.

How pages are calculated

When you add content to your chatbot, SiteGPT automatically calculates how many pages it will consume:

Web pages

Each URL you add is processed to extract the text content. The cleaned text is measured in characters, then divided by 2,500 to determine the page count. Example: A blog post with 5,000 characters of clean text = 2 pages

Files

Uploaded files (PDFs, DOCXs, etc.) are converted to text and measured the same way. Example: A 10-page PDF with ~25,000 characters = 10 pages

Raw text

When you paste text directly, the character count determines the pages. Example: 7,500 characters of pasted content = 3 pages

Viewing your usage

You can check your pages usage in several places:
1

Dashboard overview

Your chatbot dashboard shows current pages used vs. your limit
2

Content pages

The Links and Files pages show page counts for each item
3

Usage page

Navigate to Account → Usage for a detailed breakdown over time

Managing your quota

Before adding content

When you add new links or files, SiteGPT estimates the page count before processing. If the content would exceed your quota, you’ll see a warning.

Removing content

Deleting links or files immediately frees up those pages for new content.

Upgrading your plan

If you need more pages, you can upgrade your plan at any time from your billing page.

Tips for optimizing page usage

Instead of adding your entire sitemap, focus on the most relevant pages — product docs, FAQs, and key landing pages.
When adding sitemaps, use exclude patterns to skip pages that aren’t relevant for support (e.g., /blog/* if blog content isn’t needed).
If you have many small files, consider combining them into fewer, larger documents.
Use the content management pages to identify and remove outdated or low-value content.

FAQs

Cleaned characters are the actual readable text content after removing HTML tags, JavaScript, CSS, navigation menus, footers, and other non-content elements. This ensures you’re only using quota for meaningful training content.
No. Only text content counts toward your pages quota. Images, videos, and other media are not included in the calculation.
You won’t be able to add new content until you either remove existing content or upgrade your plan. Your chatbot will continue to work normally with its current training data.
Yes! The Links and Files pages show the page count for each item. Hover over the page count for a tooltip explaining the calculation.