Websites & URLs
URLs
Add individual web pages to the knowledge base. TeamWeb AI fetches the page content, extracts the text, and creates searchable embeddings.
- URL – The web page address to ingest
- Context Label – A human-readable label describing the source (e.g., “Product pricing page”, “Competitor - Acme Corp”)
- Core – Whether to always include this source in the assistant’s context
- Auto-Sync – Optionally set to Daily or Weekly to automatically re-ingest
After adding a URL, TeamWeb AI processes it in the background. The status will show as pending while processing, then ingested when complete.
You can re-ingest a URL to refresh the content if the page has been updated. When auto-sync is enabled, TeamWeb AI will re-fetch the URL on the configured schedule and only re-embed the content if it has changed (using content hashing for change detection).
Website Crawls
Crawl an entire website starting from a root URL. TeamWeb AI discovers and processes pages automatically.
- Root URL – The starting URL to begin crawling from
- Context Label – A label for the entire site (e.g., “Product documentation”, “Company blog”)
- Max Pages – The maximum number of pages to crawl (1–500, default 50)
- Core – Whether to always include this source in context
- Auto-Sync – Optionally set to Daily or Weekly to automatically re-crawl
The crawler follows links within the same domain and stays under the root URL path. For example, crawling https://example.com/docs will only follow links under /docs, not /blog.
Discovered pages appear as child sources under the main site entry. You can view all crawled pages and their content. Re-crawling a site deletes existing content and re-discovers pages from scratch.
Keeping Content Fresh
Each URL and website card shows a freshness indicator (“Synced 3h ago”) so you can see when the content was last processed. You can:
- Change sync frequency using the dropdown on each source card (No sync / Daily / Weekly)
- Manually re-ingest any individual source at any time
- Bulk re-ingest all project sources using the button in the project header