Documents

Upload documents directly to the knowledge base so your assistants can search and reference their content. TeamWeb AI extracts text from a range of common file formats, chunks it, and embeds it for semantic search.

Supported Formats

Format	Extensions	Notes
PDF	`.pdf`	Text-based PDFs. Scanned/image-only PDFs won’t extract text without an embedded text layer.
Microsoft Word	`.docx`	Extracts paragraph text and table content.
Microsoft Excel	`.xlsx`	Extracts cell values from all sheets.
Microsoft PowerPoint	`.pptx`	Extracts text from all slides.
HTML	`.html`, `.htm`	Strips tags, scripts, and styles; extracts visible text.
Plain Text	`.txt`, `.csv`, `.md`, `.rst`	Read as-is.

Uploading a Document

From the project detail page, select the Documents tab and click Upload Document.

Document – Select the file to upload
Context Label – A short description of what the document contains (e.g., “Product specification”, “Company handbook”, “Research report”)
Core – Whether to always include this document’s content in the assistant’s context

After uploading, the document is processed in the background. Text is extracted, split into chunks, and embedded for search. The status will change from pending to ingested once processing is complete.

How It Works

The uploaded file is saved temporarily on the server
Text is extracted using format-specific parsers
The extracted text is split into overlapping chunks (2000 characters with 200-character overlap)
Each chunk is converted into a vector embedding
Chunks are stored in the knowledge base and become searchable
The temporary file is removed after successful processing

Re-uploading

To update a document with a newer version, click the re-upload button on the document card. This replaces the existing content and re-processes the file. The old chunks are removed and new ones are created from the updated document.

Limitations

Scanned PDFs: PDFs that contain only scanned images (no text layer) will not yield any extracted text. Use OCR software to add a text layer before uploading.
Password-protected files: Encrypted or password-protected documents cannot be processed.
File size: Very large documents will create many chunks, which may increase processing time.

Documents work best for structured, text-rich content like reports, manuals, and specifications. For brief, frequently changing information, consider using Facts instead.

Code Repositories