Dashboard
DocsCore ConceptsKnowledge Base

Knowledge Base

The Knowledge Base is where you build the foundation for your AI assistant. It consists of all the documents and web pages you add to a project, which are then processed and made searchable.

Document Sources

You can add content to your knowledge base in two ways:

File Upload

Upload files directly from your computer using drag-and-drop or the file picker. Supported formats:

  • PDF — including scanned documents (extracted via OCR)
  • DOCX / DOC — Microsoft Word documents
  • HTML / HTM — static web pages

Files are uploaded to AWS S3 via a presigned URL, then processed asynchronously by the ingestion pipeline.

Web URLs

Paste any public URL and Opentrace will crawl the page using ScrapingBee, extracting all visible content. This is useful for adding blog posts, documentation pages, or any publicly accessible web content.

How Content Becomes Searchable

After adding a document or URL, it goes through the ingestion pipeline:

  1. Partitioning — extracting text, tables, and images from the raw document
  2. Chunking — splitting content into manageable pieces (max 3,000 characters)
  3. Summarising — generating AI summaries for chunks containing tables or images
  4. Vectorization — creating 1,536-dimensional vector embeddings for semantic search

Once complete, the document's chunks are stored in PostgreSQL with pgvector and are immediately searchable via the chat interface.

Managing Your Knowledge Base

The Knowledge Base has two tabs:

  • Documents Tab — view all uploaded files and URLs, their processing status, and click to inspect individual chunks
  • Settings Tab — configure RAG strategy, embedding model, search parameters, and reranking options

Real-Time Processing Status

After uploading, each document displays a live status indicator that polls every 2 seconds:

uploading → queued → partitioning → chunking → summarising → vectorization → completed

You can click on any document to see detailed information about each processing stage.

Was this page helpful?