Dashboard
DocsCore ConceptsRAG Strategies

RAG Strategies

Opentrace supports four retrieval strategies that determine how the system finds relevant document chunks when you ask a question. Each strategy offers different trade-offs between speed, accuracy, and comprehensiveness.

Basic (Vector Search)

The simplest and fastest strategy. Your question is converted into an embedding vector and compared against all document chunk embeddings using cosine similarity.

ProsCons
Fast — single search passMisses keyword-exact matches
Great for semantic/conceptual queriesLess effective for precise terminology

Hybrid

Combines vector similarity search with full-text keyword search (PostgreSQL tsvector), then merges results using Reciprocal Rank Fusion (RRF).

ProsCons
Catches both semantic and exact keyword matchesSlightly slower (two searches + fusion)
Configurable vector/keyword weightsMore parameters to tune

Default weights: vector_weight: 0.7, keyword_weight: 0.3. Adjust these in the project's RAG settings.

Multi-Query Vector

The LLM generates N variations of your original query, performs a vector search for each variation, then fuses all results using RRF. This casts a wider semantic net.

ProsCons
Catches different phrasings and anglesSlower — N searches + LLM call
Excellent for complex or ambiguous questionsHigher API cost (more embeddings)

Multi-Query Hybrid

The most comprehensive strategy. Generates N query variations and runs hybrid search (vector + keyword) for each, then fuses everything with RRF.

ProsCons
Maximum recall — best for important questionsSlowest and most expensive
Combines all search modalitiesMay return redundant results
Tip

Which strategy should I use? Start with Basic for speed. Switch to Hybrid if you need exact keyword matching. Use Multi-Query variants for complex research questions where thoroughness matters more than speed.

Reciprocal Rank Fusion (RRF)

When combining results from multiple search methods, Opentrace uses RRF — a ranking algorithm that merges multiple ranked lists into a single list. Each result's score is calculated as:

score = Σ (weight / (k + rank))

Where k is a constant (typically 60) that prevents top-ranked results from dominating. This produces a balanced ranking that respects both vector similarity and keyword relevance.

Final Context Selection

After retrieval, the top chunks (limited by final_context_size) are selected and structured into three categories:

  • Texts — plain text passages
  • Tables — HTML table content
  • Images — base64 image data

These are then passed to the LLM along with your question and citation metadata.

Was this page helpful?