Similarity Search

Module 4 · ~7 min read

Semantic search finds chunks about "automobile pricing" even if your query says "car costs" — because their vector representations are numerically close in embedding space. This is the core difference between vector search and traditional keyword matching.

Cosine Similarity

Qdrant measures how similar two vectors are using cosine similarity — the cosine of the angle between them:

1.0 — identical direction (same meaning)
0.0 — orthogonal (unrelated topics)
-1.0 — opposite direction (possible with bi-encoder models)

Most RAG results fall in the 0.7–0.95 range for relevant content. Scores below 0.5 typically indicate a weak or off-topic match.

The SearchRequest API

HybridRetriever.java — dense similarity search View source ↗

List<Document> results = vectorStore.similaritySearch(
    SearchRequest.builder()
        .query(query)
        .topK(topK * 2)   // fetch 2x to have headroom for RRF merge
        .build());

Why topK * 2?

The dense search is one of two retrieval sources. The final result list is capped at topK after merging with the keyword search results via Reciprocal Rank Fusion. Fetching double the target count from each source gives the RRF merge enough material to work with, especially when both sources return many of the same chunks (which collapse to one entry in the merged list).

Dense search (topK*2=20 results) FTS search (topK*2=20 results) │ │ └──────────────┬──────────────────┘ │ RRF merge ▼ Top topK=10 results (with combined RRF scores)

Similarity Search vs Keyword Search

Aspect	Vector similarity search	Keyword / FTS search
Matches on	Semantic meaning	Exact / stemmed terms
Strength	Paraphrase, synonyms, conceptual queries	Proper nouns, codes, exact phrases
Weakness	Specific codes like "INV-2024-001"	Meaning differences, synonyms
Backend	Qdrant (HNSW index)	PostgreSQL (GIN tsvector index)

Power RAG runs both in parallel and merges the results — see Topic 14: Reciprocal Rank Fusion for the merge algorithm.

For filtering to documents from a specific source or date range, SearchRequest.builder().filterExpression(...) supports metadata filters. The filter is applied inside Qdrant before the similarity ranking, making it efficient even for large collections.

← Previous Next →