Hybrid search combines full-text (BM25) search with KNN vector search in a single query, fusing results using Reciprocal Rank Fusion (RRF). This leverages the strengths of both retrieval methods: keyword precision from BM25 and semantic understanding from vector similarity.
Full-text search excels at exact keyword matching and rare terms but misses conceptually similar content. Vector search captures semantic meaning but can be noisy on ambiguous queries. Hybrid search combines both, so documents that score well on either or both signals are surfaced.
RRF is a rank-based fusion algorithm. It operates on rank positions rather than raw scores, which avoids the need to normalize incompatible score scales (BM25 scores are unbounded; KNN distances have a different scale).
RRF_score(d) = SUM over all result sets r: weight_r / (rank_constant + rank_r(d))
Where:
dis a documentrank_r(d)is the document's 1-based position in result setr(sorted by that retriever's score)rank_constantis a smoothing constant (default: 60, configurable via therank_constantoption)weight_ris an optional per-retriever weight (default: 1.0)
If a document does not appear in a particular result set, its contribution from that set is 0.
rank_constant=60is the default.- Lower values (e.g. 10) amplify differences between top-ranked items.
- Higher values (e.g. 100) distribute influence more evenly across ranks.
Combine MATCH(...) and KNN(...) in the WHERE clause, with OPTION fusion_method='rrf':
- SQL
- JSON
SELECT id, hybrid_score()
FROM t
WHERE match('machine learning')
AND knn(vec, (0.1, 0.1, 0.1, 0.1))
OPTION fusion_method='rrf';This runs the text search and KNN search as independent parallel sub-queries, then fuses the results using RRF. Without fusion_method='rrf', the query runs as a regular KNN search filtered by the text match (pre-hybrid behavior).
hybrid_score()- the RRF fusion score (only available in hybrid queries)weight()- the BM25 text match scoreknn_dist()- the vector distance (minimum across all KNN sub-queries if multiple)
| Option | Type | Default | Description |
|---|---|---|---|
fusion_method |
string | (none) | Set to 'rrf' to enable hybrid search. Required. |
rank_constant |
int | 60 | Smoothing constant in the RRF formula |
window_size |
int | 0 (auto) | How many results each sub-query retrieves before fusion. When 0, auto-computed from KNN k (with oversampling) and query LIMIT |
fusion_weights |
tuple | (all 1.0) | Per-sub-query weights for RRF scoring |
- SQL
- JSON
-- Default rank_constant=60 (gentler ranking)
SELECT id, hybrid_score() FROM t
WHERE match('machine learning') AND knn(vec, (0.1, 0.1, 0.1, 0.1))
OPTION fusion_method='rrf';
-- rank_constant=10 (sharper top-rank differences)
SELECT id, hybrid_score() FROM t
WHERE match('machine learning') AND knn(vec, (0.1, 0.1, 0.1, 0.1))
OPTION fusion_method='rrf', rank_constant=10;Standard WHERE filters work alongside hybrid search. Filters are applied to both the text and KNN sub-queries:
- SQL
- JSON
SELECT id, category, hybrid_score()
FROM t
WHERE match('machine learning')
AND knn(vec, (0.1, 0.1, 0.1, 0.1))
AND category = 1
OPTION fusion_method='rrf';By default, results are sorted by hybrid_score() DESC. You can override this:
- SQL
-- Sort by hybrid score ascending
SELECT id, hybrid_score() FROM t
WHERE match('machine learning') AND knn(vec, (0.1, 0.1, 0.1, 0.1))
ORDER BY hybrid_score() ASC
OPTION fusion_method='rrf';
-- Sort by text weight
SELECT id, weight() FROM t
WHERE match('machine learning') AND knn(vec, (0.1, 0.1, 0.1, 0.1))
ORDER BY weight() DESC, id ASC
OPTION fusion_method='rrf';
-- Sort by KNN distance
SELECT id, knn_dist() FROM t
WHERE match('machine learning') AND knn(vec, (0.1, 0.1, 0.1, 0.1))
ORDER BY knn_dist() ASC
OPTION fusion_method='rrf';If the text query matches no documents, only KNN results contribute to the RRF score:
SELECT id, hybrid_score() FROM t
WHERE match('xyznonexistent') AND knn(vec, (0.1, 0.1, 0.1, 0.1))
OPTION fusion_method='rrf';
-- Returns results ranked purely by KNN rank
A single hybrid query can combine text search with multiple KNN searches on different vector attributes. All are fused together via RRF:
- SQL
- JSON
-- Three-way fusion: text + vec1 KNN + vec2 KNN
SELECT id, hybrid_score()
FROM t
WHERE match('machine learning')
AND knn(vec1, (0.1, 0.1, 0.1, 0.1))
AND knn(vec2, (1.0, 0.0, 0.0, 0.0))
OPTION fusion_method='rrf';
-- KNN-only fusion (no text), two vector searches
SELECT id, hybrid_score()
FROM t
WHERE knn(vec1, (0.1, 0.1, 0.1, 0.1))
AND knn(vec2, (1.0, 0.0, 0.0, 0.0))
OPTION fusion_method='rrf';Multiple KNN searches without fusion_method produce an error.
By default, all sub-queries contribute equally (weight 1.0). To give different importance to text vs KNN searches, use fusion_weights with explicit aliases:
- SQL
- JSON
SELECT id, hybrid_score()
FROM t
WHERE match('machine learning') AS text
AND knn(vec1, (0.1, 0.1, 0.1, 0.1)) AS dense1
AND knn(vec2, (1.0, 0.0, 0.0, 0.0)) AS dense2
OPTION fusion_method='rrf',
fusion_weights=(text=0.7, dense1=0.2, dense2=0.1);SQL:
- Use
AS aliasonMATCH(...)andKNN(...)to name them. There are no implicit/default aliases. - Omitted aliases default to weight 1.0.
- Referencing a non-existent alias produces an error.
JSON:
"query"is the fixed alias for the full-text sub-query.- KNN aliases are set via the
"name"property on each KNN entry. - A KNN entry named
"query"collides with the text alias and produces an error. - Implicit aliases (field names without explicit
"name") are not supported infusion_weights.
You can specify weights for only some sub-queries; the rest default to 1.0:
-- Only boost text, KNN searches default to weight 1.0
SELECT id, hybrid_score()
FROM t
WHERE match('machine learning') AS text
AND knn(vec1, (0.1, 0.1, 0.1, 0.1)) AS dense1
AND knn(vec2, (1.0, 0.0, 0.0, 0.0)) AS dense2
OPTION fusion_method='rrf', fusion_weights=(text=2.0);
For tables with auto-embeddings configured on a float_vector attribute, hybrid_match() provides a shorthand that automatically runs both text and KNN searches from a single query string:
- SQL
-- Explicit vector field
SELECT id, hybrid_score() FROM t WHERE hybrid_match('machine learning', vec);
-- Auto-detect vector field (requires exactly one auto-embedding attribute)
SELECT id, hybrid_score() FROM t WHERE hybrid_match('machine learning');
-- With custom k and rank_constant
SELECT id, hybrid_score() FROM t
WHERE hybrid_match('machine learning', vec, {k=3})
OPTION rank_constant=10;
-- With attribute filter
SELECT id, hybrid_score() FROM t
WHERE hybrid_match('machine learning', vec) AND category=1;hybrid_match() automatically:
- Runs the text query as a BM25 full-text search
- Generates an embedding from the same text string
- Runs a KNN search using that embedding
- Fuses results via RRF
Requirement: The vector attribute must have model_name and from configured for auto-embeddings. Without them, hybrid_match() returns an error.
For tables with auto-embeddings, a "hybrid" property provides a shorthand in JSON:
- JSON
POST /search
{
"table": "hj",
"hybrid": { "query": "machine learning" }
}
POST /search
{
"table": "hj",
"hybrid": { "query": "machine learning", "field": "vec" }
}
POST /search
{
"table": "hj",
"hybrid": { "query": "machine learning" },
"options": { "rank_constant": 10 }
}The "hybrid" property cannot be used together with "knn".
When the vector attribute has auto-embeddings, you can use "query" (string) instead of "query_vector" (array) in the knn object:
POST /search
{
"table": "ht",
"knn": { "field": "vec", "query": "machine learning", "k": 5 },
"query": { "match": { "title": "machine learning" } },
"options": { "fusion_method": "rrf" }
}
The string is automatically embedded at query time. Without auto-embeddings configured, this returns an error.
Internally, a hybrid query is split into N+1 parallel sub-queries:
- Job 0: Full-text (BM25) sub-query (skipped if text query is empty, to avoid polluting RRF with fullscan results)
- Jobs 1..N: One KNN sub-query per
knn(...)entry
All sub-queries run concurrently. After all complete, the RRF fusion:
- Collects ranked results from each sub-query
- For each document, accumulates RRF score contributions from every sub-query it appears in
- Sorts by fused RRF score descending
- Sets
knn_dist()to the minimum distance across all KNN sub-queries for each document - Preserves
weight()from the text sub-query