MATCH 子句允许在文本字段中进行全文搜索。输入的查询字符串会使用与索引时应用于文本的相同设置进行分词。除了对输入文本的分词外,查询字符串还支持多种全文操作符,这些操作符对关键词如何提供有效匹配施加各种规则。
全文匹配子句可以与属性过滤器以 AND 布尔关系组合。不支持全文匹配与属性过滤器之间的 OR 关系。
匹配查询总是在过滤过程的第一步执行,随后是属性过滤器。属性过滤器应用于匹配查询的结果集。没有匹配子句的查询称为全表扫描。
SELECT 子句中最多只能有一个 MATCH()。
使用全文查询语法时,匹配会在文档的所有已索引文本字段中执行,除非表达式要求在某个字段内匹配(如短语搜索)或被字段操作符限制。
在使用JOIN查询时,MATCH() 可以接受一个可选的第二个参数,指定全文搜索应应用于哪个表。默认情况下,全文查询应用于 JOIN 操作中的左表:
SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id WHERE MATCH('search query', table2);
这允许你在连接操作中对特定表执行全文搜索。有关使用 MATCH 与 JOIN 的更多详细信息,请参见连接表部分。
MATCH('search query' [, table_name])
'search query':全文搜索查询字符串,可以包含各种全文操作符。table_name:(可选)应用全文搜索的表名,在JOIN查询中用于指定不同于默认左表的表。
SELECT 语句使用 MATCH 子句,必须位于 WHERE 之后,用于执行全文搜索。MATCH() 接受一个输入字符串,其中所有全文操作符均可用。
- SQL
- MATCH with filters
SELECT * FROM myindex WHERE MATCH('"find me fast"/2');使用 MATCH 和 WHERE 过滤器的更复杂查询示例。
SELECT * FROM myindex WHERE MATCH('cats|birds') AND (`title`='some title' AND `id`=123);+------+------+----------------+
| id | gid | title |
+------+------+----------------+
| 1 | 11 | first find me |
| 2 | 12 | second find me |
+------+------+----------------+
2 rows in set (0.00 sec)全文匹配可在 /search 端点和基于 HTTP 的客户端中使用。以下子句可用于执行全文匹配:
"match" 是一个简单查询,在指定字段中匹配指定的关键词。
"query":
{
"match": { "field": "keyword" }
}
你可以指定字段列表:
"match":
{
"field1,field2": "keyword"
}
或者你可以使用 _all 或 * 来搜索所有字段。
你可以使用 "!field" 搜索除某个字段外的所有字段:
"match":
{
"!field1": "keyword"
}
默认情况下,关键词使用 OR 操作符组合。但是,你可以使用 "operator" 子句更改此行为:
"query":
{
"match":
{
"content,title":
{
"query":"keyword",
"operator":"or"
}
}
}
"operator" 可以设置为 "or" 或 "and"。
还可以应用 boost 修饰符。它通过指定的因子提升词的IDF_分数,在包含 IDF 计算的排名分数中提高权重。它不会以任何方式影响匹配过程。
"query":
{
"match":
{
"field1":
{
"query": "keyword",
"boost": 2.0
}
}
}
"match_phrase" 是一个匹配整个短语的查询。它类似于 SQL 中的短语操作符。示例如下:
"query":
{
"match_phrase": { "_all" : "had grown quite" }
}
"query_string" 接受一个输入字符串,作为 MATCH() 语法的全文查询。
"query":
{
"query_string": "Church NOTNEAR/3 street"
}
"match_all" 接受一个空对象,返回表中的文档,而不执行任何属性过滤或全文匹配。或者,你也可以在请求中省略 query 子句,效果相同。
"query":
{
"match_all": {}
}
所有全文匹配子句都可以与必须、必须不和应该操作符结合使用,构成JSON bool 查询。
- match
- match_phrase
- query_string
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
POST /search
-d
'{
"table" : "hn_small",
"query":
{
"match":
{
"*" : "find joe"
}
},
"_source": ["story_author","comment_author"],
"limit": 1
}'POST /search
-d
'{
"table" : "hn_small",
"query":
{
"match_phrase":
{
"*" : "find joe"
}
},
"_source": ["story_author","comment_author"],
"limit": 1
}'POST /search
-d
'{ "table" : "hn_small",
"query":
{
"query_string": "@comment_text \"find joe fast \"/2"
},
"_source": ["story_author","comment_author"],
"limit": 1
}'$search = new Search(new Client());
$result = $search->('@title find me fast');
foreach($result as $doc)
{
echo 'Document: '.$doc->getId();
foreach($doc->getData() as $field=>$value)
{
echo $field.': '.$value;
}
}searchApi.search({"table":"hn_small","query":{"query_string":"@comment_text \"find joe fast \"/2"}, "_source": ["story_author","comment_author"], "limit":1})await searchApi.search({"table":"hn_small","query":{"query_string":"@comment_text \"find joe fast \"/2"}, "_source": ["story_author","comment_author"], "limit":1})res = await searchApi.search({"table":"hn_small","query":{"query_string":"@comment_text \"find joe fast \"/2"}, "_source": ["story_author","comment_author"], "limit":1});query = new HashMap<String,Object>();
query.put("query_string", "@comment_text \"find joe fast \"/2");
searchRequest = new SearchRequest();
searchRequest.setIndex("hn_small");
searchRequest.setQuery(query);
searchRequest.addSourceItem("story_author");
searchRequest.addSourceItem("comment_author");
searchRequest.limit(1);
searchResponse = searchApi.search(searchRequest);object query = new { query_string="@comment_text \"find joe fast \"/2" };
var searchRequest = new SearchRequest("hn_small", query);
searchRequest.Source = new List<string> {"story_author", "comment_author"};
searchRequest.Limit = 1;
SearchResponse searchResponse = searchApi.Search(searchRequest);let query = SearchQuery {
query_string: Some(serde_json::json!("@comment_text \"find joe fast \"/2").into()),
..Default::default()
};
let search_req = SearchRequest {
table: "hn_small".to_string(),
query: Some(Box::new(query)),
source: serde_json::json!(["story_author", "comment_author"]),
limit: serde_json::json!(1),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({
index: 'test',
query: { query_string: "test document 1" },
"_source": ["content", "title"],
limit: 1
});searchRequest := manticoresearch.NewSearchRequest("test")
query := map[string]interface{} {"query_string": "test document 1"}
searchReq.SetSource([]string{"content", "title"})
searchReq.SetLimit(1)
resp, httpRes, err := search.SearchRequest(*searchRequest).Execute(){
"took" : 3,
"timed_out" : false,
"hits" : {
"hits" : [
{
"_id": 668018,
"_score" : 3579,
"_source" : {
"story_author" : "IgorPartola",
"comment_author" : "joe_the_user"
}
}
],
"total" : 88063,
"total_relation" : "eq"
}
}{
"took" : 3,
"timed_out" : false,
"hits" : {
"hits" : [
{
"_id": 807160,
"_score" : 2599,
"_source" : {
"story_author" : "rbanffy",
"comment_author" : "runjake"
}
}
],
"total" : 2,
"total_relation" : "eq"
}
}{
"took" : 3,
"timed_out" : false,
"hits" : {
"hits" : [
{
"_id": 807160,
"_score" : 2566,
"_source" : {
"story_author" : "rbanffy",
"comment_author" : "runjake"
}
}
],
"total" : 1864,
"total_relation" : "eq"
}
}Document: 1
title: first find me fast
gid: 11
Document: 2
title: second find me fast
gid: 12{'aggregations': None,
'hits': {'hits': [{'_id': '807160',
'_score': 2566,
'_source': {'comment_author': 'runjake',
'story_author': 'rbanffy'}}],
'max_score': None,
'total': 1864,
'total_relation': 'eq'},
'profile': None,
'timed_out': False,
'took': 2,
'warning': None}{'aggregations': None,
'hits': {'hits': [{'_id': '807160',
'_score': 2566,
'_source': {'comment_author': 'runjake',
'story_author': 'rbanffy'}}],
'max_score': None,
'total': 1864,
'total_relation': 'eq'},
'profile': None,
'timed_out': False,
'took': 2,
'warning': None}{
took: 1,
timed_out: false,
hits:
exports {
total: 1864,
total_relation: 'eq',
hits:
[ { _id: '807160',
_score: 2566,
_source: { story_author: 'rbanffy', comment_author: 'runjake' }
}
]
}
}class SearchResponse {
took: 1
timedOut: false
aggregations: null
hits: class SearchResponseHits {
maxScore: null
total: 1864
totalRelation: eq
hits: [{_id=807160, _score=2566, _source={story_author=rbanffy, comment_author=runjake}}]
}
profile: null
warning: null
}class SearchResponse {
took: 1
timedOut: false
aggregations: null
hits: class SearchResponseHits {
maxScore: null
total: 1864
totalRelation: eq
hits: [{_id=807160, _score=2566, _source={story_author=rbanffy, comment_author=runjake}}]
}
profile: null
warning: null
}class SearchResponse {
took: 1
timedOut: false
aggregations: null
hits: class SearchResponseHits {
maxScore: null
total: 1864
totalRelation: eq
hits: [{_id=807160, _score=2566, _source={story_author=rbanffy, comment_author=runjake}}]
}
profile: null
warning: null
}{
took: 1,
timed_out: false,
hits:
exports {
total: 5,
total_relation: 'eq',
hits:
[ { _id: '1',
_score: 2566,
_source: { content: 'This is a test document 1', title: 'Doc 1' }
}
]
}
}{
"hits": {
"hits": [
{
"_id": 1,
"_score": 2566,
"_source": {
"content": "This is a test document 1",
"title": "Doc 1"
}
}
],
"total": 5,
"total_relation": "eq"
},
"timed_out": false,
"took": 0
}查询字符串可以包含特定的运算符,这些运算符定义了如何匹配查询字符串中的单词的条件。
隐式的逻辑 AND 运算符始终存在,因此 "hello world" 意味着必须在匹配的文档中同时找到 "hello" 和 "world"。
hello world
注意:没有显式的 AND 运算符。
逻辑 OR 运算符 | 的优先级高于 AND,因此 looking for cat | dog | mouse 表示 looking for (cat | dog | mouse) 而不是 (looking for cat) | dog | mouse。
hello | world
注意:没有 OR 运算符。请改用 |。
hello MAYBE world
MAYBE 运算符的功能类似于 | 运算符,但它不会返回仅匹配右子树表达式的文档。
hello -world
hello !world
否定运算符强制执行单词不存在的规则。
默认情况下,仅包含否定的查询是不支持的。要启用,请使用服务器选项 not_terms_only_allowed。
@title hello @body world
字段限制运算符将后续搜索限制在指定的字段。默认情况下,如果给定的字段名在搜索的表中不存在,查询将失败并显示错误消息。但是,可以通过在查询开头指定 @@relaxed 选项来抑制此行为:
@@relaxed @nosuchfield my query
这在搜索具有不同架构的异构表时很有用。
字段位置限制还可以将搜索进一步限制在给定字段内的前 N 个位置。例如,@body [50] hello 将不会匹配关键字 hello 出现在正文第 51 位或之后的文档。
@body[50] hello
多字段搜索运算符:
@(title,body) hello world
忽略字段搜索运算符(忽略 'title' 字段中 'hello world' 的任何匹配):
@!title hello world
忽略多字段搜索运算符(如果有 'title'、'subject' 和 'body' 字段,则 @!(title) 等同于 @(subject,body)):
@!(title,body) hello world
全字段搜索运算符:
@* hello
"hello world"
短语运算符要求单词彼此相邻。
短语搜索运算符可以包含 匹配任意词 修饰符。在短语运算符内,词的位置很重要。当使用 '匹配任意' 修饰符时,后续词在该短语查询中的位置将被移动。因此,'匹配任意' 修饰符不会影响搜索性能。
注意:当使用包含超过 31 个关键词的查询时,位置 31 及以上的关键词的排名统计(如 tf、idf、bm25)可能会被低估。这是因为内部使用了 32 位掩码来跟踪匹配中的词出现情况。匹配逻辑(查找文档)仍然是正确的,但对于非常长的查询,排名分数可能会受到影响。
"exact * phrase * * for terms"
你还可以在引号内使用 OR 运算符。在短语内使用时,OR 运算符 (|) 必须用括号 () 括起来。每个选项在相同位置被检查,如果任何选项匹配该位置,则短语匹配。
正确示例(使用括号):
"( a | b ) c"
"( ( a b c ) | d ) e"
"man ( happy | sad ) but all ( ( as good ) | ( as fast ) )"
错误示例(没有括号 - 这些不会起作用):
"a | b c"
"happy | sad"
"hello world"~10
邻近距离以单词计算,考虑单词数,并适用于引号内的所有单词。例如,查询 "cat dog mouse"~5 表示必须有少于 8 个单词的跨度包含所有 3 个单词。因此,包含 CAT aaa bbb ccc DOG eee fff MOUSE 的文档将不匹配此查询,因为跨度恰好是 8 个单词长。
注意:当使用包含超过 31 个关键词的查询时,位置 31 及以上的关键词的排名统计(如 tf、idf、bm25)可能会被低估。这是因为内部使用了 32 位掩码来跟踪匹配中的词出现情况。匹配逻辑(查找文档)仍然是正确的,但对于非常长的查询,排名分数可能会受到影响。
你还可以在邻近搜索中使用 OR 运算符。在邻近搜索中使用时,OR 运算符 (|) 必须用括号 () 括起来。每个选项单独检查。
正确示例(使用括号):
"( two | four ) fish chips"~5
错误示例(没有括号 - 这不会起作用):
"two | four fish chips"~5
"the world is a wonderful place"/3
法定人数匹配运算符引入了一种模糊匹配。它只匹配满足指定单词给定阈值的文档。在上面的示例中("the world is a wonderful place"/3),它将匹配包含至少 6 个指定单词中的 3 个的所有文档。运算符限制为 255 个关键词。除了绝对数字外,你还可以提供 0.0 到 1.0 之间的值(分别代表 0% 和 100%),Manticore 将仅匹配包含给定词至少指定百分比的文档。上面的同一示例也可以表示为 "the world is a wonderful place"/0.5,它将匹配包含 6 个词中至少 50% 的文档。
法定人数运算符支持 OR (|) 运算符。在法定人数匹配中使用时,OR 运算符 (|) 必须用括号 () 括起来。只有每个 OR 组中的一个词计入匹配。
正确示例(使用括号):
"( ( a b c ) | d ) e f g"/0.5
"happy ( sad | angry ) man"/2
错误示例(没有括号 - 这不会起作用):
"a b c | d e f g"/0.5
aaa << bbb << ccc
严格顺序运算符(也称为"之前"运算符)只有在其参数关键词以查询中指定的确切顺序出现在文档中时,才匹配一个文档。例如,查询 black << cat 将匹配文档 "black and white cat",但不匹配文档 "that cat was black"。顺序运算符具有最低的优先级。它可以应用于单个关键词和更复杂的表达式。例如,这是一个有效的查询:
(bag of words) << "exact phrase" << red|green|blue
raining =cats and =dogs
="exact phrase"
精确形式关键词修饰符仅在关键词以指定的确切形式出现时才匹配文档。默认情况下,如果关键词的词干/词形还原匹配,则认为文档是匹配的。例如,查询"runs"将同时匹配包含"runs"和"running"的文档,因为两者的词干都是"run"。但是,=runs查询将仅匹配第一个文档。精确形式修饰符要求启用 index_exact_words 选项。
另一个用例是防止将关键词扩展为 *keyword* 形式。例如,使用 index_exact_words=1 + expand_keywords=1/star,bcd 将查找包含 abcde 的文档,但 =bcd 不会。
作为影响关键词的修饰符,它可以在短语、邻近性和仲裁运算符等运算符中使用。可以对短语运算符应用精确形式修饰符,在这种情况下,它会在内部为短语中的所有项添加精确形式修饰符。
nation* *nation* *national
需要 min_infix_len 来进行前缀(尾部扩展)和/或后缀(头部扩展)。如果只需要前缀,可以使用 min_prefix_len。
搜索将尝试查找所有通配符标记的扩展,并将每个扩展记录为匹配命中。可以使用 expansion_limit 表设置控制标记的扩展次数。通配符标记可能对查询搜索时间有显著影响,尤其是当标记长度较短时。在这种情况下,使用扩展限制是可取的。
如果使用 expand_keywords 表设置,通配符运算符可以自动应用。
此外,还支持以下内联通配符运算符:
?可以匹配任何单个字符:t?st将匹配test,但不匹配teast%可以匹配零个或一个字符:tes%将匹配tes或test,但不匹配testing
内联运算符需要 dict=keywords(默认启用)和启用前缀/中缀。
REGEX(/t.?e/)
需要设置 min_infix_len 或 min_prefix_len 和 dict=keywords 选项(这是默认设置)。
与通配符运算符类似,REGEX运算符尝试查找所有匹配提供的模式的标记,并将每个扩展记录为匹配命中。请注意,这可能对查询搜索时间产生显著影响,因为需要扫描整个词典,并且词典中的每个术语都要与REGEX模式匹配。
模式应遵循 RE2 语法。REGEX表达式分隔符是开括号后的第一个符号。换句话说,开括号后的分隔符和分隔符与闭括号之间的所有文本都被视为RE2表达式。
请注意,存储在词典中的术语会经过 charset_table 转换,这意味着例如,如果根据 charset_table(默认情况下)将所有字符转换为小写,则REGEX可能无法匹配大写字符。要使用REGEX表达式成功匹配术语,模式必须对应整个标记。要实现部分匹配,请在模式的开头和/或结尾放置 .*。
REGEX(/.{3}t/)
REGEX(/t.*\d*/)
^hello world$
字段起始和字段结束关键词修饰符确保关键词仅在全文字段的开头或结尾处匹配。例如,查询 "^hello world$"(用引号括起来以将短语运算符与起始/结束修饰符结合)将只匹配包含至少一个具有这两个特定关键词的字段的文档。
boosted^1.234 boostedfieldend$^1.234
提升修饰符通过指定的因子提高词语的 IDF 分数,以纳入 IDF 的排名分数计算中。它不会以任何方式影响匹配过程。
hello NEAR/3 world NEAR/4 "my test"
NEAR 运算符是邻近运算符的更通用版本。其语法是 NEAR/N,区分大小写,并且 NEAR 关键词、斜杠符号和距离值之间不允许有空格。
虽然原始邻近运算符仅适用于关键词集合,但 NEAR 更加通用,可以接受任意子表达式作为其两个参数。当两个子表达式在 N 个词以内找到时,无论其顺序如何,它都匹配一个文档。NEAR 是左结合的,并且与 BEFORE 共享相同的(最低)优先级。
需要注意的是,one NEAR/7 two NEAR/7 three 不完全等同于 "one two three"~7。关键区别在于,邻近运算符允许在所有三个匹配词之间最多有6个非匹配词,而带有 NEAR 的版本限制较少:它允许 one 和 two 之间最多6个词,然后在该两词匹配和 three 之间再允许最多6个词。
注意:当使用此运算符处理包含超过31个关键词的查询时,位置在31及以上的关键词的排名统计(如tf、idf、bm25)可能会被低估。这是由于内部使用了32位掩码来跟踪匹配项中的项目出现情况。匹配逻辑(查找文档)仍然正确,但对于非常长的查询,排名分数可能会受到影响。
Church NOTNEAR/3 street
NOTNEAR运算符作为一个负向断言。当左参数存在,且右参数要么在文档中不存在,要么距离左匹配参数末尾指定的单词数时,匹配一个文档。距离以单词数表示。语法为NOTNEAR/N,区分大小写,且不允许在NOTNEAR关键词、斜杠符号和距离值之间有空格。此运算符的两个参数可以是项或任何运算符或运算符组。
all SENTENCE words SENTENCE "in one sentence"
"Bill Gates" PARAGRAPH "Steve Jobs"
SENTENCE和PARAGRAPH运算符在两个参数位于同一句子或同一段落时匹配文档。这些参数可以是关键词、短语或相同运算符的实例。
参数在句子或段落中的顺序无关紧要。这些运算符仅在使用index_sp(句子和段落索引功能)构建的表中有效,否则将恢复为简单的AND操作。关于什么构成句子和段落的信息,请参考index_sp指令文档。
ZONE:(h3,h4)
only in these titles
ZONE限制运算符与字段限制运算符非常相似,但限制匹配到指定的内部字段区域或区域列表。需要注意的是,后续子表达式不需要在给定区域的单一连续跨度内匹配,可以跨越多个跨度。例如,查询(ZONE:th hello world)将匹配以下示例文档:
<th>Table 1. Local awareness of Hello Kitty brand.</th>
.. some table data goes here ..
<th>Table 2. World-wide brand awareness.</th>
ZONE运算符影响查询直到下一个字段或ZONE限制运算符,或直到右括号。它仅在使用区域支持(参考index_zones)构建的表中有效,否则将被忽略。
ZONESPAN:(h2)
only in a (single) title
ZONESPAN限制运算符类似于ZONE运算符,但要求匹配发生在单一连续跨度内。在前面的示例中,ZONESPAN:th hello world将不匹配该文档,因为"hello"和"world"未出现在同一跨度中。
由于某些字符在查询字符串中作为操作符使用,必须对它们进行转义以防止查询错误或意外的匹配条件。
以下字符应使用反斜杠(\)进行转义:
! " $ ' ( ) - / < @ \ ^ | ~
要转义单引号('),使用一个反斜杠:
SELECT * FROM your_index WHERE MATCH('l\'italiano');
对于前面提到的列表中的其他字符,它们是操作符或查询构造,必须被引擎视为普通字符,并在前面加上转义字符。 反斜杠本身也必须被转义,结果是两个反斜杠:
SELECT * FROM your_index WHERE MATCH('r\\&b | \\(official video\\)');
要将反斜杠作为字符使用,必须同时转义作为字符的反斜杠和作为转义操作符的反斜杠,这需要四个反斜杠:
SELECT * FROM your_index WHERE MATCH('\\\\ABC');
当您在 Manticore Search 中处理 JSON 数据并需要在 JSON 字符串中包含双引号(")时,重要的是要正确处理转义。在 JSON 中,字符串中的双引号使用反斜杠(\)转义。然而,当通过 SQL 查询插入 JSON 数据时,Manticore Search 将反斜杠(\)解释为字符串中的转义字符。
为了确保双引号正确插入到 JSON 数据中,您需要对反斜杠本身进行转义。这导致在双引号前使用两个反斜杠(\\)。例如:
insert into tbl(j) values('{"a": "\\"abc\\""}');
MySQL 驱动程序提供转义函数(例如 PHP 中的 mysqli_real_escape_string 或 Python 中的 conn.escape_string),但它们只转义特定字符。
您仍然需要为之前提到的列表中未被其各自函数转义的字符添加转义。
因为这些函数会为您转义反斜杠,所以您只需添加一个反斜杠。
这也适用于支持(客户端)预处理语句的驱动程序。例如,使用 PHP PDO 预处理语句时,您需要为 $ 字符添加一个反斜杠:
$statement = $ln_sph->prepare( "SELECT * FROM index WHERE MATCH(:match)");
$match = '\$manticore';
$statement->bindParam(':match',$match,PDO::PARAM_STR);
$results = $statement->execute();
这将导致最终查询为 SELECT * FROM index WHERE MATCH('\\$manticore');
SQL 协议的相同规则适用,唯一的例外是对于 JSON,双引号必须用单个反斜杠转义,而其余字符需要双重转义。
当使用将数据结构转换为 JSON 字符串的 JSON 库或函数时,双引号和单个反斜杠会被这些函数自动转义,无需显式转义。
官方客户端 在底层使用各自编程语言中常见的 JSON 库/函数。之前提到的转义规则同样适用。
星号(*)是一个独特的字符,具有两个用途:
- 作为通配符前缀/后缀扩展符
- 作为短语搜索中的任意词修饰符。
与其他作为操作符的特殊字符不同,当星号处于提供其功能的位置时,不能对其进行转义。
在非通配符查询中,无论是否在 charset_table 中,星号都不需要转义。
在通配符查询中,单词中间的星号不需要转义。作为通配符操作符(无论在单词开头还是结尾),星号始终被解释为通配符操作符,即使应用了转义。
要转义 JSON 节点中的特殊字符,请使用反引号。例如:
MySQL [(none)]> select * from t where json.`a=b`=234;
+---------------------+-------------+------+
| id | json | text |
+---------------------+-------------+------+
| 8215557549554925578 | {"a=b":234} | |
+---------------------+-------------+------+
MySQL [(none)]> select * from t where json.`a:b`=123;
+---------------------+-------------+------+
| id | json | text |
+---------------------+-------------+------+
| 8215557549554925577 | {"a:b":123} | |
+---------------------+-------------+------+
考虑以下复杂查询示例:
"hello world" @title "example program"~5 @body python -(php|perl) @* code
该搜索的完整含义是:
- 在文档的任何字段中定位相邻的单词 'hello' 和 'world';
- 此外,同一文档的标题字段中必须包含单词 'example' 和 'program',两者之间最多有但不包括 5 个单词;(例如,“example PHP program” 会匹配,但“example script to introduce outside data into the correct context for your program” 不会匹配,因为两个词之间有 5 个或更多单词)
- 此外,同一文档的正文字段中必须包含单词 'python',同时排除 'php' 或 'perl';
- 最后,同一文档的任何字段中必须包含单词 'code'。
OR 运算符优先于 AND,因此“looking for cat | dog | mouse” 意味着“looking for (cat | dog | mouse)”,而不是“(looking for cat) | dog | mouse”。
为了理解查询将如何执行,Manticore Search 提供了查询分析工具,用于检查由查询表达式生成的查询树。
要启用带有 SQL 语句的全文查询分析,必须在执行所需查询之前激活它:
SET profiling =1;
SELECT * FROM test WHERE MATCH('@title abc* @body hey');
要查看查询树,请在运行查询后立即执行 SHOW PLAN 命令:
SHOW PLAN;
该命令将返回已执行查询的结构。请记住,3 个语句 - SET profiling、查询和 SHOW - 必须在同一会话中执行。
使用 HTTP JSON 协议时,只需启用 "profile":true,即可在响应中获得全文查询树结构。
{
"table":"test",
"profile":true,
"query":
{
"match_phrase": { "_all" : "had grown quite" }
}
}
响应将包含一个 profile 对象,其中包含一个 query 成员。
query 属性保存转换后的全文查询树。每个节点包括:
type:节点类型,可以是 AND、OR、PHRASE、KEYWORD 等。description:该节点的查询子树,以字符串形式表示(SHOW PLAN格式)children:任何子节点(如果存在)max_field_pos:字段内的最大位置
关键词节点还将包括:
word:转换后的关键词。querypos:该关键词在查询中的位置。excluded:关键词是否被排除在查询之外。expanded:关键词是否由前缀扩展添加。field_start:关键词必须出现在字段开头。field_end:关键词必须出现在字段结尾。boost:关键词的 IDF 将乘以此值。
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
SET profiling=1;
SELECT * FROM test WHERE MATCH('@title abc* @body hey');
SHOW PLAN \GPOST /search
{
"table": "forum",
"query": {"query_string": "i me"},
"_source": { "excludes":["*"] },
"limit": 1,
"profile":true
}$result = $index->search('i me')->setSource(['excludes'=>['*']])->setLimit(1)->profile()->get();
print_r($result->getProfile());searchApi.search({"table":"forum","query":{"query_string":"i me"},"_source":{"excludes":["*"]},"limit":1,"profile":True})await searchApi.search({"table":"forum","query":{"query_string":"i me"},"_source":{"excludes":["*"]},"limit":1,"profile":True})res = await searchApi.search({"table":"forum","query":{"query_string":"i me"},"_source":{"excludes":["*"]},"limit":1,"profile":true});query = new HashMap<String,Object>();
query.put("query_string","i me");
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
searchRequest.setQuery(query);
searchRequest.setProfile(true);
searchRequest.setLimit(1);
searchRequest.setSort(new ArrayList<String>(){{
add("*");
}});
searchResponse = searchApi.search(searchRequest);object query = new { query_string="i me" };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Profile = true;
searchRequest.Limit = 1;
searchRequest.Sort = new List<Object> { "*" };
var searchResponse = searchApi.Search(searchRequest);let query = SearchQuery {
query_string: Some(serde_json::json!("i me").into()),
..Default::default()
};
let search_req = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
sort: serde_json::json!(["*"]),
limit: serde_json::json!(1),
profile: serde_json::json!(true),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({
index: 'test',
query: { query_string: 'Text' },
_source: { excludes: ['*'] },
limit: 1,
profile: true
});searchRequest := manticoresearch.NewSearchRequest("test")
query := map[string]interface{} {"query_string": "Text"}
source := map[string]interface{} { "excludes": []string {"*"} }
searchRequest.SetQuery(query)
searchRequest.SetSource(source)
searchReq.SetLimit(1)
searchReq.SetProfile(true)
res, _, _ := apiClient.SearchAPI.Search(context.Background()).SearchRequest(*searchRequest).Execute()*************************** 1\. row ***************************
Variable: transformed_tree
Value: AND(
OR(fields=(title), KEYWORD(abcx, querypos=1, expanded), KEYWORD(abcm, querypos=1, expanded)),
AND(fields=(body), KEYWORD(hey, querypos=2)))
1 row in set (0.00 sec){
"took":1503,
"timed_out":false,
"hits":
{
"total":406301,
"hits":
[
{
"_id": 406443,
"_score":3493,
"_source":{}
}
]
},
"profile":
{
"query":
{
"type":"AND",
"description":"AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))",
"children":
[
{
"type":"AND",
"description":"AND(KEYWORD(i, querypos=1))",
"children":
[
{
"type":"KEYWORD",
"word":"i",
"querypos":1
}
]
},
{
"type":"AND",
"description":"AND(KEYWORD(me, querypos=2))",
"children":
[
{
"type":"KEYWORD",
"word":"me",
"querypos":2
}
]
}
]
}
}
}Array
(
[query] => Array
(
[type] => AND
[description] => AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))
[children] => Array
(
[0] => Array
(
[type] => AND
[description] => AND(KEYWORD(i, querypos=1))
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => i
[querypos] => 1
)
)
)
[1] => Array
(
[type] => AND
[description] => AND(KEYWORD(me, querypos=2))
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => me
[querypos] => 2
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'100', u'_score': 2500, u'_source': {}}],
'total': 1},
'profile': {u'query': {u'children': [{u'children': [{u'querypos': 1,
u'type': u'KEYWORD',
u'word': u'i'}],
u'description': u'AND(KEYWORD(i, querypos=1))',
u'type': u'AND'},
{u'children': [{u'querypos': 2,
u'type': u'KEYWORD',
u'word': u'me'}],
u'description': u'AND(KEYWORD(me, querypos=2))',
u'type': u'AND'}],
u'description': u'AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))',
u'type': u'AND'}},
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'100', u'_score': 2500, u'_source': {}}],
'total': 1},
'profile': {u'query': {u'children': [{u'children': [{u'querypos': 1,
u'type': u'KEYWORD',
u'word': u'i'}],
u'description': u'AND(KEYWORD(i, querypos=1))',
u'type': u'AND'},
{u'children': [{u'querypos': 2,
u'type': u'KEYWORD',
u'word': u'me'}],
u'description': u'AND(KEYWORD(me, querypos=2))',
u'type': u'AND'}],
u'description': u'AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))',
u'type': u'AND'}},
'timed_out': False,
'took': 0}{"hits": {"hits": [{"_id": 100, "_score": 2500, "_source": {}}],
"total": 1},
"profile": {"query": {"children": [{"children": [{"querypos": 1,
"type": "KEYWORD",
"word": "i"}],
"description": "AND(KEYWORD(i, querypos=1))",
"type": "AND"},
{"children": [{"querypos": 2,
"type": "KEYWORD",
"word": "me"}],
"description": "AND(KEYWORD(me, querypos=2))",
"type": "AND"}],
"description": "AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))",
"type": "AND"}},
"timed_out": False,
"took": 0}class SearchResponse {
took: 18
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=100, _score=2500, _source={}}]
aggregations: null
}
profile: {query={type=AND, description=AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2))), children=[{type=AND, description=AND(KEYWORD(i, querypos=1)), children=[{type=KEYWORD, word=i, querypos=1}]}, {type=AND, description=AND(KEYWORD(me, querypos=2)), children=[{type=KEYWORD, word=me, querypos=2}]}]}}
}class SearchResponse {
took: 18
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=100, _score=2500, _source={}}]
aggregations: null
}
profile: {query={type=AND, description=AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2))), children=[{type=AND, description=AND(KEYWORD(i, querypos=1)), children=[{type=KEYWORD, word=i, querypos=1}]}, {type=AND, description=AND(KEYWORD(me, querypos=2)), children=[{type=KEYWORD, word=me, querypos=2}]}]}}
}class SearchResponse {
took: 18
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=100, _score=2500, _source={}}]
aggregations: null
}
profile: {query={type=AND, description=AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2))), children=[{type=AND, description=AND(KEYWORD(i, querypos=1)), children=[{type=KEYWORD, word=i, querypos=1}]}, {type=AND, description=AND(KEYWORD(me, querypos=2)), children=[{type=KEYWORD, word=me, querypos=2}]}]}}
}{
"hits":
{
"hits":
[{
"_id": 1,
"_score": 1480,
"_source": {}
}],
"total": 1
},
"profile":
{
"query": {
"children":
[{
"children":
[{
"querypos": 1,
"type": "KEYWORD",
"word": "i"
}],
"description": "AND(KEYWORD(i, querypos=1))",
"type": "AND"
},
{
"children":
[{
"querypos": 2,
"type": "KEYWORD",
"word": "me"
}],
"description": "AND(KEYWORD(me, querypos=2))",
"type": "AND"
}],
"description": "AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))",
"type": "AND"
}
},
"timed_out": False,
"took": 0
}{
"hits":
{
"hits":
[{
"_id": 1,
"_score": 1480,
"_source": {}
}],
"total": 1
},
"profile":
{
"query": {
"children":
[{
"children":
[{
"querypos": 1,
"type": "KEYWORD",
"word": "i"
}],
"description": "AND(KEYWORD(i, querypos=1))",
"type": "AND"
},
{
"children":
[{
"querypos": 2,
"type": "KEYWORD",
"word": "me"
}],
"description": "AND(KEYWORD(me, querypos=2))",
"type": "AND"
}],
"description": "AND( AND(KEYWORD(i, querypos=1)), AND(KEYWORD(me, querypos=2)))",
"type": "AND"
}
},
"timed_out": False,
"took": 0
}在某些情况下,由于扩展和其他转换,评估后的查询树可能与原始查询树有显著差异。
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
SET profiling=1;
SELECT id FROM forum WHERE MATCH('@title way* @content hey') LIMIT 1;
SHOW PLAN;POST /search
{
"table": "forum",
"query": {"query_string": "@title way* @content hey"},
"_source": { "excludes":["*"] },
"limit": 1,
"profile":true
}$result = $index->search('@title way* @content hey')->setSource(['excludes'=>['*']])->setLimit(1)->profile()->get();
print_r($result->getProfile());searchApi.search({"table":"forum","query":{"query_string":"@title way* @content hey"},"_source":{"excludes":["*"]},"limit":1,"profile":true})await searchApi.search({"table":"forum","query":{"query_string":"@title way* @content hey"},"_source":{"excludes":["*"]},"limit":1,"profile":true})res = await searchApi.search({"table":"forum","query":{"query_string":"@title way* @content hey"},"_source":{"excludes":["*"]},"limit":1,"profile":true});query = new HashMap<String,Object>();
query.put("query_string","@title way* @content hey");
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
searchRequest.setQuery(query);
searchRequest.setProfile(true);
searchRequest.setLimit(1);
searchRequest.setSort(new ArrayList<String>(){{
add("*");
}});
searchResponse = searchApi.search(searchRequest);object query = new { query_string="@title way* @content hey" };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Profile = true;
searchRequest.Limit = 1;
searchRequest.Sort = new List<Object> { "*" };
var searchResponse = searchApi.Search(searchRequest);let query = SearchQuery {
query_string: Some(serde_json::json!("@title way* @content hey").into()),
..Default::default()
};
let search_req = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
sort: serde_json::json!(["*"]),
limit: serde_json::json!(1),
profile: serde_json::json!(true),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({
index: 'test',
query: { query_string: '@content 1'},
_source: { excludes: ["*"] },
limit:1,
profile":true
});searchRequest := manticoresearch.NewSearchRequest("test")
query := map[string]interface{} {"query_string": "1*"}
source := map[string]interface{} { "excludes": []string {"*"} }
searchRequest.SetQuery(query)
searchRequest.SetSource(source)
searchReq.SetLimit(1)
searchReq.SetProfile(true)
res, _, _ := apiClient.SearchAPI.Search(context.Background()).SearchRequest(*searchRequest).Execute()Query OK, 0 rows affected (0.00 sec)
+--------+
| id |
+--------+
| 711651 |
+--------+
1 row in set (0.04 sec)
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable | Value |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| transformed_tree | AND(
OR(
OR(
AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)),
OR(
AND(fields=(title), KEYWORD(ways, querypos=1, expanded)),
AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))),
AND(fields=(title), KEYWORD(way, querypos=1, expanded)),
OR(fields=(title), KEYWORD(way*, querypos=1, expanded))),
AND(fields=(content), KEYWORD(hey, querypos=2))) |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec){
"took":33,
"timed_out":false,
"hits":
{
"total":105,
"hits":
[
{
"_id": 711651,
"_score":2539,
"_source":{}
}
]
},
"profile":
{
"query":
{
"type":"AND",
"description":"AND( OR( OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)), OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))), AND(fields=(title), KEYWORD(way, querypos=1, expanded)), OR(fields=(title), KEYWORD(way*, querypos=1, expanded))), AND(fields=(content), KEYWORD(hey, querypos=2)))",
"children":
[
{
"type":"OR",
"description":"OR( OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)), OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))), AND(fields=(title), KEYWORD(way, querypos=1, expanded)), OR(fields=(title), KEYWORD(way*, querypos=1, expanded)))",
"children":
[
{
"type":"OR",
"description":"OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)), OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded))))",
"children":
[
{
"type":"AND",
"description":"AND(fields=(title), KEYWORD(wayne, querypos=1, expanded))",
"fields":["title"],
"max_field_pos":0,
"children":
[
{
"type":"KEYWORD",
"word":"wayne",
"querypos":1,
"expanded":true
}
]
},
{
"type":"OR",
"description":"OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))",
"children":
[
{
"type":"AND",
"description":"AND(fields=(title), KEYWORD(ways, querypos=1, expanded))",
"fields":["title"],
"max_field_pos":0,
"children":
[
{
"type":"KEYWORD",
"word":"ways",
"querypos":1,
"expanded":true
}
]
},
{
"type":"AND",
"description":"AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded))",
"fields":["title"],
"max_field_pos":0,
"children":
[
{
"type":"KEYWORD",
"word":"wayyy",
"querypos":1,
"expanded":true
}
]
}
]
}
]
},
{
"type":"AND",
"description":"AND(fields=(title), KEYWORD(way, querypos=1, expanded))",
"fields":["title"],
"max_field_pos":0,
"children":
[
{
"type":"KEYWORD",
"word":"way",
"querypos":1,
"expanded":true
}
]
},
{
"type":"OR",
"description":"OR(fields=(title), KEYWORD(way*, querypos=1, expanded))",
"fields":["title"],
"max_field_pos":0,
"children":
[
{
"type":"KEYWORD",
"word":"way*",
"querypos":1,
"expanded":true
}
]
}
]
},
{
"type":"AND",
"description":"AND(fields=(content), KEYWORD(hey, querypos=2))",
"fields":["content"],
"max_field_pos":0,
"children":
[
{
"type":"KEYWORD",
"word":"hey",
"querypos":2
}
]
}
]
}
}
}Array
(
[query] => Array
(
[type] => AND
[description] => AND( OR( OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)), OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))), AND(fields=(title), KEYWORD(way, querypos=1, expanded)), OR(fields=(title), KEYWORD(way*, querypos=1, expanded))), AND(fields=(content), KEYWORD(hey, querypos=2)))
[children] => Array
(
[0] => Array
(
[type] => OR
[description] => OR( OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)), OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))), AND(fields=(title), KEYWORD(way, querypos=1, expanded)), OR(fields=(title), KEYWORD(way*, querypos=1, expanded)))
[children] => Array
(
[0] => Array
(
[type] => OR
[description] => OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)), OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded))))
[children] => Array
(
[0] => Array
(
[type] => AND
[description] => AND(fields=(title), KEYWORD(wayne, querypos=1, expanded))
[fields] => Array
(
[0] => title
)
[max_field_pos] => 0
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => wayne
[querypos] => 1
[expanded] => 1
)
)
)
[1] => Array
(
[type] => OR
[description] => OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)), AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))
[children] => Array
(
[0] => Array
(
[type] => AND
[description] => AND(fields=(title), KEYWORD(ways, querypos=1, expanded))
[fields] => Array
(
[0] => title
)
[max_field_pos] => 0
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => ways
[querypos] => 1
[expanded] => 1
)
)
)
[1] => Array
(
[type] => AND
[description] => AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded))
[fields] => Array
(
[0] => title
)
[max_field_pos] => 0
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => wayyy
[querypos] => 1
[expanded] => 1
)
)
)
)
)
)
)
[1] => Array
(
[type] => AND
[description] => AND(fields=(title), KEYWORD(way, querypos=1, expanded))
[fields] => Array
(
[0] => title
)
[max_field_pos] => 0
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => way
[querypos] => 1
[expanded] => 1
)
)
)
[2] => Array
(
[type] => OR
[description] => OR(fields=(title), KEYWORD(way*, querypos=1, expanded))
[fields] => Array
(
[0] => title
)
[max_field_pos] => 0
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => way*
[querypos] => 1
[expanded] => 1
)
)
)
)
)
[1] => Array
(
[type] => AND
[description] => AND(fields=(content), KEYWORD(hey, querypos=2))
[fields] => Array
(
[0] => content
)
[max_field_pos] => 0
[children] => Array
(
[0] => Array
(
[type] => KEYWORD
[word] => hey
[querypos] => 2
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381551',
u'_score': 2643,
u'_source': {}}],
'total': 1},
'profile': {u'query': {u'children': [{u'children': [{u'expanded': True,
u'querypos': 1,
u'type': u'KEYWORD',
u'word': u'way*'}],
u'description': u'AND(fields=(title), KEYWORD(way*, querypos=1, expanded))',
u'fields': [u'title'],
u'type': u'AND'},
{u'children': [{u'querypos': 2,
u'type': u'KEYWORD',
u'word': u'hey'}],
u'description': u'AND(fields=(content), KEYWORD(hey, querypos=2))',
u'fields': [u'content'],
u'type': u'AND'}],
u'description': u'AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), AND(fields=(content), KEYWORD(hey, querypos=2)))',
u'type': u'AND'}},
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381551',
u'_score': 2643,
u'_source': {}}],
'total': 1},
'profile': {u'query': {u'children': [{u'children': [{u'expanded': True,
u'querypos': 1,
u'type': u'KEYWORD',
u'word': u'way*'}],
u'description': u'AND(fields=(title), KEYWORD(way*, querypos=1, expanded))',
u'fields': [u'title'],
u'type': u'AND'},
{u'children': [{u'querypos': 2,
u'type': u'KEYWORD',
u'word': u'hey'}],
u'description': u'AND(fields=(content), KEYWORD(hey, querypos=2))',
u'fields': [u'content'],
u'type': u'AND'}],
u'description': u'AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), AND(fields=(content), KEYWORD(hey, querypos=2)))',
u'type': u'AND'}},
'timed_out': False,
'took': 0}{"hits": {"hits": [{"_id": 2811025403043381551,
"_score": 2643,
"_source": {}}],
"total": 1},
"profile": {"query": {"children": [{"children": [{"expanded": True,
"querypos": 1,
"type": "KEYWORD",
"word": "way*"}],
"description": "AND(fields=(title), KEYWORD(way*, querypos=1, expanded))",
"fields": ["title"],
"type": "AND"},
{"children": [{"querypos": 2,
"type": "KEYWORD",
"word": "hey"}],
"description": "AND(fields=(content), KEYWORD(hey, querypos=2))",
"fields": ["content"],
"type": "AND"}],
"description": "AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), AND(fields=(content), KEYWORD(hey, querypos=2)))",
"type": "AND"}},
"timed_out": False,
"took": 0}class SearchResponse {
took: 18
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=2811025403043381551, _score=2643, _source={}}]
aggregations: null
}
profile: {query={type=AND, description=AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), AND(fields=(content), KEYWORD(hey, querypos=2))), children=[{type=AND, description=AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), fields=[title], children=[{type=KEYWORD, word=way*, querypos=1, expanded=true}]}, {type=AND, description=AND(fields=(content), KEYWORD(hey, querypos=2)), fields=[content], children=[{type=KEYWORD, word=hey, querypos=2}]}]}}
}class SearchResponse {
took: 18
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=2811025403043381551, _score=2643, _source={}}]
aggregations: null
}
profile: {query={type=AND, description=AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), AND(fields=(content), KEYWORD(hey, querypos=2))), children=[{type=AND, description=AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), fields=[title], children=[{type=KEYWORD, word=way*, querypos=1, expanded=true}]}, {type=AND, description=AND(fields=(content), KEYWORD(hey, querypos=2)), fields=[content], children=[{type=KEYWORD, word=hey, querypos=2}]}]}}
}class SearchResponse {
took: 18
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=2811025403043381551, _score=2643, _source={}}]
aggregations: null
}
profile: {query={type=AND, description=AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), AND(fields=(content), KEYWORD(hey, querypos=2))), children=[{type=AND, description=AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), fields=[title], children=[{type=KEYWORD, word=way*, querypos=1, expanded=true}]}, {type=AND, description=AND(fields=(content), KEYWORD(hey, querypos=2)), fields=[content], children=[{type=KEYWORD, word=hey, querypos=2}]}]}}
}{
"hits":
{
"hits":
[{
"_id": 1,
"_score": 1480,
"_source": {}
}],
"total": 1
},
"profile":
{
"query":
{
"children":
[{
"children":
[{
"expanded": True,
"querypos": 1,
"type": "KEYWORD",
"word": "1*"
}],
"description": "AND(fields=(content), KEYWORD(1*, querypos=1, expanded))",
"fields": ["content"],
"type": "AND"
}],
"description": "AND(fields=(content), KEYWORD(1*, querypos=1))",
"type": "AND"
}},
"timed_out": False,
"took": 0
}{
"hits":
{
"hits":
[{
"_id": 1,
"_score": 1480,
"_source": {}
}],
"total": 1
},
"profile":
{
"query":
{
"children":
[{
"children":
[{
"expanded": True,
"querypos": 1,
"type": "KEYWORD",
"word": "1*"
}],
"description": "AND(fields=(content), KEYWORD(1*, querypos=1, expanded))",
"fields": ["content"],
"type": "AND"
}],
"description": "AND(fields=(content), KEYWORD(1*, querypos=1))",
"type": "AND"
}},
"timed_out": False,
"took": 0
}SQL 语句 EXPLAIN QUERY 允许显示给定全文查询的执行树,而无需对表执行实际的搜索查询。
- SQL
EXPLAIN QUERY index_base '@title running @body dog'\G EXPLAIN QUERY index_base '@title running @body dog'\G
*************************** 1\. row ***************************
Variable: transformed_tree
Value: AND(
OR(
AND(fields=(title), KEYWORD(run, querypos=1, morphed)),
AND(fields=(title), KEYWORD(running, querypos=1, morphed))))
AND(fields=(body), KEYWORD(dog, querypos=2, morphed)))EXPLAIN QUERY ... option format=dot 允许以分层格式显示提供的全文查询的执行树,适合使用现有工具进行可视化,例如 https://dreampuf.github.io/GraphvizOnline:

- SQL
EXPLAIN QUERY tbl 'i me' option format=dot\GEXPLAIN QUERY tbl 'i me' option format=dot\G
*************************** 1. row ***************************
Variable: transformed_tree
Value: digraph "transformed_tree"
{
0 [shape=record,style=filled,bgcolor="lightgrey" label="AND"]
0 -> 1
1 [shape=record,style=filled,bgcolor="lightgrey" label="AND"]
1 -> 2
2 [shape=record label="i | { querypos=1 }"]
0 -> 3
3 [shape=record,style=filled,bgcolor="lightgrey" label="AND"]
3 -> 4
4 [shape=record label="me | { querypos=2 }"]
}使用表达式排序器时,可以通过 PACKEDFACTORS() 函数显示计算出的因子值。
该函数返回:
- 文档级别因素的值(例如 bm25、field_mask、doc_word_count)
- 生成命中的每个字段的列表(包括 lcs、hit_count、word_count、sum_idf、min_hit_pos 等)
- 查询中每个关键词及其 tf 和 idf 值的列表
这些值可用于理解为什么某些文档在搜索中获得较低或较高的分数,或用于优化现有的排名表达式。
- SQL
SELECT id, PACKEDFACTORS() FROM test1 WHERE MATCH('test one') OPTION ranker=expr('1')\G id: 1
packedfactors(): bm25=569, bm25a=0.617197, field_mask=2, doc_word_count=2,
field1=(lcs=1, hit_count=2, word_count=2, tf_idf=0.152356,
min_idf=-0.062982, max_idf=0.215338, sum_idf=0.152356, min_hit_pos=4,
min_best_span_pos=4, exact_hit=0, max_window_hits=1, min_gaps=2,
exact_order=1, lccs=1, wlccs=0.215338, atc=-0.003974),
word0=(tf=1, idf=-0.062982),
word1=(tf=1, idf=0.215338)
1 row in set (0.00 sec)