Searching > Full text matching > Profiling

The query string can contain certain operators that allow telling the conditions of how the words from the query string should be matched.

There always is implicit AND operator, so "hello world" means that both "hello" and "world" must be present in matching document.

hello  world

OR operator precedence is higher than AND, so looking for cat | dog | mouse means looking for ( cat | dog | mouse ) and not (looking for cat) | dog | mouse.

hello | world

hello MAYBE world

MAYBE operator works much like operator | but doesn't return documents which match only right subtree expression.

hello -world
hello !world

The negation operator enforces a rule for a word to not exist.

Queries having only negations are not supported by default in Manticore Search. There's the server option not_terms_only_allowed to enable it.

@title hello @body world

Field limit operator limits subsequent searching to a given field. Normally, query will fail with an error message if given field name does not exist in the searched index. However, that can be suppressed by specifying @@relaxed option at the very beginning of the query:

@@relaxed @nosuchfield my query

This can be helpful when searching through heterogeneous indexes with different schemas.

Field position limit additionally restricts the searching to first N position within given field (or fields). For example, @body [50] hello will not match the documents where the keyword hello occurs at position 51 and below in the body.

@body[50] hello

Multiple-field search operator:

@(title,body) hello world

Ignore field search operator (will ignore any matches of 'hello world' from field 'title'):

@!title hello world

Ignore multiple-field search operator (if we have fields title, subject and body then @!(title) is equivalent to @(subject,body)):

@!(title,body) hello world

All-field search operator:

@* hello

"hello world"

The phrase operator requires the words to be next to each other.

The phrase search operator may include a match any term modifier. Terms within the phrase operator are position significant. When the 'match any term' modifier is implemented, the position of the subsequent terms from that phrase query will be shifted. Therefore, 'match any' has no impact on search performance.

"exact * phrase * * for terms"

"hello world"~10

Proximity distance is specified in words, adjusted for word count, and applies to all words within quotes. For instance, "cat dog mouse"~5 query means that there must be less than 8-word span which contains all 3 words, ie. CAT aaa bbb ccc DOG eee fff MOUSE document will not match this query, because this span is exactly 8 words long.

"the world is a wonderful place"/3

Quorum matching operator introduces a kind of fuzzy matching. It will only match those documents that pass a given threshold of given words. The example above ("the world is a wonderful place"/3) will match all documents that have at least 3 of the 6 specified words. Operator is limited to 255 keywords. Instead of an absolute number, you can also specify a number between 0.0 and 1.0 (standing for 0% and 100%), and Manticore will match only documents with at least the specified percentage of given words. The same example above could also have been written "the world is a wonderful place"/0.5 and it would match documents with at least 50% of the 6 words.

aaa << bbb << ccc

Strict order operator (aka operator "before") will match the document only if its argument keywords occur in the document exactly in the query order. For instance, black << cat query (without quotes) will match the document "black and white cat" but not the "that cat was black" document. Order operator has the lowest priority. It can be applied both to just keywords and more complex expressions, ie. this is a valid query:

(bag of words) << "exact phrase" << red|green|blue

raining =cats and =dogs
="exact phrase"

Exact form keyword modifier will match the document only if the keyword occurred in exactly the specified form. The default behavior is to match the document if the stemmed keyword matches. For instance, "runs" query will match both the document that contains "runs" and the document that contains "running", because both forms stem to just "run" while =runs query will only match the first document. Exact form operator requires index_exact_words option to be enabled.

This is a modifier that affects the keyword and thus can be used within operators such as phrase, proximity, and quorum operators. It is possible to apply an exact form modifier to the phrase operator. It's really just syntax sugar - it adds an exact form modifier to all terms contained within the phrase.

nation* *nation* *national

Requires min_infix_len for prefix (expansion in trail) and/or sufix (expansion in head). If only prefixing is wanted, min_prefix_len can be used instead.

The search will try to find all the expansions of the wildcarded tokens and each expansion is recorded as a matched hit. The number of expansions for a token can be controlled with expansion_limit index setting. Wildcarded tokens can have a big impact on the query search time, especially when tokens have short length. In such cases is desired to use the expansion limit.

The wildcard operator can be automatically applied if expand_keywords index setting is used.

In addition, inline wildcard operators are supported:

? can match any(one) character: t?st will match test, but not teast
% can match zero or one character : tes% will match tes or test, but not testing

The inline operators require dict=keywords and infixing enabled.

^hello world$

Field-start and field-end keyword modifiers will make the keyword match only if it occurred at the very start or the very end of a fulltext field, respectively. For instance, the query "^hello world$" (with quotes and thus combining phrase operator and start/end modifiers) will only match documents that contain at least one field that has exactly these two keywords.

boosted^1.234 boostedfieldend$^1.234

The boost modifier increases the word IDF score by the specified factor in ranking scores that use IDF in their formula. It does not affect the matching in any way.

hello NEAR/3 world NEAR/4 "my test"

NEAR operator is a generalized version of a proximity operator. The syntax is NEAR/N, it is case-sensitive, and no spaces are allowed between the NEAR keyword, the slash sign, and the distance value.

The original proximity operator only worked on sets of keywords. NEAR is more generic and can accept arbitrary subexpressions as its two arguments, matching the document when both subexpressions are found within N words of each other, no matter in which order. NEAR is left associative and has the same (lowest) precedence as BEFORE.

You should also note how a (one NEAR/7 two NEAR/7 three) query using NEAR is not really equivalent to a "one two three"~7 one using keyword proximity operator. The difference here is that the proximity operator allows for up to 6 non-matching words between all the 3 matching words, but the version with NEAR is less restrictive: it would allow for up to 6 words between 'one' and 'two' and then for up to 6 more between that two-word matching and a 'three' keyword.

Church NOTNEAR/3 street

Operator NOTNEAR is a negative assertion. It matches the document when left argument exists and either there is no right argument in document or right argument is distance away from left matched argument's end. The distance is specified in words. The syntax is NOTNEAR/N, it is case-sensitive, and no spaces are allowed between the NOTNEAR keyword, the slash sign, and the distance value. Both arguments of this operator might be terms or any operators or group of operators.

all SENTENCE words SENTENCE "in one sentence"

"Bill Gates" PARAGRAPH "Steve Jobs"

SENTENCE and PARAGRAPH operators matches the document when both its arguments are within the same sentence or the same paragraph of text, respectively. The arguments can be either keywords, or phrases, or the instances of the same operator.

The order of the arguments within the sentence or paragraph does not matter. These operators only work on indexes built with index_sp (sentence and paragraph indexing feature) enabled, and revert to a mere AND otherwise. Refer to the index_sp directive documentation for the notes on what's considered a sentence and a paragraph.

ZONE:(h3,h4)
only in these titles

ZONE limit operator is quite similar to field limit operator, but restricts matching to a given in-field zone or a list of zones. Note that the subsequent subexpressions are not required to match in a single contiguous span of a given zone, and may match in multiple spans. For instance, (ZONE:th hello world) query will match this example document:

<th>Table 1. Local awareness of Hello Kitty brand.</th>
.. some table data goes here ..
<th>Table 2. World-wide brand awareness.</th>

ZONE operator affects the query until the next field or ZONE limit operator, or the closing parenthesis. It only works on the indexes built with zones support (see index_zones) and will be ignored otherwise.

ZONESPAN:(h2)
only in a (single) title

ZONESPAN limit operator is similar to the ZONE operator, but requires the match to occur in a single contiguous span. In the example above, ZONESPAN:th hello world would not match the document, since "hello" and "world" do not occur within the same span.

Escaping

Last modified: September 18, 2020

As some characters are used as operators in the query string, they should be escaped to avoid query errors or unwanted matching conditions.

The following characters should be escaped using backslash (\):

!    "    $    '    (    )    -    /    <    @    \    ^    |    ~

Single quote (') can be escaped using one backslash:

SELECT * FROM your_index WHERE MATCH('l\'italiano');

The rest of the characters in the above list are operators or query constructs. In order to treat them as simple characters, the engine has to see them with a preceding escape character. The backslash must also be escaped itself, resulting in 2 backslahes:

SELECT * FROM your_index WHERE MATCH('r\\&b | \\(official video\\)');

If we want to use backslash as character, we need to escape both the backslash as character and backslash as the escape operator, resulting in 4 backslashes used:

SELECT * FROM your_index WHERE MATCH('\\\\ABC');

MySQL drivers offer escaping functions (e.g., mysqli_real_escape_string in PHP or conn.escape_string in Python), but they only escape certain characters. You will still need to add escaping for the characters from the above list that are not escaped by the respective functions. Since these functions will escape backslash for you, you only need to add one backslash.

This also applies for the drivers that support (client-side) prepared statements. For example for PHP PDO prepared statements, you need to add a backslash for $ character:

$statement = $ln_sph->prepare( "SELECT * FROM index WHERE MATCH(:match)");
$match = '\$manticore';
$statement->bindParam(':match',$match,PDO::PARAM_STR);
$results = $statement->execute();

It will make the final query SELECT * FROM index WHERE MATCH('\\$manticore');

The same rules for SQL protocol apply with the exception that for JSON the double quote must be escaped with a single backslash, while the rest of the characters must be double escaped.

If using JSON libraries/functions that convert data structures to JSON strings the double quote and single backslash are escaped automatically by these functions and must not be explicitly escaped.

The new official clients (which use HTTP as protocol) are using under the hood common JSON libraries/functions available on the respective programming language. Same rules of escaping as above are applied.

Asterisk (*) is a special character that can have two functionalities: as wildcarding prefix/suffix expander and as any term modifier inside a phrase search. Unlike the other special characters that are operators, the asterisk cannot be escaped when it's in a position to offer one of it's functionalities.

In non-wildcard queries, the asterisk doesn't require escaping, regardless if it's in the charset_table or not.

In wildcard queries, asterisk in the middle of a word doesn't require escaping. As a wildcard operator (either at start or end of the word), the asterisk will always be interpreted as the wildcard operator even if escaping is applied.

Operators Search profiling

Last modified: December 10, 2020

Example of a complex query:

"hello world" @title "example program"~5 @body python -(php|perl) @* code

The full meaning of this search is:

Find the words 'hello' and 'world' adjacently in any field in a document;
Additionally, the same document must also contain the words 'example' and 'program' in the title field, with up to, but not including, 5 words between the words in question; (E.g. "example PHP program" would be matched however "example script to introduce outside data into the correct context for your program" would not because two terms have 5 or more words between them)
Additionally, the same document must contain the word 'python' in the body field, but not contain either 'php' or 'perl';
Additionally, the same document must contain the word 'code' in any field.

OR operator precedence is higher than AND, so "looking for cat | dog | mouse" means "looking for ( cat | dog | mouse )" and not "(looking for cat) | dog | mouse".

To understand how a query will be executed, Manticore Search offer query profile tooling for viewing the query tree created by a query expression.

When using SQL statement the full-text query profiling needs to be enabled before running the desired query:

SET profiling =1;
SELECT * FROM test WHERE MATCH('@title abc* @body hey');

To view the query tree, we must run SHOW PLAN right after the execution of the query:

SHOW PLAN;

The command will return the structure of the executed query. Please note that the 3 statements - SET profiling, the query and SHOW - must run on the same session.

In HTTP, we can just enable "profile":true to get in response the full-text query tree structure.

{
  "index":"test",
  "profile":true,
  "query":
  {
    "match_phrase": { "_all" : "had grown quite" }
  }
}

The response will contain a profile object in which we can find a member query.

query property contains the transformed full-text query tree. Each node contains:

type: node type. Can be AND, OR, PHRASE, KEYWORD etc.
description: query subtree for this node shown as a string (in SHOW PLAN format)
children: child nodes, if any
max_field_pos: maximum position within a field

A keyword node will also provide:

word: transformed keyword.
querypos: position of this keyword in a query.
excluded: keyword excluded from query.
expanded: keyword added by prefix expansion.
field_start: keyword must occur at the very start of the field.
field_end: keyword must occur at the very end of the field.
boost: keyword IDF will be multiplied by this.

‹›

SQL
HTTP
PHP
Python
javascript
Java

📋

SET profiling=1;
SELECT * FROM test WHERE MATCH('@title abc* @body hey');
SHOW PLAN \G

POST /search
{
  "index": "forum",
  "query": {"query_string": "i me"},
  "_source": { "excludes":["*"] },
  "limit": 1,
  "profile":true
}

$result = $index->search('i me')->setSource(['excludes'=>['*']])->setLimit(1)->profile()->get();
print_r($result->getProfile());

searchApi.search({"index":"forum","query":{"query_string":"i me"},"_source":{"excludes":["*"]},"limit":1,"profile":True})

res = await searchApi.search({"index":"forum","query":{"query_string":"i me"},"_source":{"excludes":["*"]},"limit":1,"profile":true});

query = new HashMap<String,Object>();
query.put("query_string","i me");
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
searchRequest.setQuery(query);
searchRequest.setProfile(true);
searchRequest.setLimit(1);
searchRequest.setSort(new ArrayList<String>(){{
    add("*");
}});
searchResponse = searchApi.search(searchRequest);

‹›

Response

*************************** 1\. row ***************************
Variable: transformed_tree
   Value: AND(
  OR(fields=(title), KEYWORD(abcx, querypos=1, expanded), KEYWORD(abcm, querypos=1, expanded)),
  AND(fields=(body), KEYWORD(hey, querypos=2)))
1 row in set (0.00 sec)

Array
(
    [query] => Array
        (
            [type] => AND
            [description] => AND( AND(KEYWORD(i, querypos=1)),  AND(KEYWORD(me, querypos=2)))
            [children] => Array
                (
                    [0] => Array
                        (
                            [type] => AND
                            [description] => AND(KEYWORD(i, querypos=1))
                            [children] => Array
                                (
                                    [0] => Array
                                        (
                                            [type] => KEYWORD
                                            [word] => i
                                            [querypos] => 1
                                        )
                                )
                        )
                    [1] => Array
                        (
                            [type] => AND
                            [description] => AND(KEYWORD(me, querypos=2))
                            [children] => Array
                                (
                                    [0] => Array
                                        (
                                            [type] => KEYWORD
                                            [word] => me
                                            [querypos] => 2
                                        )
                                )
                        )
                )
        )
)

{'hits': {'hits': [{u'_id': u'100', u'_score': 2500, u'_source': {}}],
          'total': 1},
 'profile': {u'query': {u'children': [{u'children': [{u'querypos': 1,
                                                      u'type': u'KEYWORD',
                                                      u'word': u'i'}],
                                       u'description': u'AND(KEYWORD(i, querypos=1))',
                                       u'type': u'AND'},
                                      {u'children': [{u'querypos': 2,
                                                      u'type': u'KEYWORD',
                                                      u'word': u'me'}],
                                       u'description': u'AND(KEYWORD(me, querypos=2))',
                                       u'type': u'AND'}],
                        u'description': u'AND( AND(KEYWORD(i, querypos=1)),  AND(KEYWORD(me, querypos=2)))',
                        u'type': u'AND'}},
 'timed_out': False,
 'took': 0}

{"hits": {"hits": [{"_id": "100", "_score": 2500, "_source": {}}],
          "total": 1},
 "profile": {"query": {"children": [{"children": [{"querypos": 1,
                                                      "type": "KEYWORD",
                                                      "word": "i"}],
                                       "description": "AND(KEYWORD(i, querypos=1))",
                                       "type": "AND"},
                                      {"children": [{"querypos": 2,
                                                      "type": "KEYWORD",
                                                      "word": "me"}],
                                       "description": "AND(KEYWORD(me, querypos=2))",
                                       "type": "AND"}],
                        "description": "AND( AND(KEYWORD(i, querypos=1)),  AND(KEYWORD(me, querypos=2)))",
                        "type": "AND"}},
 "timed_out": False,
 "took": 0}

class SearchResponse {
    took: 18
    timedOut: false
    hits: class SearchResponseHits {
        total: 1
        hits: [{_id=100, _score=2500, _source={}}]
        aggregations: null
    }
    profile: {query={type=AND, description=AND( AND(KEYWORD(i, querypos=1)),  AND(KEYWORD(me, querypos=2))), children=[{type=AND, description=AND(KEYWORD(i, querypos=1)), children=[{type=KEYWORD, word=i, querypos=1}]}, {type=AND, description=AND(KEYWORD(me, querypos=2)), children=[{type=KEYWORD, word=me, querypos=2}]}]}}
}

In some cases the evaluated query tree can be rather different from the original one because of expansions and other transformations.

‹›

SQL
HTTP
PHP
Python
javascript
Java

📋

SET profiling=1;
SELECT id FROM forum WHERE MATCH('@title way* @content hey') LIMIT 1;
SHOW PLAN;

POST /search
{
  "index": "forum",
  "query": {"query_string": "@title way* @content hey"},
  "_source": { "excludes":["*"] },
  "limit": 1,
  "profile":true
}

$result = $index->search('@title way* @content hey')->setSource(['excludes'=>['*']])->setLimit(1)->profile()->get();
print_r($result->getProfile());

searchApi.search({"index":"forum","query":{"query_string":"@title way* @content hey"},"_source":{"excludes":["*"]},"limit":1,"profile":true})

res = await searchApi.search({"index":"forum","query":{"query_string":"@title way* @content hey"},"_source":{"excludes":["*"]},"limit":1,"profile":true});

query = new HashMap<String,Object>();
query.put("query_string","@title way* @content hey");
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
searchRequest.setQuery(query);
searchRequest.setProfile(true);
searchRequest.setLimit(1);
searchRequest.setSort(new ArrayList<String>(){{
    add("*");
}});
searchResponse = searchApi.search(searchRequest);

‹›

Response

Query OK, 0 rows affected (0.00 sec)
+--------+
| id     |
+--------+
| 711651 |
+--------+
1 row in set (0.04 sec)
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable         | Value                                                                                                                                                                                                                                                                                                                                                                                                                   |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| transformed_tree | AND(
  OR(
    OR(
      AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)),
      OR(
        AND(fields=(title), KEYWORD(ways, querypos=1, expanded)),
        AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))),
    AND(fields=(title), KEYWORD(way, querypos=1, expanded)),
    OR(fields=(title), KEYWORD(way*, querypos=1, expanded))),
  AND(fields=(content), KEYWORD(hey, querypos=2))) |
+------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

{
  "took":33,
  "timed_out":false,
  "hits":
  {
    "total":105,
    "hits":
    [
       {
          "_id":"711651",
          "_score":2539,
          "_source":{}
       }
    ]
  },
  "profile":
  {
    "query":
    {
      "type":"AND",
      "description":"AND( OR( OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)),  OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)),  AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))),  AND(fields=(title), KEYWORD(way, querypos=1, expanded)),  OR(fields=(title), KEYWORD(way*, querypos=1, expanded))),  AND(fields=(content), KEYWORD(hey, querypos=2)))",
      "children":
      [
        {
          "type":"OR",
          "description":"OR( OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)),  OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)),  AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))),  AND(fields=(title), KEYWORD(way, querypos=1, expanded)),  OR(fields=(title), KEYWORD(way*, querypos=1, expanded)))",
          "children":
          [
            {
               "type":"OR",
               "description":"OR( AND(fields=(title), KEYWORD(wayne, querypos=1, expanded)),  OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)),  AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded))))",
               "children":
               [
                 {
                   "type":"AND",
                   "description":"AND(fields=(title), KEYWORD(wayne, querypos=1, expanded))",
                   "fields":["title"],
                   "max_field_pos":0,
                   "children":
                   [
                     {
                       "type":"KEYWORD",
                       "word":"wayne",
                       "querypos":1,
                       "expanded":true
                     }
                   ]
                 },
                 {
                   "type":"OR",
                   "description":"OR( AND(fields=(title), KEYWORD(ways, querypos=1, expanded)),  AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded)))",
                   "children":
                   [
                     {
                       "type":"AND",
                       "description":"AND(fields=(title), KEYWORD(ways, querypos=1, expanded))",
                       "fields":["title"],
                       "max_field_pos":0,
                       "children":
                       [
                         {
                           "type":"KEYWORD",
                           "word":"ways",
                           "querypos":1,
                           "expanded":true
                         }
                       ]
                     },
                     {
                       "type":"AND",
                       "description":"AND(fields=(title), KEYWORD(wayyy, querypos=1, expanded))",
                       "fields":["title"],
                       "max_field_pos":0,
                       "children":
                       [
                         {
                           "type":"KEYWORD",
                           "word":"wayyy",
                           "querypos":1,
                           "expanded":true
                         }
                       ]
                     }
                   ]
                 }
               ]
            },
            {
              "type":"AND",
              "description":"AND(fields=(title), KEYWORD(way, querypos=1, expanded))",
              "fields":["title"],
              "max_field_pos":0,
              "children":
              [
                 {
                    "type":"KEYWORD",
                    "word":"way",
                    "querypos":1,
                    "expanded":true
                 }
              ]
            },
            {
              "type":"OR",
              "description":"OR(fields=(title), KEYWORD(way*, querypos=1, expanded))",
              "fields":["title"],
              "max_field_pos":0,
              "children":
              [
                {
                  "type":"KEYWORD",
                  "word":"way*",
                  "querypos":1,
                  "expanded":true
                }
              ]
            }
          ]
        },
        {
          "type":"AND",
          "description":"AND(fields=(content), KEYWORD(hey, querypos=2))",
          "fields":["content"],
          "max_field_pos":0,
          "children":
          [
            {
              "type":"KEYWORD",
              "word":"hey",
              "querypos":2
            }
          ]
        }
      ]
    }
  }
}

{'hits': {'hits': [{u'_id': u'2811025403043381551',
                    u'_score': 2643,
                    u'_source': {}}],
          'total': 1},
 'profile': {u'query': {u'children': [{u'children': [{u'expanded': True,
                                                      u'querypos': 1,
                                                      u'type': u'KEYWORD',
                                                      u'word': u'way*'}],
                                       u'description': u'AND(fields=(title), KEYWORD(way*, querypos=1, expanded))',
                                       u'fields': [u'title'],
                                       u'type': u'AND'},
                                      {u'children': [{u'querypos': 2,
                                                      u'type': u'KEYWORD',
                                                      u'word': u'hey'}],
                                       u'description': u'AND(fields=(content), KEYWORD(hey, querypos=2))',
                                       u'fields': [u'content'],
                                       u'type': u'AND'}],
                        u'description': u'AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)),  AND(fields=(content), KEYWORD(hey, querypos=2)))',
                        u'type': u'AND'}},
 'timed_out': False,
 'took': 0}

{"hits": {"hits": [{"_id": "2811025403043381551",
                    "_score": 2643,
                    "_source": {}}],
          "total": 1},
 "profile": {"query": {"children": [{"children": [{"expanded": True,
                                                      "querypos": 1,
                                                      "type": "KEYWORD",
                                                      "word": "way*"}],
                                       "description": "AND(fields=(title), KEYWORD(way*, querypos=1, expanded))",
                                       "fields": ["title"],
                                       "type": "AND"},
                                      {"children": [{"querypos": 2,
                                                      "type": "KEYWORD",
                                                      "word": "hey"}],
                                       "description": "AND(fields=(content), KEYWORD(hey, querypos=2))",
                                       "fields": ["content"],
                                       "type": "AND"}],
                        "description": "AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)),  AND(fields=(content), KEYWORD(hey, querypos=2)))",
                        "type": "AND"}},
 "timed_out": False,
 "took": 0}

class SearchResponse {
    took: 18
    timedOut: false
    hits: class SearchResponseHits {
        total: 1
        hits: [{_id=2811025403043381551, _score=2643, _source={}}]
        aggregations: null
    }
    profile: {query={type=AND, description=AND( AND(fields=(title), KEYWORD(way*, querypos=1, expanded)),  AND(fields=(content), KEYWORD(hey, querypos=2))), children=[{type=AND, description=AND(fields=(title), KEYWORD(way*, querypos=1, expanded)), fields=[title], children=[{type=KEYWORD, word=way*, querypos=1, expanded=true}]}, {type=AND, description=AND(fields=(content), KEYWORD(hey, querypos=2)), fields=[content], children=[{type=KEYWORD, word=hey, querypos=2}]}]}}
}

The SQL statement EXPLAIN QUERY allows displaying the execution tree of a provided full-text query without running an actual search query on the index.

‹›

SQL

📋

EXPLAIN QUERY index_base '@title running @body dog'\G

‹›

Response

 EXPLAIN QUERY index_base '@title running @body dog'\G
*************************** 1\. row ***************************
Variable: transformed_tree
   Value: AND(
      OR(
            AND(fields=(title), KEYWORD(run, querypos=1, morphed)),
            AND(fields=(title), KEYWORD(running, querypos=1, morphed))))
  AND(fields=(body), KEYWORD(dog, querypos=2, morphed)))

EXPLAIN QUERY ... option format=dot allows displaying the execution tree of a provided full-text query in hierarchical format suitable for visualization by existing tools, for example https://dreampuf.github.io/GraphvizOnline :

EXPLAIN QUERY graphviz example

‹›

SQL

📋

EXPLAIN QUERY idx 'i me' option format=dot\G

‹›

Response

EXPLAIN QUERY idx 'i me' option format=dot\G
*************************** 1. row ***************************
Variable: transformed_tree
   Value: digraph "transformed_tree"
{
0 [shape=record,style=filled,bgcolor="lightgrey" label="AND"]
0 -> 1
1 [shape=record,style=filled,bgcolor="lightgrey" label="AND"]
1 -> 2
2 [shape=record label="i | { querypos=1 }"]
0 -> 3
3 [shape=record,style=filled,bgcolor="lightgrey" label="AND"]
3 -> 4
4 [shape=record label="me | { querypos=2 }"]
}

When expression ranker is used, it is possible to expose the values of the calculated factors using PACKEDFACTORS().

The function returns:

the values of document level factors (like bm25, field_mask, doc_word_count)
list with each field that returned a hit (like lcs, hit_count, word_count sum_idf, min_hit_pos etc.)
list with each keyword from the query and their tf and idf values

The values can be used to understand why certain documents get scored lower or higher in a search or to improve the existing ranking expression.

‹›

SQL

📋

SELECT id, PACKEDFACTORS() FROM test1 WHERE MATCH('test one') OPTION ranker=expr('1')\G

‹›

Response

             id: 1
packedfactors(): bm25=569, bm25a=0.617197, field_mask=2, doc_word_count=2,
    field1=(lcs=1, hit_count=2, word_count=2, tf_idf=0.152356,
        min_idf=-0.062982, max_idf=0.215338, sum_idf=0.152356, min_hit_pos=4,
        min_best_span_pos=4, exact_hit=0, max_window_hits=1, min_gaps=2,
        exact_order=1, lccs=1, wlccs=0.215338, atc=-0.003974),
    word0=(tf=1, idf=-0.062982),
    word1=(tf=1, idf=0.215338)
1 row in set (0.00 sec)

Escaping Boolean optimization

Last modified: December 10, 2020

Queries may be automatically optimized if OPTION boolean_simplify=1 is specified. Some transformations performed by this optimization include:

Excess brackets: ((A | B) | C) becomes ( A | B | C ); ((A B) C) becomes ( A B C )
Excess AND NOT: ((A !N1) !N2) becomes (A !(N1 | N2))
Common NOT: ((A !N) | (B !N)) becomes ((A|B) !N)
Common Compound NOT: ((A !(N AA)) | (B !(N BB))) becomes (((A|B) !N) | (A !AA) | (B !BB)) if the cost of evaluating N is greater than the added together costs of evaluating A and B
Common subterm: ((A (N | AA)) | (B (N | BB))) becomes (((A|B) N) | (A AA) | (B BB)) if the cost of evaluating N is greater than the added together costs of evaluating A and B
Common keywords: (A | "A B"~N) becomes A; ("A B" | "A B C") becomes "A B"; ("A B"~N | "A B C"~N) becomes ("A B"~N)
Common phrase: ("X A B" | "Y A B") becomes ("("X"|"Y") A B")
Common AND NOT: ((A !X) | (A !Y) | (A !Z)) becomes (A !(X Y Z))
Common OR NOT: ((A !(N | N1)) | (B !(N | N2))) becomes (( (A !N1) | (B !N2) ) !N) Note that optimizing the queries consumes CPU time, so for simple queries -or for hand-optimized queries- you'll do better with the default boolean_simplify=0 value. Simplifications are often better for complex queries, or algorithmically generated queries.

Queries like "-dog", which implicitly include all documents from the collection, can not be evaluated. This is both for technical and performance reasons. Technically, Manticore does not always keep a list of all IDs. Performance-wise, when the collection is huge (ie. 10-100M documents), evaluating such queries could take very long.

Search profiling Search results

Last modified: July 22, 2020

Full text operators

Boolean operators

AND operator

OR operator

MAYBE operator

Negation operator

Field search operator

Phrase search operator

Proximity search operator

Quorum matching operator

Strict order operator

Exact form modifier

Wildcard operators

Field-start and field-end modifier

IDF boost modifier

NEAR operator

NOTNEAR operator

SENTENCE and PARAGRAPH operators

ZONE limit operator

ZONESPAN limit operator

Escaping characters in query string

In MySQL command line client

Using MySQL drivers

In HTTP JSON API

In clients

Escaping asterisk

Search profiling

How a query is interpreted

Profiling the query tree in SQL

Profiling the query in HTTP

Profiling without running a query

Viewing the match factors values

Boolean optimization