Searching > Full text matching > Escaping

The query string can include specific operators that define the conditions for how the words from the query string should be matched.

An implicit logical AND operator is always present, so "hello world" implies that both "hello" and "world" must be found in the matching document.

hello  world

Note: There is no explicit AND operator.

hello | world

Note: There is no operator OR. Please use | instead.

hello MAYBE world

The MAYBE operator functions similarly to the | operator, but it does not return documents that match only the right subtree expression.

hello -world
hello !world

The negation operator enforces a rule for a word to not exist.

Queries containing only negations are not supported by default. To enable, use the server option not_terms_only_allowed.

@title hello @body world

The field limit operator restricts subsequent searches to a specified field. By default, the query will fail with an error message if the given field name does not exist in the searched table. However, this behavior can be suppressed by specifying the @@relaxed option at the beginning of the query:

@@relaxed @nosuchfield my query

This can be useful when searching through heterogeneous tables with different schemas.

Field position limits additionally constrain the search to the first N positions within a given field (or fields). For example, @body [50] hello will not match documents where the keyword hello appears at position 51 or later in the body.

@body[50] hello

Multiple-field search operator:

@(title,body) hello world

Ignore field search operator (ignores any matches of 'hello world' from the 'title' field):

@!title hello world

Ignore multiple-field search operator (if there are fields 'title', 'subject', and 'body', then @!(title) is equivalent to @(subject,body)):

@!(title,body) hello world

All-field search operator:

@* hello

"hello world"

The phrase operator mandates that the words be adjacent to each other.

The phrase search operator can include a match any term modifier. Within the phrase operator, terms are positionally significant. When the 'match any term' modifier is employed, the positions of the subsequent terms in that phrase query will be shifted. As a result, the 'match any' modifier does not affect search performance.

"exact * phrase * * for terms"

You can also use the OR operator inside the quotes. The OR operator (|) must be enclosed in brackets () when used inside phrases. Each option is checked at the same position, and the phrase matches if any of the options fit that position.

Correct examples (with brackets):

"( a | b ) c"
"( ( a b c ) | d ) e"
"man ( happy | sad ) but all ( ( as good ) | ( as fast ) )"

Incorrect examples (without brackets - these won't work):

"a | b c"
"happy | sad"

"hello world"~10

Proximity distance is measured in words, accounting for word count, and applies to all words within quotes. For example, the query "cat dog mouse"~5 indicates that there must be a span of fewer than 8 words containing all 3 words. Therefore, a document with CAT aaa bbb ccc DOG eee fff MOUSE will not match this query, as the span is exactly 8 words long.

You can also use the OR operator inside a proximity search. The OR operator (|) must be enclosed in brackets () when used inside proximity searches. Each option is checked separately.

Correct example (with brackets):

"( two | four ) fish chips"~5

Incorrect example (without brackets - this won't work):

"two | four fish chips"~5

"the world is a wonderful place"/3

The quorum matching operator introduces a type of fuzzy matching. It will match only those documents that meet a given threshold of specified words. In the example above ("the world is a wonderful place"/3), it will match all documents containing at least 3 of the 6 specified words. The operator is limited to 255 keywords. Instead of an absolute number, you can also provide a value between 0.0 and 1.0 (representing 0% and 100%), and Manticore will match only documents containing at least the specified percentage of the given words. The same example above could also be expressed as "the world is a wonderful place"/0.5, and it would match documents with at least 50% of the 6 words.

The quorum operator supports the OR (|) operator. The OR operator (|) must be enclosed in brackets () when used inside quorum matching. Only one word from each OR group counts toward the match.

Correct examples (with brackets):

"( ( a b c ) | d ) e f g"/0.5
"happy ( sad | angry ) man"/2

Incorrect example (without brackets - this won't work):

"a b c | d e f g"/0.5

aaa << bbb << ccc

The strict order operator (also known as the "before" operator) matches a document only if its argument keywords appear in the document precisely in the order specified in the query. For example, the query black << cat will match the document "black and white cat" but not the document "that cat was black". The order operator has the lowest priority. It can be applied to both individual keywords and more complex expressions. For instance, this is a valid query:

(bag of words) << "exact phrase" << red|green|blue

raining =cats and =dogs
="exact phrase"

The exact form keyword modifier matches a document only if the keyword appears in the exact form specified. By default, a document is considered a match if the stemmed/lemmatized keyword matches. For instance, the query "runs" will match both a document containing "runs" and one containing "running", because both forms stem to just "run". However, the =runs query will only match the first document. The exact form modifier requires the index_exact_words option to be enabled.

Another use case is to prevent expanding a keyword to its *keyword* form. For example, with index_exact_words=1 + expand_keywords=1/star, bcd will find a document containing abcde, but =bcd will not.

As a modifier affecting the keyword, it can be used within operators such as phrase, proximity, and quorum operators. Applying an exact form modifier to the phrase operator is possible, and in this case, it internally adds the exact form modifier to all terms in the phrase.

nation* *nation* *national

Requires min_infix_len for prefix (expansion in trail) and/or suffix (expansion in head). If only prefixing is desired, min_prefix_len can be used instead.

The search will attempt to find all expansions of the wildcarded tokens, and each expansion is recorded as a matched hit. The number of expansions for a token can be controlled with the expansion_limit table setting. Wildcarded tokens can have a significant impact on query search time, especially when tokens have short lengths. In such cases, it is desirable to use the expansion limit.

The wildcard operator can be automatically applied if the expand_keywords table setting is used.

In addition, the following inline wildcard operators are supported:

? can match any single character: t?st will match test, but not teast
% can match zero or one character: tes% will match tes or test, but not testing

The inline operators require dict=keywords (enabled by default) and prefixing/infixing enabled.

REGEX(/t.?e/)

Requires the min_infix_len or min_prefix_len and dict=keywords options to be set (which is a default).

Similarly to the wildcard operators, the REGEX operator attempts to find all tokens matching the provided pattern, and each expansion is recorded as a matched hit. Note, this can have a significant impact on query search time, as the entire dictionary is scanned, and every term in the dictionary undergoes matching with the REGEX pattern.

The patterns should adhere to the RE2 syntax. The REGEX expression delimiter is the first symbol after the open bracket. In other words, all text between the open bracket followed by the delimiter and the delimiter and the closed bracket is considered as a RE2 expression. Please note that the terms stored in the dictionary undergo charset_table transformation, meaning that for example, REGEX may not be able to match uppercase characters if all characters are lowercased according to the charset_table (which happens by default). To successfully match a term using a REGEX expression, the pattern must correspond to the entire token. To achieve partial matching, place .* at the beginning and/or end of your pattern.

REGEX(/.{3}t/)
REGEX(/t.*\d*/)

^hello world$

Field-start and field-end keyword modifiers ensure that a keyword only matches if it appears at the very beginning or the very end of a full-text field, respectively. For example, the query "^hello world$" (enclosed in quotes to combine the phrase operator with the start/end modifiers) will exclusively match documents containing at least one field with these two specific keywords.

boosted^1.234 boostedfieldend$^1.234

The boost modifier raises the word IDF_score by the indicated factor in ranking scores that incorporate IDF into their calculations. It does not impact the matching process in any manner.

hello NEAR/3 world NEAR/4 "my test"

The NEAR operator is a more generalized version of the proximity operator. Its syntax is NEAR/N, which is case-sensitive and does not allow spaces between the NEAR keywords, slash sign, and distance value.

While the original proximity operator works only on sets of keywords, NEAR is more versatile and can accept arbitrary subexpressions as its two arguments. It matches a document when both subexpressions are found within N words of each other, regardless of their order. NEAR is left-associative and shares the same (lowest) precedence as BEFORE.

It is important to note that one NEAR/7 two NEAR/7 three is not exactly equivalent to "one two three"~7. The key difference is that the proximity operator allows up to 6 non-matching words between all three matching words, while the version with NEAR is less restrictive: it permits up to 6 words between one and two, and then up to 6 more between that two-word match and three.

Church NOTNEAR/3 street

The NOTNEAR operator serves as a negative assertion. It matches a document when the left argument is present and either the right argument is absent from the document or the right argument is a specified distance away from the end of the left matched argument. The distance is denoted in words. The syntax is NOTNEAR/N, which is case-sensitive and does not permit spaces between the NOTNEAR keyword, slash sign, and distance value. Both arguments of this operator can be terms or any operators or group of operators.

all SENTENCE words SENTENCE "in one sentence"

"Bill Gates" PARAGRAPH "Steve Jobs"

The SENTENCE and PARAGRAPH operators match a document when both of their arguments are within the same sentence or the same paragraph of text, respectively. These arguments can be keywords, phrases, or instances of the same operator.

The order of the arguments within the sentence or paragraph is irrelevant. These operators function only with tables built with index_sp (sentence and paragraph indexing feature) enabled and revert to a simple AND operation otherwise. For information on what constitutes a sentence and a paragraph, refer to the index_sp directive documentation.

ZONE:(h3,h4)

only in these titles

The ZONE limit operator closely resembles the field limit operator but limits matching to a specified in-field zone or a list of zones. It is important to note that subsequent subexpressions do not need to match within a single continuous span of a given zone and may match across multiple spans. For example, the query (ZONE:th hello world) will match the following sample document:

<th>Table 1. Local awareness of Hello Kitty brand.</th>
.. some table data goes here ..
<th>Table 2. World-wide brand awareness.</th>

The ZONE operator affects the query until the next field or ZONE limit operator, or until the closing parenthesis. It functions exclusively with tables built with zone support (refer to index_zones) and will be disregarded otherwise.

ZONESPAN:(h2)

only in a (single) title

The ZONESPAN limit operator resembles the ZONE operator but mandates that the match occurs within a single continuous span. In the example provided earlier, ZONESPAN:th hello world would not match the document, as "hello" and "world" do not appear within the same span.

Escaping

Since certain characters function as operators in the query string, they must be escaped to prevent query errors or unintended matching conditions.

The following characters should be escaped using a backslash (\):

!    "    $    '    (    )    -    /    <    @    \    ^    |    ~

To escape a single quote ('), use one backslash:

SELECT * FROM your_index WHERE MATCH('l\'italiano');

For the other characters in the list mentioned earlier, which are operators or query constructs, they must be treated as simple characters by the engine, with a preceding escape character. The backslash must also be escaped, resulting in two backslashes:

SELECT * FROM your_index WHERE MATCH('r\\&b | \\(official video\\)');

To use a backslash as a character, you must escape both the backslash as a character and the backslash as the escape operator, which requires four backslashes:

SELECT * FROM your_index WHERE MATCH('\\\\ABC');

When you are working with JSON data in Manticore Search and need to include a double quote (") within a JSON string, it's important to handle it with proper escaping. In JSON, a double quote within a string is escaped using a backslash (\). However, when inserting the JSON data through an SQL query, Manticore Search interprets the backslash (\) as an escape character within strings.

To ensure the double quote is correctly inserted into the JSON data, you need to escape the backslash itself. This results in using two backslashes (\\) before the double quote. For example:

insert into tbl(j) values('{"a": "\\"abc\\""}');

MySQL drivers provide escaping functions (e.g., mysqli_real_escape_string in PHP or conn.escape_string in Python), but they only escape specific characters. You will still need to add escaping for the characters from the previously mentioned list that are not escaped by their respective functions. Because these functions will escape the backslash for you, you only need to add one backslash.

This also applies to drivers that support (client-side) prepared statements. For example, with PHP PDO prepared statements, you need to add a backslash for the $ character:

$statement = $ln_sph->prepare( "SELECT * FROM index WHERE MATCH(:match)");
$match = '\$manticore';
$statement->bindParam(':match',$match,PDO::PARAM_STR);
$results = $statement->execute();

This results in the final query SELECT * FROM index WHERE MATCH('\\$manticore');

The same rules for the SQL protocol apply, with the exception that for JSON, the double quote must be escaped with a single backslash, while the rest of the characters require double escaping.

When using JSON libraries or functions that convert data structures to JSON strings, the double quote and single backslash are automatically escaped by these functions and do not need to be explicitly escaped.

The official clients utilize common JSON libraries/functions available in their respective programming languages under the hood. The same rules for escaping mentioned earlier apply.

The asterisk (*) is a unique character that serves two purposes:

as a wildcard prefix/suffix expander
as an any-term modifier within a phrase search.

Unlike other special characters that function as operators, the asterisk cannot be escaped when it's in a position to provide one of its functionalities.

In non-wildcard queries, the asterisk does not require escaping, whether it's in the charset_table or not.

In wildcard queries, an asterisk in the middle of a word does not require escaping. As a wildcard operator (either at the beginning or end of the word), the asterisk will always be interpreted as the wildcard operator, even if escaping is applied.

To escape special characters in JSON nodes, use a backtick. For example:

MySQL [(none)]> select * from t where json.`a=b`=234;
+---------------------+-------------+------+
| id                  | json        | text |
+---------------------+-------------+------+
| 8215557549554925578 | {"a=b":234} |      |
+---------------------+-------------+------+

MySQL [(none)]> select * from t where json.`a:b`=123;
+---------------------+-------------+------+
| id                  | json        | text |
+---------------------+-------------+------+
| 8215557549554925577 | {"a:b":123} |      |
+---------------------+-------------+------+

Operators Search profiling

Full text operators

Boolean operators

AND operator

OR operator

MAYBE operator

Negation operator

Field search operator

Phrase search operator

Proximity search operator

Quorum matching operator

Strict order operator

Exact form modifier

Wildcard operators

REGEX operator

Field-start and field-end modifier

IDF boost modifier

NEAR operator

NOTNEAR operator

SENTENCE and PARAGRAPH operators

ZONE limit operator

ZONESPAN limit operator

Escaping characters in query string

In MySQL command line client

Using MySQL drivers

In HTTP JSON API

In clients

Escaping asterisk

Escaping json node names in SQL