Searching > Spell correction | Manticore Search Manual

Autocomplete, or word completion, predicts and suggests the end of a word or phrase as you type. It's commonly used in:

Search boxes on websites
Suggestions in search engines
Text fields in apps

Manticore offers an advanced autocomplete feature that gives suggestions while you type, similar to those in well-known search engines. This helps speed up searches and lets users find what they need faster.

In addition to basic autocomplete functionality, Manticore includes advanced features to enhance the user experience:

Spell Correction (Fuzziness): Manticore's autocomplete helps correct spelling mistakes by using algorithms that recognize and fix common errors. This means even if you type something wrong, you can still find what you were looking for.
Keyboard Layout Autodetection: Manticore can figure out which keyboard layout you are using. This is really useful in places where many languages are used, or if you accidentally type in the wrong language. For example, if you type "ghbdtn" by mistake, Manticore knows you meant to say "привет" (hello in Russian) and suggests the correct word.

Manticore's autocomplete can be tailored to match different needs and settings, making it a flexible tool for many applications.

Autocomplete

NOTE: CALL AUTOCOMPLETE and /autocomplete require Manticore Buddy. If it doesn't work, make sure Buddy is installed.

To use autocomplete in Manticore, use the CALL AUTOCOMPLETE SQL statement or its JSON equivalent /autocomplete. This feature provides word completion suggestions based on your indexed data.

Before you proceed, ensure that the table you intend to use for autocomplete has infixes enabled.

Note: There's an automatic check for min_infix_len in the table settings, which uses a 30-second cache to improve the performance of CALL AUTOCOMPLETE. After making changes to your table, there may be a brief delay the first time you use CALL AUTOCOMPLETE (though this is usually not noticeable). Only successful results are cached, so if you remove the table or disable min_infix_len, CALL AUTOCOMPLETE may temporarily return incorrect results until it eventually starts showing an error related to min_infix_len.

CALL AUTOCOMPLETE('query_beginning', 'table', [...options]);

POST /autocomplete
{
    "table":"table_name",
    "query":"query_beginning"
    [,"options": {<autocomplete options>}]
}

layouts: A comma-separated string of keyboard layout codes for detecting typing errors caused by keyboard layout mismatches (e.g., typing "ghbdtn" instead of "привет" when using wrong layout). Manticore compares character positions across different layouts to suggest corrections. Requires at least 2 layouts to effectively detect mismatches. Available options: us, ru, ua, se, pt, no, it, gr, uk, fr, es, dk, de, ch, br, bg, be (more details here). Default: none
fuzziness: 0, 1, or 2 (default: 2). Maximum Levenshtein distance for finding typos. Set to 0 to disable fuzzy matching
preserve: 0 or 1 (default: 0). When set to 1, keeps words that don't have fuzzy matches in the search results (e.g., "hello wrld" returns both "hello wrld" and "hello world"). When set to 0, only returns words with successful fuzzy matches (e.g., "hello wrld" returns only "hello world"). Particularly useful for preserving short words or proper nouns that may not exist in Manticore Search
prepend: Boolean (0/1 in SQL). If true(1), adds an asterisk before the last word for prefix expansion (e.g., *word)
append: Boolean (0/1 in SQL). If true(1), adds an asterisk after the last word for suffix expansion (e.g., word*)
expansion_len: Number of characters to expand in the last word. Default: 10

‹›

SQL
SQL with no fuzzy search
JSON
SQL with preserve option
JSON with preserve option

📋

mysql> CALL AUTOCOMPLETE('hello', 'comment');
+------------+
| query      |
+------------+
| hello      |
| helio      |
| hell       |
| shell      |
| nushell    |
| powershell |
| well       |
| help       |
+------------+

mysql> CALL AUTOCOMPLETE('hello', 'comment', 0 as fuzziness);
+-------+
| query |
+-------+
| hello |
+-------+

POST /autocomplete
{
    "table":"comment",
    "query":"hello"
}

mysql> CALL AUTOCOMPLETE('hello wrld', 'comment', 1 as preserve);
+------------+
| query      |
+------------+
| hello wrld |
| hello world|
+------------+

POST /autocomplete
{
    "table":"comment",
    "query":"hello wrld",
    "options": {
        "preserve": 1
    }
}

‹›

Response

[
  {
    "total": 8,
    "error": "",
    "warning": "",
    "columns": [
      {
        "query": {
          "type": "string"
        }
      }
    ],
    "data": [
      {
        "query": "hello"
      },
      {
        "query": "helio"
      },
      {
        "query": "hell"
      },
      {
        "query": "shell"
      },
      {
        "query": "nushell"
      },
      {
        "query": "powershell"
      },
      {
        "query": "well"
      },
      {
        "query": "help"
      }
    ]
  }
]

This demo demonstrates the autocomplete functionality:
Blog post about Fuzzy Search and Autocomplete - https://manticoresearch.com/blog/new-fuzzy-search-and-autocomplete/

While CALL AUTOCOMPLETE is the recommended method for most use cases, Manticore also supports other controllable and customizable approaches to implement autocomplete functionality:

To autocomplete a sentence, you can use infixed search. You can find the end of a document field by providing its beginning and:

using the full-text wildcard operator * to match any characters
optionally using ^ to start from the beginning of the field
optionally using "" for phrase matching
and using result highlighting

There is an article about it in our blog and an interactive course. A quick example is:

Let's assume you have a document: My cat loves my dog. The cat (Felis catus) is a domestic species of small carnivorous mammal.
Then you can use ^, "", and * so as the user is typing, you make queries like: ^"m*", ^"my *", ^"my c*", ^"my ca*" and so on
It will find the document, and if you also do highlighting, you will get something like: <strong>My cat</strong> loves my dog. The cat ( ...

In some cases, all you need is to autocomplete a single word or a couple of words. In this case, you can use CALL KEYWORDS.

CALL KEYWORDS is available through the SQL interface and offers a way to examine how keywords are tokenized or to obtain the tokenized forms of specific keywords. If the table enables infixes, it allows you to quickly find possible endings for given keywords, making it suitable for autocomplete functionality.

This is a great alternative to general infixed search, as it provides higher performance since it only needs the table's dictionary, not the documents themselves.

CALL KEYWORDS(text, table [, options])

The CALL KEYWORDS statement divides text into keywords. It returns the tokenized and normalized forms of the keywords, and if desired, keyword statistics. Additionally, it provides the position of each keyword in the query and all forms of tokenized keywords when the table enables lemmatizers.

Parameter	Description
text	Text to break down to keywords
table	Name of the table from which to take the text processing settings
0/1 as stats	Show statistics of keywords, default is 0
0/1 as fold_wildcards	Fold wildcards, default is 0
0/1 as fold_lemmas	Fold morphological lemmas, default is 0
0/1 as fold_blended	Fold blended words, default is 0
N as expansion_limit	Override expansion_limit defined in the server configuration, default is 0 (use value from the configuration)
docs/hits as sort_mode	Sorts output results by either 'docs' or 'hits'. No sorting is applied by default.
jieba_mode	Jieba segmentation mode for the query. See jieba_mode for more details

The examples show how it works if assuming the user is trying to get an autocomplete for "my cat ...". So on the application side all you need to do is to suggest the user the endings from the column "normalized" for each new word. It often makes sense to sort by hits or docs using 'hits' as sort_mode or 'docs' as sort_mode.

‹›

Examples

Examples

📋

MySQL [(none)]> CALL KEYWORDS('m*', 't', 1 as stats);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | m*        | my         | 1    | 2    |
| 1    | m*        | mammal     | 1    | 1    |
+------+-----------+------------+------+------+

MySQL [(none)]> CALL KEYWORDS('my*', 't', 1 as stats);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | my*       | my         | 1    | 2    |
+------+-----------+------------+------+------+

MySQL [(none)]> CALL KEYWORDS('c*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+-------------+------+------+
| qpos | tokenized | normalized  | docs | hits |
+------+-----------+-------------+------+------+
| 1    | c*        | cat         | 1    | 2    |
| 1    | c*        | carnivorous | 1    | 1    |
| 1    | c*        | catus       | 1    | 1    |
+------+-----------+-------------+------+------+

MySQL [(none)]> CALL KEYWORDS('ca*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+-------------+------+------+
| qpos | tokenized | normalized  | docs | hits |
+------+-----------+-------------+------+------+
| 1    | ca*       | cat         | 1    | 2    |
| 1    | ca*       | carnivorous | 1    | 1    |
| 1    | ca*       | catus       | 1    | 1    |
+------+-----------+-------------+------+------+

MySQL [(none)]> CALL KEYWORDS('cat*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | cat*      | cat        | 1    | 2    |
| 1    | cat*      | catus      | 1    | 1    |
+------+-----------+------------+------+------+

There is a nice trick how you can improve the above algorithm - use bigram_index. When you have it enabled for the table what you get in it is not just a single word, but each pair of words standing one after another indexed as a separate token.

This allows to predict not just the current word's ending, but the next word too which is especially beneficial for the purpose of autocomplete.

‹›

Examples

Examples

📋

MySQL [(none)]> CALL KEYWORDS('m*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | m*        | my         | 1    | 2    |
| 1    | m*        | mammal     | 1    | 1    |
| 1    | m*        | my cat     | 1    | 1    |
| 1    | m*        | my dog     | 1    | 1    |
+------+-----------+------------+------+------+

MySQL [(none)]> CALL KEYWORDS('my*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | my*       | my         | 1    | 2    |
| 1    | my*       | my cat     | 1    | 1    |
| 1    | my*       | my dog     | 1    | 1    |
+------+-----------+------------+------+------+

MySQL [(none)]> CALL KEYWORDS('c*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+--------------------+------+------+
| qpos | tokenized | normalized         | docs | hits |
+------+-----------+--------------------+------+------+
| 1    | c*        | cat                | 1    | 2    |
| 1    | c*        | carnivorous        | 1    | 1    |
| 1    | c*        | carnivorous mammal | 1    | 1    |
| 1    | c*        | cat felis          | 1    | 1    |
| 1    | c*        | cat loves          | 1    | 1    |
| 1    | c*        | catus              | 1    | 1    |
| 1    | c*        | catus is           | 1    | 1    |
+------+-----------+--------------------+------+------+

MySQL [(none)]> CALL KEYWORDS('ca*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+--------------------+------+------+
| qpos | tokenized | normalized         | docs | hits |
+------+-----------+--------------------+------+------+
| 1    | ca*       | cat                | 1    | 2    |
| 1    | ca*       | carnivorous        | 1    | 1    |
| 1    | ca*       | carnivorous mammal | 1    | 1    |
| 1    | ca*       | cat felis          | 1    | 1    |
| 1    | ca*       | cat loves          | 1    | 1    |
| 1    | ca*       | catus              | 1    | 1    |
| 1    | ca*       | catus is           | 1    | 1    |
+------+-----------+--------------------+------+------+

MySQL [(none)]> CALL KEYWORDS('cat*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1    | cat*      | cat        | 1    | 2    |
| 1    | cat*      | cat felis  | 1    | 1    |
| 1    | cat*      | cat loves  | 1    | 1    |
| 1    | cat*      | catus      | 1    | 1    |
| 1    | cat*      | catus is   | 1    | 1    |
+------+-----------+------------+------+------+

CALL KEYWORDS supports distributed tables so no matter how big your data set you can benefit from using it.

Spell correction

Spell correction, also known as:

Auto correction
Text correction
Fixing spelling errors
Typo tolerance
"Did you mean?"

and so on, is a software functionality that suggests alternatives to or makes automatic corrections of the text you have typed in. The concept of correcting typed text dates back to the 1960s when computer scientist Warren Teitelman, who also invented the "undo" command, introduced a philosophy of computing called D.W.I.M., or "Do What I Mean." Instead of programming computers to accept only perfectly formatted instructions, Teitelman argued that they should be programmed to recognize obvious mistakes.

The first well-known product to provide spell correction functionality was Microsoft Word 6.0, released in 1993.

There are a few ways spell correction can be done, but it's important to note that there is no purely programmatic way to convert your mistyped "ipone" into "iphone" with decent quality. Mostly, there has to be a dataset the system is based on. The dataset can be:

A dictionary of properly spelled words, which in turn can be:
- Based on your real data. The idea here is that, for the most part, the spelling in the dictionary made up of your data is correct, and the system tries to find a word that is most similar to the typed word (we'll discuss how this can be done with Manticore shortly).
- Or it can be based on an external dictionary unrelated to your data. The issue that may arise here is that your data and the external dictionary can be too different: some words may be missing in the dictionary, while others may be missing in your data.
Not just dictionary-based, but also context-aware, e.g., "white ber" would be corrected to "white bear," while "dark ber" would be corrected to "dark beer." The context might not just be a neighboring word in your query, but also your location, time of day, the current sentence's grammar (to change "there" to "their" or not), your search history, and virtually any other factors that can affect your intent.
Another classic approach is to use previous search queries as the dataset for spell correction. This is even more utilized in autocomplete functionality but makes sense for autocorrect too. The idea is that users are mostly right with spelling, so we can use words from their search history as a source of truth, even if we don't have the words in our documents or use an external dictionary. Context-awareness is also possible here.

Manticore provides the fuzzy search option and the commands CALL QSUGGEST and CALL SUGGEST that can be used for automatic spell correction purposes.

The Fuzzy Search feature allows for more flexible matching by accounting for slight variations or misspellings in the search query. It works similarly to a normal SELECT SQL statement or a /search JSON request but provides additional parameters to control the fuzzy matching behavior.

NOTE: The fuzzy option requires Manticore Buddy. If it doesn't work, make sure Buddy is installed.

SELECT
  ...
  MATCH('...')
  ...
  OPTION fuzzy={0|1}
  [, distance=N]
  [, preserve={0|1}]
  [, layouts='{be,bg,br,ch,de,dk,es,fr,uk,gr,it,no,pt,ru,se,ua,us}']
}

Note: When conducting a fuzzy search via SQL, the MATCH clause should not contain any full-text operators except the phrase search operator and should only include the words you intend to match.

‹›

SQL
SQL with additional filters
JSON
SQL with preserve option
JSON with preserve option

📋

SELECT * FROM mytable WHERE MATCH('someting') OPTION fuzzy=1, layouts='us,ua', distance=2;

Example of a more complex Fuzzy search query with additional filters:

SELECT * FROM mytable WHERE MATCH('someting') OPTION fuzzy=1 AND (category='books' AND price < 20);

POST /search
{
  "table": "test",
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "*": "ghbdtn"
          }
        }
      ]
    }
  },
  "options": {
    "fuzzy": true,
    "layouts": ["us", "ru"],
    "distance": 2
  }
}

SELECT * FROM mytable WHERE MATCH('hello wrld') OPTION fuzzy=1, preserve=1;

POST /search
{
  "table": "test",
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "*": "hello wrld"
          }
        }
      ]
    }
  },
  "options": {
    "fuzzy": true,
    "preserve": 1
  }
}

‹›

Response

+------+-------------+
| id   | content     |
+------+-------------+
|    1 | something   |
|    2 | some thing  |
+------+-------------+
2 rows in set (0.00 sec)

POST /search
{
  "table": "table_name",
  "query": {
    <full-text query>
  },
  "options": {
    "fuzzy": {true|false}
    [,"layouts": ["be","bg","br","ch","de","dk","es","fr","uk","gr","it","no","pt","ru","se","ua","us"]]
    [,"distance": N]
    [,"preserve": {0|1}]
  }
}

Note: If you use the query_string, be aware that it does not support full-text operators except the phrase search operator. The query string should consist solely of the words you wish to match.

fuzzy: Turn fuzzy search on or off.
distance: Set the Levenshtein distance for matching. The default is 2.
preserve: 0 or 1 (default: 0). When set to 1, keeps words that don't have fuzzy matches in the search results (e.g., "hello wrld" returns both "hello wrld" and "hello world"). When set to 0, only returns words with successful fuzzy matches (e.g., "hello wrld" returns only "hello world"). Particularly useful for preserving short words or proper nouns that may not exist in Manticore Search.
layouts: Keyboard layouts for detecting typing errors caused by keyboard layout mismatches (e.g., typing "ghbdtn" instead of "привет" when using wrong layout). Manticore compares character positions across different layouts to suggest corrections. Requires at least 2 layouts to effectively detect mismatches. No layouts are used by default. Use an empty string '' (SQL) or array [] (JSON) to turn this off. Supported layouts include:
- be - Belgian AZERTY layout
- bg - Standard Bulgarian layout
- br - Brazilian QWERTY layout
- ch - Swiss QWERTZ layout
- de - German QWERTZ layout
- dk - Danish QWERTY layout
- es - Spanish QWERTY layout
- fr - French AZERTY layout
- uk - British QWERTY layout
- gr - Greek QWERTY layout
- it - Italian QWERTY layout
- no - Norwegian QWERTY layout
- pt - Portuguese QWERTY layout
- ru - Russian JCUKEN layout
- se - Swedish QWERTY layout
- ua - Ukrainian JCUKEN layout
- us - American QWERTY layout

This demo demonstrates the fuzzy search functionality:
Blog post about Fuzzy Search and Autocomplete - https://manticoresearch.com/blog/new-fuzzy-search-and-autocomplete/

Both commands are accessible via SQL and support querying both local (plain and real-time) and distributed tables. The syntax is as follows:

CALL QSUGGEST(<word or words>, <table name> [,options])
CALL SUGGEST(<word or words>, <table name> [,options])

options: N as option_name[, M as another_option, ...]

These commands provide all suggestions from the dictionary for a given word. They work only on tables with infixing enabled and dict=keywords. They return the suggested keywords, Levenshtein distance between the suggested and original keywords, and the document statistics of the suggested keyword.

If the first parameter contains multiple words, then:

CALL QSUGGEST will return suggestions only for the last word, ignoring the rest.
CALL SUGGEST will return suggestions only for the first word.

That's the only difference between them. Several options are supported for customization:

Option	Description	Default
limit	Returns N top matches	5
max_edits	Keeps only dictionary words with a Levenshtein distance less than or equal to N	4
result_stats	Provides Levenshtein distance and document count of the found words	1 (enabled)
delta_len	Keeps only dictionary words with a length difference less than N	3
max_matches	Number of matches to keep	25
reject	Rejected words are matches that are not better than those already in the match queue. They are put in a rejected queue that gets reset in case one actually can go in the match queue. This parameter defines the size of the rejected queue (as reject*max(max_matched,limit)). If the rejected queue is filled, the engine stops looking for potential matches	4
result_line	alternate mode to display the data by returning all suggests, distances and docs each per one row	0
non_char	do not skip dictionary words with non alphabet symbols	0 (skip such words)
sentence	Returns the original sentence along with the last word replaced by the matched one.	0 (do not return the full sentence)

To show how it works, let's create a table and add a few documents to it.

create table products(title text) min_infix_len='2';
insert into products values (0,'Crossbody Bag with Tassel'), (0,'microfiber sheet set'), (0,'Pet Hair Remover Glove');

As you can see, the mistyped word "crossbUdy" gets corrected to "crossbody". By default, CALL SUGGEST/QSUGGEST return:

distance - the Levenshtein distance which means how many edits they had to make to convert the given word to the suggestion
docs - number of documents containing the suggested word

To disable the display of these statistics, you can use the option 0 as result_stats.

‹›

Example

Example

📋

call suggest('crossbudy', 'products');

‹›

Response

+-----------+----------+------+
| suggest   | distance | docs |
+-----------+----------+------+
| crossbody | 1        | 1    |
+-----------+----------+------+

If the first parameter is not a single word, but multiple, then CALL SUGGEST will return suggestions only for the first word.

‹›

Example

Example

📋

call suggest('bagg with tasel', 'products');

‹›

Response

+---------+----------+------+
| suggest | distance | docs |
+---------+----------+------+
| bag     | 1        | 1    |
+---------+----------+------+

If the first parameter is not a single word, but multiple, then CALL SUGGEST will return suggestions only for the last word.

‹›

Example

Example

📋

CALL QSUGGEST('bagg with tasel', 'products');

‹›

Response

+---------+----------+------+
| suggest | distance | docs |
+---------+----------+------+
| tassel  | 1        | 1    |
+---------+----------+------+

Adding 1 as sentence makes CALL QSUGGEST return the entire sentence with the last word corrected.

‹›

Example

Example

📋

CALL QSUGGEST('bag with tasel', 'products', 1 as sentence);

‹›

Response

+-------------------+----------+------+
| suggest           | distance | docs |
+-------------------+----------+------+
| bag with tassel   | 1        | 1    |
+-------------------+----------+------+

The 1 as result_line option changes the way the suggestions are displayed in the output. Instead of showing each suggestion in a separate row, it displays all suggestions, distances, and docs in a single row. Here's an example to demonstrate this:

This interactive course shows how CALL SUGGEST works in a little web app.

CALL SUGGEST example

Autocomplete

Autocomplete

CALL AUTOCOMPLETE

General syntax

SQL

JSON

Options

Links

Alternative autocomplete methods

Autocomplete a sentence

Autocomplete a word

CALL KEYWORDS

General syntax

Spell correction

How it works

Fuzzy Search

General syntax

SQL

JSON

Options

Links

CALL QSUGGEST, CALL SUGGEST

Single word example

CALL SUGGEST takes only the first word

CALL QSUGGEST takes only the last word

Different display mode

Demo