Manticore Search Manual

Manticore Search is a multi-storage database designed specifically for search, including full-text search.

Manticore was born in 2017 as a continuation of Sphinx Search. We took the best from Sphinx (C++ core and focus on low level data structures and fine-tuned algorithms), added a lot of new functionality, fixed hundreds of bugs, made it easier to use, kept it open source and made Manticore Search even more lightweight & extremely fast database for search.

Over 20 full-text operators and over 20 ranking factors
Custom ranking
Stemming
Lemmatization
Stopwords
Synonyms
Wordforms
Advanced tokenization at character and word level
Proper Chinese segmentation
Text highlighting

Modern MPP architecture and smart query parallelization capabilities allow to fully utilize all your CPU cores to lower response time as much as possible, when needed.

Traditional row-wise storage for small, medium and big size datasets.

Columnar storage support via the Manticore Columnar Library for bigger datasets (much bigger than can fit in RAM).

The native Manticore's syntax is SQL. It speaks SQL over HTTP and MySQL protocol. You can use your preferred mysql client to connect to Manticore Search server via SQL protocol in any programming language.

To provide more programmatic way to manage your data and schemas Manticore provides HTTP JSON protocol. It is very similar to the one from Elasticsearch.

You can create / update / delete indexes online as well as providing schemas in a configuration file.

Being written fully in C++, Manticore Search starts fast and doesn't take much RAM. Low-level optimizations give good performance.

After a new document is added or updated it can be read immediately.

We provide interactive courses for easier learning.

Manticore is not fully ACID-compliant, but it supports isolated transactions for atomic changes and binary logging for safe writes.

Data can be distributed across servers and data-centers. Any Manticore Search node can be both a load balancer and a data node. Manticore implements synchronous multi-master replication using Galera library which guarantees consistency between all data nodes and no data loss.

Manticore indexer tool and rich configuration syntax helps to sync existing data from MySQL, PostgreSQL, any database which speaks ODBC and any other technology which can generate a simple XML or CSV.

You can integrate Manticore Search with MySQL/MariaDB server via a FEDERATED engine or use Manticore through ProxySQL

Manticore has a special index type called "percolate" which implements search in reverse when you index your queries rather than data. It's an extremely powerful tool for full-text data stream filtering: just put all your queries in the index, process your data stream by sending each batch of documents to Manticore Search and you'll get only those back that match some of your stored queries.

Manticore's possible applications are not limited by, but include:

Full-text search
- when used with small data volume you can benefit from powerful full-text search syntax and low RAM consumption (as little as 7-8 megabytes)
- when used with big data you can benefit from Manticore's high availability capabilities and ability to serve very large indexes, each taking hundreds of gigabytes of RAM
OLAP: using Manicore Search + Manticore Columnar Library, you can handle terabytes of data on a single or multiple servers for data analysis
Faceted search
Geo-spatial search
Spell correction
Autocomplete
Data stream filtering

Read this first

Last modified: December 27, 2021

The manual is arranged as a reflection of the most likely way you would use Manticore:

starting from some basic information about it and how to install and connect
through some essential things like adding documents and running searches
to some performance optimization tips and tricks and extending Manticore with help of plugins and custom functions

Key sections of the manual are marked with sign ✔️ in the menu for your convenience since their corresponding functionality is most used. If you are new to Manticore we highly recommend to not skip them.

If you are looking for a quick understanding of how Manticore works in general ⚡ Quick start guide section should be good to read.

Each query example has a little icon 📋 in the top-right corner:

Copy example

You can use it to copy examples to clipboard. If the query is an HTTP request it will be copied as a CURL command. You can configure the host/port if you press ⚙️.

We love search and we've made our best to make searching in this manual as convenient as possible. Of course it's backed by Manticore Search. Besides using the search bar which requires opening the manual first there is a very easy way to find something by just opening mnt.cr/your-search-keyword :

mnt.cr quick manual search

There are few things you need to understand about Manticore Search that can help you follow the best practices of using it.

Real-time index allows adding, updating and deleting documents with immediate availability of the changes.
Plain index is a mostly immutable data structure and a basic element used by real-time indexes. Plain index stores a set of documents, their common dictionary and indexation settings. One real-time index can consist of multiple plain indexes (chunks), but besides that Manticore provides direct access to building plain indexes using tool indexer. It makes sense when your data is mostly immutable, therefore you don't need a real-time index for that.

Manticore Search works in two modes:

Real-time mode (RT mode). This is a default one and allows to manage your data schema imperatively:
- allows managing your data schema online using SQL commands CREATE/ALTER/DROP TABLE and their equivalents in non-SQL clients
- in the configuration file you need to define only server-related settings including data_dir
Plain mode allows to define your data schemas in a configuration file, i.e. provides declarative kind of schema management. It makes sense in three cases:
- when you only deal with plain indexes
- or when your data schema is very stable and you don't need replication (as it's available only in the RT mode)
- when you have to make your data schema portable (e.g. for easier deployment of it on a new server)

You cannot combine the 2 modes and need to decide which one you want to follow by specifying data_dir in your configuration file (which is the default behaviour). If you are unsure our recommendation is to follow the RT mode as if even you need a plain index you can build it with a separate plain index config and import to your main Manticore instance.

Real-time indexes can be used in both RT and plain modes. In the RT mode a real-time index is defined with a CREATE TABLE command, while in the plain mode it is defined in the configuration file. Plain (offline) indexes are supported only in the plain mode. Plain indexes cannot be created in the RT mode, but existing plain indexes made in the plain mode can be converted to real-time indexes and imported in the RT mode.

Manticore provides multiple ways and interfaces to manage your schemas and data, but the two main are:

SQL. This is a native Manticore's language which enables all Manticore's functionality. The best practice is to use SQL to:
- manage your schemas and do other DBA routines as it's the easiest way to do that
- design your queries as SQL is much closer to natural language than the JSON DSL which is important when you design something new. You can use Manticore SQL via any MySQL client or /sql.
JSON. Most functionality is also available via JSON domain specific language. This is especially useful when you integrate Manticore with your application as with JSON you can do it more programmatically than with SQL. The best practice is to first explore how to do something via SQL and then use JSON to integrate it into your application.

Introduction ️Installation

Last modified: January 12, 2021

Last modified: February 10, 2021

Introduction

Our key features are:

Powerful and fast full-text searching which works fine for small and big datasets

Modern MPP architecture

Row-wise storage

Column-oriented storage

SQL-first

JSON over HTTP

Declarative and imperative schema management

Power of C++

Real-time inserts

Interactive courses

ACID compliance

Built-in replication and load balancing

Can sync from MySQL/PostgreSQL/ODBC/xml/csv out of the box

Integrations

Stream filtering

Possible applications:

Read this first

About this manual

Do not skip ✔️

Quick start guide

Using examples

Search in this manual

Best practices

Real-time index vs plain index

Real-time mode vs plain mode

SQL vs JSON