Changelog

Next release

Minor changes

  • ⚠️ BREAKING CHANGE: Secondary indexes file format got changed and if you are using secondary indexes for searching (searchd.secondary_indexes = 1 which was not a default in the previous versions) the new Manticore version will skip loading older index versions to prevent performance drop. The recommendations are:
    • remove secondary indexes files during upgrade:
      • systemctl stop manticore
      • upgrade Manticore version
      • remove .spidx index files
      • start Manticore back
      • use ALTER TABLE <table name> REBUILD SECONDARY (not yet implemented❗) to recover secondary indexes
    • If you are running a replication cluster, full cluster restart should be performed with removal of .spidx files and ALTER TABLE <table name> REBUILD SECONDARY on all the nodes. Read about restarting a cluster for more details.

Packaging

  • arm64 packages for macOS and Linuxes
  • easier package building for contributors

Version 5.0.2

Released: May 30th 2022

Bugfixes

  • Issue #791 - wrong stack size could cause a crash.

Version 5.0.0

Released: May 18th 2022

Major new features

  • 🔬 Support for Manticore Columnar Library 1.15.2, which enables Secondary indexes beta version. Building secondary indexes is on by default for plain and real-time columnar and row-wise indexes (if Manticore Columnar Library is in use), but to enable it for searching you need to set secondary_indexes = 1 either in your configuration file or using SET GLOBAL. The new functionality is supported in all operating systems except old Debian Stretch and Ubuntu Xenial.

  • Read-only mode: you can now specify listeners that process only read queries discarding any writes.

  • New /cli endpoint for running SQL queries over HTTP even easier.

  • Faster bulk INSERT/REPLACE/DELETE via JSON over HTTP: previously you could provide multiple write commands via HTTP JSON protocol, but they were processed one by one, now they are handled as a single transaction.

  • #720 Nested filters support in JSON protocol. Previously you couldn't code things like a=1 and (b=2 or c=3) in JSON: must (AND), should (OR) and must_not (NOT) worked only on the highest level. Now they can be nested.

  • Support for Chunked transfer encoding in HTTP protocol. You can now use chunked transfer in your application to transfer large batches with lower resource consumption (since you don't need to calculate Content-Length). On the server's side Manticore now always processes incoming HTTP data in streaming fashion without waiting for the whole batch to be transferred as previously, which:

    • decreases peak RAM consumption, which lowers a chance of OOM
    • decreases response time (our tests showed 11% decrease for processing a 100MB batch)
    • lets you overcome max_packet_size and transfer batches much larger than the largest allowed value of max_packet_size (128MB), e.g. 1GB at once.
  • #719 HTTP interface support of 100 Continue: now you can transfer large batches from curl (including curl libraries used by various programming languages) which by default does Expect: 100-continue and waits some time before actually sending the batch. Previously you had to add Expect: header, now it's not needed.

    MORE
  • ⚠️ BREAKING CHANGE: Pseudo sharding is enabled by default. If you want to disable it make sure you add pseudo_sharding = 0 to section searchd of your Manticore configuration file.

  • Having at least one full-text field in a real-time/plain index is not mandatory anymore. You can now use Manticore even in cases not having anything to do with full-text search.

  • Fast fetching for attributes backed by Manticore Columnar Library: queries like select * from <columnar table> are now much faster than previously, especially if there are many fields in the schema.

  • ⚠️ BREAKING CHANGE: Implicit cutoff. Manticore now doesn't spend time and resources processing data you don't need in the result set which will be returned. The downside is that it affects total_found in SHOW META and hits.total in JSON output. It is now only accurate in case you see total_relation: eq while total_relation: gte means the actual number of matching documents is greater than the total_found value you've got. To retain the previous behaviour you can use search option cutoff=0, which makes total_relation always eq.

  • ⚠️ BREAKING CHANGE: All full-text fields are now stored by default. You need to use stored_fields = (empty value) to make all fields non-stored (i.e. revert to the previous behaviour).

  • #715 HTTP JSON supports search options.

Minor changes

  • ⚠️ BREAKING CHANGE: Index meta file format change. Previously meta files (.meta, .sph) were in binary format, now it's just json. The new Manticore version will convert older indexes automatically, but:
    • you can get warning like WARNING: ... syntax error, unexpected TOK_IDENT
    • you won't be able to run the index with previous Manticore versions, make sure you have a backup
  • ⚠️ BREAKING CHANGE: Session state support with help of HTTP keep-alive. This makes HTTP stateful when the client supports it too. For example, using the new /cli endpoint and HTTP keep-alive (which is on by default in all browsers) you can call SHOW META after SELECT and it will work the same way it works via mysql. Note, previously Connection: keep-alive HTTP header was supported too, but it only caused reusing the same connection. Since this version it also makes the session stateful.
  • You can now specify columnar_attrs = * to define all your attributes as columnar in the plain mode which is useful in case the list is long.
  • Faster replication SST
  • ⚠️ BREAKING CHANGE: Replication protocol has been changed. If you are running a replication cluster, then when upgrading to Manticore 5 you need to:
    • stop all your nodes first cleanly
    • and then start the node which was stopped last with --new-cluster (run tool manticore_new_cluster in Linux).
    • read about restarting a cluster for more details.
  • Replication improvements:
    • Faster SST
    • Noise resistance which can help in case of unstable network between replication nodes
    • Improved logging
  • Security improvement: Manticore now listens on 127.0.0.1 instead of 0.0.0.0 in case no listen at all is specified in config. Even though in the default configuration which is shipped with Manticore Search the listen setting is specified and it's not typical to have a configuration with no listen at all, it's still possible. Previously Manticore would listen on 0.0.0.0 which is not secure, now it listens on 127.0.0.1 which is usually not exposed to the Internet.
  • Faster aggregation over columnar attributes.
  • Increased AVG() accuracy: previously Manticore used float internally for aggregations, now it uses double which increases the accuracy significantly.
  • Improved support for JDBC MySQL driver.
  • DEBUG malloc_stats support for jemalloc.
  • optimize_cutoff is now available as a per-table setting which can be set when you CREATE or ALTER a table.
  • ⚠️ BREAKING CHANGE: query_log_format is now sphinxql by default. If you are used to plain format you need to add query_log_format = plain to your configuration file.
  • Significant memory consumption improvements: Manticore consumes significantly less RAM now in case of long and intensive insert/replace/optimize workload in case stored fields are used.
  • shutdown_timeout default value was increased from 3 seconds to 60 seconds.
  • Commit ffd0499d Support for Java mysql connector >= 6.0.3: in Java mysql connection 6.0.3 they changed the way they connect to mysql which broke compatibility with Manticore. The new behaviour is now supported.
  • Commit 1da6dbec disabled saving a new disk chunk on loading an index (e.g. on searchd startup).
  • Issue #746 Support for glibc >= 2.34.
  • Issue #784 count 'VIP' connections separately from usual (non-VIP). Previously VIP connections were counted towards the max_connections limit, which could cause "maxed out" error for non-VIP connections. Now VIP connections are not counted towards the limit. Current number of VIP connections can be also seen in SHOW STATUS and status.
  • ID can now be specified explicitly.
  • Issue #687 support zstd compression for mysql proto

⚠️ Other minor breaking changes

  • ⚠️ BM25F formula has been slightly updated to improve search relevance. This only affects search results in case you use function BM25F(), it doesn't change behaviour of the default ranking formula.
  • ⚠️ Changed behaviour of REST /sql endpoint: /sql?mode=raw now requires escaping and returns an array.
  • ⚠️ Format change of the response of /bulk INSERT/REPLACE/DELETE requests:
    • previously each sub-query constituted a separate transaction and resulted in a separate response
    • now the whole batch is considered a single transaction, which returns a single response
  • ⚠️ Search options low_priority and boolean_simplify now require a value (0/1): previously you could do SELECT ... OPTION low_priority, boolean_simplify, now you need to do SELECT ... OPTION low_priority=1, boolean_simplify=1.
  • ⚠️ If you are using old php, python or java clients please follow the corresponding link and find an updated version. The old versions are not fully compatible with Manticore 5.
  • ⚠️ HTTP JSON requests are now logged in different format in mode query_log_format=sphinxql. Previously only full-text part was logged, now it's logged as is.

New packages

  • ⚠️ BREAKING CHANGE: because of the new structure when you upgrade to Manticore 5 it's recommended to remove old packages before you install the new ones:

    • RPM-based: yum remove manticore*
    • Debian and Ubuntu: apt remove manticore*
  • New deb/rpm packages structure. Previous versions provided:

    • manticore-server with searchd (main search daemon) and all needed for it
    • manticore-tools with indexer and indextool
    • manticore including everything
    • manticore-all RPM as a meta package referring to manticore-server and manticore-tools

    The new structure is:

    • manticore - deb/rpm meta package which installs all the above as dependencies
    • manticore-server-core - searchd and everything to run it alone
    • manticore-server - systemd files and other supplementary scripts
    • manticore-tools - indexer, indextool and other tools
    • manticore-common - default configuration file, default data directory, default stopwords
    • manticore-icudata, manticore-dev, manticore-converter didn't change much
    • .tgz bundle which includes all the packages
  • Support for Ubuntu Jammy

  • Support for Amazon Linux 2 via YUM repo

Bugfixes

  • Issue #287 out of memory while indexing RT index
  • Issue #604 Breaking change 3.6.0, 4.2.0 sphinxql-parser
  • Issue #667 FATAL: out of memory (unable to allocate 9007199254740992 bytes)
  • Issue #676 Strings not passed correctly to UDFs
  • Issue #698 Searchd crashes after trying to add a text column to a rt index
  • Issue #705 Indexer couldn't find all columns
  • Issue #709 Grouping by json.boolean works wrong
  • Issue #716 indextool commands related to index (eg. --dumpdict) failure
  • Issue #724 Fields disappear from the selection
  • Issue #727 .NET HttpClient Content-Type incompatibility when using application/x-ndjson
  • Issue #729 Field length calculation
  • Issue #730 create/insert into/drop columnar table has a memleak
  • Issue #731 Empty column in results under certain conditions
  • Issue #749 Crash of daemon on start
  • Issue #750 Daemon hangs on start
  • Issue #751 Crash at SST
  • Issue #752 Json attribute marked as columnar when engine='columnar'
  • Issue #753 Replication listens on 0
  • Issue #754 columnar_attrs = * is not working with csvpipe
  • Issue #755 Crash on select float in columnar in rt
  • Issue #756 Indextool changes rt index during check
  • Issue #757 Need a check for listeners port range intersections
  • Issue #758 Log original error in case RT index failed to save disk chunk
  • Issue #759 Only one error reported for RE2 config
  • Issue #760 RAM consumption changes in commit 5463778558586d2508697fa82e71d657ac36510f
  • Issue #761 3rd node doesn't make a non-primary cluster after dirty restart
  • Issue #762 Update counter gets increased by 2
  • Issue #763 New version 4.2.1 corrupt index created with 4.2.0 with morphology using
  • Issue #764 No escaping in json keys /sql?mode=raw
  • Issue #765 Using function hides other values
  • Issue #766 Memleak triggered by a line in FixupAttrForNetwork
  • Issue #767 Memleak in 4.2.0 and 4.2.1 related with docstore cache
  • Issue #768 Strange ping-pong with stored fields over network
  • Issue #769 lemmatizer_base reset to empty if not mentioned in 'common' section
  • Issue #770 pseudo_sharding makes SELECT by id slower
  • Issue #771 DEBUG malloc_stats output zeros when using jemalloc
  • Issue #772 Drop/add column makes value invisible
  • Issue #773 Can't add column bit(N) to columnar table
  • Issue #774 "cluster" gets empty on start in manticore.json
  • Commit 1da4ce89 HTTP actions are not tracked in SHOW STATUS
  • Commit 381000ab disable pseudo_sharding for low frequency single keyword queries
  • Commit 800325cc fixed stored attributes vs index merge
  • Commit cddfeed6 generalized distinct value fetchers; added specialized distinct fetchers for columnar strings
  • Commit fba4bb4f fixed fetching null integer attributes from docstore
  • Commit f3009a92 ranker could be specified twice in query log

Version 4.2.0, Dec 23 2021

Major new features

  • Pseudo-sharding support for real-time indexes and full-text queries. In previous release we added limited pseudo sharding support. Starting from this version you can get all benefits of the pseudo sharding and your multi-core processor by just enabling searchd.pseudo_sharding. The coolest thing is that you don't need to do anything with your indexes or queries for that, just enable it and if you have free CPU it will be used to lower your response time. It supports plain and real-time indexes for full-text, filtering and analytical queries. For example, here is how enabling pseudo sharding can make most queries' response time in average about 10x lower on Hacker news curated comments dataset multiplied 100 times (116 million docs in a plain index).

Pseudo sharding on vs off in 4.2.0

  • PQ transactions are now atomic and isolated. Previously PQ transactions support was limited. It enables much faster REPLACE into PQ, especially when you need to replace a lot of rules at once. Performance details:
  • 4.0.2
  • 4.2.0
📋

It takes 48 seconds to insert 1M PQ rules and 406 seconds to REPLACE just 40K in 10K batches.

root@perf3 ~ # mysql -P9306 -h0 -e "drop table if exists pq; create table pq (f text, f2 text, j json, s string) type='percolate';"; date; for m in `seq 1 1000`; do (echo -n "insert into pq (id,query,filters,tags) values "; for n in `seq 1 1000`; do echo -n "(0,'@f (cat | ( angry dog ) | (cute mouse)) @f2 def', 'j.json.language=\"en\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; [ $n != 1000 ] && echo -n ","; done; echo ";")|mysql -P9306 -h0; done; date; mysql -P9306 -h0 -e "select count(*) from pq"

Wed Dec 22 10:24:30 AM CET 2021
Wed Dec 22 10:25:18 AM CET 2021
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+

root@perf3 ~ # date; (echo "begin;"; for offset in `seq 0 10000 30000`; do n=0; echo "replace into pq (id,query,filters,tags) values "; for id in `mysql -P9306 -h0 -NB -e "select id from pq limit $offset, 10000 option max_matches=1000000"`; do echo "($id,'@f (tiger | ( angry bear ) | (cute panda)) @f2 def', 'j.json.language=\"de\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; n=$((n+1)); [ $n != 10000 ] && echo -n ","; done; echo ";"; done; echo "commit;") > /tmp/replace.sql; date
Wed Dec 22 10:26:23 AM CET 2021
Wed Dec 22 10:26:27 AM CET 2021
root@perf3 ~ # time mysql -P9306 -h0 < /tmp/replace.sql

real    6m46.195s
user    0m0.035s
sys 0m0.008s

Minor changes

  • optimize_cutoff is now available as a configuration option in section searchd. It's useful when you want to limit the RT chunks count in all your indexes to a particular number globally.
  • Commit 00874743 accurate count(distinct ...) and FACET ... distinct over several local physical indexes (real-time/plain) with identical fields set/order.
  • PR #598 bigint support for YEAR() and other timestamp functions.
  • Commit 8e85d4bc Adaptive rt_mem_limit. Previously Manticore Search was collecting exactly up to rt_mem_limit of data before saving a new disk chunk to disk, and while saving was still collecting up to 10% more (aka double-buffer) to minimize possible insert suspension. If that limit was also exhausted, adding new documents was blocked until the disk chunk was fully saved to disk. The new adaptive limit is built on the fact that we have auto-optimize now, so it's not a big deal if disk chunks do not fully respect rt_mem_limit and start flushing a disk chunk earlier. So, now we collect up to 50% of rt_mem_limit and save that as a disk chunk. Upon saving we look at the statistics (how much we've saved, how many new documents have arrived while saving) and recalculate the initial rate which will be used next time. For example, if we saved 90 million documents, and another 10 million docs arrived while saving, the rate is 90%, so we know that next time we can collect up to 90% of rt_mem_limit before starting flushing another disk chunk. The rate value is calculated automatically from 33.3% to 95%.
  • Issue #628 unpack_zlib for PostgreSQL source. Thank you, Dmitry Voronin for the contribution.
  • Commit 6d54cf2b indexer -v and --version. Previously you could still see indexer's version, but -v/--version were not supported.
  • Issue #662 infinit mlock limit by default when Manticore is started via systemd.
  • Commit 63c8cd05 spinlock -> op queue for coro rwlock.
  • Commit 41130ce3 environment variable MANTICORE_TRACK_RT_ERRORS useful for debugging RT segments corruption.

Breaking changes

  • Binlog version was increased, binlog from previous version won't be replayed, so make sure you stop Manticore Search cleanly during upgrade: no binlog files should be in /var/lib/manticore/binlog/ except binlog.meta after stopping the previous instance.
  • Commit 3f659f36 new column "chain" in show threads option format=all. It shows stack of some task info tickets, most useful for profiling needs, so if you are parsing show threads output be aware of the new column.
  • searchd.workers was obsoleted since 3.5.0, now it's deprecated, if you still have it in your configuration file it will trigger a warning on start. Manticore Search will start, but with a warning.
  • If you use PHP and PDO to access Manticore you need to do PDO::ATTR_EMULATE_PREPARES

Bugfixes

  • Issue #650 Manticore 4.0.2 slower than Manticore 3.6.3. 4.0.2 was faster than previous versions in terms of bulk inserts, but significantly slower for single document inserts. It's been fixed in 4.2.0.
  • Commit 22f4141b RT index could get corrupted under intensive REPLACE load, or it could crash
  • Commit 03be91e4 fixed average at merging groupers and group N sorter; fixed merge of aggregates
  • Commit 2ea575d3 indextool --check could crash
  • Commit 7ec76d4a RAM exhaustion issue caused by UPDATEs
  • Commit 658a727e daemon could hang on INSERT
  • Commit 46e42b9b daemon could hang on shutdown
  • Commit f8d7d517 daemon could crash on shutdown
  • Commit 733accf1 daemon could hang on crash
  • Commit f7f8bd8c daemon could crash on startup trying to rejoin cluster with invalid nodes list
  • Commit 14015561 distributed index could get completely forgotten in RT mode in case it couldn't resolve one of its agents on start
  • Issue #683 attr bit(N) engine='columnar' fails
  • Issue #682 create table fails, but leaves dir
  • Issue #663 Config fails with: unknown key name 'attr_update_reserve'
  • Issue #632 Manticore crash on batch queries
  • Issue #679 Batch queries causing crashes again with v4.0.3
  • Commit f7f8bd8c fixed daemon crash on startup trying to re-join cluster with invalid nodes list
  • Issue #643 Manticore 4.0.2 does not accept connections after batch of inserts
  • Issue #635 FACET query with ORDER BY JSON.field or string attribute could crash
  • Issue #634 Crash SIGSEGV on query with packedfactors
  • Commit 41657f15 morphology_skip_fields was not supported by create table

Version 4.0.2, Sep 21 2021

Major new features

  • Full support of Manticore Columnar Library. Previously Manticore Columnar Library was supported only for plain indexes. Now it's supported:

    • in real-time indexes for INSERT, REPLACE, DELETE, OPTIMIZE
    • in replication
    • in ALTER
    • in indextool --check
  • Automatic indexes compaction (Issue #478). Finally, you don't have to call OPTIMIZE manually or via a crontask or other kind of automation. Manticore now does it for you automatically and by default. You can set default compaction threshold via optimize_cutoff global variable.

  • Chunk snapshots and locks system revamp. These changes may be invisible from outside at first glance, but they improve the behaviour of many things happening in real-time indexes significantly. In a nutshell, previously most Manticore data manipulation operations relied on locks heavily, now we use disk chunk snapshots instead.

  • Significantly faster bulk INSERT performance into a real-time index. For example on Hetzner's server AX101 with SSD, 128 GB of RAM and AMD's Ryzen™ 9 5950X (16*2 cores) with 3.6.0 you could get 236K docs per second inserted into a table with schema name text, email string, description text, age int, active bit(1) (default rt_mem_limit, batch size 25000, 16 concurrent insert workers, 16 million docs inserted overall). In 4.0.2 the same concurrency/batch/count gives 357K docs per second.

    MORE
  • ALTER can add/remove a full-text field (in RT mode). Previously it could only add/remove an attribute.

  • 🔬 Experimental: pseudo-sharding for full-scan queries - allows to parallelize any non-full-text search query. Instead of preparing shards manually you can now just enable new option searchd.pseudo_sharding and expect up to CPU cores lower response time for non-full-text search queries. Note it can easily occupy all existing CPU cores, so if you care not only about latency, but throughput too - use it with caution.

Minor changes

  • Linux Mint and Ubuntu Hirsute Hippo are supported via APT repository
  • faster update by id via HTTP in big indexes in some cases (depends on the ids distribution)
  • 671e65a2 - added caching to lemmatizer-uk
  • 3.6.0
  • 4.0.2
📋
time curl -X POST -d '{"update":{"index":"idx","id":4611686018427387905,"doc":{"mode":0}}}' -H "Content-Type: application/x-ndjson" http://127.0.0.1:6358/json/bulk

real    0m43.783s
user    0m0.008s
sys     0m0.007s

Breaking changes

  • the new version can read older indexes, but the older versions can't read Manticore 4's indexes
  • removed implicit sorting by id. Sort explicitly if required
  • charset_table's default value changes from 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451 to non_cjk
  • OPTIMIZE happens automatically. If you don't need it make sure to set auto_optimize=0 in section searchd in the configuration file
  • Issue #616 ondisk_attrs_default were deprecated, now they are removed
  • for contributors: we now use Clang compiler for Linux builds as according to our tests it can build a faster Manticore Search and Manticore Columnar Library
  • if max_matches is not specified in a search query it gets updated implicitly with the lowest needed value for the sake of performance of the new columnar storage. It can affect metric total in SHOW META, but not total_found which is the actual number of found documents.

Migration from Manticore 3

  • make sure you a stop Manticore 3 cleanly:
    • no binlog files should be in /var/lib/manticore/binlog/ (only binlog.meta should be in the directory)
    • otherwise the indexes Manticore 4 can't reply binlogs for won't be run
  • the new version can read older indexes, but the older versions can't read Manticore 4's indexes, so make sure you make a backup if you want to be able to rollback the new version easily
  • if you run a replication cluster make sure you:
    • stop all your nodes first cleanly
    • and then start the node which was stopped last with --new-cluster (run tool manticore_new_cluster in Linux).
    • read about restarting a cluster for more details

Bugfixes

  • Lots of replication issues have been fixed:
    • Commit 696f8649 - fixed crash during SST on joiner with active index; added sha1 verify at joiner node at writing file chunks to speed up index loading; added rotation of changed index files at joiner node on index load; added removal of index files at joiner node when active index gets replaced by a new index from donor node; added replication log points at donor node for sending files and chunks
    • Commit b296c55a - crash on JOIN CLUSTER in case the address is incorrect
    • Commit 418bf880 - while initial replication of a large index the joining node could fail with ERROR 1064 (42000): invalid GTID, (null), the donor could become unresponsive while another node was joining
    • Commit 6fd350d2 - hash could be calculated wrong for a big index which could result in replication failure
    • Issue #615 - replication failed on cluster restart
  • Issue #574 - indextool --help doesn't display parameter --rotate
  • Issue #578 - searchd high CPU usage while idle after ca. a day
  • Issue #587 - flush .meta immediately
  • Issue #617 - manticore.json gets emptied
  • Issue #618 - searchd --stopwait fails under root. It also fixes systemctl behaviour (previously it was showing failure for ExecStop and didn't wait long enough for searchd to stop properly)
  • Issue #619 - INSERT/REPLACE/DELETE vs SHOW STATUS. command_insert, command_replace and others were showing wrong metrics
  • Issue #620 - charset_table for a plain index had a wrong default value
  • Commit 8f753688 - new disk chunks don't get mlocked
  • Issue #607 - Manticore cluster node crashes when unable to resolve a node by name
  • Issue #623 - replication of updated index can lead to undefined state
  • Commit ca03d228 - indexer could hang on indexing a plain index source with a json attribute
  • Commit 53c75305 - fixed not equal expression filter at PQ index
  • Commit ccf94e02 - fixed select windows at list queries above 1000 matches. SELECT * FROM pq ORDER BY id desc LIMIT 1000 , 100 OPTION max_matches=1100 was not working previously
  • Commit a0483fe9 - HTTPS request to Manticore could cause warning like "max packet size(8388608) exceeded"
  • Issue #648 - Manticore 3 could hang after a few updates of string attributes

Version 3.6.0, May 3rd 2021

Maintenance release before Manticore 4

Major new features

  • Support for Manticore Columnar Library for plain indexes. New setting columnar_attrs for plain indexes
  • Support for Ukrainian Lemmatizer
  • Fully revised histograms. When building an index Manticore also builds histograms for each field in it, which it then uses for faster filtering. In 3.6.0 the algorithm was fully revised and you can get a higher performance if you have a lot of data and do a lot of filtering.

Minor changes

Optimizations

  • faster JSON parsing, our tests show 3-4% lower latency on queries like WHERE json.a = 1
  • non-documented command DEBUG SPLIT as a prerequisite for automatic sharding/rebalancing

Bugfixes

  • Issue #584 - inaccurate and unstable FACET results
  • Issue #506 - Strange behavior when using MATCH: those who suffer from this issue need to rebuild the index as the problem was on the phase of building an index
  • Issue #387 - intermittent core dump when running query with SNIPPET() function
  • Stack optimizations useful for processing complex queries:
    • Issue #469 - SELECT results in CRASH DUMP
    • e8420cc7 - stack size detection for filter trees
  • Issue #461 - Update using the IN condition does not take effect correctly
  • Issue #464 - SHOW STATUS immediately after CALL PQ returns - Issue #481 - Fixed static binary build
  • Issue #502 - bug in multi-queries
  • Issue #514 - Unable to use unusual names for columns when use 'create table'
  • Commit d1dbe771 - daemon crash on replay binlog with update of string attribute; set binlog version to 10
  • Commit 775d0555 - fixed expression stack frame detection runtime (test 207)
  • Commit 4795dc49 - percolate index filter and tags were empty for empty stored query (test 369)
  • Commit c3f0bf4d - breaks of replication SST flow at network with long latency and high error rate (different data centers replication); updated replication command version to 1.03
  • Commit ba2d6619 - joiner lock cluster on write operations after join into cluster (test 385)
  • Commit de4dcb9f - wildcards matching with exact modifier (test 321)
  • Commit 6524fc6a - docid checkpoints vs docstore
  • Commit f4ab83c2 - Inconsistent indexer behavior when parsing invalid xml
  • Commit 7b727e22 - Stored percolate query with NOTNEAR runs forever (test 349)
  • Commit 812dab74 - wrong weight for phrase starting with wildcard
  • Commit 1771afc6 - percolate query with wildcards generate terms without payload on matching causes interleaved hits and breaks matching (test 417)
  • Commit aa0d8c2b - fixed calculation of 'total' in case of parallelized query
  • Commit 18d81b3c - crash in Windows with multiple concurrent sessions at daemon
  • Commit 84432f23 - some index settings could not be replicated
  • Commit 93411fe6 - On high rate of adding new events netloop sometimes freeze because of atomic 'kick' event being processed once for several events a time and loosing actual actions from them status of the query, not the server status
  • Commit d805fc12 - New flushed disk chunk might be lost on commit
  • Commit 63cbf008 - inaccurate 'net_read' in profiler
  • Commit f5379bb2 - Percolate issue with arabic (right to left texts)
  • Commit 49eeb420 - id not picked correctly on duplicate column name
  • Commit refactoring of network events to fix a crash in rare cases
  • e8420cc7 fix in indextool --dumpheader
  • Commit ff716353 - TRUNCATE WITH RECONFIGURE worked wrong with stored fields

Breaking changes:

  • New binlog format: you need to make a clean stop of Manticore before upgrading
  • Index format slightly changes: the new version can read you existing indexes fine, but if you decide to downgrade from 3.6.0 to an older version the newer indexes will be unreadable
  • Replication format change: don't replicate from an older version to 3.6.0 and vice versa, switch to the new version on all your nodes at once
  • reverse_scan is deprecated. Make sure you don't use this option in your queries since 3.6.0 since they will fail otherwise
  • As of this release we don't provide builds for RHEL6, Debian Jessie and Ubuntu Trusty any more. If it's mission critical for you to have them supported contact us

Deprecations

  • No more implicit sorting by id. If you rely on it make sure to update your queries accordingly
  • Search option reverse_scan has been deprecated

Version 3.5.4, Dec 10 2020

New Features

  • New Python, Javascript and Java clients are generally available now and are well documented in this manual.
  • automatic drop of a disk chunk of a real-time index. This optimization enables dropping a disk chunk automatically when OPTIMIZing a real-time index when the chunk is obviously not needed any more (all the documents are suppressed). Previously it still required merging, now the chunk can be just dropped instantly. The cutoff option is ignored, i.e. even if nothing is actually merged an obsoleted disk chunk gets removed. This is useful in case you maintain retention in your index and delete older documents. Now compacting such indexes will be faster.
  • standalone NOT as an option for SELECT

Minor Changes

Deprecations

  • indexer --verbose is deprecated as it never added anything to the indexer output
  • For dumping watchdog's backtrace signal USR2 is now to be used instead of USR1

Bugfixes

Version 3.5.2, Oct 1 2020

New features

  • OPTIMIZE reduces disk chunks to a number of chunks ( default is 2* No. of cores) instead of a single one. The optimal number of chunks can be controlled by cutoff option.
  • NOT operator can be now used standalone. By default it is disabled since accidental single NOT queries can be slow. It can be enabled by setting new searchd directive not_terms_only_allowed to 0.
  • New setting max_threads_per_query sets how many threads a query can use. If the directive is not set, a query can use threads up to the value of threads. Per SELECT query the number of threads can be limited with OPTION threads=N overriding the global max_threads_per_query.
  • Percolate indexes can be now be imported with IMPORT TABLE.
  • HTTP API /search receives basic support for faceting/grouping by new query node aggs.

Minor changes

  • If no replication listen directive is declared, the engine will try to use ports after the defined 'sphinx' port, up to 200.
  • listen=...:sphinx needs to be explicit set for SphinxSE connections or SphinxAPI clients.
  • SHOW INDEX STATUS outputs new metrics: killed_documents, killed_rate, disk_mapped_doclists, disk_mapped_cached_doclists, disk_mapped_hitlists and disk_mapped_cached_hitlists.
  • SQL command status now outputs Queue\Threads and Tasks\Threads.

Deprecations:

  • dist_threads is completely deprecated now, searchd will log a warning if the directive is still used.

Docker

The official Docker image is now based on Ubuntu 20.04 LTS

Packaging

Besides the usual manticore package, you can also install Manticore Search by components:

  • manticore-server-core - provides searchd, manpage, log dir, API and galera module. It will also install manticore-common as the dependency.
  • manticore-server - provides automation scripts for core (init.d, systemd), and manticore_new_cluster wrapper. It will also install manticore-server-core as the dependency.
  • manticore-common - provides config, stopwords, generic docs and skeleton folders (datadir, modules, etc.)
  • manticore-tools - provides auxiliary tools ( indexer, indextool etc.), their manpages and examples. It will also install manticore-common as the dependency.
  • manticore-icudata (RPM) or manticore-icudata-65l (DEB) - provides ICU data file for icu morphology usage.
  • manticore-devel (RPM) or manticore-dev (DEB) - provides dev headers for UDFs.

Bugifixes

  1. Commit 2a474dc1 Crash of daemon at grouper at RT index with different chunks
  2. Commit 57a19e5a Fastpath for empty remote docs
  3. Commit 07dd3f31 Expression stack frame detection runtime
  4. Commit 08ae357c Matching above 32 fields at percolate indexes
  5. Commit 16b9390f Replication listen ports range
  6. Commit 5fa671af Show create table on pq
  7. Commit 54d133b6 HTTPS port behavior
  8. Commit fdbbe524 Mixing docstore rows when replacing
  9. Commit afb53f64 Switch TFO unavailable message level to 'info'
  10. Commit 59d94cef Crash on strcmp invalid use
  11. Commit 04af0349 Adding index to cluster with system (stopwords) files
  12. Commit 50148b4e Merge indexes with large dictionaries; RT optimize of large disk chunks
  13. Commit a2adf158 Indextool can dump meta from current version
  14. Commit 69f6d5f7 Issue in group order in GROUP N
  15. Commit 24d5d80f Explicit flush for SphinxSE after handshake
  16. Commit 31c4d78a Avoid copy of huge descriptions when not necessary
  17. Commit 2959e2ca Negative time in show threads
  18. Commit f0b35710 Token filter plugin vs zero position deltas
  19. Commit a49e5bc1 Change 'FAIL' to 'WARNING' on multiple hits

Version 3.5.0, 22 Jul 2020

Major new features:

  • This release took so long, because we were working hard on changing multitasking mode from threads to coroutines. It makes configuration simpler and queries parallelization much more straightforward: Manticore just uses given number of threads (see new setting threads) and the new mode makes sure it's done in the most optimal way.

  • Changes in highlighting:

    • any highlighting that works with several fields (highlight({},'field1, field2') or highlight in json queries) now applies limits per-field by default.
    • any highlighting that works with plain text (highlight({}, string_attr) or snippet() now applies limits to the whole document.
    • per-field limits can be switched to global limits by limits_per_field=0 option (1 by default).
    • allow_empty is now 0 by default for highlighting via HTTP JSON.
  • The same port can now be used for http, https and binary API (to accept connections from a remote Manticore instance). listen = *:mysql is still required for connections via mysql protocol. Manticore now detects automatically the type of client trying to connect to it except for MySQL (due to restrictions of the protocol).

  • In RT mode a field can now be text and string attribute at the same time - GitHub issue #331.

    In plain mode it's called sql_field_string. Now it's available in RT mode for real-time indexes too. You can use it as shown in the example:

    create table t(f string attribute indexed);
    insert into t values(0,'abc','abc');
    select * from t where match('abc');
    +---------------------+------+
    | id                  | f    |
    +---------------------+------+
    | 2810845392541843463 | abc  |
    +---------------------+------+
    1 row in set (0.01 sec)
    
    mysql> select * from t where f='abc';
    +---------------------+------+
    | id                  | f    |
    +---------------------+------+
    | 2810845392541843463 | abc  |
    +---------------------+------+
    1 row in set (0.00 sec)

Minor changes

  • You can now highlight string attributes.
  • SSL and compression support for SQL interface
  • Support of mysql client status command.
  • Replication can now replicate external files (stopwords, exceptions etc.).
  • Filter operator in is now available via HTTP JSON interface.
  • expressions in HTTP JSON.
  • You can now change rt_mem_limit on the fly in RT mode, i.e. can do ALTER ... rt_mem_limit=<new value>.
  • You can now use separate CJK charset tables: chinese, japanese and korean.
  • thread_stack now limits maximum thread stack, not initial.
  • Improved SHOW THREADS output.
  • Display progress of long CALL PQ in SHOW THREADS.
  • cpustat, iostat, coredump can be changed during runtime with SET.
  • SET [GLOBAL] wait_timeout=NUM implemented ,

Breaking changes:

  • Index format has been changed. Indexes built in 3.5.0 cannot be loaded by Manticore version < 3.5.0, but Manticore 3.5.0 understands older formats.
  • INSERT INTO PQ VALUES() (i.e. without providing column list) previously expected exactly (query, tags) as the values. It's been changed to (id,query,tags,filters). The id can be set to 0 if you want it to be auto-generated.
  • allow_empty=0 is a new default in highlighting via HTTP JSON interface.
  • Only absolute paths are allowed for external files (stopwords, exceptions etc.) in CREATE TABLE/ALTER TABLE.

Deprecations:

  • ram_chunks_count was renamed to ram_chunk_segments_count in SHOW INDEX STATUS.
  • workers is obsolete. There's only one workers mode now.
  • dist_threads is obsolete. All queries are as much parallel as possible now (limited by threads and jobs_queue_size).
  • max_children is obsolete. Use threads to set the number of threads Manticore will use (set to the # of CPU cores by default).
  • queue_max_length is obsolete. Instead of that in case it's really needed use jobs_queue_size to fine-tune internal jobs queue size (unlimited by default).
  • All /json/* endpoints are now available w/o /json/, e.g. /search, /insert, /delete, /pq etc.
  • field meaning "full-text field" was renamed to "text" in describe.

    3.4.2:

    mysql> describe t;
    +-------+--------+----------------+
    | Field | Type   | Properties     |
    +-------+--------+----------------+
    | id    | bigint |                |
    | f     | field  | indexed stored |
    +-------+--------+----------------+

    3.5.0:

    mysql> describe t;
    +-------+--------+----------------+
    | Field | Type   | Properties     |
    +-------+--------+----------------+
    | id    | bigint |                |
    | f     | text   | indexed stored |
    +-------+--------+----------------+
  • Cyrillic и doesn't map to i in non_cjk charset_table (which is a default) as it affected Russian stemmers and lemmatizers too much.
  • read_timeout. Use network_timeout instead which controls both reading and writing.

Packages

  • Ubuntu Focal 20.04 official package
  • deb package name changed from manticore-bin to manticore

Bugfixes:

  1. Issue #351 searchd memory leak
  2. Commit ceabe44f Tiny read out of bounds in snippets
  3. Commit 1c3e84a3 Dangerous write into local variable for crash queries
  4. Commit 26e094ab Tiny memory leak of sorter in test 226
  5. Commit d2c7f86a Huge memory leak in test 226
  6. Commit 0dd80122 Cluster shows the nodes are in sync, but count(*) shows different numbers
  7. Commit f1c1ac3f Cosmetic: Duplicate and sometimes lost warning messages in the log
  8. Commit f1c1ac3f Cosmetic: (null) index name in log
  9. Commit 359dbd30 Cannot retrieve more than 70M results
  10. Commit 19f328ee Can't insert PQ rules with no-columns syntax
  11. Commit bf685d5d Misleading error message when inserting a document to an index in a cluster
  12. Commit 2cf18c83 /json/replace and json/update return id in exponent form
  13. Issue #324 Update json scalar properties and mva in the same query
  14. Commit d38409eb hitless_words doesn't work in RT mode
  15. Commit 5813d639 ALTER RECONFIGURE in rt mode should be disallowed
  16. Commit 5813d639 rt_mem_limit gets reset to 128M after searchd restart
  17. highlight() sometimes hangs
  18. Commit 7cd878f4 Failed to use U+code in RT mode
  19. Commit 2b213de4 Failed to use wildcard at wordforms at RT mode
  20. Commit e9d07e68 Fixed SHOW CREATE TABLE vs multiple wordform files
  21. Commit fc90a84f JSON query without "query" crashes searchd
  22. Manticore official docker couldn't index from mysql 8
  23. Commit 23e05d32 HTTP /json/insert requires id
  24. Commit bd679af0 SHOW CREATE TABLE doesn't work for PQ
  25. Commit bd679af0 CREATE TABLE LIKE doesn't work properly for PQ
  26. Commit 5eacf28f End of line in settings in show index status
  27. Commit cb153228 Empty title in "highlight" in HTTP JSON response
  28. Issue #318 CREATE TABLE LIKE infix error
  29. Commit 9040d22c RT crashes under load
  30. cd512c7d Lost crash log on crash at RT disk chunk
  31. Issue #323 Import table fails and closes the connection
  32. Commit 6275316a ALTER reconfigure corrupts a PQ index
  33. Commit 9c1d221e Searchd reload issues after change index type
  34. Commit 71e2b5bb Daemon crashes on import table with missed files
  35. Issue #322 Crash on select using multiple indexes, group by and ranker = none
  36. Commit c3f58490 HIGHLIGHT() doesn't higlight in string attributes
  37. Issue #320 FACET fails to sort on string attribute
  38. Commit 4f1a1f25 Error in case of missing data dir
  39. Commit 04f4ddd4 access_* are not supported in RT mode
  40. Commit 1c0616a2 Bad JSON objects in strings: 1. CALL PQ returns "Bad JSON objects in strings: 1" when the json is greater than some value.
  41. Commit 32f943d6 RT-mode inconsistency. In some cases I can't drop the index since it's unknown and can't create it since the directory is not empty.
  42. Issue #319 Crash on select
  43. Commit 22a28dd7 max_xmlpipe2_field = 2M returned warning on 2M field
  44. Issue #342 Query conditions execution bug
  45. Commit dd8dcab2 Simple 2 terms search finds a document containing only one term
  46. Commit 90919e62 It was impossible in PQ to match a json with capital letters in keys
  47. Commit 56da086a Indexer crashes on csv+docstore
  48. Issue #363 using [null] in json attr in centos 7 causes corrupted inserted data
  49. Major Issue #345 Records not being inserted, count() is random, "replace into" returns OK
  50. max_query_time slows down SELECTs too much
  51. Issue #352 Master-agent communication fails on Mac OS
  52. Issue #328 Error when connecting to Manticore with Connector.Net/Mysql 8.0.19
  53. Commit daa760d2 Fixed escaping of \0 and optimized performance
  54. Commit 9bc5c01a Fixed count distinct vs json
  55. Commit 4f89a965 Fixed drop table at other node failed
  56. Commit 952af5a5 Fix crashes on tightly running call pq

Version 3.4.2, 10 April 2020

Critical bugfixes

Version 3.4.0, 26 March 2020

Major changes

  • server works in 2 modes: rt-mode and plain-mode
    • rt-mode requires data_dir and no index definition in config
    • in plain-mode indexes are defined in config; no data_dir allowed
  • replication available only in rt-mode

Minor changes

  • charset_table defaults to non_cjk alias
  • in rt-mode full-text fields are indexed and stored by default
  • full-text fields in rt-mode renamed from 'field' to 'text'
  • ALTER RTINDEX is renamed to ALTER TABLE
  • TRUNCATE RTINDEX is renamed to TRUNCATE TABLE

Features

  • stored-only fields
  • SHOW CREATE TABLE, IMPORT TABLE

Improvements

  • much faster lockless PQ
  • /sql can execute any type of SQL statement in mode=raw
  • alias mysql for mysql41 protocol
  • default state.sql in data_dir

Bugfixes

Version 3.3.0, 4 February 2020

Features

  • Parallel Real-Time index searching
  • EXPLAIN QUERY command
  • configuration file without index definitions (alpha version)
  • CREATE/DROP TABLE commands (alpha version)
  • indexer --print-rt - can read from a source and print INSERTs for a Real-Time index

Improvements

  • Updated to Snowball 2.0 stemmers
  • LIKE filter for SHOW INDEX STATUS
  • improved memory usage for high max_matches
  • SHOW INDEX STATUS adds ram_chunks_count for RT indexes
  • lockless PQ
  • changed LimitNOFILE to 65536

Bugfixes

Version 3.2.2, 19 December 2019

Features

  • Autoincrement ID for RT indexes
  • Highlight support for docstore via new HIGHLIGHT() function, available also in HTTP API
  • SNIPPET() can use special function QUERY() which returns current MATCH query
  • new field_separator option for highlighting functions.

Improvements and changes

  • lazy fetch of stored fields for remote nodes (can significantly increase performance)
  • strings and expressions don't break anymore multi-query and FACET optimizations
  • RHEL/CentOS 8 build now uses mysql libclient from mariadb-connector-c-devel
  • ICU data file is now shipped with the packages, icu_data_dir removed
  • systemd service files include 'Restart=on-failure' policy
  • indextool can now check real-time indexes online
  • default conf is now /etc/manticoresearch/manticore.conf
  • service on RHEL/CentOS renamed to 'manticore' from 'searchd'
  • removed query_mode and exact_phrase snippet's options

Bugfixes

  • Commit 6ae474c7 fix crash on SELECT query over HTTP interface
  • Commit 59577513 fix RT index saves disk chunks but does not mark some documents deleted
  • Commit e861f0fc fix crash on search of multi index or multi queries with dist_threads
  • Commit 440991fc fix crash on infix generation for long terms with wide utf8 codepoints
  • Commit 5fd599b4 fix race at adding socket to IOCP
  • Commit cf10d7d3 fix issue of bool queries vs json select list
  • Commit 996de77f fix indextool check to report wrong skiplist offset, check of doc2row lookup
  • Commit 6e3fc9e8 fix indexer produces bad index with negative skiplist offset on large data
  • Commit faed3220 fix JSON converts only numeric to string and JSON string to numeric conversion at expressions
  • Commit 53319720 fix indextool exit with error code in case multiple commands set at command line
  • Commit 795520ac fix #275 binlog invalid state on error no space left on disk
  • Commit 2284da5e fix #279 crash on IN filter to JSON attribute
  • Commit ce2e4b47 fix #281 wrong pipe closing call
  • Commit 535589ba fix server hung at CALL PQ with recursive JSON attribute encoded as string
  • Commit a5fc8a36 fix advancing beyond the end of the doclist in multiand node
  • Commit a3628617 fix retrieving of thread public info
  • Commit f8d2d7bb fix docstore cache locks

Version 3.2.0, 17 October 2019

Features

  • Document storage
  • new directives stored_fields, docstore_cache_size, docstore_block_size, docstore_compression, docstore_compression_level

Improvements and changes

  • improved SSL support
  • non_cjk built-in charset updated
  • disabled UPDATE/DELETE statements logging a SELECT in query log
  • RHEL/CentOS 8 packages

Bugfixes

Version 3.1.2, 22 August 2019

Features and Improvements

  • Experimental SSL support for HTTP API
  • field filter for CALL KEYWORDS
  • max_matches for /json/search
  • automatic sizing of default Galera gcache.size
  • improved FreeBSD support

Bugfixes

Version 3.1.0, 16 July 2019

Features and Improvements

  • replication for RealTime indexes
  • ICU tokenizer for chinese
  • new morphology option icu_chinese
  • new directive icu_data_dir
  • multiple statements transactions for replication
  • LAST_INSERT_ID() and @session.last_insert_id
  • LIKE 'pattern' for SHOW VARIABLES
  • Multiple documents INSERT for percolate indexes
  • Added time parsers for config
  • internal task manager
  • mlock for doc and hit lists components
  • jail snippets path

Removals

  • RLP library support dropped in favor of ICU; all rlp* directives removed
  • updating document ID with UPDATE is disabled

Bugfixes

Version 3.0.2, 31 May 2019

Improvements

  • added mmap readers for docs and hit lists
  • /sql HTTP endpoint response is now the same as /json/search response
  • new directives access_plain_attrs, access_blob_attrs, access_doclists, access_hitlists
  • new directive server_id for replication setups

Removals

  • removed HTTP /search endpoint

Deprecations

  • read_buffer, ondisk_attrs, ondisk_attrsdefault, mlock (replaced by `access*` directives)

Bugfixes

Version 3.0.0, 6 May 2019

Features and improvements

  • New index storage. Non-scalar attributes are not limited anymore to 4GB size per index
  • attr_update_reserve directive
  • String,JSON and MVAs can be updated using UPDATE
  • killlists are applied at index load time
  • killlist_target directive
  • multi AND searches speedup
  • better average performance and RAM usage
  • convert tool for upgrading indexes made with 2.x
  • CONCAT() function
  • JOIN CLUSTER cluster AT 'nodeaddress:port'
  • ALTER CLUSTER posts UPDATE nodes
  • node_address directive
  • list of nodes printed in SHOW STATUS

Behaviour changes

  • in case of indexes with killists, server doesn't rotate indexes in order defined in conf, but follows the chain of killlist targets
  • order of indexes in a search no longer defines the order in which killlists are applied
  • Document IDs are now signed big integers

Removed directives

  • docinfo (always extern now), inplace_docinfo_gap, mva_updates_pool

Version 2.8.2 GA, 2 April 2019

Features and improvements

  • Galera replication for percolate indexes
  • OPTION morphology

Compiling notes

Cmake minimum version is now 3.13. Compiling requires boost and libssl development libraries.

Bugfixes

Version 2.8.1 GA, 6 March 2019

Features and improvements

  • SUBSTRING_INDEX()
  • SENTENCE and PARAGRAPH support for percolate queries
  • systemd generator for Debian/Ubuntu; also added LimitCORE to allow core dumping

Bugfixes

Version 2.8.0 GA, 28 January 2019

Improvements

  • Distributed indexes for percolate indexes
  • CALL PQ new options and changes:
    • skip_bad_json
    • mode (sparsed/sharded)
    • json documents can be passed as a json array
    • shift
    • Column names 'UID', 'Documents', 'Query', 'Tags', 'Filters' were renamed to 'id', 'documents', 'query', 'tags', 'filters'
  • DESCRIBE pq TABLE
  • SELECT FROM pq WHERE UID is not possible any more, use 'id' instead
  • SELECT over pq indexes is on par with regular indexes (e.g. you can filter rules via REGEX())
  • ANY/ALL can be used on PQ tags
  • expressions have auto-conversion for JSON fields, not requiring explicit casting
  • built-in 'non_cjk' charset_table and 'cjk' ngram_chars
  • built-in stopwords collections for 50 languages
  • multiple files in a stopwords declaration can also be separated by comma
  • CALL PQ can accept JSON array of documents

Bugfixes

  • Commit a4e19af fixed csjon-related leak
  • Commit 28d8627 fixed crash because of missed value in json
  • Commit bf4e9ea fixed save of empty meta for RT index
  • Commit 33b4573 fixed lost form flag (exact) for sequence of lemmatizer
  • Commit 6b95d48 fixed string attrs > 4M use saturate instead of overflow
  • Commit 621418b fixed crash of server on SIGHUP with disabled index
  • Commit 3f7e35d fixed server crash on simultaneous API session status commands
  • Commit cd9e4f1 fixed crash of server at delete query to RT index with field filters
  • Commit 9376470 fixed crash of server at CALL PQ to distributed index with empty document
  • Commit 8868b20 fixed cut Manticore SQL error message larger 512 chars
  • Commit de9deda fixed crash on save percolate index without binlog
  • Commit 2b219e1 fixed http interface is not working in OSX
  • Commit e92c602 fixed indextool false error message on check of MVA
  • Commit 238bdea fixed write lock at FLUSH RTINDEX to not write lock whole index during save and on regular flush from rt_flush_period
  • Commit c26a236 fixed ALTER percolate index stuck waiting search load
  • Commit 9ee5703 fixed max_children to use default amount of thread_pool workers for value of 0
  • Commit 5138fc0 fixed error on indexing of data into index with index_token_filter plugin along with stopwords and stopword_step=0
  • Commit 2add3d3 fixed crash with absent lemmatizer_base when still using aot lemmatizers in index definitions

Version 2.7.5 GA, 4 December 2018

Improvements

  • REGEX function
  • limit/offset for json API search
  • profiler points for qcache

Bugfixes

  • Commit eb3c768 fixed crash of server on FACET with multiple attribute wide types
  • Commit d915cf6 fixed implicit group by at main select list of FACET query
  • Commit 5c25dc2 fixed crash on query with GROUP N BY
  • Commit 85d30a2 fixed deadlock on handling crash at memory operations
  • Commit 85166b5 fixed indextool memory consumption during check
  • Commit 58fb031 fixed gmock include not needed anymore as upstream resolve itself

Version 2.7.4 GA, 1 November 2018

Improvements

  • SHOW THREADS in case of remote distributed indexes prints the original query instead of API call
  • SHOW THREADS new option format=sphinxql prints all queries in SQL format
  • SHOW PROFILE prints additional clone_attrs stage

Bugfixes

  • Commit 4f15571 fixed failed to build with libc without malloc_stats, malloc_trim
  • Commit f974f20 fixed special symbols inside words for CALL KEYWORDS result set
  • Commit 0920832 fixed broken CALL KEYWORDS to distributed index via API or to remote agent
  • Commit fd686bf fixed distributed index agent_query_timeout propagate to agents as max_query_time
  • Commit 4ffa623 fixed total documents counter at disk chunk got affected by OPTIMIZE command and breaks weight calculation
  • Commit dcaf4e0 fixed multiple tail hits at RT index from blended
  • Commit eee3817 fixed deadlock at rotation

Version 2.7.3 GA, 26 September 2018

Improvements

  • sort_mode option for CALL KEYWORDS
  • DEBUG on VIP connection can perform 'crash ' for intentional SIGEGV action on server
  • DEBUG can perform 'malloc_stats' for dumping malloc stats in searchd.log 'malloc_trim' to perform a malloc_trim()
  • improved backtrace is gdb is present on the system

Bugfixes

Version 2.7.2 GA, 27 August 2018

Improvements

  • compatibility with MySQL 8 clients
  • TRUNCATE WITH RECONFIGURE
  • retired memory counter on SHOW STATUS for RT indexes
  • global cache of multi agents
  • improved IOCP on Windows
  • VIP connections for HTTP protocol
  • Manticore SQL DEBUG command which can run various subcommands
  • shutdown_token - SHA1 hash of password needed to invoke shutdown using DEBUG command
  • new stats to SHOW AGENT STATUS (_ping, _has_perspool, _need_resolve)
  • --verbose option of indexer now accept [debugvv] for printing debug messages

Bugfixes

Version 2.7.1 GA, 4 July 2018

Improvements

  • improved wildcards performance on matching multiple documents at PQ
  • support for fullscan queries at PQ
  • support for MVA attributes at PQ
  • regexp and RLP support for percolate indexes

Bugfixes

Version 2.7.0 GA, 11 June 2018

Improvements

  • reduced number of syscalls to avoid Meltdown and Spectre patches impact
  • internal rewrite of local index management
  • remote snippets refactor
  • full configuration reload
  • all node connections are now independent
  • proto improvements
  • Windows communication switched from wsapoll to IO completion ports
  • TFO can be used for communication between master and nodes
  • SHOW STATUS now outputs to server version and mysql_version_string
  • added docs_id option for documents called in CALL PQ.
  • percolate queries filter can now contain expressions
  • distributed indexes can work with FEDERATED
  • dummy SHOW NAMES COLLATE and SET wait_timeout (for better ProxySQL compatibility)

Bugfixes

Version 2.6.4 GA, 3 May 2018

Features and improvements

  • MySQL FEDERATED engine support
  • MySQL packets return now SERVER_STATUS_AUTOCOMMIT flag, adds compatibility with ProxySQL
  • listen_tfo - enable TCP Fast Open connections for all listeners
  • indexer --dumpheader can dump also RT header from .meta file
  • cmake build script for Ubuntu Bionic

Bugfixes

  • Commit 355b116 fixed invalid query cache entries for RT index;
  • Commit 546e229 fixed index settings got lost next after seamless rotation
  • Commit 0c45098 fixed fixed infix vs prefix length set; added warning on unsupportedinfix length
  • Commit 80542fa fixed RT indexes auto-flush order
  • Commit 705d8c5 fixed result set schema issues for index with multiple attributes and queries to multiple indexes
  • Commit b0ba932 fixed some hits got lost at batch insert with document duplicates
  • Commit 4510fa4 fixed optimize failed to merge disk chunks of RT index with large documents count

Version 2.6.3 GA, 28 March 2018

Improvements

  • jemalloc at compilation. If jemalloc is present on system, it can be enabled with cmake flag -DUSE_JEMALLOC=1

Bugfixes

  • Commit 85a6d7e fixed log expand_keywords option into Manticore SQL query log
  • Commit caaa384 fixed HTTP interface to correctly process query with large size
  • Commit e386d84 fixed crash of server on DELETE to RT index with index_field_lengths enable
  • Commit cd538f3 fixed cpustats searchd cli option to work with unsupported systems
  • Commit 8740fd6 fixed utf8 substring matching with min lengths defined

Version 2.6.2 GA, 23 February 2018

Improvements

  • improved Percolate Queries performance in case of using NOT operator and for batched documents.
  • percolate_query_call can use multiple threads depending on dist_threads
  • new full-text matching operator NOTNEAR/N
  • LIMIT for SELECT on percolate indexes
  • expand_keywords can accept 'start','exact' (where 'star,exact' has same effect as '1')
  • ranged-main-query for joined fields which uses the ranged query defined by sql_query_range

Bugfixes

  • Commit 72dcf66 fixed crash on searching ram segments; deadlock on save disk chunk with double buffer; deadlock on save disk chunk during optimize
  • Commit 3613714 fixed indexer crash on xml embedded schema with empty attribute name
  • Commit 48d7e80 fixed erroneous unlinking of not-owned pid-file
  • Commit a5563a4 fixed orphaned fifos sometimes left in temp folder
  • Commit 2376e8f fixed empty FACET result set with wrong NULL row
  • Commit 4842b67 fixed broken index lock when running server as windows service
  • Commit be35fee fixed wrong iconv libs on mac os
  • Commit 83744a9 fixed wrong count(*)

Version 2.6.1 GA, 26 January 2018

Improvements

  • agent_retry_count in case of agents with mirrors gives the value of retries per mirror instead of per agent, the total retries per agent being agent_retry_count*mirrors.
  • agent_retry_count can now be specified per index, overriding global value. An alias mirror_retry_count is added.
  • a retry_count can be specified in agent definition and the value represents retries per agent
  • Percolate Queries are now in HTTP JSON API at /json/pq.
  • Added -h and -v options (help and version) to executables
  • morphology_skip_fields support for Real-Time indexes

Bugfixes

  • Commit a40b079 fixed ranged-main-query to correctly work with sql_range_step when used at MVA field
  • Commit f2f5375 fixed issue with blackhole system loop hung and blackhole agents seems disconnected
  • Commit 84e1f54 fixed query id to be consistent, fixed duplicated id for stored queries
  • Commit 1948423 fixed server crash on shutdown from various states
  • Commit 9a706b Commit 3495fd7 timeouts on long queries
  • Commit 3359bcd8 refactored master-agent network polling on kqueue-based systems (Mac OS X, BSD).

Version 2.6.0, 29 December 2017

Features and improvements

Bugfixes

  • Commit 0cfae4c fixed crash on debug build of server (and m.b. UB on release) when built with rlp
  • Commit 324291e fixed RT index optimize with progressive option enabled that merges kill-lists with wrong order
  • Commit ac0efee minor crash on mac
  • lots of minor fixes after thorough static code analysis
  • other minor bugfixes

Upgrade

In this release we've changed internal protocol used by masters and agents to speak with each other. In case you run Manticoresearch in a distributed environment with multiple instances make sure your first upgrade agents, then the masters.

Version 2.5.1, 23 November 2017

Features and improvements

  • JSON queries on HTTP API protocol. Supported search, insert, update, delete, replace operations. Data manipulation commands can be also bulked, also there are some limitations currently as MVA and JSON attributes can't be used for inserts, replaces or updates.
  • RELOAD INDEXES command
  • FLUSH LOGS command
  • SHOW THREADS can show progress of optimize, rotation or flushes.
  • GROUP N BY work correctly with MVA attributes
  • blackhole agents are run on separate thread to not affect master query anymore
  • implemented reference count on indexes, to avoid stalls caused by rotations and high load
  • SHA1 hashing implemented, not exposed yet externally
  • fixes for compiling on FreeBSD, macOS and Alpine

Bugfixes

Version 2.4.1 GA, 16 October 2017

Features and improvements

  • OR operator in WHERE clause between attribute filters
  • Maintenance mode ( SET MAINTENANCE=1)
  • CALL KEYWORDS available on distributed indexes
  • Grouping in UTC
  • query_log_mode for custom log files permissions
  • Field weights can be zero or negative
  • max_query_time can now affect full-scans
  • added net_wait_tm, net_throttle_accept and net_throttle_action for network thread fine tuning (in case of workers=thread_pool)
  • COUNT DISTINCT works with facet searches
  • IN can be used with JSON float arrays
  • multi-query optimization is not broken anymore by integer/float expressions
  • SHOW META shows a multiplier row when multi-query optimization is used

Compiling

Manticore Search is built using cmake and the minimum gcc version required for compiling is 4.7.2.

Folders and service

  • Manticore Search runs under manticore user.
  • Default data folder is now /var/lib/manticore/.
  • Default log folder is now /var/log/manticore/.
  • Default pid folder is now /var/run/manticore/.

Bugfixes

  • Commit a58c619 fixed SHOW COLLATION statement that breaks java connector
  • Commit 631cf4e fixed crashes on processing distributed indexes; added locks to distributed index hash; removed move and copy operators from agent
  • Commit 942bec0 fixed crashes on processing distributed indexes due to parallel reconnects
  • Commit e5c1ed2 fixed crash at crash handler on store query to server log
  • Commit 4a4bda5 fixed a crash with pooled attributes in multiqueries
  • Commit 3873bfb fixed reduced core size by prevent index pages got included into core file
  • Commit 11e6254 fixed searchd crashes on startup when invalid agents are specified
  • Commit 4ca6350 fixed indexer reports error in sql_query_killlist query
  • Commit 123a9f0 fixed fold_lemmas=1 vs hit count
  • Commit cb99164 fixed inconsistent behavior of html_strip
  • Commit e406761 fixed optimize rt index loose new settings; fixed optimize with sync option lock leaks;
  • Commit 86aeb82 fixed processing erroneous multiqueries
  • Commit 2645230 fixed result set depends on multi-query order
  • Commit 72395d9 fixed server crash on multi-query with bad query
  • Commit f353326 fixed shared to exclusive lock
  • Commit 3754785 fixed server crash for query without indexes
  • Commit 29f360e fixed dead lock of server

Version 2.3.3, 06 July 2017

  • Manticore branding

Reporting bugs

Unfortunately, Manticore is not yet 100% bug free (even though we're working hard towards that), so you might occasionally run into some issues.

Reporting as much as possible about each bug is very important - because to fix it, we need to be able to either reproduce and fix the bug, or to deduce what's causing it from the information that you provide. So here are some instructions how to do that.

Bug-tracker

We track bugs and feature requests in Github. Feel free to create a new ticket and describe your bug in details so both you and developers can save their time.

Documentation updates

Updates to the documentation (what you are reading now) is also done on Github.

Crashes

Manticore is written in C++ - low level programming language allowing to speak to the computer with not so many intermediate layers for faster performance. The drawback of that is that in rare cases there is no way to handle a bug elegantly writing the error about it to a log and skipping processing the command which caused the problem. Instead of that the program can just crash which means it would stop completely and would have to be restarted.

When Manticore Search crashes you need to let Manticore team know about that by making a bug report on github or if you use Manticore's professional services in your private helpdesk. Manticore team needs the following:

  1. searchd log
  2. coredump
  3. query log

It will be great if you additionally do the following:

  1. run gdb to inspect the coredump:
    gdb /usr/bin/searchd </path/to/coredump>
  2. Find crashed thread id in the coredump file name (make sure you have %p in /proc/sys/kernel/core_pattern), e.g. core.work_6.29050.server_name.1637586599 means thread_id=29050
  3. In gdb run:
    set pagination off
    info threads
    # find thread number by it's id (e.g. for `LWP 29050` it will be thread number 8
    thread apply all bt
    thread <thread number>
    bt full
    info locals
    quit
  4. Provide the outputs

What do I do when Manticore Search hangs?

You need to run gdb manually and collect some info that may be useful to understand why it's hanging.

  1. show threads option format=all run trough a VIP port

  2. collect lsof output since hanging can be caused by too many connections or open file descriptors

    lsof -p `cat /var/run/manticore/searchd.pid`
  3. dump core

    gcore `cat /var/run/manticore/searchd.pid`

    (it will save the dump to the current dir)

  4. Install and run gdb:

    gdb /usr/bin/searchd `cat /var/run/manticore/searchd.pid`

    Note it will halt your running searchd, but if it's alredy hanging it shouldn't be a problem.

  5. In gdb run:

    set pagination off
    info threads
    thread apply all bt
    quit
  6. Collect all the outputs and files and provide them in a bug report.

For experts: the macros added in this commit can be helpful to debug.

How to enable saving coredumps on crash?

[root@srv lib]# systemctl set-environment _ADDITIONAL_SEARCHD_PARAMS='--coredump'
[root@srv lib]# systemctl restart manticore
[root@srv lib]# ps aux|grep searchd
mantico+  1955  0.0  0.0  61964  1580 ?        S    11:02   0:00 /usr/bin/searchd --config /etc/manticoresearch/manticore.conf --coredump
mantico+  1956  0.6  0.0 392744  2664 ?        Sl   11:02   0:00 /usr/bin/searchd --config /etc/manticoresearch/manticore.conf --coredump
  • make sure that your OS allows you to save coredumps: /proc/sys/kernel/core_pattern should be non-empty - it is where it will save them. If you do:
    echo "/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern

    it will instruct your kernel to save them to file like core.searchd.1773.centos-4gb-hel1-1.1636454937

  • searchd should be started with ulimit -c unlimited, but if you start Manticore via systemctl it does it for yourself since it does:
    [root@srv lib]# grep CORE /lib/systemd/system/manticore.service
    LimitCORE=infinity

How do I install debug symbols?

Manticore Search and Manticore Columnar Library are written in C++, which means that what you get is a compiled compact binary file which executes in your OS optimal way. When you run a binary your system doesn't have full access to the names of variables, functions, methods, classes etc that are implemented. All that is provided separately in so called "debuginfo" packages or "symbol packages". Debug symbols are useful for troubleshooting and other debugging purposes, since when you have symbols and your binary crashes there's a way to visualize the state it crashed at including function names. Manticore Search provides such backtrace in searchd log and also generates coredump if it was run with --coredump. Without symbols all you get is just internal offsets that is difficult/impossible to decode. So if you make a bug report about a crash in most cases Manticore team will need debug symbols to be able to help you.

To install Manticore Search / Manticore Columnar Library debug symbols just install package *debuginfo* (centos), *dbgsym* (ubuntu, debian), *dbgsymbols* (windows, macos) of exactly the same version you are running. For example if you've installed Manticore Search in Centos 8 from package https://repo.manticoresearch.com/repository/manticoresearch/release/centos/8/x86_64/manticore-4.0.2_210921.af497f245-1.el8.x86_64.rpm the corresponding package with symbols is https://repo.manticoresearch.com/repository/manticoresearch/release/centos/8/x86_64/manticore-debuginfo-4.0.2_210921.af497f245-1.el8.x86_64.rpm

Note they have the same commit id af497f245 which corresponds to the commit this version was built from.

If you installed Manticore from a Manticore APT/YUM repo you can one of the following tools:

  • debuginfo-install in centos 7
  • dnf debuginfo-install centos 8
  • find-dbgsym-packages in Debian and Ubuntu

to find a debug symbols package for you.

How to check the debug symbols exist?

  1. Find build id in file /usr/bin/searchd output:
[root@srv lib]# file /usr/bin/searchd
/usr/bin/searchd: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=2c582e9f564ea1fbeb0c68406c271ba27034a6d3, stripped

In this case it's 2c582e9f564ea1fbeb0c68406c271ba27034a6d3.

  1. Find symbols in /usr/lib/debug/.build-id like this:
[root@srv ~]# ls -la /usr/lib/debug/.build-id/2c/582e9f564ea1fbeb0c68406c271ba27034a6d3*
lrwxrwxrwx. 1 root root 23 Nov  9 10:42 /usr/lib/debug/.build-id/2c/582e9f564ea1fbeb0c68406c271ba27034a6d3 -> ../../../../bin/searchd
lrwxrwxrwx. 1 root root 27 Nov  9 10:42 /usr/lib/debug/.build-id/2c/582e9f564ea1fbeb0c68406c271ba27034a6d3.debug -> ../../usr/bin/searchd.debug

Uploading your data

To fix your bug developers often need to reproduce it locally. To do it they need your configuration file, index files, binlog (if present), sometimes data to index (like data from external storages or XML/CSV files) and queries.

Attach your data when you create a ticket on Github. In case it's too big or the data is sensitive feel free to upload it to our write-only FTP server:

  • ftp: dev.manticoresearch.com
  • user: manticorebugs
  • pass: shithappens
  • directory: create directory github-issue-N so we understand what data is related with what issue on Github.

It's convenient to mirror your directory with our FTP using tool lftp which is available for Linux, Mac and Windows. For example, if you want to sync your current directory ftp to directory github-ussue-123, here's what you should do and what you will get:

➜  ~ lftp -e "set ftp:passive-mode off; mkdir github-issue-123; mirror -LR ftp/ github-issue-123/" -u manticorebugs,shithappens dev.manticoresearch.com
mkdir ok, `github-issue-123' created
Total: 2 directories, 1 file, 0 symlinks
New: 1 file, 0 symlinks
lftp manticorebugs@dev.manticoresearch.com:/> quit

DEBUG

DEBUG [ subcommand ]

DEBUG statement is designed to call different internal or vip commands for dev/testing purposes. It is not intended for production automation, since the syntax of subcommand part may be freely changed in any build.

Call DEBUG without params to show list of useful commands (in general) and subcommands (of DEBUG statement) available at current context.

MySQL [(none)]> debug;
+------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| command                                                          | meaning                                                                                |
+------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| flush logs                                                       | emulate USR1 signal                                                                    |
| reload indexes                                                   | emulate HUP signal                                                                     |
| debug token <password>                                           | calculate token for password                                                           |
| debug malloc_stats                                               | perform 'malloc_stats', result in searchd.log                                          |
| debug malloc_trim                                                | pefrorm 'malloc_trim' call                                                             |
| debug sleep <N>                                                  | sleep for <N> seconds                                                                  |
| debug tasks                                                      | display global tasks stat (use select from @@system.tasks instead)                     |
| debug sched                                                      | display task manager schedule (use select from @@system.sched instead)                 |
| debug merge <IDX> [chunk] <X> [into] [chunk] <Y> [option sync=1] | For RT index <IDX> merge disk chunk X into disk chunk Y                                |
| debug drop [chunk] <X> [from] <IDX> [option sync=1]              | For RT index <IDX> drop disk chunk X                                                   |
| debug files <IDX> [option format=all|external]                   | list files belonging to <IDX>. 'all' - including external (wordforms, stopwords, etc.) |
| debug close                                                      | ask server to close connection from it's side                                          |
| debug compress <IDX> [chunk] <X> [option sync=1]                 | Compress disk chunk X of RT index <IDX> (wipe out deleted documents)                   |
| debug split <IDX> [chunk] <X> on @<uservar> [option sync=1]      | Split disk chunk X of RT index <IDX> using set of DocIDs from @uservar                 |
| debug wait <cluster> [like 'xx'] [option timeout=3]              | wait <cluster> ready, but no more than 3 secs.                                         |
| debug wait <cluster> status <N> [like 'xx'] [option timeout=13]  | wait <cluster> commit achieve <N>, but no more than 13 secs                            |
+------------------------------------------------------------------+----------------------------------------------------------------------------------------+
16 rows in set (0.00 sec)

Same from VIP connection:

MySQL [(none)]> debug;
+------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| command                                                          | meaning                                                                                |
+------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| flush logs                                                       | emulate USR1 signal                                                                    |
| reload indexes                                                   | emulate HUP signal                                                                     |
| debug shutdown <password>                                        | emulate TERM signal                                                                    |
| debug crash <password>                                           | crash daemon (make SIGSEGV action)                                                     |
| debug token <password>                                           | calculate token for password                                                           |
| debug malloc_stats                                               | perform 'malloc_stats', result in searchd.log                                          |
| debug malloc_trim                                                | pefrorm 'malloc_trim' call                                                             |
| debug procdump                                                   | ask watchdog to dump us                                                                |
| debug setgdb on|off                                              | enable or disable potentially dangerous crash dumping with gdb                         |
| debug setgdb status                                              | show current mode of gdb dumping                                                       |
| debug sleep <N>                                                  | sleep for <N> seconds                                                                  |
| debug tasks                                                      | display global tasks stat (use select from @@system.tasks instead)                     |
| debug sched                                                      | display task manager schedule (use select from @@system.sched instead)                 |
| debug merge <IDX> [chunk] <X> [into] [chunk] <Y> [option sync=1] | For RT index <IDX> merge disk chunk X into disk chunk Y                                |
| debug drop [chunk] <X> [from] <IDX> [option sync=1]              | For RT index <IDX> drop disk chunk X                                                   |
| debug files <IDX> [option format=all|external]                   | list files belonging to <IDX>. 'all' - including external (wordforms, stopwords, etc.) |
| debug close                                                      | ask server to close connection from it's side                                          |
| debug compress <IDX> [chunk] <X> [option sync=1]                 | Compress disk chunk X of RT index <IDX> (wipe out deleted documents)                   |
| debug split <IDX> [chunk] <X> on @<uservar> [option sync=1]      | Split disk chunk X of RT index <IDX> using set of DocIDs from @uservar                 |
| debug wait <cluster> [like 'xx'] [option timeout=3]              | wait <cluster> ready, but no more than 3 secs.                                         |
| debug wait <cluster> status <N> [like 'xx'] [option timeout=13]  | wait <cluster> commit achieve <N>, but no more than 13 secs                            |
+------------------------------------------------------------------+----------------------------------------------------------------------------------------+
21 rows in set (0.00 sec)

All debug XXX commands should be regarded as non-stable, and they're matter of freely modification at any moment, don't be surprised. This example output here also not necessary reflect actual available commands, try it on your system to see what is available on your instance. Also, no detailed documentation implied apart this short 'meaning' column.

Just as quick illustration, two commands available only to VIP clients described below - shutdown and crash. Both requires a token, which can be generated with debug token subcommand, and put into shutdown_token param of searchd section of the config file. If no such section exists, or if a hash of the provided password does not match with the token stored in the config, the subcommands will do nothing.

mysql> debug token hello;
+-------------+------------------------------------------+
| command     | result                                   |
+-------------+------------------------------------------+
| debug token | aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d |
+-------------+------------------------------------------+
1 row in set (0,00 sec)

Subcommand shutdown will send a TERM signal to the server and so will make it shut down. Since it is quite dangerous (nobody wants accidentally stop a production service), it:

  1. needs a VIP connection, and
  2. needs the password

Subcommand crash is literally causes crash. It may be used when setting up different things, like 'how system manager keeps livenes of the service', or 'how feasible is coredump tracking'.

If some commands is found useful in generic, we'll move them from debug subcommands to more generic and stable place (notice meaning of debug tasks and debug sched in the table as such examples).

References

SQL commands

Schema management
Data management
  • INSERT - Adds new documents
  • UPDATE - Replaces existing documents with new ones
  • UPDATE - Does in-place update in documents
  • DELETE - Deletes documents
  • TRUNCATE TABLE - Deletes all documents from index
SELECT
Flushing misc things
  • FLUSH ATTRIBUTES - Forces flushing updated attributes to disk
  • FLUSH HOSTNAMES - Renews IPs associates to agent host names
  • FLUSH LOGS - Initiates reopen of searchd log and query log files (similar to USR1)
Real-time index optimization
Importing to a real-time index
  • ATTACH INDEX - Moves data from a plain index to a real-time index
  • IMPORT TABLE - Imports previously created RT or PQ index into a server running in RT mode
Replication
Plain index rotate
Transactions
  • BEGIN - Begins a transaction
  • COMMIT - Finishes a transaction
  • ROLLBACK - Rolls back a transaction
CALL
Plugins
Server status

HTTP endpoints

  • /sql - Allows running an SQL statement over HTTP
  • /cli - HTTP command line interface
  • /insert - Inserts a document into a real-time index
  • /pq/idx/doc - Inserts a PQ rule into a percolate index
  • /update - Updates a document in a real-time index
  • /replace - Replaces a document in a real-time index
  • /pq/idx/doc/N?refresh=1 - Replaces a PQ rule in a percolate index
  • /delete - Deletes a document in an index
  • /bulk - Perform several insert, update or delete operations in a single call
  • /search - Performs search
  • /pq/idx/search - Performs reverse search in a percolate index

Common things

Common index settings
Plain index settings
Distributed index settings
RT index settings

Full-text search operators

Functions

Mathematical
  • ABS() - Returns absolute value
  • ATAN2() - Returns arctangent function of two arguments
  • BITDOT() - Returns sum of products of an each bit of a mask multiplied with its weight
  • CEIL() - Returns smallest integer value greater or equal to the argument
  • COS() - Returns cosine of the argument
  • CRC32() - Returns CRC32 value of the argument
  • EXP() - Returns exponent of the argument
  • FIBONACCI() - Returns the N-th Fibonacci number, where N is the integer argument
  • FLOOR() - Returns the largest integer value lesser or equal to the argument
  • GREATEST() - Takes JSON/MVA array as the argument and returns the greatest value in that array
  • IDIV() - Returns result of an integer division of the first argument by the second argument
  • LEAST() - Takes JSON/MVA array as the argument, and returns the least value in that array
  • LN() - Returns natural logarithm of the argument
  • LOG10() - Returns common logarithm of the argument
  • LOG2() - Returns binary logarithm of the argument
  • MAX() - Returns the bigger of two arguments
  • MIN() - Returns the smaller of two arguments
  • POW() - Returns the first argument raised to the power of the second argument
  • RAND() - Returns random float between 0..1
  • SIN() - Returns sine of the argument
  • SQRT() - Returns square root of the argument
Searching and ranking
  • BM25F() - Returns precise BM25F formula value
  • EXIST() - Replaces non-existing columns with default values
  • GROUP_CONCAT() - Produces a comma-separated list of the attribute values of all documents in the group
  • HIGHLIGHT() - Highlights search results
  • MIN_TOP_SORTVAL() - Returns sort key value of the worst found element in the current top-N matches
  • MIN_TOP_WEIGHT() - Returns weight of the worst found element in the current top-N matches
  • PACKEDFACTORS() - Outputs weighting factors
  • REMOVE_REPEATS() - Removes repeated adjusted rows with the same 'column' value
  • WEIGHT() - Returns fulltext match score
  • ZONESPANLIST() - Returns pairs of matched zone spans
  • QUERY() - Returns current full-text query
Type casting
  • BIGINT() - Forcibly promotes the integer argument to 64-bit type
  • DOUBLE() - Forcibly promotes given argument to floating point type
  • INTEGER() - Forcibly promotes given argument to 64-bit signed type
  • TO_STRING() - Forcibly promotes the argument to string type
  • UINT() - Forcibly reinterprets given argument to 64-bit unsigned type
  • SINT() - Interprets 32-bit unsigned integer as signed 64-bit integer
Arrays and conditions
  • ALL() - Returns 1 if condition is true for all elements in the array
  • ANY() - Returns 1 if condition is true for any element in the array
  • CONTAINS() - Checks whether the (x,y) point is within the given polygon
  • IF() - Checks whether the 1st argument is equal to 0.0, returns the 2nd argument if it is not zero or the 3rd one when it is
  • IN() - Returns 1 if the first argument is equal to any of the other arguments, or 0 otherwise
  • INDEXOF() - Iterates through all elements in the array and returns index of the first matching element
  • INTERVAL() - Returns index of the argument that is less than the first argument
  • LENGTH() - Returns number of elements in MVA
  • REMAP() - Allows to make some exceptions of expression values depending on the condition values
Date and time
  • NOW() - Returns current timestamp as an INTEGER
  • CURTIME() - Returns current time in local timezone
  • UTC_TIME() - Returns current time in UTC timezone
  • UTC_TIMESTAMP() - Returns current date/time in UTC timezone
  • SECOND() - Returns integer second from the timestamp argument
  • MINUTE() - Returns integer minute from the timestamp argument
  • HOUR() - Returns integer hour from the timestamp argument
  • DAY() - Returns integer day from the timestamp argument
  • MONTH() - Returns integer month from the timestamp argument
  • YEAR() - Returns integer year from the timestamp argument
  • YEARMONTH() - Returns integer year and month code from the timestamp argument
  • YEARMONTHDAY() - Returns integer year, month and day code from the timestamp argument
  • TIMEDIFF() - Returns difference between the timstamps
Geo-spatial
  • GEODIST() - Computes geosphere distance between two given points
  • GEOPOLY2D() - Creates a polygon that takes in account the Earth's curvature
  • POLY2D() - Creates a simple polygon in plain space
String
  • CONCAT() - Concatenates two or more strings
  • REGEX() - Returns 1 if regular expression matched to string of attribute and 0 otherwise
  • SNIPPET() - Highlights search results
  • SUBSTRING_INDEX() - Returns a substring of the string before the specified number of delimiter occurs
  • Other
  • LAST_INSERT_ID() - Returns ids of documents inserted or replaced by last statement in the current session

Common settings in configuration file

To be put to section common {} in configuration file:

indexer is a tool to create plain indexes

Indexer settings in configuration file

To be put to section indexer {} in configuration file:

Indexer start parameters
indexer [OPTIONS] [indexname1 [indexname2 [...]]]
  • --all - Rebuilds all indexes from the config
  • --buildstops - Reviews the index source, as if it were indexing the data, and produces a list of the terms that are being indexed.
  • --buildfreqs - Adds the quantity present in the index for --buildstops
  • --config, -c - Path to configuration file
  • --dump-rows - Dumps rows fetched by SQL source(s) into the specified file
  • --help - Lists all the parameters
  • --keep-attrs - Allows to reuse existing attributes on reindexing
  • --keep-attrs-names - Allows to specify attributes to reuse from the existing index
  • --merge-dst-range - Runs the filter range given upon merging
  • --merge-killlists - Changes the way kill lists are processed when merging indexes
  • --merge - Merges two plain indexes into one
  • --nohup - Indexer won't send SIGHUP if this option is on
  • --noprogress - Prevents displaying progress details
  • --print-queries - Prints out SQL queries that indexer sends to the database
  • --print-rt - Outputs data fetched from sql source(s) as INSERTs to a real-time index
  • --quiet - Prevents displaying anything
  • --rotate - Forces indexes rotation after all the indexes are built
  • --sighup-each - Forces rotation of each index after it's built
  • -v - Shows indexer version

Index converter from Manticore v2 / Sphinx v2

index_converter is a tool for converting indexes created with Sphinx/Manticore Search 2.x to Manticore Search 3.x index format.

index_converter {--config /path/to/config|--path}
Index converter start parameters
  • --config, -c - Path to indexes configuration file
  • --index - Specifies which index should be converted
  • --path - Defines path containing index(es) instead of the configuration file
  • --strip-path - Strips path from filenames referenced by index
  • --large-docid - Allows to convert documents with ids larger than 2^63
  • --output-dir - Writes the new files in a chosen folder
  • --all - Converts all indexes from the configuration file / path
  • --killlist-target - Sets the target indexes for which kill-lists will be applied

searchd is a Manticore server.

Searchd settings in a configuration file

To be put to section searchd {} in configuration file:

Searchd start parameters
searchd [OPTIONS]
  • --config, -c - Path to configuration file
  • --console - Forces running in console mode
  • --coredump - Enables saving core dump on crash
  • --cpustats - Enables CPU time reporting
  • --delete - Removes Manticore service from Microsoft Management Console and other places where the services are registered
  • --force-preread - Forbids the server to serve any incoming connection until pre-reading of the index files completes
  • --help, -h - Lists all the parameters
  • --index - Forces serving only the specified index
  • --install - Installs searchd as a service into Microsoft Management Console
  • --iostats - Enables input/output reporting
  • --listen, -l - Overrides listen from the configuration file
  • --logdebug, --logdebugv, --logdebugvv - Enables additional debug output in the server log
  • --logreplication - Enables additional replication debug output in the server log
  • --new-cluster - Bootstraps a replication cluster and makes the server a reference node with cluster restart protection
  • --new-cluster-force - Bootstraps a replication cluster and makes the server a reference node bypassing cluster restart protection
  • --nodetach - Leaves searchd in foreground
  • --ntservice - Passed by Microsoft Management Console to searchd to invoke it as a service on Windows platforms
  • --pidfile - Overrides pid_file from the configuration file
  • --port, p - Specifies port searchd should listen on disregarding the port specified in the configuration file
  • --replay-flags - Specifies extra binary log replay options
  • --servicename - Applies the given name to searchd when installing or deleting the service, as would appear in Microsoft Management Console
  • --status - Queries running search to return its status
  • --stop - Stops Manticore server
  • --stopwait - Stops Manticore server gracefully
  • --strip-path - Strips path names from all the file names referenced from the index
  • -v - shows version information
Searchd environment variables

Miscellaneous index maintenance functionality useful for troubleshooting.

indextool <command> [options]
Indextool start parameters

Used to dump miscellaneous debug information about the physical index

indextool <command> [options]
  • --config, -c - Path to configuration file
  • --quiet, -q - Keeps indextool quiet - it will not output banner, etc
  • --help, -h - Lists all the parameters
  • -v - Shows version information
  • Indextool - Verifies configuration file
  • --buildidf - Builds IDF file from one or several dictionary dumps
  • --build-infixes - Build infixes for an existing dict=keywords index
  • --dumpheader - Quickly dumps the provided index header file
  • --dumpconfig - Dumps index definition from the given index header file in almost compliant manticore.conf file format
  • --dumpheader - Dumps index header by index name with looking up the header path in the configuration file
  • --dumpdict - Dumps index dictionary
  • --dumpdocids - Dumps document IDs by index name
  • --dumphitlist - Dumps all occurrences of the given keyword/id in the given index
  • --fold - Tests tokenization based on index's settings
  • --htmlstrip - Filters STDIN using HTML stripper settings for the given index
  • --mergeidf - Merges several .idf files into a single one
  • --morph - Applies morphology to the given STDIN and prints the result to stdout
  • --check - Checks the index data files for consistency
  • --check-id-dups - Checks if there are duplicate ids
  • --check-disk-chunk - Checks one disk chunk of an RT index
  • --strip-path - Strips path names from all the file names referenced from the index
  • --rotate - Defines whether to check index waiting for rotation in --check
  • --apply-killlists - Applies kill-lists for all indexes listed in the configuration file

Splits compound words into components.

wordbreaker [-dict path/to/dictionary_file] {split|test|bench}
Wordbreaker start parameters.

Used to extract contents of a dictionary file that uses ispell or MySpell format.

spelldump [options] <dictionary> <affix> [result] [locale-name]
  • dictionary - Dictionary's main file
  • affix - Dictionary's affix file
  • result - Specifies where the dictionary data should be output to
  • locale-name - Specifies the locale details to use

List of reserved keywords

A complete alphabetical list of keywords that are currently reserved in Manticore SQL syntax (and therefore can not be used as identifiers).

AND, AS, BY, DISTINCT, DIV, EXPLAIN, FACET, FALSE, FORCE, FROM, IGNORE, IN, INDEXES, IS, LIMIT, MOD, NOT, NULL, OFFSET, OR, ORDER, REGEX, RELOAD, SELECT, SYSFILTERS, TRUE, USE

Documentation for old Manticore versions