Released: Oct 7 2024
- Issue #64 Resolved an issue where the
unattended-upgrades
utility, which automatically installs package updates on Debian-based systems, would incorrectly mark several Manticore packages, includingmanticore-galera
,manticore-executor
, andmanticore-columnar-lib
, for removal. This occurred due todpkg
mistakenly considering the virtual packagemanticore-extra
as redundant. Changes were made to ensureunattended-upgrades
no longer tries to remove essential Manticore components.
Released: August 2nd 2024
Version 6.3.6 continues the 6.3 series and includes only bug fixes.
- Issue #2477 Fixed a crash introduced in version 6.3.4, which could occur when dealing with expressions and distributed or multiple tables.
- Issue #2352 Fixed a daemon crash or internal error upon early exit caused by
max_query_time
when querying multiple indexes.
Released: July 31st 2024
Version 6.3.4 continues the 6.3 series and includes only minor improvements and bug fixes.
- Issue #2146 Improved escaping of delimiters in word forms and exceptions.
- Issue #2268 Improved external files renaming on copy for CREATE and ALTER TABLE statements.
- Issue #2315 Added string comparison operators to SELECT list expressions.
- Issue #2363 Added support for null values in Elastic-like bulk requests.
- Issue #2374 Added support for mysqldump version 9.
- Issue #2375 Improved error handling in HTTP JSON queries with JSON path to the node where the error occurs.
- Issue #2280 Fixed performance degradation in wildcard queries with many matches when disk_chunks > 1.
- Issue #2332 Fixed crash in MVA MIN or MAX SELECT list expressions for empty MVA arrays.
- Issue #2339 Fixed incorrect processing of Kibana's infinite range request.
- Issue #2342 Fixed join filter on columnar attributes from the right table when the attribute is not in the SELECT list.
- Issue #2343 Fixed duplicate 'static' specifier in Manticore 6.3.2.
- Issue #2344 Fixed LEFT JOIN returning non-matching entries when MATCH() over the right table is used.
- Issue #2350 Fixed saving of disk chunk at RT index with
hitless_words
. - Issue #2364 The 'aggs_node_sort' property can now be added in any order among other properties.
- Issue #2368 Fixed error on full-text vs filter order in the JSON query.
- Issue #2376 Fixed bug related to incorrect JSON response for long UTF-8 requests.
- Issue #2684 Fixed calculation of presort/prefilter expressions that depend on joined attributes.
- Issue #301 Changed the method of calculating data size for metrics to read from the
manticore.json
file instead of checking the entire size of the data directory. - Issue #302 Added handling of validation requests from Vector.dev.
Released: June 26th 2024
Version 6.3.2 continues the 6.3 series and includes several bug fixes, some of which were discovered after the release of 6.3.0.
- ⚠️Issue #2305 Updated aggs.range values to be numeric.
- Commit c51c Fixed grouping by stored check vs rset merge.
- Commit 0e85 Fixed a crash in the daemon when querying with wildcard characters in an RT index using a CRC dictionary and
local_df
enabled. - Issue #2200 Fixed a crash in JOIN on
count(*)
without GROUP BY. - Issue #2201 Fixed JOIN not returning a warning when attempting grouping by a full-text field.
- Issue #2230 Addressed issue where adding an attribute via
ALTER TABLE
did not take KNN options into account. - Issue #2231 Fixed failure in removing
manticore-tools
Redhat package in version 6.3.0. - Issue #2242 Corrected issues with JOIN and multiple FACET statements returning incorrect results.
- Issue #2250 Fixed ALTER TABLE producing an error if the table is in a cluster.
- Issue #2252 Fixed the original query being passed into buddy from the SphinxQL interface.
- Issue #2267 Improved wildcard expansion in the
CALL KEYWORDS
for RT index with disk chunks. - Issue #271 Fixed hanging of incorrect
/cli
requests. - Issue #274 Resolved issues where concurrent requests to Manticore could get stuck.
- Issue #275 Fixed hanging of
drop table if exists t; create table t
via/cli
.
- Issue #2270 Added support for
cluster:name
format in the/_bulk
HTTP endpoint.
Released: May 23rd 2024
- Issue #839 Implemented float_vector data type; implemented vector search.
- Issue #1673 INNER/LEFT JOIN (beta stage).
- Issue #1744 Implemented autodetection of date formats for timestamp fields.
- Issue #1720 Changed Manticore Search license from GPLv2-or-later to GPLv3-or-later.
- Commit 7a55 Running Manticore in Windows now requires Docker to run Buddy.
- Issue #1541 Added a REGEX full-text operator.
- Issue #2091 Ubuntu Noble 24.04 support.
- Commit 514d Revamp of time operations for better performance and new date/time functions:
- CURDATE() - Returns current date in local timezone
- QUARTER() - Returns the integer quarter of the year from a timestamp argument
- DAYNAME() - Returns the weekday name for a given timestamp argument
- MONTHNAME() - Returns the name of the month for a given timestamp argument
- DAYOFWEEK() - Returns the integer weekday index for a given timestamp argument
- DAYOFYEAR() - Returns the integer day of the year for a given timestamp argument
- YEARWEEK() - Returns the integer year and the day code of the first day of current week for a given timestamp argument
- DATEDIFF() - Returns the number of days between two given timestamps
- DATE() - Formats the date part from a timestamp argument
- TIME() - Formats the time part from a timestamp argument
- timezone - Timezone used by date/time-related functions.
- Commit 30e7 Added range, histogram, date_range, and date_histogram aggregates to the HTTP interface and similar expressions into SQL.
- Issue #1285 Support of Filebeat versions 8.10 - 8.11.
- Issue #1771 ALTER TABLE ... type='distributed'.
- Issue #1788 Added the ability to copy tables using the CREATE TABLE ... LIKE ... WITH DATA SQL statement.
- Issue #2072 Optimized the table compacting algorithm: Previously, both manual OPTIMIZE and automatic auto_optimize processes would first merge chunks to ensure the count did not exceed the limit, and then expunge deleted documents from all other chunks containing deleted documents. This approach was sometimes too resource-intensive and has been disabled. Now, chunk merging occurs solely according to the progressive_merge setting. However, chunks with a high number of deleted documents are more likely to be merged.
- Commit ce6c Added protection against loading a secondary index of a newer version.
- Issue #1417 Partial replace via REPLACE INTO ... SET.
- Commit 7c16 Updated default merge buffer sizes:
.spa
(scalar attrs): 256KB -> 8MB;.spb
(blob attrs): 256KB -> 8MB;.spc
(columnar attrs): 1MB, no change;.spds
(docstore): 256KB -> 8MB;.spidx
(secondary indexes): 256KB buffer -> 128MB memory limit;.spi
(dictionary): 256KB -> 16MB;.spd
(doclists): 8MB, no change;.spp
(hitlists): 8MB, no change;.spe
(skiplists): 256KB -> 8MB. - Issue #1859 Added composite aggregation via JSON.
- Commit 216b Disabled PCRE.JIT due to issues with some regex patterns and no significant time benefit.
- Commit 55cd Added support for vanilla Galera v.3 (api v25) (
libgalera_smm.so
from MySQL 5.x). - Commit 86f9 Changed metric suffix from
_rate
to_rps
. - Commit c0c1 Improved docs about balancer HA support.
- Commit d1d2 Changed
index
totable
in error messages; fixed bison parser error message fixup. - Commit fd26 Support
manticore.tbl
as table name. - Issue #1105 Support for running indexer via systemd (docs). ❤️ Thank you, @subnix for the PR.
- Issue #1294 Secondary indexes support in GEODIST().
- Issue #1394 Simplified SHOW THREADS.
- Issue #1424 Added support for the default values (
agent_connect_timeout
andagent_query_timeout
) forcreate distributed table
statement. - Issue #1442 Added expansion_limit search query option that overrides
searchd.expansion_limit
. - Issue #1448 Implemented ALTER TABLE for int->bigint conversion.
- Issue #146 Meta information in MySQL response.
- Issue #1494 SHOW VERSION.
- Issue #1582 Support of deleting documents by id array via JSON.
- Issue #1589 Improve error "unsupported value type".
- Issue #1634 Added Buddy version into
SHOW STATUS
. - Issue #1641 Match requests optimisation in case of zero docs for a keyword.
- Issue #1712 Added conversion to bool attribute from the string value of "true" and "false" on posting data.
- Issue #1713 Added access_dict table and searchd option.
- Issue #1767 Added new options: expansion_merge_threshold_docs and expansion_merge_threshold_hits to the searchd section of the config; made the threshold for merging tiny terms of the expanded terms configurable.
- Issue #1768 Added display of the last command time in
@@system.sessions
. - Issue #1806 Upgraded Buddy Protocol to version 2.
- Issue #1810 Implemented additional request formats to the
/sql
endpoint to ease integration with libraries. - Issue #1825 Added an Info section to SHOW BUDDY PLUGINS.
- Issue #1837 Improved memory consumption in
CALL PQ
with large packets. - Issue #1853 Switched compiler from Clang 15 to Clang 16.
- Issue #1857 Added string comparison. ❤️ Thank you, @etcd for the PR.
- Issue #1915 Added support for joined stored fields.
- Issue #1937 Log client's host:port in query-log.
- Issue #1981 Fixed wrong error.
- Issue #1983 Introduced support for verbosity levels for the query plan via JSON.
- Issue #2010 Disabled logging of queries from Buddy unless
log_level=debug
is set. - Issue #2035 Linux Mint 21.3 suppport.
- Issue #2056 Manticore couldn't be built with Mysql 8.3+.
- Issue #2112
DEBUG DEDUP
command for real-time table chunks that can experience duplicate entries after attaching a plain table containing duplicates. - Issue #212 Added time to SHOW QUERIES.
- Issue #218 Handle
@timestamp
column as timestamp. - Issue #252 Implemented logic to enable/disable buddy plugins.
- Issue #254 Updated composer to a fresher version where recent CVEs are fixed.
- Issue #340 Minor optimization in Manticore systemd unit related with
RuntimeDirectory
. - Issue #51 Added rdkafka support and updated to PHP 8.3.3.
- Issue #527 Support of attaching an RT table. New command ALTER TABLE ... RENAME.
- ⚠️Issue #1436 Fixed an IDF calculation issue.
local_df
is now a default. Improved the master-agent search protocol (version updated). If you are running Manticore Search in a distributed environment with multiple instances, make sure to first upgrade the agents, then the masters. - ⚠️Issue #1572 Added replication of distributed tables and updated the replication protocol. If you are running a replication cluster, you need to:
- First, cleanly stop all your nodes
- Then, start the node that was stopped last with
--new-cluster
, using the toolmanticore_new_cluster
in Linux. - Read about restarting a cluster for more details.
- ⚠️Issue #1763 HTTP API endpoint aliases
/json/*
have been deprecated. - ⚠️Issue #1982 Changed profile to plan in JSON, added query profiling for JSON.
- ⚠️Commit e235 manticore-backup doesn't backup
plugin_dir
anymore. - ⚠️Issue #171 Migrated Buddy to Swoole to improve performance and stability. When switching to the new version, ensure all Manticore packages are updated.
- ⚠️Issue #196 Merged all core plugins into Buddy and changed the core logic.
- ⚠️Issue #2107 Treating document IDs as numbers in
/search
responses. - ⚠️Issue #38 Added Swoole, disabled ZTS, and removed the parallel extension.
- ⚠️Issue #1929 Overriding in
charset_table
was not working in some cases.
- Commit 3376 Fixed replication error on SST of large files.
- Commit 6d36 Added a retry mechanism to replication commands; fixed replication join failure on a busy network with packet loss.
- Commit 842e Changed the FATAL message in replication to a WARNING message.
- Commit 8c32 Fixed the calculation of the
gcache.page_size
for replication clusters without tables or with empty tables; also fixed saving and loading of the Galera options. - Commit a2af Added functionality to skip the update nodes replication command on the node that joins the cluster.
- Commit c054 Fixed deadlock during replication on updating blob attributes versus replacing documents.
- Commit e80d Added replication_connect_timeout, replication_query_timeout, replication_retry_delay, replication_retry_count searchd config options to control network during replication similar to
searchd.agent_*
but with different defaults. - Issue #1356 Fixed replication nodes retry after some nodes are missed and name resolution of these nodes failed.
- Issue #1445 Fixed the replication log verbosity level at the
show variables
. - Issue #1482 Fixed a replication issue for a joiner node connecting to a cluster on a pod restarted in Kubernetes.
- Issue #1962 Fixed a long wait for replication to alter on empty cluster with an invalid node name.
- Commit 8a48 Fixed unused matches cleanup in
count distinct
which could cause a crash. - Issue #1569 Binary log is now written with transaction granularity.
- Issue #2089 Fixed a bug associated with 64-bit IDs that could result in a "Malformed packet" error when inserting via MySQL, leading to corrupted tables and duplicate IDs.
- Issue #2160 Fixed dates being inserted as if they were in UTC instead of local time zone.
- Issue #2177 Fixed a crash that occurred when performing a search in a real-time table with a non-empty
index_token_filter
. - Issue #2209 Changed duplicate filtering in RT columnar storage to fix crashes and wrong query results.
- Commit 001d Fixed html stipper corrupting memory after processing a joined field.
- Commit 00eb Avoided rewinding stream after flush to prevent miscommunication issues with mysqldump.
- Commit 0553 Don't wait for preread to finish if it has not started.
- Commit 055a Fixed large Buddy output string to split among multiple lines in the searchd log.
- Commit 0a88 Moved MySQL interface warning about failing header
debugv
verbosity level. - Commit 150a Fixed race condition on multiple clusters management operations; prohibited creating multiple clusters with the same name or path.
- Commit 2e40 Fixed implicit cutoff in fulltext queries; split MatchExtended into template partD.
- Commit 75f5 Fixed the discrepancy of
index_exact_words
between indexing and loading the table to the daemon. - Commit 7643 Fixed missed error message for invalid cluster deletion.
- Commit 7a03 Fixed CBO vs queue union; fixed CBO vs RT pseudo sharding.
- Commit 7b4e When starting without the secondary index (SI) library and parameters in the configuration, the misleading warning message 'WARNING: secondary_indexes set but failed to initialize secondary library' was issued.
- Commit 8496 fixed hit sorting in quorum.
- Commit 8973 Fixed issue with upper case options in the ModifyTable plugin.
- Commit 9935 Fixed restoring from a dump with empty json values (represented as NULL).
- Commit a28f Fixed SST timeout at the joiner node when receiving SST by using pcon.
- Commit b5a5 Fixed a crash on selecting aliased string attribute.
- Commit c556 Added query transform of the term into
=term
of full-text query with themorphology_skip_fields
field. - Commit cdc3 Added missing config key (skiplist_cache_size).
- Commit cf6e Fixed crash at the expression ranker with large complex query.
- Commit e513 Fixed fulltext CBO vs invalid index hints.
- Commit eb05 Interrupt preread on shutdown for faster shutdown.
- Commit f945 Changed stack calculation for fulltext queries to avoid a crash in case of a complex query.
- Issue #1262 Fixed a crash of the indexer when indexing an SQL source with multiple columns having the same name.
- Issue #1273 Return 0 instead of
for non-existing sysvars. - Issue #1289 Fixed indextool error when checking external files of the RT table.
- Issue #1335 Fixed query parse error due to multi wordform inside the phrase.
- Issue #1364 Added the replay of empty binlog files with old binlog versions.
- Issue #1365 Fixed removal of the last empty binlog file.
- Issue #1372 Fixed incorrect relative paths (converted to absolute from the daemon's start directory) after changes in
data_dir
affect the current work directory on daemon start. - Issue #1393 Slowest time degradation in hn_small: fetch/cache cpu info on daemon startup.
- Issue #1395 Fixed warning regarding missing external file during index load.
- Issue #1402 Fixed crash at global groupers on free of data ptr attributes.
- Issue #1403 _ADDITIONAL_SEARCHD_PARAMS is not working.
- Issue #1427 Fixed per table
agent_query_timeout
being replaced by the default query optionagent_query_timeout
. - Issue #1444 Fixed crash at the grouper and ranker when using
packedfactors()
with multiple values per match. - Issue #1458 Manticore crashes on frequent index updates.
- Issue #1481 Fixed crash on cleanup of the parsed query after parse error.
- Issue #1484 Fixed HTTP JSON requests not being routed to buddy.
- Issue #1499 JSON attribute root value couldn't be an array. Fixed.
- Issue #1507 Fixed crash on table recreation within the transaction.
- Issue #1515 Fixed expansion of the short forms of the RU lemmas.
- Issue #1579 Fixed JSON and STRING attributes usage in [date_format](Functions/Date_and_time_functions.md#DATE_FORMAT()) expression.
- Issue #1580 Fixed the grouper for multiple aliases to JSON fields.
- Issue #1594 Wrong total_related in dev: fixed implict cutoff vs limit; added better fullscan detection in json queries.
- Issue #1603 Fixed JSON and STRING attributes usage in all date expression.
- Issue #1609 crash on using LEVENSHTEIN().
- Issue #1612 Fixed memory corruption after a search query parse error with highlight.
- Issue #1614 Disabled wildcard expansion for terms shorter than
min_prefix_len
/min_infix_len
. - Issue #1617 Altered behavior to not log an error if Buddy handles the request successfully.
- Issue #1635 Fixed total at the meta of the search query for queries with limit set.
- Issue #1640 Impossible to use a table with an upper case via JSON in plain mode.
- Issue #1643 Provided a default
SPH_EXTNODE_STACK_SIZE
value. - Issue #1646 Fixed SphinxQL log of negative filter with ALL/ANY on MVA attribute.
- Issue #1660 Fix application of docid killlists from other indexes. ❤️ Thank you, @raxoft for the PR.
- Issue #1668 Fixed missed matches due to early exit on raw index full scan (without any indexes iterators); removed cutoff from the plain row iterator.
- Issue #1671 Fixed
FACET
error when querying a distributed table with agent and local tables. - Issue #1690 Fixed crash on histogram estimation for large values.
- Issue #1692 crash on alter table tbl add column col uint.
- Issue #1710 Empty result for condition
WHERE json.array IN (<value>)
. - Issue #172 Fixed an issue with TableFormatter when sending request to
/cli
. - Issue #1742
CREATE TABLE
wasn't failing in case of a missing wordforms file. - Issue #1762 The order of attributes in RT tables now follows the configuration order.
- Issue #1765 HTTP bool query with 'should' condition returns incorrect results.
- Issue #1769 Sorting by string attributes does not work with
SPH_SORT_ATTR_DESC
andSPH_SORT_ATTR_ASC
. - Issue #177 Disabled the
Expect: 100-continue
HTTP header for curl requests to Buddy. - Issue #1791 crash caused by GROUP BY alias.
- Issue #1792 SQL meta summary shows wrong time on windows.
- Issue #1794 Fixed a single-term performance drop with JSON queries.
- Issue #1798 Incompatible filters didn't raise an error on
/search
. - Issue #1802 Fixed
ALTER CLUSTER ADD
andJOIN CLUSTER
operations to wait for each other, preventing a race condition whereALTER
adds a table to the cluster while the donor sends tables to the joiner node. - Issue #1811 Incorrect handling of
/pq/{table}/*
requests. - Issue #1816
UNFREEZE
wasn't working in some cases. - Issue #183 Fixed an issue with MVA restoration in some cases.
- Issue #1849 Fixed indextool crash on shutdown if used with MCL.
- Issue #1866 Fixed unnecessary url decoding for
/cli_json
requests. - Issue #1872 change plugin_dir set logic on daemon start.
- Issue #1874 alter table ... exceptions fails.
- Issue #1891 Manticore crashes with
signal 11
when inserting data. - Issue #1920 Reduced throttling for low_priority.
- Issue #1924 Mysqldump + mysql restore bug.
- Issue #1951 Fixed incorrect creation of the distributed table in the case of a missing local table or incorrect agent description; now returns an error message.
- Issue #1972 Implemented a
FREEZE
counter to avoid freeze/unfreeze issues. - Issue #1980 Obey query timeout in OR nodes. Previously
max_query_time
could be not working in some cases. - Issue #1986 Failed to rename new to current [manticore.json].
- Issue #1988 A full-text query could ignore a
SecondaryIndex
CBO hint. - Issue #1990 Fixed
expansion_limit
to slice final result set for call keywords from multiple disk chunks or RAM chunks. - Issue #1994 wrong external files.
- Issue #2021 A few manticore-executor processes could be left running after stopping Manticore.
- Issue #2029 Crash using Levenshtein Distance.
- Issue #2037 Got error after multiple max operator ran on an empty index.
- Issue #2052 crash on multi-group with JSON.field.
- Issue #2067 Manticore was crashing on incorrect request to _update.
- Issue #2069 Fixed an issue with string filter comparators for closed ranges in the JSON interface.
- Issue #2082
alter
failed when the data_dir path was located on a symlink. - Issue #2102 Improved special handling of SELECT queries in mysqldump to ensure the resulting INSERT statements are compatible with Manticore.
- Issue #2103 Thai chars were in the wrong charsets.
- Issue #2124 Crash if I use an SQL with a reserved word.
- Issue #2154 Tables with wordforms couldn't be imported.
- Issue #2176 Fixed a crash that occurred when the engine parameter was set to 'columnar' and duplicate IDs were added via JSON.
- Issue #223 Proper error when trying to insert a document w/o schema and w/o column names.
- Issue #239 Auto-schema multi-line insert could fail.
- Issue #399 Added an error message on indexing if an id attribute is declared at the data source.
- Issue #59 Manticore cluster breakdown.
- Issue #68 optimize.php would crash if percolate table was present.
- Issue #77 Fixed errors when deploying on Kubernetes.
- Issue #97 Set VIP HTTP port as default when available.
Various improvements: improved versions check and streaming ZSTD decompression; added user prompts for version mismatches during restore; fixed incorrect prompting behavior for different versions on restore; enhanced decompression logic to read directly from the stream rather than into working memory; added
--force
flag - Commit 3b35 Added backup version display after Manticore search start to identify issues at this stage.
- Commit ad2e Updated error message for failed connections to the daemon.
- Commit ce5e Fixed issue with converting absolute root backup paths to relative and removed writeable check on restore to enable restoration from different paths.
- Commit db7e Added sorting to the file iterator to ensure consistency across various situations.
- Issue #106 Backup and restore of multiple configurations.
- Issue #91 Added defattr to prevent unusual user permissions in files after installation on RHEL.
- Issue #91 Added extra chown to ensure files default to the root user in Ubuntu.
- Commit f104 Vector search support.
- Commit 2169 Fixed cleanup of temporary files during the interrupted setup of the secondary index build. This resolves the issue where the daemon exceeded the open files limit when creating
tmp.spidx
files. - Commit 709b Use separate streamvbyte library for columnar and SI.
- Commit 1c26 Added a warning that columnar storage doesn't support json attrs.
- Commit 3acd Fixed data unpacking in SI.
- Commit 574c Fixed a crash on saving a disk chunk with mixed rowwise and columnar storage.
- Commit e87f Fixed SI iterator being hinted at an already processed block.
- Issue #1474 Update is broken for rowwise MVA column with columnar engine.
- Issue #1510 Fixed crash when aggregating to a columnar attribute used in
HAVING
. - Issue #1519 Fixed crash in
expr
ranker on using columnar attribute.
- ❗Issue #42 Support of plain indexation via environment variables.
- ❗Issue #47 Improved flexibility of configuration via environment vars.
- Issue #54 Improved the backup and restore processes for Docker.
- Issue #77 Improved entrypoint to handle backup restoration on first start only.
- Commit a27c Fixed query logging to stdout.
- Issue #38 Mute BUDDY warnings if EXTRA is not set.
- Issue #71 Fixed hostname in
manticore.conf.sh
.
Released: August 23rd 2023
Version 6.2.12 continues the 6.2 series and addresses issues discovered after the release of 6.2.0.
- ❗Issue #1351 "Manticore 6.2.0 doesn't start via systemctl on Centos 7": Modified
TimeoutStartSec
frominfinity
to0
for better compatibility with Centos 7. - ❗Issue #1364 "Crash after upgrading from 6.0.4 to 6.2.0": Added replay functionality for empty binlog files from older binlog versions.
- PR #1334 "fix typo in searchdreplication.cpp": Corrected a typo in
searchdreplication.cpp
: beggining -> beginning. - Issue #1337 "Manticore 6.2.0 WARNING: conn (local)(12), sock=8088: bailing on failed MySQL header, AsyncNetInputBuffer_c::AppendData: error 11 (Resource temporarily unavailable) return -1": Lowered the verbosity level of the MySQL interface warning about the header to logdebugv.
- Issue #1355 "join cluster hangs when node_address can't be resolved": Improved replication retry when certain nodes are unreachable, and their name resolution fails. This should resolve issues in Kubernetes and Docker nodes related to replication. Enhanced the error message for replication start failures and made updates to test model 376. Additionally, provided a clear error message for name resolution failures.
- Issue #1361 "No lower case mapping for "Ø" in charset non_cjk": Adjusted the mapping for the 'Ø' character.
- Issue #1365 "searchd leaves binlog.meta and binlog.001 after clean stop": Ensured that the last empty binlog file is removed properly.
- Commit 0871: Fixed the
Thd_t
build issue on Windows related to atomic copy restrictions. - Commit 1cc0: Addressed an issue with FT CBO vs
ColumnarScan
. - Commit c6bf: Made corrections to test 376 and added a substitution for the
AF_INET
error in the test. - Commit cbc3: Resolved a deadlock issue during replication when updating blob attributes versus replacing documents. Also removed the rlock of the index during commit because it's already locked at a more basic level.
- Commit 4f91 Updated info on
/bulk
endpoints in the manual.
- Support of Manticore Columnar Library v2.2.4
Released: August 4th 2023
- The query optimizer has been enhanced to support full-text queries, significantly improving search efficiency and performance.
- Integrations with:
- mysqldump - to make logical backups using
mysqldump
- Apache Superset and Grafana to visualize data stored in Manticore
- HeidiSQL and DBForge for easier development with Manticore
- mysqldump - to make logical backups using
- We've started using GitHub workflows, making it simpler for contributors to utilize the same Continuous Integration (CI) process that the core team applies when preparing packages. All jobs can be run on GitHub-hosted runners, which facilitates seamless testing of changes in your fork of Manticore Search.
- We've started using CLT to test complex scenarios. For example, we're now able to ensure that a package built after a commit can be properly installed across all supported Linux operating systems. The Command Line Tester (CLT) provides a user-friendly way to record tests in an interactive mode and to effortlessly replay them.
- Significant performance improvement in count distinct operation by employing a combination of hash tables and HyperLogLog.
- Enabled multithreaded execution of queries containing secondary indexes, with the number of threads limited to the count of physical CPU cores. This should considerably improve the query execution speed.
pseudo_sharding
has been adjusted to be limited to the number of free threads. This update considerably enhances the throughput performance.- Users now have the option to specify the default attribute storage engine via the configuration settings, providing better customization to match specific workload requirements.
- Support for Manticore Columnar Library 2.2.0 with numerous bug fixes and improvements in Secondary indexes.
- Buddy #153: The /pq HTTP endpoint now serves as an alias for the
/json/pq
HTTP endpoint. - Commit 0bf1: We've ensured multi-byte compatibility for
upper()
andlower()
. - Commit 2bb9: Instead of scanning the index for
count(*)
queries, a precalculated value is now returned. - Commit 3c84: It's now possible to use
SELECT
for making arbitrary calculations and displaying@@sysvars
. Unlike before, you are no longer limited to just one calculation. Therefore, queries likeselect user(), database(), @@version_comment, version(), 1+1 as a limit 10
will return all the columns. Note that the optional 'limit' will always be ignored. - Commit 6aca: Implemented the
CREATE DATABASE
stub query. - Commit 9dc1: When executing
ALTER TABLE table REBUILD SECONDARY
, secondary indexes are now always rebuilt, even if attributes weren't updated. - Commit 46ed: Sorters utilizing precalculated data are now identified before using CBO to avoid unnecessary CBO calculations.
- Commit 102a: Implementing mocked and utilizing of the full-text expression stack to prevent daemon crashes.
- Commit 979f: A speedy code path has been added for match cloning code for matches that don't use string/mvas/json attributes.
- Commit a073: Added support for the
SELECT DATABASE()
command. However, it will always returnManticore
. This addition is crucial for integrations with various MySQL tools. - Commit bc04: Modified the response format of the /cli endpoint, and added the
/cli_json
endpoint to function as the previous/cli
. - Commit d70b: The
thread_stack
can now be altered during runtime using theSET
statement. Both session-local and daemon-wide variants are available. Current values can be accessed in theshow variables
output. - Commit d96e: Code has been integrated into CBO to more accurately estimate the complexity of executing filters over string attributes.
- Commit e77d: The DocidIndex cost calculation has been improved, enhancing overall performance.
- Commit f3ae: Load metrics, similar to 'uptime' on Linux, are now visible in the
SHOW STATUS
command. - Commit f3cc: The field and attribute order for
DESC
andSHOW CREATE TABLE
now match that ofSELECT * FROM
. - Commit f3d2: Different internal parsers now provide their internal mnemonic code (e.g.,
P01
) during various errors. This enhancement aids in identifying which parser caused an error and also obscures non-essential internal details. - Issue #271 "Sometimes CALL SUGGEST does not suggest a correction of a single letter typo": Improved SUGGEST/QSUGGEST behaviour for short words: added the option
sentence
to show the entire sentence - Issue #696 "Percolate index does not search properly by exact phrase query when stemming enabled": The percolate query has been modified to handle an exact term modifier, improving search functionality.
- Issue #829 "DATE FORMATTING methods": added the [date_format()](../Functions/Date_and_time_functions.md#DATE_FORMAT()) select list expression, which exposes the
strftime()
function. - Issue #961 "Sorting buckets via HTTP JSON API": introduced an optional sort property for each bucket of aggregates in the HTTP interface.
- Issue #1062 "Improve error logging of JSON insert api failure - "unsupported value type"": The
/bulk
endpoint reports information regarding the number of processed and non-processed strings (documents) in case of an error. - Issue #1070 "CBO hints don't support multiple attributes": Enabled index hints to handle multiple attributes.
- Issue #1106 "Add tags to http search query": Tags have been added to HTTP PQ responses.
- Issue #1301 "buddy should not create table in parallel": Resolved an issue that was causing failures from parallel CREATE TABLE operations. Now, only one
CREATE TABLE
operation can run at a time. - Issue #1303 "add support of @ to column names".
- Issue #1316 "Queries on taxi dataset are slow with ps=1": The CBO logic has been refined, and the default histogram resolution has been set to 8k for better accuracy on attributes with randomly distributed values.
- Issue #1317 "Fix CBO vs fulltext on hn dataset": Enhanced logic has been implemented for determining when to use bitmap iterator intersection and when to use a priority queue.
- Issue #1318 "columnar: change iterator interface to single-call" : Columnar iterators now use a single
Get
call, replacing the previous two-stepAdvanceTo
+Get
calls to retrieve a value. - Issue #1319 "Aggregate calc speedup (remove CheckReplaceEntry?)": The
CheckReplaceEntry
call was removed from the group sorter to expedite the calculation of aggregate functions. - Issue #1320 "create table read_buffer_docs/hits doesn't understand k/m/g syntax": The
CREATE TABLE
optionsread_buffer_docs
andread_buffer_hits
now support k/m/g syntax. - Language packs for English, German and Russian can now be effortlessly installed on Linux by executing the command
apt/yum install manticore-language-packs
. On macOS, use the commandbrew install manticoresoftware/tap/manticore-language-packs
. - Field and attribute order is now consistent between
SHOW CREATE TABLE
andDESC
operations. - If disk space is insufficient when executing
INSERT
queries, newINSERT
queries will fail until enough disk space becomes available. - The UINT64() type conversion function has been added.
- The
/bulk
endpoint now processes empty lines as a commit command. More info here. - Warnings have been implemented for invalid index hints, providing more transparency and allowing for error mitigation.
- When
count(*)
is used with a single filter, queries now leverage precalculated data from secondary indexes when available, substantially speeding up query times.
- ⚠️ Tables created or modified in version 6.2.0 cannot be read by older versions.
- ⚠️ Document IDs are now handled as unsigned 64-bit integers during indexing and INSERT operations.
- ⚠️ The syntax for query optimizer hints has been updated. The new format is
/*+ SecondaryIndex(uid) */
. Please note that the old syntax is no longer supported. - ⚠️ Issue #1160: The usage of
@
in table names has been disallowed to prevent syntax conflicts. - ⚠️ String fields/attributes marked as
indexed
andattribute
are now regarded as a single field duringINSERT
,DESC
, andALTER
operations. - ⚠️ Issue #1057: MCL libraries will no longer load on systems that don't support SSE 4.2.
- ⚠️ Issue #1143: agent_query_timeout was broken. Fixed and is now effective.
- Commit 2a6e "Crash on DROP TABLE": resolved a problem causing extended wait times to finish write operations (optimize, disk chunk save) on an RT table when executing a DROP TABLE statement. Added a warning to notify when a table directory is not empty after executing a DROP TABLE command.
- Commit 2ebd: Support for columnar attributes, which was missing in the code used for grouping by multiple attributes, has been added.
- Commit 3be4 Resolved a crash issue potentially caused by disk space running out by properly handling write errors in binlog.
- Commit 6adb: A crash that occasionally occurred when using multiple columnar scan iterators (or secondary index iterators) in a query has been fixed.
- Commit 6bd9: Filters were not being removed when using sorters that use precalculated data. This issue has been fixed.
- Commit 6d03: The CBO code has been updated to provide better estimates for queries using filters over row-wise attributes executed in multiple threads.
- Commit 6dd3, Helm #56 "fatal crash dump in Kubernetes cluster": Fixed a defective bloom filter for the JSON root object; fixed daemon crash due to filtering by a JSON field.
- Commit 6e1b Rectified daemon crash caused by invalid
manticore.json
config. - Commit 6fbc Fixed the json range filter to support int64 values.
- Commit 9c67
.sph
files could be corruptedALTER
. Fixed. - Commit 77cc: A shared key has been added for the replication of the replace statement to resolve a
pre_commit
error occurring when replace is replicated from multiple master nodes. - Commit 2884 resolved issues with bigint checks over functions like 'date_format()'.
- Commit 9513: Iterators are no longer displayed in SHOW META when sorters utilize precalculated data.
- Commit a2a7: The fulltext node stack size has been updated to prevent crashes on complex fulltext queries.
- Commit a062: A bug causing a crash during the replication of updates with JSON and string attributes has been resolved.
- Commit b3e6: The string builder has been updated to use 64-bit integers to avoid crashes when dealing with large data sets.
- Commit c472: Addressed a crash that was occurring with count distinct across multiple indexes.
- Commit d073: Fixed an issue where queries over disk chunks of RT indexes could be executed in multiple threads even if
pseudo_sharding
was disabled. - Commit d205 The set of values returned by the
show index status
command has been modified and now varies depending on the type of index in use. - Commit e9bc Fixed an HTTP error when processing bulk requests and an issue where the error wasn't being returned to the client from the net loop.
- Commit f77c use of an extended stack for PQ.
- Commit fac2 Updated the export ranker output to align with packedfactors().
- Commit ff87: Fixed an issue with the string list in the filter of the SphinxQL query log.
- Issue #589 "The charset definition seems to depend on the ordering of codes": Fixed incorrect charset mapping for duplicates.
- Issue #811 "Mapping multiple words in word forms interferes phrase search with CJK punctuations between keywords": Fixed ngram token position within phrase query with wordforms.
- Issue #834 "Equals sign in search query breaks request": Ensured the exact symbol can be escaped and fixed double exact expansion by the
expand_keywords
option. - Issue #864 "exceptions/stopwords conflict"
- Issue #910 "Manticore crash when calling call snippets() with libstemmer_fr and index_exact_words": Resolved internal conflicts causing crashes when
SNIPPETS()
was called. - Issue #946 "Duplicate records during SELECT": Fixed the issue of duplicate documents in the result set for a query with
not_terms_only_allowed
option to RT index with killed documents. - Issue #967 "Using JSON arguments in UDF functions leads to a crash": Fixed a daemon crash when processing a search with pseudo-sharding enabled and UDF with JSON argument.
- Issue #1050 "count(*) in FEDERATED": Fixed a daemon crash occurring with a query through a
FEDERATED
engine with aggregate. - Issue #1052 Fixed an issue where
rt_attr_json
column was incompatible with columnar storage. - Issue #1072 "* is removed from search query by ignore_chars": Fixed this issue so wildcards in a query aren't impacted by
ignore_chars
. - Issue #1075 "indextool --check fails if there's a distributed table": indextool is now compatible with instances having 'distributed' and 'template' indexes in the json config.
- Issue #1081 "particular select on particular RT dataset leads to crash of searchd": Resolved daemon crash on a query with packedfactors and large internal buffer.
- Issue #1095 "With not_terms_only_allowed deleted documents are ignored"
- Issue #1099 "indextool --dumpdocids is not working": Restored functionality of the
--dumpdocids
command. - Issue #1100 "indextool --buildidf is not working": indextool now closes the file after finishing globalidf.
- Issue #1104 "Count(*) is trying to be treated as schema set in remote tables": Resolved an issue where an error message was being sent by the daemon for queries into the distributed index when the agent returned an empty result set.
- Issue #1109 "FLUSH ATTRIBUTES hangs with threads=1".
- Issue #1126 "Lost connection to MySQL server during query - manticore 6.0.5": Crashes that were happening when using multiple filters over columnar attributes have been addressed.
- Issue #1135 "JSON string filtering case sensitivity": Corrected the collation to function correctly for filters used in HTTP search requests.
- Issue #1140 "Match in a wrong field": Fixed the damage related with
morphology_skip_fields
. - Issue #1155 "system remote commands via API should pass g_iMaxPacketSize": Made updates to bypass the
max_packet_size
check for replication commands between nodes. Additionally, the latest cluster error has been added to the status display. - Issue #1302 "tmp files left on failed optimize": Corrected an issue where temporary files were left behind after an error occurred during a merge or optimize process.
- Issue #1304 "add env var for buddy start timeout": Added environment variable
MANTICORE_BUDDY_TIMEOUT
(default 3 seconds) to control the daemon's wait duration for a buddy message at startup. - Issue #1305 "Int overflow when saving PQ meta": Mitigated excessive memory consumption by daemon on saving large PQ index.
- Issue #1306 "Can't recreate RT table after altering its external file": Rectified an error of alter with empty string for external files; fixed RT index external files left after altering external files.
- Issue #1307 "SELECT statement sum(value) as value doesn't work properly": Fixed issue where select list expression with alias could hide index attribute; also fixed sum to count in int64 for integer.
- Issue #1308 "Avoid binding to localhost in replication": Ensured replication doesn't bind to localhost for host names with multiple IPs.
- Issue #1309 "reply to mysql client failed for data larger 16Mb": Fixed the issue of returning a SphinxQL packet larger than 16Mb to the client.
- Issue #1310 "wrong reference in "paths to external files should be absolute": Corrected the display of the full path to external files in
SHOW CREATE TABLE
. - Issue #1311 "debug build crashes on long strings in snippets": Now, long strings (>255 characters) are permitted in the text targeted by the
SNIPPET()
function. - Issue #1312 "spurious crash on use-after-delete in kqueue polling (master-agent)": Fixed crashes when the master cannot connect to the agent on kqueue-driven systems (FreeBSD, MacOS, etc.).
- Issue #1313 "too long connect to itself": When connecting from the master to agents on MacOS/BSD, a unified connect+query timeout is now used instead of just connect.
- Issue #1314 "pq (json meta) with unreached embedded synonyms fails to load": Fixed the embedded synonyms flag in pq.
- Issue #1315 "Allow some functions (sint, fibonacci, second, minute, hour, day, month, year, yearmonth, yearmonthday) to use implicitly promoted argument values".
- Issue #1321 "Enable multithreaded SI in fullscan, but limit threads": Code has been implemented into CBO to better predict multithreaded performance of secondary indexes when they're utilized in a full-text query.
- Issue #1322 "count(*) queries still slow after using precalc sorters": Iterators are no longer initiated when employing sorters that use precalculated data, circumventing detrimental performance effects.
- Issue #1411 "query log in sphinxql does not preserve original queries for MVA's": Now,
all()/any()
is logged.
Released: March 15 2023
- Improved integration with Logstash, Beats etc. including:
- Support for Logstash versions 7.6 - 7.15, Filebeat versions 7.7 - 7.12
- Auto-schema support.
- Added handling of bulk requests in Elasticsearch-like format.
- Buddy commit ce90 Log Buddy version on Manticore start.
- Issue #588, Issue #942 Fixed bad character at the search meta and call keywords for bigram index.
- Issue #1027 Lowercase HTTP headers are rejected.
- ❗Issue #1039 Fixed memory leak at daemon on reading output of the Buddy console.
- Issue #1056 Fixed unexpected behavior of question mark.
- Issue #1064 - Fixed race condition in tokenizer lowercase tables causing a crash.
- Commit 59bb Fixed bulk writes processing in the JSON interface for documents with id explicitly set to null.
- Commit 7b6b Fixed term statistics in CALL KEYWORDS for multiple same terms.
- Commit f381 Default config is now created by Windows installer; paths are no longer substituted in runtime.
- Commit 6940, Commit cc5a Fixed replication issues for cluster with nodes in multiple networks.
- Commit 4972 Fixed
/pq
HTTP endpoint to be an alias of the/json/pq
HTTP endpoint. - Commit 3b53 Fixed daemon crash on Buddy restart.
- Buddy commit fba9 Display original error on invalid request received.
- Buddy commit db95 Allow spaces in backup path and add some magic to regexp to support single quotes also.
Released: Feb 10 2023
- Issue #1024 crash 2 Crash / Segmentation Fault on Facet search with larger number of results
- ❗Issue #1029 - WARNING: Compiled-in value KNOWN_CREATE_SIZE (16) is less than measured (208). Consider to fix the value!
- ❗Issue #1032 - Manticore 6.0.0 plain index crashes
- ❗Issue #1033 - multiple distributed lost on daemon restart
- ❗Issue #1064 - race condition in tokenizer lowercase tables
Released: Feb 7 2023
Starting with this release, Manticore Search comes with Manticore Buddy, a sidecar daemon written in PHP that handles high-level functionality that does not require super low latency or high throughput. Manticore Buddy operates behind the scenes, and you may not even realize it is running. Although it is invisible to the end user, it was a significant challenge to make Manticore Buddy easily installable and compatible with the main C++-based daemon. This major change will allow the team to develop a wide range of new high-level features, such as shards orchestration, access control and authentication, and various integrations like mysqldump, DBeaver, Grafana mysql connector. For now it already handles SHOW QUERIES, BACKUP and Auto schema.
This release also includes more than 130 bug fixes and numerous features, many of which can be considered major.
- 🔬 Experimental: you can now execute Elasticsearch-compatible insert and replace JSON queries which enables using Manticore with tools like Logstash (version < 7.13), Filebeat and other tools from the Beats family. Enabled by default. You can disable it using
SET GLOBAL ES_COMPAT=off
. - Support for Manticore Columnar Library 2.0.0 with numerous fixes and improvements in Secondary indexes. ⚠️ BREAKING CHANGE: Secondary indexes are ON by default as of this release. Make sure you do ALTER TABLE table_name REBUILD SECONDARY if you are upgrading from Manticore 5. See below for more details.
- Commit c436 Auto-schema: you can now skip creating a table, just insert the first document and Manticore will create the table automatically based on its fields. Read more about this in detail here. You can turn it on/off using searchd.auto_schema.
- Vast revamp of cost-based optimizer which lowers query response time in many cases.
- Issue #1008 Parallelization performance estimate in CBO.
- Issue #1014 CBO is now aware of secondary indexes and can act smarter.
- Commit cef9 Encoding stats of columnar tables/fields are now stored in the meta data to help CBO make smarter decisions.
- Commit 2b95 Added CBO hints for fine-tuning its behaviour.
- Telemetry: we are excited to announce the addition of telemetry in this release. This feature allows us to collect anonymous and depersonalized metrics that will help us improve the performance and user experience of our product. Rest assured, all data collected is completely anonymous and will not be linked to any personal information. This feature can be easily turned off in the settings if desired.
- Commit 5aaf ALTER TABLE table_name REBUILD SECONDARY to rebuild secondary indexes whenever you want, for example:
- when you migrate from Manticore 5 to the newer version,
- when you did UPDATE (i.e. in-place update, not replace) of an attribute in the index
- Issue #821 New tool
manticore-backup
for backing up and restoring Manticore instance - SQL command BACKUP to do backups from inside Manticore.
- SQL command SHOW QUERIES as an easy way to see running queries rather than threads.
- Issue #551 SQL command
KILL
to kill a long-runningSELECT
. - Dynamic
max_matches
for aggregation queries to increase accuracy and lower response time.
-
Issue #822 SQL commands FREEZE/UNFREEZE to prepare a real-time/plain table for a backup.
-
Commit c470 New settings
accurate_aggregation
andmax_matches_increase_threshold
for controlled aggregation accuracy. -
Issue #718 Support for signed negative 64-bit IDs. Note, you still can't use IDs > 2^63, but you can now use ids in the range of from -2^63 to 0.
-
As we recently added support for secondary indexes, things became confusing as "index" could refer to a secondary index, a full-text index, or a plain/real-time
index
. To reduce confusion, we are renaming the latter to "table". The following SQL/command line commands are affected by this change. Their old versions are deprecated, but still functional:index <table name>
=>table <table name>
,searchd -i / --index
=>searchd -t / --table
,SHOW INDEX STATUS
=>SHOW TABLE STATUS
,SHOW INDEX SETTINGS
=>SHOW TABLE SETTINGS
,FLUSH RTINDEX
=>FLUSH TABLE
,OPTIMIZE INDEX
=>OPTIMIZE TABLE
,ATTACH TABLE plain TO RTINDEX rt
=>ATTACH TABLE plain TO TABLE rt
,RELOAD INDEX
=>RELOAD TABLE
,RELOAD INDEXES
=>RELOAD TABLES
.
We are not planning to make the old forms obsolete, but to ensure compatibility with the documentation, we recommend changing the names in your application. What will be changed in a future release is the "index" to "table" rename in the output of various SQL and JSON commands.
-
Queries with stateful UDFs are now forced to be executed in a single thread.
-
Issue #1011 Refactoring of all related to time scheduling as a prerequisite for parallel chunks merging.
-
⚠️ BREAKING CHANGE: Columnar storage format has been changed. You need to rebuild those tables that have columnar attributes.
-
⚠️ BREAKING CHANGE: Secondary indexes file format has been changed, so if you are using secondary indexes for searching and have
searchd.secondary_indexes = 1
in your configuration file, be aware that the new Manticore version will skip loading the tables that have secondary indexes. It's recommended to:- Before you upgrade change
searchd.secondary_indexes
to 0 in the configuration file. - Run the instance. Manticore will load up the tables with a warning.
- Run
ALTER TABLE <table name> REBUILD SECONDARY
for each index to rebuild secondary indexes.
If you are running a replication cluster, you'll need to run
ALTER TABLE <table name> REBUILD SECONDARY
on all the nodes or follow this instruction with just change: run theALTER .. REBUILD SECONDARY
instead of theOPTIMIZE
. - Before you upgrade change
-
⚠️ BREAKING CHANGE: The binlog version has been updated, so any binlogs from previous versions will not be replayed. It is important to ensure that Manticore Search is stopped cleanly during the upgrade process. This means that there should be no binlog files in
/var/lib/manticore/binlog/
except forbinlog.meta
after stopping the previous instance. -
Issue #849
SHOW SETTINGS
: you can now see the settings from the configuration file from inside Manticore. -
Issue #1007 SET GLOBAL CPUSTATS=1/0 turns on/off cpu time tracking; SHOW THREADS now doesn't show CPU statistics when the cpu time tracking is off.
-
Issue #1009 RT table RAM chunk segments can now be merged while the RAM chunk is being flushed.
-
Issue #1012 Added secondary index progress to the output of indexer.
-
Issue #1013 Previously a table record could be removed by Manticore from the index list if it couldn't start serving it on start. The new behaviour is to keep it in the list to try to load it on the next start.
-
indextool --docextract returns all the words and hits belonging to requested document.
-
Commit 2b29 Environment variable
dump_corrupt_meta
enables dumping a corrupted table meta data to log in case searchd can't load the index. -
Commit c7a3
DEBUG META
can showmax_matches
and pseudo sharding statistics. -
Commit 6bca A better error instead of the confusing "Index header format is not json, will try it as binary...".
-
Commit bef3 Ukirainian lemmatizer path has been changed.
-
Commit 4ae7 Secondary indexes statistics has been added to SHOW META.
-
Commit 2e7c JSON interface can now be easily visualized using Swagger Editor https://manual.manticoresearch.com/Openapi#OpenAPI-specification.
-
⚠️ BREAKING CHANGE: Replication protocol has been changed. If you are running a replication cluster, then when upgrading to Manticore 5 you need to:
- stop all your nodes first cleanly
- and then start the node which was stopped last with
--new-cluster
(run toolmanticore_new_cluster
in Linux). - read about restarting a cluster for more details.
- Refactoring of Secondary indexes integration with Columnar storage.
- Commit efe2 Manticore Columnar Library optimization which can lower response time by partial preliminary min/max evaluation.
- Commit 2757 If a disk chunk merge is interrupted, the daemon now cleans up the MCL-related tmp files.
- Commit e9c6 Columnar and secondary libraries versions are dumped to log on crash.
- Commit f5e8 Added support for quick doclist rewinding to secondary indexes.
- Commit 06df Queries like
select attr, count(*) from plain_index
(w/o filtering) are now faster in case you are using MCL. - Commit 0a76 @@autocommit in HandleMysqlSelectSysvar for compatibility with .net connector for mysql greater than 8.25
- ⚠️ BREAKING CHANGE: MCL Issue #17 MCL: add SSE code to columnar scan. MCL now requires at least SSE4.2.
- Commit 4d19 ⚠️ BREAKING CHANGE: Support for Debian Stretch and Ubuntu Xenial has been discontinued.
- RHEL 9 support including Centos 9, Alma Linux 9 and Oracle Linux 9.
- Issue #924 Debian Bookworm support.
- Issue #636 Packaging: arm64 builds for Linuxes and MacOS.
- PR #26 Multi-architecture (x86_64 / arm64) docker image.
- Simplified package building for contributors.
- It's now possible to install a specific version using APT.
- Commit a6b8 Windows installer (previously we provided just an archive).
- Switched to compiling using CLang 15.
- ⚠️ BREAKING CHANGE: Custom Homebrew formulas including the formula for Manticore Columnar Library. To install Manticore, MCL and any other necessary components, use the following command
brew install manticoresoftware/manticore/manticoresearch manticoresoftware/manticore/manticore-extra
.
- Issue #479 Field with name
text
- Issue #501 id can't be non bigint
- Issue #646 ALTER vs field with name "text"
- ❗Issue #652 Possible BUG: HTTP (JSON) offset and limit affects facet results
- ❗Issue #827 Searchd hangs/crashes under intensive loading
- ❗Issue #996 PQ index out of memory
- ❗Commit 1041
binlog_flush = 1
has been broken all the time since Sphinx. Fixed. - MCL Issue #14 MCL: crash on select when too many ft fields
- Issue #470 sql_joined_field can't be stored
- Issue #713 Crash when using LEVENSHTEIN()
- Issue #743 Manticore crashes unexpected and cant to normal restart
- Issue #788 CALL KEYWORDS through /sql returns control char which breaks json
- Issue #789 mariadb can't create table FEDERATED
- Issue #796 WARNING: dlopen() failed: /usr/bin/lib_manticore_columnar.so: cannot open shared object file: No such file or directory
- Issue #797 Manticore crashes when search with ZONESPAN is done through api
- Issue #799 wrong weight when using multiple indexes and facet distinct
- Issue #801 SphinxQL group query hangs after SQL index reprocessing
- Issue #802 MCL: Indexer crashes in 5.0.2 and manticore-columnar-lib 1.15.4
- Issue #813 Manticore 5.0.2 FEDERATED returns empty set (MySQL 8.0.28)
- Issue #824 select COUNT DISTINCT on 2 indices when result is zero throws internal error
- Issue #826 CRASH on delete query
- Issue #843 MCL: Bug with long text field
- Issue #856 5.0.2 rtindex: Aggregate search limit behavior is not as expected
- Issue #863 Hits returned is Nonetype object even for searches that should return multiple results
- Issue #870 Crash with using Attribute and Stored Field in SELECT expression
- Issue #872 table gets invisible after crash
- Issue #877 Two negative terms in search query gives error: query is non-computable
- Issue #878 a -b -c is not working via json query_string
- Issue #886 pseudo_sharding with query match
- Issue #893 Manticore 5.0.2 min/max function doesn't work as expecting ...
- Issue #896 Field "weight" is not parsed correctly
- Issue #897 Manticore service crash upon start and keep restarting
- Issue #900 group by j.a, smth works wrong
- Issue #913 Searchd crash when expr used in ranker, but only for queries with two proximities
- Issue #916 net_throttle_action is broken
- Issue #919 MCL: Manticore crashes on query execution and other crashed during cluster recovery.
- Issue #925 SHOW CREATE TABLE outputs w/o backticks
- Issue #930 It's now possible to query Manticore from Java via JDBC connector
- Issue #933 bm25f ranking problems
- Issue #934 configless indexes frozen in watchdog on the first-load state
- Issue #937 Segfault when sorting facet data
- Issue #940 crash on CONCAT(TO_STRING)
- Issue #947 In some cases a single simple select could cause the whole instance stall, so you couldn't log in to it or run any other query until the running select is done.
- Issue #948 Indexer crash
- Issue #950 wrong count from facet distinct
- Issue #953 LCS is calculating incorrectly in built-in sph04 ranker
- Issue #955 5.0.3 dev crashing
- Issue #963 FACET with json on engine columnar crash
- Issue #982 MCL: 5.0.3 crash from secondary index
- PR #984 @@autocommit in HandleMysqlSelectSysvar
- PR #985 Fix thread-chunk distribution in RT indexes
- Issue #985 Fix thread-chunk distribution in RT indexes
- Issue #986 wrong default max_query_time
- Issue #987 Crash on when using regex expression in multithreaded execution
- Issue #988 Broken backward index compatibility
- Issue #989 indextool reports error checking columnar attributes
- Issue #990 memleak of json grouper clones
- Issue #991 Memleak of levenshtein func cloning
- Issue #992 Error message lost when loading meta
- Issue #993 Propagate errors from dynamic indexes/subkeys and sysvars
- Issue #994 Crash on count distinct over a columnar string in columnar storage
- Issue #995 MCL: min(pickup_datetime) from taxi1 crashes
- Issue #997 empty excludes JSON query removes columns from select list
- Issue #998 Secondary tasks run on current scheduler sometimes cause abnormal side effects
- Issue #999 crash with facet distinct and different schemas
- Issue #1000 MCL: Columnar rt index became damaged after run without columnar library
- Issue #1001 implicit cutoff is not working in json
- Issue #1002 Columnar grouper issue
- Issue #1003 Unable to delete last field from the index
- Issue #1004 wrong behaviour after --new-cluster
- Issue #1005 "columnar library not loaded", but it's not required
- Issue #1006 no error for delete query
- Issue #1010 Fixed ICU data file location in Windows builds
- PR #1018 Handshake send problem
- Issue #1020 Display id in show create table
- Issue #1024 crash 1 Crash / Segmentation Fault on Facet search with larger number of results.
- Issue #1026 RT index: searchd "stuck" forever when many documents are being inserted and RAMchunk gets full
- Commit 4739 Thread gets stuck on shutdown while replication is busy between nodes
- Commit ab87 Mixing floats and ints in a JSON range filter could make Manticore ignore the filter
- Commit d001 Float filters in JSON were inaccurate
- Commit 4092 Discard uncommitted txns if index altered (or it can crash)
- Commit 9692 Query syntax error when using backslash
- Commit 0c19 workers_clients could be wrong in SHOW STATUS
- Commit 1772 fixed a crash on merging ram segments w/o docstores
- Commit f45b Fixed missed ALL/ANY condition for equals JSON filter
- Commit 3e83 Replication could fail with
got exception while reading ist stream: mkstemp(./gmb_pF6TJi) failed: 13 (Permission denied)
if the searchd was started from a directory it can't write to. - Commit 92e5 Since 4.0.2 crash log included only offsets. This commit fixes that.
Released: May 30th 2022
- ❗Issue #791 - wrong stack size could cause a crash.
Released: May 18th 2022
-
🔬 Support for Manticore Columnar Library 1.15.2, which enables Secondary indexes beta version. Building secondary indexes is on by default for plain and real-time columnar and row-wise indexes (if Manticore Columnar Library is in use), but to enable it for searching you need to set
secondary_indexes = 1
either in your configuration file or using SET GLOBAL. The new functionality is supported in all operating systems except old Debian Stretch and Ubuntu Xenial. -
Read-only mode: you can now specify listeners that process only read queries discarding any writes.
-
New /cli endpoint for running SQL queries over HTTP even easier.
-
Faster bulk INSERT/REPLACE/DELETE via JSON over HTTP: previously you could provide multiple write commands via HTTP JSON protocol, but they were processed one by one, now they are handled as a single transaction.
-
#720 Nested filters support in JSON protocol. Previously you couldn't code things like
a=1 and (b=2 or c=3)
in JSON:must
(AND),should
(OR) andmust_not
(NOT) worked only on the highest level. Now they can be nested. -
Support for Chunked transfer encoding in the HTTP protocol. You can now use chunked transfer in your application to transmit large batches with reduced resource consumption (since calculating
Content-Length
is unnecessary). On the server side, Manticore now always processes incoming HTTP data in a streaming manner, without waiting for the entire batch to be transferred as before, which:- reduces peak RAM usage, lowering the risk of OOM
- decreases response time (our tests indicated an 11% reduction for processing a 100MB batch)
- allows you to bypass max_packet_size and transfer batches much larger than the maximum allowed value of
max_packet_size
(128MB), for example, 1GB at a time.
-
#719 HTTP interface support of
100 Continue
: now you can transfer large batches fromcurl
(including curl libraries used by various programming languages) which by default doesExpect: 100-continue
and waits some time before actually sending the batch. Previously you had to addExpect:
header, now it's not needed. -
⚠️ BREAKING CHANGE: Pseudo sharding is enabled by default. If you want to disable it make sure you add
pseudo_sharding = 0
to sectionsearchd
of your Manticore configuration file. -
Having at least one full-text field in a real-time/plain index is not mandatory anymore. You can now use Manticore even in cases not having anything to do with full-text search.
-
Fast fetching for attributes backed by Manticore Columnar Library: queries like
select * from <columnar table>
are now much faster than previously, especially if there are many fields in the schema. -
⚠️ BREAKING CHANGE: Implicit cutoff. Manticore now doesn't spend time and resources processing data you don't need in the result set which will be returned. The downside is that it affects
total_found
in SHOW META and hits.total in JSON output. It is now only accurate in case you seetotal_relation: eq
whiletotal_relation: gte
means the actual number of matching documents is greater than thetotal_found
value you've got. To retain the previous behaviour you can use search optioncutoff=0
, which makestotal_relation
alwayseq
. -
⚠️ BREAKING CHANGE: All full-text fields are now stored by default. You need to use
stored_fields =
(empty value) to make all fields non-stored (i.e. revert to the previous behaviour). -
#715 HTTP JSON supports search options.
- ⚠️ BREAKING CHANGE: Index meta file format change. Previously meta files (
.meta
,.sph
) were in binary format, now it's just json. The new Manticore version will convert older indexes automatically, but:- you can get warning like
WARNING: ... syntax error, unexpected TOK_IDENT
- you won't be able to run the index with previous Manticore versions, make sure you have a backup
- you can get warning like
- ⚠️ BREAKING CHANGE: Session state support with help of HTTP keep-alive. This makes HTTP stateful when the client supports it too. For example, using the new /cli endpoint and HTTP keep-alive (which is on by default in all browsers) you can call
SHOW META
afterSELECT
and it will work the same way it works via mysql. Note, previouslyConnection: keep-alive
HTTP header was supported too, but it only caused reusing the same connection. Since this version it also makes the session stateful. - You can now specify
columnar_attrs = *
to define all your attributes as columnar in the plain mode which is useful in case the list is long. - Faster replication SST
- ⚠️ BREAKING CHANGE: Replication protocol has been changed. If you are running a replication cluster, then when upgrading to Manticore 5 you need to:
- stop all your nodes first cleanly
- and then start the node which was stopped last with
--new-cluster
(run toolmanticore_new_cluster
in Linux). - read about restarting a cluster for more details.
- Replication improvements:
- Faster SST
- Noise resistance which can help in case of unstable network between replication nodes
- Improved logging
- Security improvement: Manticore now listens on
127.0.0.1
instead of0.0.0.0
in case nolisten
at all is specified in config. Even though in the default configuration which is shipped with Manticore Search thelisten
setting is specified and it's not typical to have a configuration with nolisten
at all, it's still possible. Previously Manticore would listen on0.0.0.0
which is not secure, now it listens on127.0.0.1
which is usually not exposed to the Internet. - Faster aggregation over columnar attributes.
- Increased
AVG()
accuracy: previously Manticore usedfloat
internally for aggregations, now it usesdouble
which increases the accuracy significantly. - Improved support for JDBC MySQL driver.
DEBUG malloc_stats
support for jemalloc.- optimize_cutoff is now available as a per-table setting which can be set when you CREATE or ALTER a table.
- ⚠️ BREAKING CHANGE: query_log_format is now
sphinxql
by default. If you are used toplain
format you need to addquery_log_format = plain
to your configuration file. - Significant memory consumption improvements: Manticore consumes significantly less RAM now in case of long and intensive insert/replace/optimize workload in case stored fields are used.
- shutdown_timeout default value was increased from 3 seconds to 60 seconds.
- Commit ffd0 Support for Java mysql connector >= 6.0.3: in Java mysql connection 6.0.3 they changed the way they connect to mysql which broke compatibility with Manticore. The new behaviour is now supported.
- Commit 1da6 disabled saving a new disk chunk on loading an index (e.g. on searchd startup).
- Issue #746 Support for glibc >= 2.34.
- Issue #784 count 'VIP' connections separately from usual (non-VIP). Previously VIP connections were counted towards the
max_connections
limit, which could cause "maxed out" error for non-VIP connections. Now VIP connections are not counted towards the limit. Current number of VIP connections can be also seen inSHOW STATUS
andstatus
. - ID can now be specified explicitly.
- Issue #687 support zstd compression for mysql proto
- ⚠️ BM25F formula has been slightly updated to improve search relevance. This only affects search results in case you use function BM25F(), it doesn't change behaviour of the default ranking formula.
- ⚠️ Changed behaviour of REST /sql endpoint:
/sql?mode=raw
now requires escaping and returns an array. - ⚠️ Format change of the response of
/bulk
INSERT/REPLACE/DELETE requests:- previously each sub-query constituted a separate transaction and resulted in a separate response
- now the whole batch is considered a single transaction, which returns a single response
- ⚠️ Search options
low_priority
andboolean_simplify
now require a value (0/1
): previously you could doSELECT ... OPTION low_priority, boolean_simplify
, now you need to doSELECT ... OPTION low_priority=1, boolean_simplify=1
. - ⚠️ If you are using old php, python or java clients please follow the corresponding link and find an updated version. The old versions are not fully compatible with Manticore 5.
- ⚠️ HTTP JSON requests are now logged in different format in mode
query_log_format=sphinxql
. Previously only full-text part was logged, now it's logged as is.
-
⚠️ BREAKING CHANGE: because of the new structure when you upgrade to Manticore 5 it's recommended to remove old packages before you install the new ones:
- RPM-based:
yum remove manticore*
- Debian and Ubuntu:
apt remove manticore*
- RPM-based:
-
New deb/rpm packages structure. Previous versions provided:
manticore-server
withsearchd
(main search daemon) and all needed for itmanticore-tools
withindexer
andindextool
manticore
including everythingmanticore-all
RPM as a meta package referring tomanticore-server
andmanticore-tools
The new structure is:
manticore
- deb/rpm meta package which installs all the above as dependenciesmanticore-server-core
-searchd
and everything to run it alonemanticore-server
- systemd files and other supplementary scriptsmanticore-tools
-indexer
,indextool
and other toolsmanticore-common
- default configuration file, default data directory, default stopwordsmanticore-icudata
,manticore-dev
,manticore-converter
didn't change much.tgz
bundle which includes all the packages
-
Support for Ubuntu Jammy
-
Support for Amazon Linux 2 via YUM repo
- Issue #815 Random crash when using UDF function
- Issue #287 out of memory while indexing RT index
- Issue #604 Breaking change 3.6.0, 4.2.0 sphinxql-parser
- Issue #667 FATAL: out of memory (unable to allocate 9007199254740992 bytes)
- Issue #676 Strings not passed correctly to UDFs
- ❗Issue #698 Searchd crashes after trying to add a text column to a rt index
- Issue #705 Indexer couldn't find all columns
- ❗Issue #709 Grouping by json.boolean works wrong
- Issue #716 indextool commands related to index (eg. --dumpdict) failure
- ❗Issue #724 Fields disappear from the selection
- Issue #727 .NET HttpClient Content-Type incompatibility when using
application/x-ndjson
- Issue #729 Field length calculation
- ❗Issue #730 create/insert into/drop columnar table has a memleak
- Issue #731 Empty column in results under certain conditions
- ❗Issue #749 Crash of daemon on start
- ❗Issue #750 Daemon hangs on start
- ❗Issue #751 Crash at SST
- Issue #752 Json attribute marked as columnar when engine='columnar'
- Issue #753 Replication listens on 0
- Issue #754 columnar_attrs = * is not working with csvpipe
- ❗Issue #755 Crash on select float in columnar in rt
- ❗Issue #756 Indextool changes rt index during check
- Issue #757 Need a check for listeners port range intersections
- Issue #758 Log original error in case RT index failed to save disk chunk
- Issue #759 Only one error reported for RE2 config
- ❗Issue #760 RAM consumption changes in commit 5463778558586d2508697fa82e71d657ac36510f
- Issue #761 3rd node doesn't make a non-primary cluster after dirty restart
- Issue #762 Update counter gets increased by 2
- Issue #763 New version 4.2.1 corrupt index created with 4.2.0 with morphology using
- Issue #764 No escaping in json keys /sql?mode=raw
- ❗Issue #765 Using function hides other values
- ❗Issue #766 Memleak triggered by a line in FixupAttrForNetwork
- ❗Issue #767 Memleak in 4.2.0 and 4.2.1 related with docstore cache
- Issue #768 Strange ping-pong with stored fields over network
- Issue #769 lemmatizer_base reset to empty if not mentioned in 'common' section
- Issue #770 pseudo_sharding makes SELECT by id slower
- Issue #771 DEBUG malloc_stats output zeros when using jemalloc
- Issue #772 Drop/add column makes value invisible
- Issue #773 Can't add column bit(N) to columnar table
- Issue #774 "cluster" gets empty on start in manticore.json
- ❗Commit 1da4 HTTP actions are not tracked in SHOW STATUS
- Commit 3810 disable pseudo_sharding for low frequency single keyword queries
- Commit 8003 fixed stored attributes vs index merge
- Commit cddf generalized distinct value fetchers; added specialized distinct fetchers for columnar strings
- Commit fba4 fixed fetching null integer attributes from docstore
- Commit f300
ranker
could be specified twice in query log
- Pseudo-sharding support for real-time indexes and full-text queries. In previous release we added limited pseudo sharding support. Starting from this version you can get all benefits of the pseudo sharding and your multi-core processor by just enabling searchd.pseudo_sharding. The coolest thing is that you don't need to do anything with your indexes or queries for that, just enable it and if you have free CPU it will be used to lower your response time. It supports plain and real-time indexes for full-text, filtering and analytical queries. For example, here is how enabling pseudo sharding can make most queries' response time in average about 10x lower on Hacker news curated comments dataset multiplied 100 times (116 million docs in a plain index).
- Debian Bullseye is now supported.
- PQ transactions are now atomic and isolated. Previously PQ transactions support was limited. It enables much faster REPLACE into PQ, especially when you need to replace a lot of rules at once. Performance details:
- 4.0.2
- 4.2.0
It takes 48 seconds to insert 1M PQ rules and 406 seconds to REPLACE just 40K in 10K batches.
root@perf3 ~ # mysql -P9306 -h0 -e "drop table if exists pq; create table pq (f text, f2 text, j json, s string) type='percolate';"; date; for m in `seq 1 1000`; do (echo -n "insert into pq (id,query,filters,tags) values "; for n in `seq 1 1000`; do echo -n "(0,'@f (cat | ( angry dog ) | (cute mouse)) @f2 def', 'j.json.language=\"en\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; [ $n != 1000 ] && echo -n ","; done; echo ";")|mysql -P9306 -h0; done; date; mysql -P9306 -h0 -e "select count(*) from pq"
Wed Dec 22 10:24:30 AM CET 2021
Wed Dec 22 10:25:18 AM CET 2021
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
root@perf3 ~ # date; (echo "begin;"; for offset in `seq 0 10000 30000`; do n=0; echo "replace into pq (id,query,filters,tags) values "; for id in `mysql -P9306 -h0 -NB -e "select id from pq limit $offset, 10000 option max_matches=1000000"`; do echo "($id,'@f (tiger | ( angry bear ) | (cute panda)) @f2 def', 'j.json.language=\"de\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; n=$((n+1)); [ $n != 10000 ] && echo -n ","; done; echo ";"; done; echo "commit;") > /tmp/replace.sql; date
Wed Dec 22 10:26:23 AM CET 2021
Wed Dec 22 10:26:27 AM CET 2021
root@perf3 ~ # time mysql -P9306 -h0 < /tmp/replace.sql
real 6m46.195s
user 0m0.035s
sys 0m0.008s
- optimize_cutoff is now available as a configuration option in section
searchd
. It's useful when you want to limit the RT chunks count in all your indexes to a particular number globally. - Commit 0087 accurate count(distinct ...) and FACET ... distinct over several local physical indexes (real-time/plain) with identical fields set/order.
- PR #598 bigint support for
YEAR()
and other timestamp functions. - Commit 8e85 Adaptive rt_mem_limit. Previously Manticore Search was collecting exactly up to
rt_mem_limit
of data before saving a new disk chunk to disk, and while saving was still collecting up to 10% more (aka double-buffer) to minimize possible insert suspension. If that limit was also exhausted, adding new documents was blocked until the disk chunk was fully saved to disk. The new adaptive limit is built on the fact that we have auto-optimize now, so it's not a big deal if disk chunks do not fully respectrt_mem_limit
and start flushing a disk chunk earlier. So, now we collect up to 50% ofrt_mem_limit
and save that as a disk chunk. Upon saving we look at the statistics (how much we've saved, how many new documents have arrived while saving) and recalculate the initial rate which will be used next time. For example, if we saved 90 million documents, and another 10 million docs arrived while saving, the rate is 90%, so we know that next time we can collect up to 90% ofrt_mem_limit
before starting flushing another disk chunk. The rate value is calculated automatically from 33.3% to 95%. - Issue #628 unpack_zlib for PostgreSQL source. Thank you, Dmitry Voronin for the contribution.
- Commit 6d54
indexer -v
and--version
. Previously you could still see indexer's version, but-v
/--version
were not supported. - Issue #662 infinit mlock limit by default when Manticore is started via systemd.
- Commit 63c8 spinlock -> op queue for coro rwlock.
- Commit 4113 environment variable
MANTICORE_TRACK_RT_ERRORS
useful for debugging RT segments corruption.
- Binlog version was increased, binlog from previous version won't be replayed, so make sure you stop Manticore Search cleanly during upgrade: no binlog files should be in
/var/lib/manticore/binlog/
exceptbinlog.meta
after stopping the previous instance. - Commit 3f65 new column "chain" in
show threads option format=all
. It shows stack of some task info tickets, most useful for profiling needs, so if you are parsingshow threads
output be aware of the new column. searchd.workers
was obsoleted since 3.5.0, now it's deprecated, if you still have it in your configuration file it will trigger a warning on start. Manticore Search will start, but with a warning.- If you use PHP and PDO to access Manticore you need to do
PDO::ATTR_EMULATE_PREPARES
- ❗Issue #650 Manticore 4.0.2 slower than Manticore 3.6.3. 4.0.2 was faster than previous versions in terms of bulk inserts, but significantly slower for single document inserts. It's been fixed in 4.2.0.
- ❗Commit 22f4 RT index could get corrupted under intensive REPLACE load, or it could crash
- Commit 03be fixed average at merging groupers and group N sorter; fixed merge of aggregates
- Commit 2ea5
indextool --check
could crash - Commit 7ec7 RAM exhaustion issue caused by UPDATEs
- Commit 658a daemon could hang on INSERT
- Commit 46e4 daemon could hang on shutdown
- Commit f8d7 daemon could crash on shutdown
- Commit 733a daemon could hang on crash
- Commit f7f8 daemon could crash on startup trying to rejoin cluster with invalid nodes list
- Commit 1401 distributed index could get completely forgotten in RT mode in case it couldn't resolve one of its agents on start
- Issue #683 attr bit(N) engine='columnar' fails
- Issue #682 create table fails, but leaves dir
- Issue #663 Config fails with: unknown key name 'attr_update_reserve'
- Issue #632 Manticore crash on batch queries
- Issue #679 Batch queries causing crashes again with v4.0.3
- Commit f7f8 fixed daemon crash on startup trying to re-join cluster with invalid nodes list
- Issue #643 Manticore 4.0.2 does not accept connections after batch of inserts
- Issue #635 FACET query with ORDER BY JSON.field or string attribute could crash
- Issue #634 Crash SIGSEGV on query with packedfactors
- Commit 4165 morphology_skip_fields was not supported by create table
-
Full support of Manticore Columnar Library. Previously Manticore Columnar Library was supported only for plain indexes. Now it's supported:
- in real-time indexes for
INSERT
,REPLACE
,DELETE
,OPTIMIZE
- in replication
- in
ALTER
- in
indextool --check
- in real-time indexes for
-
Automatic indexes compaction (Issue #478). Finally, you don't have to call OPTIMIZE manually or via a crontask or other kind of automation. Manticore now does it for you automatically and by default. You can set default compaction threshold via optimize_cutoff global variable.
-
Chunk snapshots and locks system revamp. These changes may be invisible from outside at first glance, but they improve the behaviour of many things happening in real-time indexes significantly. In a nutshell, previously most Manticore data manipulation operations relied on locks heavily, now we use disk chunk snapshots instead.
-
Significantly faster bulk INSERT performance into a real-time index. For example on Hetzner's server AX101 with SSD, 128 GB of RAM and AMD's Ryzen™ 9 5950X (16*2 cores) with 3.6.0 you could get 236K docs per second inserted into a table with schema
name text, email string, description text, age int, active bit(1)
(defaultrt_mem_limit
, batch size 25000, 16 concurrent insert workers, 16 million docs inserted overall). In 4.0.2 the same concurrency/batch/count gives 357K docs per second. -
ALTER can add/remove a full-text field (in RT mode). Previously it could only add/remove an attribute.
-
🔬 Experimental: pseudo-sharding for full-scan queries - allows to parallelize any non-full-text search query. Instead of preparing shards manually you can now just enable new option searchd.pseudo_sharding and expect up to
CPU cores
lower response time for non-full-text search queries. Note it can easily occupy all existing CPU cores, so if you care not only about latency, but throughput too - use it with caution.
- Linux Mint and Ubuntu Hirsute Hippo are supported via APT repository
- faster update by id via HTTP in big indexes in some cases (depends on the ids distribution)
- 671e65a2 - added caching to lemmatizer-uk
- 3.6.0
- 4.0.2
time curl -X POST -d '{"update":{"index":"idx","id":4611686018427387905,"doc":{"mode":0}}}' -H "Content-Type: application/x-ndjson" http://127.0.0.1:6358/json/bulk
real 0m43.783s
user 0m0.008s
sys 0m0.007s
- custom startup flags for systemd. Now you don't need to start searchd manually in case you need to run Manticore with some specific startup flag
- new function LEVENSHTEIN() which calculates Levenshtein distance
- added new searchd startup flags
--replay-flags=ignore-trx-errors
and--replay-flags=ignore-all-errors
so one can still start searchd if the binlog is corrupted - Issue #621 - expose errors from RE2
- more accurate COUNT(DISTINCT) for distributed indexes consisting of local plain indexes
- FACET DISTINCT to remove duplicates when you do faceted search
- exact form modifier doesn't require morphology now and works for indexes with infix/prefix search enabled
- the new version can read older indexes, but the older versions can't read Manticore 4's indexes
- removed implicit sorting by id. Sort explicitly if required
charset_table
's default value changes from0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451
tonon_cjk
OPTIMIZE
happens automatically. If you don't need it make sure to setauto_optimize=0
in sectionsearchd
in the configuration file- Issue #616
ondisk_attrs_default
were deprecated, now they are removed - for contributors: we now use Clang compiler for Linux builds as according to our tests it can build a faster Manticore Search and Manticore Columnar Library
- if max_matches is not specified in a search query it gets updated implicitly with the lowest needed value for the sake of performance of the new columnar storage. It can affect metric
total
in SHOW META, but nottotal_found
which is the actual number of found documents.
- make sure you a stop Manticore 3 cleanly:
- no binlog files should be in
/var/lib/manticore/binlog/
(onlybinlog.meta
should be in the directory) - otherwise the indexes Manticore 4 can't reply binlogs for won't be run
- no binlog files should be in
- the new version can read older indexes, but the older versions can't read Manticore 4's indexes, so make sure you make a backup if you want to be able to rollback the new version easily
- if you run a replication cluster make sure you:
- stop all your nodes first cleanly
- and then start the node which was stopped last with
--new-cluster
(run toolmanticore_new_cluster
in Linux). - read about restarting a cluster for more details
- Lots of replication issues have been fixed:
- Commit 696f - fixed crash during SST on joiner with active index; added sha1 verify at joiner node at writing file chunks to speed up index loading; added rotation of changed index files at joiner node on index load; added removal of index files at joiner node when active index gets replaced by a new index from donor node; added replication log points at donor node for sending files and chunks
- Commit b296 - crash on JOIN CLUSTER in case the address is incorrect
- Commit 418b - while initial replication of a large index the joining node could fail with
ERROR 1064 (42000): invalid GTID, (null)
, the donor could become unresponsive while another node was joining - Commit 6fd3 - hash could be calculated wrong for a big index which could result in replication failure
- Issue #615 - replication failed on cluster restart
- Issue #574 -
indextool --help
doesn't display parameter--rotate
- Issue #578 - searchd high CPU usage while idle after ca. a day
- Issue #587 - flush .meta immediately
- Issue #617 - manticore.json gets emptied
- Issue #618 - searchd --stopwait fails under root. It also fixes systemctl behaviour (previously it was showing failure for ExecStop and didn't wait long enough for searchd to stop properly)
- Issue #619 - INSERT/REPLACE/DELETE vs SHOW STATUS.
command_insert
,command_replace
and others were showing wrong metrics - Issue #620 -
charset_table
for a plain index had a wrong default value - Commit 8f75 - new disk chunks don't get mlocked
- Issue #607 - Manticore cluster node crashes when unable to resolve a node by name
- Issue #623 - replication of updated index can lead to undefined state
- Commit ca03 - indexer could hang on indexing a plain index source with a json attribute
- Commit 53c7 - fixed not equal expression filter at PQ index
- Commit ccf9 - fixed select windows at list queries above 1000 matches.
SELECT * FROM pq ORDER BY id desc LIMIT 1000 , 100 OPTION max_matches=1100
was not working previously - Commit a048 - HTTPS request to Manticore could cause warning like "max packet size(8388608) exceeded"
- Issue #648 - Manticore 3 could hang after a few updates of string attributes
Maintenance release before Manticore 4
- Support for Manticore Columnar Library for plain indexes. New setting columnar_attrs for plain indexes
- Support for Ukrainian Lemmatizer
- Fully revised histograms. When building an index Manticore also builds histograms for each field in it, which it then uses for faster filtering. In 3.6.0 the algorithm was fully revised and you can get a higher performance if you have a lot of data and do a lot of filtering.
- tool
manticore_new_cluster [--force]
useful for restarting a replication cluster via systemd - --drop-src for
indexer --merge
- new mode
blend_mode='trim_all'
- added support for escaping JSON path with backticks
- indextool --check can work in RT mode
- FORCE/IGNORE INDEX(id) for SELECT/UPDATE
- chunk id for a merged disk chunk is now unique
- indextool --check-disk-chunk CHUNK_NAME
- faster JSON parsing, our tests show 3-4% lower latency on queries like
WHERE json.a = 1
- non-documented command
DEBUG SPLIT
as a prerequisite for automatic sharding/rebalancing
- Issue #584 - inaccurate and unstable FACET results
- Issue #506 - Strange behavior when using MATCH: those who suffer from this issue need to rebuild the index as the problem was on the phase of building an index
- Issue #387 - intermittent core dump when running query with SNIPPET() function
- Stack optimizations useful for processing complex queries:
- Issue #469 - SELECT results in CRASH DUMP
- e8420cc7 - stack size detection for filter trees
- Issue #461 - Update using the IN condition does not take effect correctly
- Issue #464 - SHOW STATUS immediately after CALL PQ returns - Issue #481 - Fixed static binary build
- Issue #502 - bug in multi-queries
- Issue #514 - Unable to use unusual names for columns when use 'create table'
- Commit d1db - daemon crash on replay binlog with update of string attribute; set binlog version to 10
- Commit 775d - fixed expression stack frame detection runtime (test 207)
- Commit 4795 - percolate index filter and tags were empty for empty stored query (test 369)
- Commit c3f0 - breaks of replication SST flow at network with long latency and high error rate (different data centers replication); updated replication command version to 1.03
- Commit ba2d - joiner lock cluster on write operations after join into cluster (test 385)
- Commit de4d - wildcards matching with exact modifier (test 321)
- Commit 6524 - docid checkpoints vs docstore
- Commit f4ab - Inconsistent indexer behavior when parsing invalid xml
- Commit 7b72 - Stored percolate query with NOTNEAR runs forever (test 349)
- Commit 812d - wrong weight for phrase starting with wildcard
- Commit 1771 - percolate query with wildcards generate terms without payload on matching causes interleaved hits and breaks matching (test 417)
- Commit aa0d - fixed calculation of 'total' in case of parallelized query
- Commit 18d8 - crash in Windows with multiple concurrent sessions at daemon
- Commit 8443 - some index settings could not be replicated
- Commit 9341 - On high rate of adding new events netloop sometimes freeze because of atomic 'kick' event being processed once for several events a time and loosing actual actions from them status of the query, not the server status
- Commit d805 - New flushed disk chunk might be lost on commit
- Commit 63cb - inaccurate 'net_read' in profiler
- Commit f537 - Percolate issue with arabic (right to left texts)
- Commit 49ee - id not picked correctly on duplicate column name
- Commit refa of network events to fix a crash in rare cases
- e8420cc7 fix in
indextool --dumpheader
- Commit ff71 - TRUNCATE WITH RECONFIGURE worked wrong with stored fields
- New binlog format: you need to make a clean stop of Manticore before upgrading
- Index format slightly changes: the new version can read you existing indexes fine, but if you decide to downgrade from 3.6.0 to an older version the newer indexes will be unreadable
- Replication format change: don't replicate from an older version to 3.6.0 and vice versa, switch to the new version on all your nodes at once
reverse_scan
is deprecated. Make sure you don't use this option in your queries since 3.6.0 since they will fail otherwise- As of this release we don't provide builds for RHEL6, Debian Jessie and Ubuntu Trusty any more. If it's mission critical for you to have them supported contact us
- No more implicit sorting by id. If you rely on it make sure to update your queries accordingly
- Search option
reverse_scan
has been deprecated
- New Python, Javascript and Java clients are generally available now and are well documented in this manual.
- automatic drop of a disk chunk of a real-time index. This optimization enables dropping a disk chunk automatically when OPTIMIZing a real-time index when the chunk is obviously not needed any more (all the documents are suppressed). Previously it still required merging, now the chunk can be just dropped instantly. The cutoff option is ignored, i.e. even if nothing is actually merged an obsoleted disk chunk gets removed. This is useful in case you maintain retention in your index and delete older documents. Now compacting such indexes will be faster.
- standalone NOT as an option for SELECT
- Issue #453 New option indexer.ignore_non_plain=1 is useful in case you run
indexer --all
and have not only plain indexes in the configuration file. Withoutignore_non_plain=1
you'll get a warning and a respective exit code. - SHOW PLAN ... OPTION format=dot and EXPLAIN QUERY ... OPTION format=dot enable visualization of full-text query plan execution. Useful for understanding complex queries.
indexer --verbose
is deprecated as it never added anything to the indexer output- For dumping watchdog's backtrace signal
USR2
is now to be used instead ofUSR1
- Issue #423 cyrillic char period call snippets retain mode don't highlight
- Issue #435 RTINDEX - GROUP N BY expression select = fatal crash
- Commit 2b3b searchd status shows Segmentation fault when in cluster
- Commit 9dd2 'SHOW INDEX index.N SETTINGS' doesn't address chunks >9
- Issue #389 Bug that crashes Manticore
- Commit fba1 Converter creates broken indexes
- Commit eecd stopword_step=0 vs CALL SNIPPETS()
- Commit ea68 count distinct returns 0 at low max_matches on a local index
- Commit 362f When using aggregation stored texts are not returned in hits
- OPTIMIZE reduces disk chunks to a number of chunks ( default is
2* No. of cores
) instead of a single one. The optimal number of chunks can be controlled by cutoff option. - NOT operator can be now used standalone. By default it is disabled since accidental single NOT queries can be slow. It can be enabled by setting new searchd directive not_terms_only_allowed to
0
. - New setting max_threads_per_query sets how many threads a query can use. If the directive is not set, a query can use threads up to the value of threads.
Per
SELECT
query the number of threads can be limited with OPTION threads=N overriding the globalmax_threads_per_query
. - Percolate indexes can be now be imported with IMPORT TABLE.
- HTTP API
/search
receives basic support for faceting/grouping by new query nodeaggs
.
- If no replication listen directive is declared, the engine will try to use ports after the defined 'sphinx' port, up to 200.
listen=...:sphinx
needs to be explicit set for SphinxSE connections or SphinxAPI clients.- SHOW INDEX STATUS outputs new metrics:
killed_documents
,killed_rate
,disk_mapped_doclists
,disk_mapped_cached_doclists
,disk_mapped_hitlists
anddisk_mapped_cached_hitlists
. - SQL command
status
now outputsQueue\Threads
andTasks\Threads
.
dist_threads
is completely deprecated now, searchd will log a warning if the directive is still used.
The official Docker image is now based on Ubuntu 20.04 LTS
Besides the usual manticore
package, you can also install Manticore Search by components:
manticore-server-core
- providessearchd
, manpage, log dir, API and galera module. It will also installmanticore-common
as the dependency.manticore-server
- provides automation scripts for core (init.d, systemd), andmanticore_new_cluster
wrapper. It will also installmanticore-server-core
as the dependency.manticore-common
- provides config, stopwords, generic docs and skeleton folders (datadir, modules, etc.)manticore-tools
- provides auxiliary tools (indexer
,indextool
etc.), their manpages and examples. It will also installmanticore-common
as the dependency.manticore-icudata
(RPM) ormanticore-icudata-65l
(DEB) - provides ICU data file for icu morphology usage.manticore-devel
(RPM) ormanticore-dev
(DEB) - provides dev headers for UDFs.
- Commit 2a47 Crash of daemon at grouper at RT index with different chunks
- Commit 57a1 Fastpath for empty remote docs
- Commit 07dd Expression stack frame detection runtime
- Commit 08ae Matching above 32 fields at percolate indexes
- Commit 16b9 Replication listen ports range
- Commit 5fa6 Show create table on pq
- Commit 54d1 HTTPS port behavior
- Commit fdbb Mixing docstore rows when replacing
- Commit afb5 Switch TFO unavailable message level to 'info'
- Commit 59d9 Crash on strcmp invalid use
- Commit 04af Adding index to cluster with system (stopwords) files
- Commit 5014 Merge indexes with large dictionaries; RT optimize of large disk chunks
- Commit a2ad Indextool can dump meta from current version
- Commit 69f6 Issue in group order in GROUP N
- Commit 24d5 Explicit flush for SphinxSE after handshake
- Commit 31c4 Avoid copy of huge descriptions when not necessary
- Commit 2959 Negative time in show threads
- Commit f0b3 Token filter plugin vs zero position deltas
- Commit a49e Change 'FAIL' to 'WARNING' on multiple hits
-
This release took so long, because we were working hard on changing multitasking mode from threads to coroutines. It makes configuration simpler and queries parallelization much more straightforward: Manticore just uses given number of threads (see new setting threads) and the new mode makes sure it's done in the most optimal way.
-
Changes in highlighting:
- any highlighting that works with several fields (
highlight({},'field1, field2'
) orhighlight
in json queries) now applies limits per-field by default. - any highlighting that works with plain text (
highlight({}, string_attr)
orsnippet()
now applies limits to the whole document. - per-field limits can be switched to global limits by
limits_per_field=0
option (1
by default). - allow_empty is now
0
by default for highlighting via HTTP JSON.
- any highlighting that works with several fields (
-
The same port can now be used for http, https and binary API (to accept connections from a remote Manticore instance).
listen = *:mysql
is still required for connections via mysql protocol. Manticore now detects automatically the type of client trying to connect to it except for MySQL (due to restrictions of the protocol). -
In RT mode a field can now be text and string attribute at the same time - GitHub issue #331.
In plain mode it's called
sql_field_string
. Now it's available in RT mode for real-time indexes too. You can use it as shown in the example:create table t(f string attribute indexed); insert into t values(0,'abc','abc'); select * from t where match('abc'); +---------------------+------+ | id | f | +---------------------+------+ | 2810845392541843463 | abc | +---------------------+------+ 1 row in set (0.01 sec) mysql> select * from t where f='abc'; +---------------------+------+ | id | f | +---------------------+------+ | 2810845392541843463 | abc | +---------------------+------+ 1 row in set (0.00 sec)
- You can now highlight string attributes.
- SSL and compression support for SQL interface
- Support of mysql client
status
command. - Replication can now replicate external files (stopwords, exceptions etc.).
- Filter operator
in
is now available via HTTP JSON interface. expressions
in HTTP JSON.- You can now change
rt_mem_limit
on the fly in RT mode, i.e. can doALTER ... rt_mem_limit=<new value>
. - You can now use separate CJK charset tables:
chinese
,japanese
andkorean
. - thread_stack now limits maximum thread stack, not initial.
- Improved
SHOW THREADS
output. - Display progress of long
CALL PQ
inSHOW THREADS
. - cpustat, iostat, coredump can be changed during runtime with SET.
SET [GLOBAL] wait_timeout=NUM
implemented ,
- Index format has been changed. Indexes built in 3.5.0 cannot be loaded by Manticore version < 3.5.0, but Manticore 3.5.0 understands older formats.
INSERT INTO PQ VALUES()
(i.e. without providing column list) previously expected exactly(query, tags)
as the values. It's been changed to(id,query,tags,filters)
. The id can be set to 0 if you want it to be auto-generated.allow_empty=0
is a new default in highlighting via HTTP JSON interface.- Only absolute paths are allowed for external files (stopwords, exceptions etc.) in
CREATE TABLE
/ALTER TABLE
.
ram_chunks_count
was renamed toram_chunk_segments_count
inSHOW INDEX STATUS
.workers
is obsolete. There's only one workers mode now.dist_threads
is obsolete. All queries are as much parallel as possible now (limited bythreads
andjobs_queue_size
).max_children
is obsolete. Use threads to set the number of threads Manticore will use (set to the # of CPU cores by default).queue_max_length
is obsolete. Instead of that in case it's really needed use jobs_queue_size to fine-tune internal jobs queue size (unlimited by default).- All
/json/*
endpoints are now available w/o/json/
, e.g./search
,/insert
,/delete
,/pq
etc. field
meaning "full-text field" was renamed to "text" indescribe
.3.4.2:
mysql> describe t; +-------+--------+----------------+ | Field | Type | Properties | +-------+--------+----------------+ | id | bigint | | | f | field | indexed stored | +-------+--------+----------------+
3.5.0:
mysql> describe t; +-------+--------+----------------+ | Field | Type | Properties | +-------+--------+----------------+ | id | bigint | | | f | text | indexed stored | +-------+--------+----------------+
- Cyrillic
и
doesn't map toi
innon_cjk
charset_table (which is a default) as it affected Russian stemmers and lemmatizers too much. read_timeout
. Use network_timeout instead which controls both reading and writing.
- Ubuntu Focal 20.04 official package
- deb package name changed from
manticore-bin
tomanticore
- Issue #351 searchd memory leak
- Commit ceab Tiny read out of bounds in snippets
- Commit 1c3e Dangerous write into local variable for crash queries
- Commit 26e0 Tiny memory leak of sorter in test 226
- Commit d2c7 Huge memory leak in test 226
- Commit 0dd8 Cluster shows the nodes are in sync, but
count(*)
shows different numbers - Commit f1c1 Cosmetic: Duplicate and sometimes lost warning messages in the log
- Commit f1c1 Cosmetic: (null) index name in log
- Commit 359d Cannot retrieve more than 70M results
- Commit 19f3 Can't insert PQ rules with no-columns syntax
- Commit bf68 Misleading error message when inserting a document to an index in a cluster
- Commit 2cf1
/json/replace
andjson/update
return id in exponent form - Issue #324 Update json scalar properties and mva in the same query
- Commit d384
hitless_words
doesn't work in RT mode - Commit 5813
ALTER RECONFIGURE
in rt mode should be disallowed - Commit 5813
rt_mem_limit
gets reset to 128M after searchd restart - highlight() sometimes hangs
- Commit 7cd8 Failed to use U+code in RT mode
- Commit 2b21 Failed to use wildcard at wordforms at RT mode
- Commit e9d0 Fixed
SHOW CREATE TABLE
vs multiple wordform files - Commit fc90 JSON query without "query" crashes searchd
- Manticore official docker couldn't index from mysql 8
- Commit 23e0 HTTP /json/insert requires id
- Commit bd67
SHOW CREATE TABLE
doesn't work for PQ - Commit bd67
CREATE TABLE LIKE
doesn't work properly for PQ - Commit 5eac End of line in settings in show index status
- Commit cb15 Empty title in "highlight" in HTTP JSON response
- Issue #318
CREATE TABLE LIKE
infix error - Commit 9040 RT crashes under load
- cd512c7d Lost crash log on crash at RT disk chunk
- Issue #323 Import table fails and closes the connection
- Commit 6275
ALTER reconfigure
corrupts a PQ index - Commit 9c1d Searchd reload issues after change index type
- Commit 71e2 Daemon crashes on import table with missed files
- Issue #322 Crash on select using multiple indexes, group by and ranker = none
- Commit c3f5
HIGHLIGHT()
doesn't higlight in string attributes - Issue #320
FACET
fails to sort on string attribute - Commit 4f1a Error in case of missing data dir
- Commit 04f4 access_* are not supported in RT mode
- Commit 1c06 Bad JSON objects in strings: 1.
CALL PQ
returns "Bad JSON objects in strings: 1" when the json is greater than some value. - Commit 32f9 RT-mode inconsistency. In some cases I can't drop the index since it's unknown and can't create it since the directory is not empty.
- Issue #319 Crash on select
- Commit 22a2
max_xmlpipe2_field
= 2M returned warning on 2M field - Issue #342 Query conditions execution bug
- Commit dd8d Simple 2 terms search finds a document containing only one term
- Commit 9091 It was impossible in PQ to match a json with capital letters in keys
- Commit 56da Indexer crashes on csv+docstore
- Issue #363 using
[null]
in json attr in centos 7 causes corrupted inserted data - Major Issue #345 Records not being inserted, count() is random, "replace into" returns OK
- max_query_time slows down SELECTs too much
- Issue #352 Master-agent communication fails on Mac OS
- Issue #328 Error when connecting to Manticore with Connector.Net/Mysql 8.0.19
- Commit daa7 Fixed escaping of \0 and optimized performance
- Commit 9bc5 Fixed count distinct vs json
- Commit 4f89 Fixed drop table at other node failed
- Commit 952a Fix crashes on tightly running call pq
- Commit 2ffe fix RT index from old version fails to index data
- server works in 2 modes: rt-mode and plain-mode
- rt-mode requires data_dir and no index definition in config
- in plain-mode indexes are defined in config; no data_dir allowed
- replication available only in rt-mode
- charset_table defaults to non_cjk alias
- in rt-mode full-text fields are indexed and stored by default
- full-text fields in rt-mode renamed from 'field' to 'text'
- ALTER RTINDEX is renamed to ALTER TABLE
- TRUNCATE RTINDEX is renamed to TRUNCATE TABLE
- stored-only fields
- SHOW CREATE TABLE, IMPORT TABLE
- much faster lockless PQ
- /sql can execute any type of SQL statement in mode=raw
- alias mysql for mysql41 protocol
- default state.sql in data_dir
- Commit a533 fix crash on wrong field syntax in highlight()
- Commit 7fbb fix crash of server on replicate RT index with docstore
- Commit 24a0 fix crash on highlight to index with infix or prefix option and to index wo stored fields enabled
- Commit 3465 fix false error about empty docstore and dock-id lookup for empty index
- Commit a707 fix #314 SQL insert command with trailing semicolon
- Commit 9562 removed warning on query word(s) mismatch
- Commit b860 fix queries in snippets segmented via ICU
- Commit 5275 fix find/add race condition in docstore block cache
- Commit f06e fix mem leak in docstore
- Commit a725 fix #316 LAST_INSERT_ID returns empty on INSERT
- Commit 1ebd fix #317 json/update HTTP endpoint to support array for MVA and object for JSON attribute
- Commit e426 fix rash of indexer dumping rt without explicit id
- Parallel Real-Time index searching
- EXPLAIN QUERY command
- configuration file without index definitions (alpha version)
- CREATE/DROP TABLE commands (alpha version)
- indexer --print-rt - can read from a source and print INSERTs for a Real-Time index
- Updated to Snowball 2.0 stemmers
- LIKE filter for SHOW INDEX STATUS
- improved memory usage for high max_matches
- SHOW INDEX STATUS adds ram_chunks_count for RT indexes
- lockless PQ
- changed LimitNOFILE to 65536
- Commit 9c33 added check of index schema for duplicate attributes #293
- Commit a008 fix crash in hitless terms
- Commit 6895 fix loose docstore after ATTACH
- Commit d6f6 fix docstore issue in distributed setup
- Commit bce2 replace FixedHash with OpenHash in sorter
- Commit e0ba fix attributes with duplicated names at index definition
- Commit ca81 fix html_strip in HIGHLIGHT()
- Commit 493a fix passage macro in HIGHLIGHT()
- Commit a82d fix double buffer issues when RT index creates small or large disk chunk
- Commit a404 fix event deletion for kqueue
- Commit 8bea fix save of disk chunk for large value of rt_mem_limit of RT index
- Commit 8707 fix float overflow on indexing
- Commit a564 fix insert document with negative ID into RT index fails with error now
- Commit bbeb fix crash of server on ranker fieldmask
- Commit 3809 fix crash on using query cache
- Commit dc2a fix crash on using RT index RAM segments with parallel inserts
- Autoincrement ID for RT indexes
- Highlight support for docstore via new HIGHLIGHT() function, available also in HTTP API
- SNIPPET() can use special function QUERY() which returns current MATCH query
- new field_separator option for highlighting functions.
- lazy fetch of stored fields for remote nodes (can significantly increase performance)
- strings and expressions don't break anymore multi-query and FACET optimizations
- RHEL/CentOS 8 build now uses mysql libclient from mariadb-connector-c-devel
- ICU data file is now shipped with the packages, icu_data_dir removed
- systemd service files include 'Restart=on-failure' policy
- indextool can now check real-time indexes online
- default conf is now /etc/manticoresearch/manticore.conf
- service on RHEL/CentOS renamed to 'manticore' from 'searchd'
- removed query_mode and exact_phrase snippet's options
- Commit 6ae4 fix crash on SELECT query over HTTP interface
- Commit 5957 fix RT index saves disk chunks but does not mark some documents deleted
- Commit e861 fix crash on search of multi index or multi queries with dist_threads
- Commit 4409 fix crash on infix generation for long terms with wide utf8 codepoints
- Commit 5fd5 fix race at adding socket to IOCP
- Commit cf10 fix issue of bool queries vs json select list
- Commit 996d fix indextool check to report wrong skiplist offset, check of doc2row lookup
- Commit 6e3f fix indexer produces bad index with negative skiplist offset on large data
- Commit faed fix JSON converts only numeric to string and JSON string to numeric conversion at expressions
- Commit 5331 fix indextool exit with error code in case multiple commands set at command line
- Commit 7955 fix #275 binlog invalid state on error no space left on disk
- Commit 2284 fix #279 crash on IN filter to JSON attribute
- Commit ce2e fix #281 wrong pipe closing call
- Commit 5355 fix server hung at CALL PQ with recursive JSON attribute encoded as string
- Commit a5fc fix advancing beyond the end of the doclist in multiand node
- Commit a362 fix retrieving of thread public info
- Commit f8d2 fix docstore cache locks
- Document storage
- new directives stored_fields, docstore_cache_size, docstore_block_size, docstore_compression, docstore_compression_level
- improved SSL support
- non_cjk built-in charset updated
- disabled UPDATE/DELETE statements logging a SELECT in query log
- RHEL/CentOS 8 packages
- Commit 301a fix crash on replace document in disk chunk of RT index
- Commit 46c1 fix #269 LIMIT N OFFSET M
- Commit 92a4 fix DELETE statements with id explicitly set or id list provided to skip search
- Commit 8ca7 fix wrong index after event removed at netloop at windowspoll poller
- Commit 6036 fix float roundup at JSON via HTTP
- Commit 62f6 fix remote snippets to check empty path first; fixing windows tests
- Commit aba2 fix reload of config to work on windows same way as on linux
- Commit 6b8c fix #194 PQ to work with morphology and stemmers
- Commit 174d fix RT retired segments management
- Experimental SSL support for HTTP API
- field filter for CALL KEYWORDS
- max_matches for /json/search
- automatic sizing of default Galera gcache.size
- improved FreeBSD support
- Commit 0a1a fixed replication of RT index into node where same RT index exists and has different path
- Commit 4adc fix flush rescheduling for indexes without activity
- Commit d6c0 improve rescheduling of flushing RT/PQ indexes
- Commit d0a7 fix #250 index_field_lengths index option for TSV and CSV piped sources
- Commit 1266 fix indextool wrong report for block index check on empty index
- Commit 553c fix empty select list at Manticore SQL query log
- Commit 56c8 fix indexer -h/--help response
- replication for RealTime indexes
- ICU tokenizer for chinese
- new morphology option icu_chinese
- new directive icu_data_dir
- multiple statements transactions for replication
- LAST_INSERT_ID() and @session.last_insert_id
- LIKE 'pattern' for SHOW VARIABLES
- Multiple documents INSERT for percolate indexes
- Added time parsers for config
- internal task manager
- mlock for doc and hit lists components
- jail snippets path
- RLP library support dropped in favor of ICU; all rlp* directives removed
- updating document ID with UPDATE is disabled
- Commit f047 fix defects in concat and group_concat
- Commit b081 fix query uid at percolate index to be BIGINT attribute type
- Commit 4cd8 do not crash if failed to prealloc a new disk chunk
- Commit 1a55 add missing timestamp data type to ALTER
- Commit f3a8 fix crash of wrong mmap read
- Commit 4475 fix hash of clusters lock in replication
- Commit ff47 fix leak of providers in replication
- Commit 58dc fix #246 undefined sigmask in indexer
- Commit 3dd8 fix race in netloop reporting
- Commit a02a zero gap for HA strategies rebalancer
- added mmap readers for docs and hit lists
/sql
HTTP endpoint response is now the same as/json/search
response- new directives
access_plain_attrs
,access_blob_attrs
,access_doclists
,access_hitlists
- new directive
server_id
for replication setups
- removed HTTP
/search
endpoint
read_buffer
,ondisk_attrs
,ondisk_attrs_default
,mlock
are replaced byaccess_*
directives
- Commit 849c allow attribute names starting with numbers in select list
- Commit 48e6 fixed MVAs in UDFs, fixed MVA aliasing
- Commit 0555 fixed #187 crash when using query with SENTENCE
- Commit 93bf fixed #143 support () around MATCH()
- Commit 599e fixed save of cluster state on ALTER cluster statement
- Commit 230c fixed crash of server on ALTER index with blob attributes
- Commit 5802 fixed #196 filtering by id
- Commit 25d2 discard searching on template indexes
- Commit 2a30 fixed id column to have regular bigint type at SQL reply
- New index storage. Non-scalar attributes are not limited anymore to 4GB size per index
- attr_update_reserve directive
- String,JSON and MVAs can be updated using UPDATE
- killlists are applied at index load time
- killlist_target directive
- multi AND searches speedup
- better average performance and RAM usage
- convert tool for upgrading indexes made with 2.x
- CONCAT() function
- JOIN CLUSTER cluster AT 'nodeaddress:port'
- ALTER CLUSTER posts UPDATE nodes
- node_address directive
- list of nodes printed in SHOW STATUS
- in case of indexes with killists, server doesn't rotate indexes in order defined in conf, but follows the chain of killlist targets
- order of indexes in a search no longer defines the order in which killlists are applied
- Document IDs are now signed big integers
- docinfo (always extern now), inplace_docinfo_gap, mva_updates_pool
- Galera replication for percolate indexes
- OPTION morphology
Cmake minimum version is now 3.13. Compiling requires boost and libssl development libraries.
- Commit 6967 fixed crash on many stars at select list for query into many distributed indexes
- Commit 36df fixed #177 large packet via Manticore SQL interface
- Commit 5793 fixed #170 crash of server on RT optimize with MVA updated
- Commit edb2 fixed server crash on binlog removed due to RT index remove after config reload on SIGHUP
- Commit bd3e fixed mysql handshake auth plugin payloads
- Commit 6a21 fixed #172 phrase_boundary settings at RT index
- Commit 3562 fixed #168 deadlock at ATTACH index to itself
- Commit 250b fixed binlog saves empty meta after server crash
- Commit 4aa6 fixed crash of server due to string at sorter from RT index with disk chunks
- SUBSTRING_INDEX()
- SENTENCE and PARAGRAPH support for percolate queries
- systemd generator for Debian/Ubuntu; also added LimitCORE to allow core dumping
- Commit 84fe fixed crash of server on match mode all and empty full text query
- Commit daa8 fixed crash on deleting of static string
- Commit 2207 fixed exit code when indextool failed with FATAL
- Commit 0721 fixed #109 no matches for prefixes due to wrong exact form check
- Commit 8af8 fixed #161 reload of config settings for RT indexes
- Commit e2d5 fixed crash of server on access of large JSON string
- Commit 75cd fixed PQ field at JSON document altered by index stripper causes wrong match from sibling field
- Commit e2f7 fixed crash of server at parse JSON on RHEL7 builds
- Commit 3a25 fixed crash of json unescaping when slash is on the edge
- Commit be9f fixed option 'skip_empty' to skip empty docs and not warn they're not valid json
- Commit 266e fixed #140 output 8 digits on floats when 6 is not enough to be precise
- Commit 3f6d fixed empty jsonobj creation
- Commit f3c7 fixed #160 empty mva outputs NULL instead of an empty string
- Commit 0afa fixed fail to build without pthread_getname_np
- Commit 9405 fixed crash on server shutdown with thread_pool workers
- Distributed indexes for percolate indexes
- CALL PQ new options and changes:
- skip_bad_json
- mode (sparsed/sharded)
- json documents can be passed as a json array
- shift
- Column names 'UID', 'Documents', 'Query', 'Tags', 'Filters' were renamed to 'id', 'documents', 'query', 'tags', 'filters'
- DESCRIBE pq TABLE
- SELECT FROM pq WHERE UID is not possible any more, use 'id' instead
- SELECT over pq indexes is on par with regular indexes (e.g. you can filter rules via REGEX())
- ANY/ALL can be used on PQ tags
- expressions have auto-conversion for JSON fields, not requiring explicit casting
- built-in 'non_cjk' charset_table and 'cjk' ngram_chars
- built-in stopwords collections for 50 languages
- multiple files in a stopwords declaration can also be separated by comma
- CALL PQ can accept JSON array of documents
- Commit a4e1 fixed csjon-related leak
- Commit 28d8 fixed crash because of missed value in json
- Commit bf4e fixed save of empty meta for RT index
- Commit 33b4 fixed lost form flag (exact) for sequence of lemmatizer
- Commit 6b95 fixed string attrs > 4M use saturate instead of overflow
- Commit 6214 fixed crash of server on SIGHUP with disabled index
- Commit 3f7e fixed server crash on simultaneous API session status commands
- Commit cd9e fixed crash of server at delete query to RT index with field filters
- Commit 9376 fixed crash of server at CALL PQ to distributed index with empty document
- Commit 8868 fixed cut Manticore SQL error message larger 512 chars
- Commit de9d fixed crash on save percolate index without binlog
- Commit 2b21 fixed http interface is not working in OSX
- Commit e92c fixed indextool false error message on check of MVA
- Commit 238b fixed write lock at FLUSH RTINDEX to not write lock whole index during save and on regular flush from rt_flush_period
- Commit c26a fixed ALTER percolate index stuck waiting search load
- Commit 9ee5 fixed max_children to use default amount of thread_pool workers for value of 0
- Commit 5138 fixed error on indexing of data into index with index_token_filter plugin along with stopwords and stopword_step=0
- Commit 2add fixed crash with absent lemmatizer_base when still using aot lemmatizers in index definitions
- REGEX function
- limit/offset for json API search
- profiler points for qcache
- Commit eb3c fixed crash of server on FACET with multiple attribute wide types
- Commit d915 fixed implicit group by at main select list of FACET query
- Commit 5c25 fixed crash on query with GROUP N BY
- Commit 85d3 fixed deadlock on handling crash at memory operations
- Commit 8516 fixed indextool memory consumption during check
- Commit 58fb fixed gmock include not needed anymore as upstream resolve itself
- SHOW THREADS in case of remote distributed indexes prints the original query instead of API call
- SHOW THREADS new option
format=sphinxql
prints all queries in SQL format - SHOW PROFILE prints additional
clone_attrs
stage
- Commit 4f15 fixed failed to build with libc without malloc_stats, malloc_trim
- Commit f974 fixed special symbols inside words for CALL KEYWORDS result set
- Commit 0920 fixed broken CALL KEYWORDS to distributed index via API or to remote agent
- Commit fd68 fixed distributed index agent_query_timeout propagate to agents as max_query_time
- Commit 4ffa fixed total documents counter at disk chunk got affected by OPTIMIZE command and breaks weight calculation
- Commit dcaf fixed multiple tail hits at RT index from blended
- Commit eee3 fixed deadlock at rotation
- sort_mode option for CALL KEYWORDS
- DEBUG on VIP connection can perform 'crash
' for intentional SIGEGV action on server - DEBUG can perform 'malloc_stats' for dumping malloc stats in searchd.log 'malloc_trim' to perform a malloc_trim()
- improved backtrace is gdb is present on the system
- Commit 0f3c fixed crash or hfailure of rename on Windows
- Commit 1455 fixed crashes of server on 32-bit systems
- Commit ad37 fixed crash or hung of server on empty SNIPPET expression
- Commit b36d fixed broken non progressive optimize and fixed progressive optimize to not create kill-list for oldest disk chunk
- Commit 34b0 fixed queue_max_length bad reply for SQL and API at thread pool worker mode
- Commit ae4b fixed crash on adding full-scan query to PQ index with regexp or rlp options set
- Commit f80f fixed crash when call one PQ after another
- Commit 9742 refactor AcquireAccum
- Commit 39e5 fixed leak of memory after call pq
- Commit 21bc cosmetic refactor (c++11 style c-trs, defaults, nullptrs)
- Commit 2d69 fixed memory leak on trying to insert duplicate into PQ index
- Commit 5ed9 fixed crash on JSON field IN with large values
- Commit 4a52 fixed crash of server on CALL KEYWORDS statement to RT index with expansion limit set
- Commit 5526 fixed invalid filter at PQ matches query;
- Commit 204f introduce small obj allocator for ptr attrs
- Commit 2545 refactor ISphFieldFilter to refcounted flavour
- Commit 1366 fixed ub/sigsegv when using strtod on non-terminated strings
- Commit 94bc fixed memory leak in json resultset processing
- Commit e78e fixed read over the end of mem block applying attribute add
- Commit fad5 fixed refactor CSphDict for refcount flavour
- Commit fd84 fixed leak of AOT internal type outside
- Commit 5ee7 fixed memory leak tokenizer management
- Commit 116c fixed memory leak in grouper
- Commit 56fd special free/copy for dynamic ptrs in matches (memory leak grouper)
- Commit b1fc fixed memory leak of dynamic strings for RT
- Commit 517b refactor grouper
- Commit b1fc minor refactor (c++11 c-trs, some reformats)
- Commit 7034 refactor ISphMatchComparator to refcounted flavour
- Commit b1fc privatize cloner
- Commit efbc simplify native little-endian for MVA_UPSIZE, DOCINFO2ID_T, DOCINFOSETID
- Commit 6da0 add valgrind support to to ubertests
- Commit 1d17 fixed crash because race of 'success' flag on connection
- Commit 5a09 switch epoll to edge-triggered flavour
- Commit 5d52 fixed IN statement in expression with formatting like at filter
- Commit bd8b fixed crash at RT index on commit of document with large docid
- Commit ce65 fixed argless options in indextool
- Commit 08c9 fixed memory leak of expanded keyword
- Commit 30c7 fixed memory leak of json grouper
- Commit 6023 fixed leak of global user vars
- Commit 7c13 fixed leakage of dynamic strings on early rejected matches
- Commit 9154 fixed leakage on length(
) - Commit 43fc fixed memory leak because strdup() in parser
- Commit 71ff fixed refactor expression parser to accurate follow refcounts
- compatibility with MySQL 8 clients
- TRUNCATE WITH RECONFIGURE
- retired memory counter on SHOW STATUS for RT indexes
- global cache of multi agents
- improved IOCP on Windows
- VIP connections for HTTP protocol
- Manticore SQL DEBUG command which can run various subcommands
- shutdown_token - SHA1 hash of password needed to invoke
shutdown
using DEBUG command - new stats to SHOW AGENT STATUS (_ping, _has_perspool, _need_resolve)
- --verbose option of indexer now accept [debugvv] for printing debug messages
- Commit 3900 removed wlock at optimize
- Commit 4c33 fixed wlock at reload index settings
- Commit b5ea fixed memory leak on query with JSON filter
- Commit 930e fixed empty documents at PQ result set
- Commit 53de fixed confusion of tasks due to removed one
- Commit cad9 fixed wrong remote host counting
- Commit 9000 fixed memory leak of parsed agent descriptors
- Commit 978d fixed leak in search
- Commit 0193 cosmetic changes on explicit/inline c-trs, override/final usage
- Commit 943e fixed leak of json in local/remote schema
- Commit 02db fixed leak of json sorting col expr in local/remote schema
- Commit c74d fixed leak of const alias
- Commit 6e5b fixed leak of preread thread
- Commit 39c7 fixed stuck on exit because of stucked wait in netloop
- Commit adaf fixed stuck of 'ping' behaviour on change HA agent to usual host
- Commit 32c4 separate gc for dashboard storage
- Commit 511a fixed ref-counted ptr fix
- Commit 32c4 fixed indextool crash on unexistent index
- Commit 156e fixed output name of exceeding attr/field in xmlpipe indexing
- Commit cdac fixed default indexer's value if no indexer section in config
- Commit e61e fixed wrong embedded stopwords in disk chunk by RT index after server restart
- Commit 5fba fixed skip phantom (already closed, but not finally deleted from the poller) connections
- Commit f22a fixed blended (orphaned) network tasks
- Commit 4689 fixed crash on read action after write
- Commit 03f9 fixed searchd crashes when running tests on windows
- Commit e925 fixed handle EINPROGRESS code on usual connect()
- Commit 248b fixed connection timeouts when working with TFO
- improved wildcards performance on matching multiple documents at PQ
- support for fullscan queries at PQ
- support for MVA attributes at PQ
- regexp and RLP support for percolate indexes
- Commit 6885 fixed loose of query string
- Commit 0f17 fixed empty info at SHOW THREADS statement
- Commit 53fa fixed crash on matching with NOTNEAR operator
- Commit 2602 fixed error message on bad filter to PQ delete
- reduced number of syscalls to avoid Meltdown and Spectre patches impact
- internal rewrite of local index management
- remote snippets refactor
- full configuration reload
- all node connections are now independent
- proto improvements
- Windows communication switched from wsapoll to IO completion ports
- TFO can be used for communication between master and nodes
- SHOW STATUS now outputs to server version and mysql_version_string
- added
docs_id
option for documents called in CALL PQ. - percolate queries filter can now contain expressions
- distributed indexes can work with FEDERATED
- dummy SHOW NAMES COLLATE and
SET wait_timeout
(for better ProxySQL compatibility)
- Commit 5bcf fixed added not equal to tags of PQ
- Commit 9ebc fixed added document id field to JSON document CALL PQ statement
- Commit 8ae0 fixed flush statement handlers to PQ index
- Commit c24b fixed PQ filtering on JSON and string attributes
- Commit 1b8b fixed parsing of empty JSON string
- Commit 1ad8 fixed crash at multi-query with OR filters
- Commit 69b8 fixed indextool to use config common section (lemmatizer_base option) for commands (dumpheader)
- Commit 6dbe fixed empty string at result set and filter
- Commit 39c4 fixed negative document id values
- Commit 266b fixed word clip length for very long words indexed
- Commit 4782 fixed matching multiple documents of wildcard queries at PQ
- MySQL FEDERATED engine support
- MySQL packets return now SERVER_STATUS_AUTOCOMMIT flag, adds compatibility with ProxySQL
- listen_tfo - enable TCP Fast Open connections for all listeners
- indexer --dumpheader can dump also RT header from .meta file
- cmake build script for Ubuntu Bionic
- Commit 355b fixed invalid query cache entries for RT index;
- Commit 546e fixed index settings got lost next after seamless rotation
- Commit 0c45 fixed fixed infix vs prefix length set; added warning on unsupportedinfix length
- Commit 8054 fixed RT indexes auto-flush order
- Commit 705d fixed result set schema issues for index with multiple attributes and queries to multiple indexes
- Commit b0ba fixed some hits got lost at batch insert with document duplicates
- Commit 4510 fixed optimize failed to merge disk chunks of RT index with large documents count
- jemalloc at compilation. If jemalloc is present on system, it can be enabled with cmake flag
-DUSE_JEMALLOC=1
- Commit 85a6 fixed log expand_keywords option into Manticore SQL query log
- Commit caaa fixed HTTP interface to correctly process query with large size
- Commit e386 fixed crash of server on DELETE to RT index with index_field_lengths enable
- Commit cd53 fixed cpustats searchd cli option to work with unsupported systems
- Commit 8740 fixed utf8 substring matching with min lengths defined
- improved Percolate Queries performance in case of using NOT operator and for batched documents.
- percolate_query_call can use multiple threads depending on dist_threads
- new full-text matching operator NOTNEAR/N
- LIMIT for SELECT on percolate indexes
- expand_keywords can accept 'start','exact' (where 'star,exact' has same effect as '1')
- ranged-main-query for joined fields which uses the ranged query defined by sql_query_range
- Commit 72dc fixed crash on searching ram segments; deadlock on save disk chunk with double buffer; deadlock on save disk chunk during optimize
- Commit 3613 fixed indexer crash on xml embedded schema with empty attribute name
- Commit 48d7 fixed erroneous unlinking of not-owned pid-file
- Commit a556 fixed orphaned fifos sometimes left in temp folder
- Commit 2376 fixed empty FACET result set with wrong NULL row
- Commit 4842 fixed broken index lock when running server as windows service
- Commit be35 fixed wrong iconv libs on mac os
- Commit 8374 fixed wrong count(*)
- agent_retry_count in case of agents with mirrors gives the value of retries per mirror instead of per agent, the total retries per agent being agent_retry_count*mirrors.
- agent_retry_count can now be specified per index, overriding global value. An alias mirror_retry_count is added.
- a retry_count can be specified in agent definition and the value represents retries per agent
- Percolate Queries are now in HTTP JSON API at /json/pq.
- Added -h and -v options (help and version) to executables
- morphology_skip_fields support for Real-Time indexes
- Commit a40b fixed ranged-main-query to correctly work with sql_range_step when used at MVA field
- Commit f2f5 fixed issue with blackhole system loop hung and blackhole agents seems disconnected
- Commit 84e1 fixed query id to be consistent, fixed duplicated id for stored queries
- Commit 1948 fixed server crash on shutdown from various states
- Commit 9a70 Commit 3495 timeouts on long queries
- Commit 3359 refactored master-agent network polling on kqueue-based systems (Mac OS X, BSD).
- HTTP JSON: JSON queries can now do equality on attributes, MVA and JSON attributes can be used in inserts and updates, updates and deletes via JSON API can be performed on distributed indexes
- Percolate Queries
- Removed support for 32-bit docids from the code. Also removed all the code that converts/loads legacy indexes with 32-bit docids.
- Morphology only for certain fields . A new index directive morphology_skip_fields allows defining a list of fields for which morphology does not apply.
- expand_keywords can now be a query runtime directive set using the OPTION statement
- Commit 0cfa fixed crash on debug build of server (and m.b. UB on release) when built with rlp
- Commit 3242 fixed RT index optimize with progressive option enabled that merges kill-lists with wrong order
- Commit ac0e minor crash on mac
- lots of minor fixes after thorough static code analysis
- other minor bugfixes
In this release we've changed internal protocol used by masters and agents to speak with each other. In case you run Manticoresearch in a distributed environment with multiple instances make sure your first upgrade agents, then the masters.
- JSON queries on HTTP API protocol. Supported search, insert, update, delete, replace operations. Data manipulation commands can be also bulked, also there are some limitations currently as MVA and JSON attributes can't be used for inserts, replaces or updates.
- RELOAD INDEXES command
- FLUSH LOGS command
- SHOW THREADS can show progress of optimize, rotation or flushes.
- GROUP N BY work correctly with MVA attributes
- blackhole agents are run on separate thread to not affect master query anymore
- implemented reference count on indexes, to avoid stalls caused by rotations and high load
- SHA1 hashing implemented, not exposed yet externally
- fixes for compiling on FreeBSD, macOS and Alpine
- Commit 9897 filter regression with block index
- Commit b1c3 rename PAGE_SIZE -> ARENA_PAGE_SIZE for compatibility with musl
- Commit f213 disable googletests for cmake < 3.1.0
- Commit f30e failed to bind socket on server restart
- Commit 0807 fixed crash of server on shutdown
- Commit 3e3a fixed show threads for system blackhole thread
- Commit 262c Refactored config check of iconv, fixes building on FreeBSD and Darwin
- OR operator in WHERE clause between attribute filters
- Maintenance mode ( SET MAINTENANCE=1)
- CALL KEYWORDS available on distributed indexes
- Grouping in UTC
- query_log_mode for custom log files permissions
- Field weights can be zero or negative
- max_query_time can now affect full-scans
- added net_wait_tm, net_throttle_accept and net_throttle_action for network thread fine tuning (in case of workers=thread_pool)
- COUNT DISTINCT works with facet searches
- IN can be used with JSON float arrays
- multi-query optimization is not broken anymore by integer/float expressions
- SHOW META shows a
multiplier
row when multi-query optimization is used
Manticore Search is built using cmake and the minimum gcc version required for compiling is 4.7.2.
- Manticore Search runs under
manticore
user. - Default data folder is now
/var/lib/manticore/
. - Default log folder is now
/var/log/manticore/
. - Default pid folder is now
/var/run/manticore/
.
- Commit a58c fixed SHOW COLLATION statement that breaks java connector
- Commit 631c fixed crashes on processing distributed indexes; added locks to distributed index hash; removed move and copy operators from agent
- Commit 942b fixed crashes on processing distributed indexes due to parallel reconnects
- Commit e5c1 fixed crash at crash handler on store query to server log
- Commit 4a4b fixed a crash with pooled attributes in multiqueries
- Commit 3873 fixed reduced core size by prevent index pages got included into core file
- Commit 11e6 fixed searchd crashes on startup when invalid agents are specified
- Commit 4ca6 fixed indexer reports error in sql_query_killlist query
- Commit 123a fixed fold_lemmas=1 vs hit count
- Commit cb99 fixed inconsistent behavior of html_strip
- Commit e406 fixed optimize rt index loose new settings; fixed optimize with sync option lock leaks;
- Commit 86ae fixed processing erroneous multiqueries
- Commit 2645 fixed result set depends on multi-query order
- Commit 7239 fixed server crash on multi-query with bad query
- Commit f353 fixed shared to exclusive lock
- Commit 3754 fixed server crash for query without indexes
- Commit 29f3 fixed dead lock of server
- Manticore branding
Unfortunately, Manticore is not yet 100% bug-free, although the development team is working hard towards that goal. You may encounter some issues from time to time. It is crucial to report as much information as possible about each bug to fix it effectively. To fix a bug, either it needs to be reproduced and fixed or its cause needs to be deduced based on the information you provide. To help with this, please follow the instructions below.
Bugs and feature requests are tracked on Github. You are welcome to create a new ticket and describe your bug in detail to save time for both you and the developers.
Updates to the documentation (what you are reading now) are also done on Github.
Manticore Search is written in C++, which is a low-level programming language that allows for direct communication with the computer for faster performance. However, there is a drawback to this approach as in rare cases, it may not be possible to elegantly handle a bug by writing an error to a log and skipping the processing of the command that caused the problem. Instead, the program may crash, resulting in it stopping completely and needing to be restarted.
When Manticore Search crashes, it is important to let the Manticore team know by submitting a bug report on GitHub or through Manticore's professional services in your private helpdesk. The Manticore team requires the following information:
- The searchd log
- The coredump
- The query log
Additionally, it would be helpful if you could do the following:
- Run gdb to inspect the coredump:
gdb /usr/bin/searchd </path/to/coredump>
- Find the crashed thread ID in the coredump file name (make sure you have
%p
in /proc/sys/kernel/core_pattern), e.g.core.work_6.29050.server_name.1637586599
means thread_id=29050 - In gdb run:
set pagination off info threads # find thread number by it's id (e.g. for `LWP 29050` it will be thread number 8 thread apply all bt thread <thread number> bt full info locals quit
- Provide the outputs
If Manticore Search hangs, you need to collect some information that may be useful in understanding the cause. Here's how you can do it:
-
Run
show threads option format=all
trough a VIP port -
collect the lsof output output, as hanging can be caused by too many connections or open file descriptors.
lsof -p `cat /var/run/manticore/searchd.pid`
-
Dump the core:
gcore `cat /var/run/manticore/searchd.pid`
(It will save the dump to the current directory.)
-
Install and run gdb:
gdb /usr/bin/searchd `cat /var/run/manticore/searchd.pid`
Note that this will halt your running searchd, but if it's already hanging, it shouldn't be a problem.
-
In gdb run:
set pagination off info threads thread apply all bt quit
-
Collect all the outputs and files and provide them in a bug report.
For experts: the macros added in this commit can be helpful in debugging.
- make sure you run searchd with
--coredump
. To avoid modifying scripts, you can use the https://manual.manticoresearch.com/Starting_the_server/Linux#Custom-startup-flags-using-systemd , method. For example::
[root@srv lib]# systemctl set-environment _ADDITIONAL_SEARCHD_PARAMS='--coredump'
[root@srv lib]# systemctl restart manticore
[root@srv lib]# ps aux|grep searchd
mantico+ 1955 0.0 0.0 61964 1580 ? S 11:02 0:00 /usr/bin/searchd --config /etc/manticoresearch/manticore.conf --coredump
mantico+ 1956 0.6 0.0 392744 2664 ? Sl 11:02 0:00 /usr/bin/searchd --config /etc/manticoresearch/manticore.conf --coredump
-
Ensure that your operating system allows you to save core dumps by checking that:
/proc/sys/kernel/core_pattern
is not empty. This is the location where the core dumps will be saved. To save core dumps to a file such ascore.searchd.1773.centos-4gb-hel1-1.1636454937
, run the following command:echo "/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern
-
searchd should be started with
ulimit -c unlimited
. If you start Manticore using systemctl, it will automatically set the limit to infinity as indicated by the following line in the manticore.service file:[root@srv lib]# grep CORE /lib/systemd/system/manticore.service LimitCORE=infinity
Manticore Search and Manticore Columnar Library are written in C++, which results in compiled binary files that execute optimally on your operating system. However, when running a binary, your system does not have full access to the names of variables, functions, methods, and classes. This information is provided in separate "debuginfo" or "symbol packages."
Debug symbols are essential for troubleshooting and debugging, as they allow you to visualize the system state when it crashed, including the names of functions. Manticore Search provides a backtrace in the searchd log and generates a coredump if run with the --coredump flag. Without symbols, all you will see is internal offsets, making it difficult or impossible to decode the cause of the crash. If you need to make a bug report about a crash, the Manticore team will often require debug symbols to assist you.
To install Manticore Search/Manticore Columnar Library debug symbols, you will need to install the *debuginfo*
package for CentOS, the *dbgsym*
package for Ubuntu and Debian, or the *dbgsymbols*
package for Windows and macOS. These packages should be the same version as the installed Manticore. For example, if you installed Manticore Search in Centos 8 from the package https://repo.manticoresearch.com/repository/manticoresearch/release/centos/8/x86_64/manticore-4.0.2_210921.af497f245-1.el8.x86_64.rpm , the corresponding package with symbols would be https://repo.manticoresearch.com/repository/manticoresearch/release/centos/8/x86_64/manticore-debuginfo-4.0.2_210921.af497f245-1.el8.x86_64.rpm
Note that both packages have the same commit id af497f245
, which corresponds to the commit that this version was built from.
If you have installed Manticore from a Manticore APT/YUM repository, you can use one of the following tools:
debuginfo-install
in CentOS 7dnf debuginfo-install
CentOS 8find-dbgsym-packages
in Debian and Ubuntu
to find a debug symbols package for you.
- Find the build ID in the output of
file /usr/bin/searchd
:
[root@srv lib]# file /usr/bin/searchd
/usr/bin/searchd: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=2c582e9f564ea1fbeb0c68406c271ba27034a6d3, stripped
In this case, the build ID is 2c582e9f564ea1fbeb0c68406c271ba27034a6d3.
- Find symbols in
/usr/lib/debug/.build-id
like this:
[root@srv ~]# ls -la /usr/lib/debug/.build-id/2c/582e9f564ea1fbeb0c68406c271ba27034a6d3*
lrwxrwxrwx. 1 root root 23 Nov 9 10:42 /usr/lib/debug/.build-id/2c/582e9f564ea1fbeb0c68406c271ba27034a6d3 -> ../../../../bin/searchd
lrwxrwxrwx. 1 root root 27 Nov 9 10:42 /usr/lib/debug/.build-id/2c/582e9f564ea1fbeb0c68406c271ba27034a6d3.debug -> ../../usr/bin/searchd.debug
To fix your bug, developers often need to reproduce it locally. To do this, they need your configuration file, table files, binlog (if present), and sometimes source data (such as data from external storages or XML/CSV files) and queries.
Attach your data when you create a ticket on Github. If the data is too large or sensitive, you can upload it to our write-only S3 storage at s3://s3.manticoresearch.com/write-only/
. Here's how you can do it using the Minio client:
- Install the client https://min.io/docs/minio/linux/reference/minio-mc.html#install-mc For example on 64-bit Linux:
curl https://dl.min.io/client/mc/release/linux-amd64/mc \
--create-dirs \
-o $HOME/minio-binaries/mc
chmod +x $HOME/minio-binaries/mc
export PATH=$PATH:$HOME/minio-binaries/
- Add our s3 host (use full path to executable or change into its directory):
cd $HOME/minio-binaries
and then./mc config host add manticore http://s3.manticoresearch.com:9000 manticore manticore
- Copy your files (use full path to executable or change into its directory):
cd $HOME/minio-binaries
and then./mc cp -r issue-1234/ manticore/write-only/issue-1234
. Make sure the folder name is unique and best if it corresponds to the issue on GitHub where you described the bug.
DEBUG [ subcommand ]
The DEBUG
statement is designed for developers and testers to call various internal or VIP commands. However, it is not intended for production use as the syntax of the subcommand
component may change freely in any build.
To view a list of useful commands and DEBUG
statement subcommands available in the current context, simply call DEBUG
without any parameters.
mysql> debug;
+-------------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| command | meaning |
+-------------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| flush logs | emulate USR1 signal |
| reload indexes | emulate HUP signal |
| debug token <password> | calculate token for password |
| debug malloc_stats | perform 'malloc_stats', result in searchd.log |
| debug malloc_trim | pefrorm 'malloc_trim' call |
| debug sleep <N> | sleep for <N> seconds |
| debug tasks | display global tasks stat (use select from @@system.tasks instead) |
| debug sched | display task manager schedule (use select from @@system.sched instead) |
| debug merge <TBL> [chunk] <X> [into] [chunk] <Y> [option sync=1,byid=0] | For RT table <TBL> merge disk chunk X into disk chunk Y |
| debug drop [chunk] <X> [from] <TBL> [option sync=1] | For RT table <TBL> drop disk chunk X |
| debug files <TBL> [option format=all|external] | list files belonging to <TBL>. 'all' - including external (wordforms, stopwords, etc.) |
| debug close | ask server to close connection from it's side |
| debug compress <TBL> [chunk] <X> [option sync=1] | Compress disk chunk X of RT table <TBL> (wipe out deleted documents) |
| debug split <TBL> [chunk] <X> on @<uservar> [option sync=1] | Split disk chunk X of RT table <TBL> using set of DocIDs from @uservar |
| debug wait <cluster> [like 'xx'] [option timeout=3] | wait <cluster> ready, but no more than 3 secs. |
| debug wait <cluster> status <N> [like 'xx'] [option timeout=13] | wait <cluster> commit achieve <N>, but no more than 13 secs |
| debug meta | Show max_matches/pseudo_shards. Needs set profiling=1 |
| debug trace OFF|'path/to/file' [<N>] | trace flow to file until N bytes written, or 'trace OFF' |
| debug curl <URL> | request given url via libcurl |
+-------------------------------------------------------------------------+----------------------------------------------------------------------------------------+
19 rows in set (0.00 sec)
Same from VIP connection:
mysql> debug;
+-------------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| command | meaning |
+-------------------------------------------------------------------------+----------------------------------------------------------------------------------------+
| flush logs | emulate USR1 signal |
| reload indexes | emulate HUP signal |
| debug shutdown <password> | emulate TERM signal |
| debug crash <password> | crash daemon (make SIGSEGV action) |
| debug token <password> | calculate token for password |
| debug malloc_stats | perform 'malloc_stats', result in searchd.log |
| debug malloc_trim | pefrorm 'malloc_trim' call |
| debug procdump | ask watchdog to dump us |
| debug setgdb on|off | enable or disable potentially dangerous crash dumping with gdb |
| debug setgdb status | show current mode of gdb dumping |
| debug sleep <N> | sleep for <N> seconds |
| debug tasks | display global tasks stat (use select from @@system.tasks instead) |
| debug sched | display task manager schedule (use select from @@system.sched instead) |
| debug merge <TBL> [chunk] <X> [into] [chunk] <Y> [option sync=1,byid=0] | For RT table <TBL> merge disk chunk X into disk chunk Y |
| debug drop [chunk] <X> [from] <TBL> [option sync=1] | For RT table <TBL> drop disk chunk X |
| debug files <TBL> [option format=all|external] | list files belonging to <TBL>. 'all' - including external (wordforms, stopwords, etc.) |
| debug close | ask server to close connection from it's side |
| debug compress <TBL> [chunk] <X> [option sync=1] | Compress disk chunk X of RT table <TBL> (wipe out deleted documents) |
| debug split <TBL> [chunk] <X> on @<uservar> [option sync=1] | Split disk chunk X of RT table <TBL> using set of DocIDs from @uservar |
| debug wait <cluster> [like 'xx'] [option timeout=3] | wait <cluster> ready, but no more than 3 secs. |
| debug wait <cluster> status <N> [like 'xx'] [option timeout=13] | wait <cluster> commit achieve <N>, but no more than 13 secs |
| debug meta | Show max_matches/pseudo_shards. Needs set profiling=1 |
| debug trace OFF|'path/to/file' [<N>] | trace flow to file until N bytes written, or 'trace OFF' |
| debug curl <URL> | request given url via libcurl |
+-------------------------------------------------------------------------+----------------------------------------------------------------------------------------+
24 rows in set (0.00 sec)
All debug XXX
commands should be regarded as non-stable and subject to modification at any time, so don't be surprised if they change. This example output may not reflect the actual available commands, so try it on your system to see what is available on your instance. Additionally, there is no detailed documentation provided aside from this short 'meaning' column.
As a quick illustration, two commands available only to VIP clients are described below - shutdown and crash. Both require a token, which can be generated with the debug token subcommand, and added to the shutdown_token param in the searchd section of the config file. If no such section exists, or if the provided password hash does not match the token stored in the config, the subcommands will do nothing.
mysql> debug token hello;
+-------------+------------------------------------------+
| command | result |
+-------------+------------------------------------------+
| debug token | aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d |
+-------------+------------------------------------------+
1 row in set (0,00 sec)
The subcommand shutdown
will send a TERM signal to the server, causing it to shut down. This can be dangerous, as nobody wants to accidentally stop a production service. Therefore, it requires a VIP connection and the password to be used.
The subcommand crash
literally causes a crash. It may be used for testing purposes, such as to test how the system manager maintains the service's liveness or to test the feasibility of tracking coredumps.
If some commands are found to be useful in a more general context, they may be moved from the debug subcommands to a more stable and generic location (as exemplified by the debug tasks
and debug sched
in the table).
- CREATE TABLE - Creates new table
- CREATE TABLE LIKE - Creates table using another one as a template
- CREATE TABLE LIKE ... WITH DATA - Copies a table
- DESCRIBE - Prints out table's field list and their types
- ALTER TABLE - Changes table schema / settings
- ALTER TABLE REBUILD SECONDARY - Updates/recovers secondary indexes
- ALTER TABLE type='distributed' - Updates/recovers secondary indexes
- ALTER TABLE RENAME
- DROP TABLE IF EXISTS - Deletes a table (if it exists)
- SHOW TABLES - Shows tables list
- SHOW CREATE TABLE - Shows SQL command how to create the table
- SHOW TABLE STATUS - Shows information about current table status
- SHOW TABLE SETTINGS - Shows table settings
- INSERT - Adds new documents
- REPLACE - Replaces existing documents with new ones
- REPLACE .. SET - Replaces one or multiple fields in a table
- UPDATE - Does in-place update in documents
- DELETE - Deletes documents
- TRUNCATE TABLE - Deletes all documents from table
- BACKUP - Backs up your tables
- SELECT - Searches
- WHERE - Filters
- GROUP BY - Groups search results
- GROUP BY ORDER - Orders groups
- GROUP BY HAVING - Filters groups
- OPTION - Query Options
- FACET - Faceted search
- SUB-SELECTS - About using SELECT sub-queries
- JOIN - Joining tables in SELECT
- EXPLAIN QUERY - Shows query execution plan without running the query itself
- SHOW META - Shows extended information about executed query
- SHOW PROFILE - Shows profiling information about executed query
- SHOW PLAN - Shows query execution plan after the query was executed
- SHOW WARNINGS - Shows warnings from the latest query
- FLUSH ATTRIBUTES - Forces flushing updated attributes to disk
- FLUSH HOSTNAMES - Renews IPs associates to agent host names
- FLUSH LOGS - Initiates reopen of searchd log and query log files (similar to USR1)
- FLUSH RAMCHUNK - Force creating a new disk chunk
- FLUSH TABLE - Flushes real-time table RAM chunk to disk
- OPTIMIZE TABLE - Enqueues real-time table for optimization
- ATTACH TABLE - Moves data from a plain table to a real-time table
- IMPORT TABLE - Imports previously created RT or PQ table into a server running in the RT mode
- JOIN CLUSTER - Joins a replication cluster
- ALTER CLUSTER - Adds/deletes a table to a replication cluster
- SET CLUSTER - Changes replication cluster settings
- DELETE CLUSTER - Deletes a replication cluster
- RELOAD TABLE - Rotates a plain table
- RELOAD TABLES - Rotates all plain tables
- CALL SUGGEST, CALL QSUGGEST - Suggests spell-corrected words
- CALL SNIPPETS - Builds a highlighted results snippet from provided data and query
- CALL PQ - Runs a percolate query
- CALL KEYWORDS - Used to check how keywords are tokenized. Also allows to retrieve tokenized forms of provided keywords
- CREATE FUNCTION - Installs a user-defined function (UDF)
- DROP FUNCTION - Drops a user-defined function (UDF)
- CREATE PLUGIN - Installs a plugin
- CREATE BUDDY PLUGIN - Installs a Buddy plugin
- DROP PLUGIN - Drops a plugin
- DROP BUDDY PLUGIN - Drops a Buddy plugin
- RELOAD PLUGINS - Reloads all plugins from a given library
- ENABLE BUDDY PLUGIN - Reactivates a previously disabled Buddy plugin
- DISABLE BUDDY PLUGIN - Deactivates an active Buddy plugin
- SHOW STATUS - Displays a number of useful performance counters
- SHOW THREADS - Lists all currently active client threads
- SHOW VARIABLES - Lists server-wide variables and their values
- SHOW VERSION - Provides detailed version information of various components of the instance.
- /sql - Execute an SQL statement over HTTP JSON
- /cli - Provides an HTTP command line interface
- /insert - Inserts a document into a real-time table
- /pq/tbl_name/doc - Adds a PQ rule to a percolate table
- /update - Updates a document in a real-time table
- /replace - Replaces an existing document in a real-time table or inserts it if it doesn't exist
- /pq/tbl_name/doc/N?refresh=1 - Replaces a PQ rule in a percolate table
- /delete - Removes a document from a table
- /bulk - Executes multiple insert, update, or delete operations in a single call. Learn more about bulk inserts here.
- /search - Performs a search
- /search -> knn - Performs a KNN vector search
- /pq/tbl_name/search - Performs a reverse search in a percolate table
- /tbl_name/_mapping - Creates a table schema in the Elasticsearch style
- access_plain_attrs
- access_blob_attrs
- access_doclists
- access_hitlists
- access_dict
- attr_update_reserve
- bigram_freq_words
- bigram_index
- blend_chars
- blend_mode
- charset_table
- dict
- docstore_block_size
- docstore_compression
- docstore_compression_level
- embedded_limit
- exceptions
- expand_keywords
- global_idf
- hitless_words
- html_index_attrs
- html_remove_elements
- html_strip
- ignore_chars
- index_exact_words
- index_field_lengths
- index_sp
- index_token_filter
- index_zones
- infix_fields
- inplace_enable
- inplace_hit_gap
- inplace_reloc_factor
- inplace_write_factor
- killlist_target
- max_substring_len
- min_infix_len
- min_prefix_len
- min_stemming_len
- min_word_len
- morphology
- morphology_skip_fields
- ngram_chars
- ngram_len
- overshort_step
- path
- phrase_boundary
- phrase_boundary_step
- prefix_fields
- preopen
- read_buffer_docs
- read_buffer_hits
- regexp_filter
- stopwords
- stopword_step
- stopwords_unstemmed
- type
- wordforms
- local
- agent
- agent_connect_timeout
- agent_blackhole
- agent_persistent
- agent_query_timeout
- agent_retry_count
- ha_strategy
- mirror_retry_count
- rt_attr_bigint
- rt_attr_bool
- rt_attr_float
- rt_attr_float_vector
- rt_attr_json
- rt_attr_multi_64
- rt_attr_multi
- rt_attr_string
- rt_attr_timestamp
- rt_attr_uint
- rt_field
- rt_mem_limit
- OR
- MAYBE
- NOT - NOT operator
- @field - field search operator
- @field[N] - field position limit modifier
- @(field1,field2,...) - multi-field search operator
- @!field - ignore field search operator
- @!(field1,field2,...) - ignore multi-field search operator
- @* - all-field search operator
- "word1 word2 ... " - phrase search operator
- "word1 word2 ... "~N - proximity search operator
- "word1 word2 ... "/N - quorum matching operator
- word1 << word2 << word3 - strict order operator
- =word1 - exact form modifier
- ^word1 - field-start modifier
- word2$ - field-end modifier
- word^N - keyword IDF boost modifier
- word1 NEAR/N word2 - NEAR, generalized proximity operator
- word1 NOTNEAR/N word2 - NOTNEAR, negative assertion operator
- word1 PARAGRAPH word2 PARAGRAPH "word3 word4" - PARAGRAPH operator
- word1 SENTENCE word2 SENTENCE "word3 word4" - SENTENCE operator
- ZONE:(h3,h4) - ZONE limit operator
- ZONESPAN:(h2) - ZONESPAN limit operator
- @@relaxed - suppresses errors about missing fields
- t?st - wildcard operators
- REGEX(/pattern/) - REGEX operator
- ABS() - Returns absolute value
- ATAN2() - Returns arctangent function of two arguments
- BITDOT() - Returns sum of products of each bit of a mask multiplied with its weight
- CEIL() - Returns smallest integer value greater or equal to the argument
- COS() - Returns cosine of the argument
- CRC32() - Returns CRC32 value of the argument
- EXP() - Returns exponent of the argument
- FIBONACCI() - Returns the N-th Fibonacci number, where N is the integer argument
- FLOOR() - Returns the largest integer value lesser or equal to the argument
- GREATEST() - Takes JSON/MVA array as the argument and returns the greatest value in that array
- IDIV() - Returns result of an integer division of the first argument by the second argument
- LEAST() - Takes JSON/MVA array as the argument, and returns the least value in that array
- LN() - Returns natural logarithm of the argument
- LOG10() - Returns common logarithm of the argument
- LOG2() - Returns binary logarithm of the argument
- MAX() - Returns the larger of two arguments
- MIN() - Returns the smaller of two arguments
- POW() - Returns the first argument raised to the power of the second argument
- RAND() - Returns random float between 0 and 1
- SIN() - Returns sine of the argument
- SQRT() - Returns square root of the argument
- BM25F() - Returns precise BM25F formula value
- EXIST() - Replaces non-existing columns with default values
- GROUP_CONCAT() - Produces a comma-separated list of the attribute values of all documents in the group
- HIGHLIGHT() - Highlights search results
- MIN_TOP_SORTVAL() - Returns sort key value of the worst found element in the current top-N matches
- MIN_TOP_WEIGHT() - Returns weight of the worst found element in the current top-N matches
- PACKEDFACTORS() - Outputs weighting factors
- REMOVE_REPEATS() - Removes repeated adjusted rows with the same 'column' value
- WEIGHT() - Returns fulltext match score
- ZONESPANLIST() - Returns pairs of matched zone spans
- QUERY() - Returns current full-text query
- BIGINT() - Forcibly promotes the integer argument to 64-bit type
- DOUBLE() - Forcibly promotes given argument to floating point type
- INTEGER() - Forcibly promotes given argument to 64-bit signed type
- TO_STRING() - Forcibly promotes the argument to string type
- UINT() - Converts the given argument to 32-bit unsigned integer type
- UINT64() - Converts the given argument to 64-bit unsigned integer type
- SINT() - Interprets 32-bit unsigned integer as signed 64-bit integer
- ALL() - Returns 1 if condition is true for all elements in the array
- ANY() - Returns 1 if condition is true for any element in the array
- CONTAINS() - Checks whether the (x,y) point is within the given polygon
- IF() - Checks whether the 1st argument is equal to 0.0, returns the 2nd argument if it is not zero or the 3rd one when it is
- IN() - Returns 1 if the first argument is equal to any of the other arguments, or 0 otherwise
- INDEXOF() - Iterates through all elements in the array and returns index of the first matching element
- INTERVAL() - Returns index of the argument that is less than the first argument
- LENGTH() - Returns number of elements in MVA
- REMAP() - Allows to make some exceptions of expression values depending on the condition values
- NOW() - Returns current timestamp as an INTEGER
- CURTIME() - Returns current time in local timezone
- CURDATE() - Returns current date in local timezone
- UTC_TIME() - Returns current time in UTC timezone
- UTC_TIMESTAMP() - Returns current date/time in UTC timezone
- SECOND() - Returns integer second from the timestamp argument
- MINUTE() - Returns integer minute from the timestamp argument
- HOUR() - Returns integer hour from the timestamp argument
- DAY() - Returns integer day from the timestamp argument
- MONTH() - Returns integer month from the timestamp argument
- QUARTER() - Returns the integer quarter of the year from a timestamp argument
- YEAR() - Returns integer year from the timestamp argument
- DAYNAME() - Returns the weekday name for a given timestamp argument
- MONTHNAME() - Returns the name of the month for a given timestamp argument
- DAYOFWEEK() - Returns the integer weekday index for a given timestamp argument
- DAYOFYEAR() - Returns the integer day of the year for a given timestamp argument
- YEARWEEK() - Returns the integer year and the day code of the first day of current week for a given timestamp argument
- YEARMONTH() - Returns integer year and month code from the timestamp argument
- YEARMONTHDAY() - Returns integer year, month and day code from the timestamp argument
- TIMEDIFF() - Returns difference between the timstamps
- DATEDIFF() - Returns the number of days between two given timestamps
- DATE() - Formats the date part from a timestamp argument
- TIME() - Formats the time part from a timestamp argument
- DATE_FORMAT() - Returns a formatted string based on the provided date and format arguments
- GEODIST() - Computes geosphere distance between two given points
- GEOPOLY2D() - Creates a polygon that takes in account the Earth's curvature
- POLY2D() - Creates a simple polygon in plain space
- CONCAT() - Concatenates two or more strings
- REGEX() - Returns 1 if regular expression matched to string of attribute and 0 otherwise
- SNIPPET() - Highlights search results
- SUBSTRING_INDEX() - Returns a substring of the string before the specified number of delimiter occurs
- CONNECTION_ID() - Returns the current connection ID
- KNN_DIST() - Returns KNN vector search distance
- LAST_INSERT_ID() - Returns ids of documents inserted or replaced by last statement in the current session
To be put to section common {}
in configuration file:
- lemmatizer_base - Lemmatizer dictionaries base path
- progressive_merge - Defines order of merging disk chunks in a real-time table
- json_autoconv_keynames - Whether and how to auto-convert key names within JSON attributes
- json_autoconv_numbers - Automatically detects and converts possible JSON strings that represent numbers into numeric attributes
- on_json_attr_error - What to do if JSON format errors are found
- plugin_dir - Location for the dynamic libraries and UDFs
indexer
is a tool to create plain tables
To be put to section indexer {}
in configuration file:
- lemmatizer_cache - Lemmatizer cache size
- max_file_field_buffer - Maximum file field adaptive buffer size
- max_iops - Maximum indexation I/O operations per second
- max_iosize - Maximum allowed I/O operation size
- max_xmlpipe2_field - Maximum allowed field size for XMLpipe2 source type
- mem_limit - Indexing RAM usage limit
- on_file_field_error - How to handle IO errors in file fields
- write_buffer - Write buffer size
- ignore_non_plain - To ignore warnings about non-plain tables
indexer [OPTIONS] [indexname1 [indexname2 [...]]]
- --all - Rebuilds all tables from the config
- --buildstops - Analyzes the table source as if indexing the data, generating a list of indexed terms
- --buildfreqs - Adds the frequency count to the table for --buildstops
- --config, -c - Specifies the path to the configuration file
- --dump-rows - Dumps rows retrieved by SQL source(s) into the specified file
- --help - Displays all available parameters
- --keep-attrs - Allows reuse of existing attributes when reindexing
- --keep-attrs-names - Specifies which attributes to reuse from the existing table
- --merge-dst-range - Applies the given filter range during merging
- --merge-killlists - Alters kill list processing when merging tables
- --merge - Combines two plain tables into one
- --nohup - Prevents indexer from sending SIGHUP when this option is enabled
- --noprogress - Hides progress details
- --print-queries - Outputs SQL queries sent by the indexer to the database
- --print-rt - Displays data fetched from SQL source(s) as INSERTs into a real-time table
- --quiet - Suppresses all output
- --rotate - Initiates table rotation after all tables are built
- --sighup-each - Triggers rotation of each table after it's built
- -v - Displays indexer version
index_converter
is a tool designed to convert tables created with Sphinx/Manticore Search 2.x into the Manticore Search 3.x table format.
index_converter {--config /path/to/config|--path}
- --config, -c - Path to table configuration file
- --index - Specifies which table to convert
- --path - Sets path containing table(s) instead of the configuration file
- --strip-path - Removes path from filenames referenced by table
- --large-docid - Allows conversion of documents with ids larger than 2^63
- --output-dir - Writes new files in a specified folder
- --all - Converts all tables from the configuration file / path
- --killlist-target - Sets target tables for applying kill-lists
searchd
is the Manticore server.
To be put in the searchd {}
section of the configuration file:
- access_blob_attrs - Defines how table's blob attributes file is accessed
- access_doclists - Defines how table's doclists file is accessed
- access_hitlists - Defines how table's hitlists file is accessed
- access_plain_attrs - Defines how search server accesses table's plain attributes
- access_dict - Defines how table's dictionary file is accessed
- agent_connect_timeout - Remote agent connection timeout
- agent_query_timeout - Remote agent query timeout
- agent_retry_count - Specifies the number of times Manticore tries to connect and query remote agents
- agent_retry_delay - Specifies the delay before retrying to query a remote agent in case of failure
- attr_flush_period - Sets the time period between flushing updated attributes to disk
- binlog_flush - Binary log transaction flush/sync mode
- binlog_max_log_size - Maximum binary log file size
- binlog_path - Binary log files path
- client_timeout - Maximum time to wait between requests when using persistent connections
- collation_libc_locale - Server libc locale
- collation_server - Default server collation
- data_dir - Path to data directory where Manticore stores everything (RT mode)
- docstore_cache_size - Maximum size of document blocks from document storage held in memory
- expansion_limit - Maximum number of expanded keywords for a single wildcard
- grouping_in_utc - Enables using UTC timezone for grouping time fields
- ha_period_karma - Agent mirror statistics window size
- ha_ping_interval - Interval between agent mirror pings
- hostname_lookup - Hostnames renew strategy
- jobs_queue_size - Defines the maximum number of "jobs" allowed in the queue simultaneously
- listen - Specifies IP address and port or Unix-domain socket path for searchd to listen on
- listen_backlog - TCP listen backlog
- listen_tfo - Enables TCP_FASTOPEN flag for all listeners
- log - Path to Manticore server log file
- max_batch_queries - Limits the number of queries per batch
- max_connections - Maximum number of active connections
- max_filters - Maximum allowed per-query filter count
- max_filter_values - Maximum allowed per-filter values count
- max_open_files - Maximum number of files allowed to be opened by server
- max_packet_size - Maximum allowed network packet size
- mysql_version_string - Server version string returned via MySQL protocol
- net_throttle_accept - Defines how many clients are accepted on each iteration of the network loop
- net_throttle_action - Defines how many requests are processed on each iteration of the network loop
- net_wait_tm - Controls busy loop interval of a network thread
- net_workers - Number of network threads
- network_timeout - Network timeout for client requests
- node_address - Specifies network address of the node
- persistent_connections_limit - Maximum number of simultaneous persistent connections to remote persistent agents
- pid_file - Path to Manticore server pid file
- predicted_time_costs - Costs for the query time prediction model
- preopen_tables - Determines whether to forcibly preopen all tables on startup
- pseudo_sharding - Enables pseudo-sharding for search queries to plain and real-time tables
- qcache_max_bytes - Maximum RAM allocated for cached result sets
- qcache_thresh_msec - Minimum wall time threshold for a query result to be cached
- qcache_ttl_sec - Expiration period for a cached result set
- query_log - Path to query log file
- query_log_format - Query log format
- query_log_min_msec - Prevents logging too fast queries
- query_log_mode - Query log file permissions mode
- read_buffer_docs - Per-keyword read buffer size for document lists
- read_buffer_hits - Per-keyword read buffer size for hit lists
- read_unhinted - Unhinted read size
- rt_flush_period - How often Manticore flushes real-time tables' RAM chunks to disk
- rt_merge_iops - Maximum number of I/O operations (per second) that real-time chunks merging thread is allowed to do
- rt_merge_maxiosize - Maximum size of an I/O operation that real-time chunks merging thread is allowed to do
- seamless_rotate - Prevents searchd stalls while rotating tables with huge amounts of data to precache
- secondary_indexes - Enables using secondary indexes for search queries
- server_id - Server identifier used as a seed to generate a unique document ID
- shutdown_timeout - Searchd
--stopwait
timeout - shutdown_token - SHA1 hash of the password required to invoke
shutdown
command from VIP SQL connection - snippets_file_prefix - Prefix to prepend to the local file names when generating snippets in
load_files
mode - sphinxql_state - Path to file where the current SQL state will be serialized
- sphinxql_timeout - Maximum time to wait between requests from a MySQL client
- ssl_ca - Path to SSL Certificate Authority certificate file
- ssl_cert - Path to server's SSL certificate
- ssl_key - Path to SSL certificate key of the server
- subtree_docs_cache - Maximum common subtree document cache size
- subtree_hits_cache - Maximum common subtree hit cache size, per-query
- timezone - Timezone used by date/time-related functions
- thread_stack - Maximum stack size for a job
- unlink_old - Whether to unlink .old table copies on successful rotation
- watchdog - Whether to enable or disable Manticore server watchdog
searchd [OPTIONS]
- --config, -c - Specifies the path to the configuration file
- --console - Forces the server to run in console mode
- --coredump - Enables core dump saving upon crash
- --cpustats - Enables CPU time reporting
- --delete - Removes the Manticore service from Microsoft Management Console and other locations where services are registered
- --force-preread - Prevents the server from serving incoming connections until table files are pre-read
- --help, -h - Displays all available parameters
- --table (--index) - Restricts the server to serve only the specified table
- --install - Installs searchd as a service in Microsoft Management Console
- --iostats - Enables input/output reporting
- --listen, -l - Overrides listen from the configuration file
- --logdebug, --logdebugv, --logdebugvv - Enables additional debug output in the server log
- --logreplication - Enables extra replication debug output in the server log
- --new-cluster - Initializes a replication cluster and sets the server as a reference node with cluster restart protection
- --new-cluster-force - Initializes a replication cluster and sets the server as a reference node, bypassing cluster restart protection
- --nodetach - Keeps searchd running in the foreground
- --ntservice - Used by Microsoft Management Console to launch searchd as a service on Windows platforms
- --pidfile - Overrides pid_file in the configuration file
- --port, p - Specifies the port searchd should listen on, ignoring the port specified in the configuration file
- --replay-flags - Sets additional binary log replay options
- --servicename - Assigns the given name to searchd when installing or deleting the service, as displayed in Microsoft Management Console
- --status - Queries the running search service to return its status
- --stop - Stops the Manticore server
- --stopwait - Stops the Manticore server gracefully
- --strip-path - Removes path names from all file names referenced in the table
- -v - Displays version information
- MANTICORE_TRACK_DAEMON_SHUTDOWN - Enables detailed logging during searchd shutdown
Assorted table maintenance features helpful for troubleshooting.
indextool <command> [options]
Utilized for dumping various debug information related to the physical table.
indextool <command> [options]
- --config, -c - Specifies the path to the configuration file
- --quiet, -q - Keeps indextool quiet; no banner output, etc.
- --help, -h - Lists all available parameters
- -v - Displays version information
- Indextool - Verifies the configuration file
- --buildidf - Builds an IDF file from one or more dictionary dumps
- --build-infixes - Builds infixes for an existing dict=keywords table
- --dumpheader - Quickly dumps the provided table header file
- --dumpconfig - Dumps table definition from the given table header file in a nearly compliant manticore.conf format
- --dumpheader - Dumps table header by table name while looking up the header path in the configuration file
- --dumpdict - Dumps the table dictionary
- --dumpdocids - Dumps document IDs by table name
- --dumphitlist - Dumps all occurrences of the given keyword/id in the specified table
- --docextract - Runs table check pass on entire dictionary/docs/hits and collects all words and hits belonging to the requested document
- --fold - Tests tokenization based on table settings
- --htmlstrip - Filters STDIN using HTML stripper settings for the specified table
- --mergeidf - Merges multiple .idf files into a single file
- --morph - Applies morphology to the provided STDIN and outputs the result to stdout
- --check - Checks table data files for consistency
- --check-id-dups - Checks for duplicate IDs
- --check-disk-chunk - Checks a single disk chunk of an RT table
- --strip-path - Removes path names from all file names referenced in the table
- --rotate - Determines whether to check a table waiting for rotation when using
--check
- --apply-killlists - Applies kill-lists for all tables listed in the configuration file
Splits compound words into their components.
wordbreaker [-dict path/to/dictionary_file] {split|test|bench}
- STDIN - Accepts a string to break into parts
- -dict - Specifies the dictionary file to use
- split|test|bench - Specifies the command
Extracts the contents of a dictionary file using ispell or MySpell format
spelldump [options] <dictionary> <affix> [result] [locale-name]
- dictionary - Main dictionary file
- affix - Affix file for the dictionary
- result - Specifies the output destination for the dictionary data
- locale-name - Specifies the locale details to use
A comprehensive alphabetical list of keywords currently reserved in Manticore SQL syntax (thus, they cannot be used as identifiers).
AND, AS, BY, COLUMNARSCAN, DATE_ADD, DATE_SUB, DAY, DISTINCT, DIV, DOCIDINDEX, EXPLAIN, FACET, FALSE, FORCE, FROM, HOUR, IGNORE, IN, INTERVAL, INDEXES, INNER, IS, JOIN, KNN, LEFT, LIMIT, MINUTE, MOD, MONTH, NOT, NO_COLUMNARSCAN, NO_DOCIDINDEX, NO_SECONDARYINDEX, NULL, OFFSET, ON, OR, ORDER, QUARTER, REGEX, RELOAD, SECOND, SECONDARYINDEX, SELECT, SYSFILTERS, TRUE, WEEK, YEAR
- 2.4.1
- 2.5.1
- 2.6.0
- 2.6.1
- 2.6.2
- 2.6.3
- 2.6.4
- 2.7.0
- 2.7.1
- 2.7.2
- 2.7.3
- 2.7.4
- 2.7.5
- 2.8.0
- 2.8.1
- 2.8.2
- 3.0.0
- 3.0.2
- 3.1.0
- 3.1.2
- 3.2.0
- 3.2.2
- 3.3.0
- 3.4.0
- 3.4.2
- 3.5.0
- 3.5.2
- 3.5.4
- 4.0.2
- 4.2.0
- 5.0.2. Installation page
- 6.0.0. Installation page
- 6.0.2. Installation page
- 6.0.4. Installation page
- 6.2.0. Installation page
- 6.2.12. Installation page
- 6.3.0. Installation page
- 6.3.2. Installation page
- 6.3.4. Installation page