The lemmatizer_base
is an optional configuration directive that specifies the base path for lemmatizer dictionaries. The default path is /usr/share/manticore
The lemmatizer implementation in Manticore Search (see Morphology to learn what lemmatizers are) is dictionary-driven and requires specific dictionary files for different languages. These files can be downloaded from the Manticore website (https://manticoresearch.com/install/#other-downloads).
Example:
lemmatizer_base = /usr/share/manticore/
The progressive_merge is a configuration directive that, when enabled, merges real-time table disk chunks from smaller to larger ones. This approach speeds up the merging process and reduces read/write amplification. By default, this setting is enabled. If disabled, the chunks are merged in the order they were created.
The json_autoconv_keynames is an optional configuration directive that determines if and how to auto-convert key names within JSON attributes. The known value is 'lowercase'. By default, this setting is unspecified (meaning no conversion occurs).
When set to lowercase, key names within JSON attributes will be automatically converted to lowercase during indexing. This conversion applies to JSON attributes from all data sources, including SQL and XMLpipe2.
Example:
json_autoconv_keynames = lowercase
The json_autoconv_numbers is an optional configuration directive that determines whether to automatically detect and convert JSON strings that represent numbers into numeric attributes. The default value is 0 (do not convert strings into numbers).
When this option is set to 1, values such as "1234" will be indexed as numbers instead of strings. If the option is set to 0, such values will be indexed as strings. This conversion applies to JSON attributes from all data sources, including SQL and XMLpipe2.
Example:
json_autoconv_numbers = 1
on_json_attr_error is an optional configuration directive that specifies the action to take if JSON format errors are found. The default value is ignore_attr
(ignore errors). This setting applies only to sql_attr_json
attributes.
By default, JSON format errors are ignored (ignore_attr
), and the indexer tool will show a warning. Setting this option to fail_index
will cause indexing to fail at the first JSON format error.
Example:
on_json_attr_error = ignore_attr
The plugin_dir is an optional configuration directive that specifies the trusted location for dynamic libraries (UDFs). The default path is /usr/local/lib/manticore/
.
This directive sets the trusted directory from which the UDF libraries can be loaded.
Example:
plugin_dir = /usr/local/lib/manticore/
Manticore Search supports the use of special suffixes to simplify numeric values with specific meanings. These suffixes are categorized into size suffixes and time suffixes. The common format for suffixes is an integer
followed by a literal
, such as 10k
or 100d
. Literals are case-insensitive, so 10W
and 10w
are considered the same.
-
Size suffixes: These suffixes can be used in settings that define the size of something, such as memory buffer, disk file size, or RAM limit. If no suffix is specified, the value is considered in bytes by default. The available size suffixes are:
k
for kilobytes (1k = 1024 bytes)m
for megabytes (1m = 1024k)g
for gigabytes (1g = 1024m)t
for terabytes (1t = 1024g)
-
Time suffixes: These suffixes can be used in settings that define time interval values, such as delays or timeouts. Unadorned values for these parameters usually have a documented scale, but instead of guessing, you can use an explicit suffix. The available time suffixes are:
us
for microsecondsms
for millisecondss
for secondsm
for minutesh
for hoursd
for daysw
for weeks
Manticore configuration supports shebang syntax, allowing the configuration to be written in a programming language and interpreted at loading. This enables dynamic settings, such as generating tables by querying a database table, modifying settings based on external factors, or including external files containing table and source declarations.
The configuration file is parsed by the declared interpreter, and the output is used as the actual configuration. This occurs each time the configuration is read, not only at searchd startup.
Note: This feature is not available on the Windows platform.
In the following example, PHP is used to create multiple tables with different names and to scan a specific folder for files containing extra table declarations:
#!/usr/bin/php
...
<?php for ($i=1; $i<=6; $i++) { ?>
table test_<?=$i?> {
type = rt
path = /var/lib/manticore/data/test_<?=$i?>
rt_field = subject
...
}
<?php } ?>
...
<?php
$confd_folder='/etc/manticore.conf.d/';
$files = scandir($confd_folder);
foreach($files as $file)
{
if(($file == '.') || ($file =='..'))
{} else {
$fp = new SplFileInfo($confd_folder.$file);
if('conf' == $fp->getExtension()){
include ($confd_folder.$file);
}
}
}
?>