Manticore 的数据类型可以分为两类:全文字段和属性。
Manticore 中的字段名称必须遵循以下规则:
- 可以包含字母(a-z,A-Z)、数字(0-9)和连字符(-)
- 必须以字母开头
- 数字只能出现在字母之后
- 下划线(
_)是唯一允许的特殊字符
- 字段名称不区分大小写
例如:
- 有效的字段名称:
title,product_id,user_name_2
- 无效的字段名称:
2title,-price,user@name
全文字段:
- 可以用自然语言处理算法进行索引,因此可以搜索关键词
- 不能用于排序或分组
- 可以检索原始文档内容
- 可以使用原始文档内容进行高亮显示
全文字段由数据类型 text 表示。所有其他数据类型称为“属性”。
属性是与每个文档关联的非全文值,可用于在搜索期间执行非全文过滤、排序和分组。
通常希望不仅基于匹配的文档 ID 及其排名来处理全文搜索结果,还基于多个每个文档的值。例如,可能需要按日期然后按相关性对新闻搜索结果进行排序,或者在指定价格范围内搜索产品,或者将博客搜索限制为特定用户发布的帖子,或者按月份对结果进行分组。为了高效实现这些功能,Manticore 不仅支持全文字段,还允许为每个文档添加额外的属性。这些属性可用于过滤、排序或分组全文匹配,或仅通过属性进行搜索。
属性与全文字段不同,不进行全文索引。它们存储在表中,但不能像全文那样搜索。
属性的一个好例子是论坛帖子表。假设只有标题和内容字段需要全文搜索——但有时也需要限制搜索到某个作者或子论坛(即仅搜索具有特定 author_id 或 forum_id 值的行);或者按 post_date 列排序匹配结果;或者按 post_date 的月份对匹配帖子进行分组并计算每组的匹配计数。
CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp);
POST /cli -d "CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)"
$index = new \Manticoresearch\Index($client);
$index->setName('forum');
$index->create([
'title'=>['type'=>'text'],
'content'=>['type'=>'text'],
'author_id'=>['type'=>'int'],
'forum_id'=>['type'=>'int'],
'post_date'=>['type'=>'timestamp']
]);
utilsApi.sql('CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)')
await utilsApi.sql('CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)')
res = await utilsApi.sql('CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)');
utilsApi.sql("CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)");
utilsApi.Sql("CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)");
utils_api.sql("CREATE TABLE forum(title text, content text, author_id int, forum_id int, post_date timestamp)", Some(true)).await;
table forum
{
type = rt
path = forum
# when configuring fields via config, they are indexed (and not stored) by default
rt_field = title
rt_field = content
# this option needs to be specified for the field to be stored
stored_fields = title, content
rt_attr_uint = author_id
rt_attr_uint = forum_id
rt_attr_timestamp = post_date
}
此示例展示了运行一个通过 author_id、forum_id 过滤并按 post_date 排序的全文查询。
select * from forum where author_id=123 and forum_id in (1,3,7) order by post_date desc
POST /search
{
"table": "forum",
"query":
{
"match_all": {},
"bool":
{
"must":
[
{ "equals": { "author_id": 123 } },
{ "in": { "forum_id": [1,3,7] } }
]
}
},
"sort": [ { "post_date": "desc" } ]
}
$client->search([
'table' => 'forum',
'query' =>
[
'match_all' => [],
'bool' => [
'must' => [
'equals' => ['author_id' => 123],
'in' => [
'forum_id' => [
1,3,7
]
]
]
]
],
'sort' => [
['post_date' => 'desc']
]
]);
searchApi.search({"table":"forum","query":{"match_all":{},"bool":{"must":[{"equals":{"author_id":123}},{"in":{"forum_id":[1,3,7]}}]}},"sort":[{"post_date":"desc"}]})
await searchApi.search({"table":"forum","query":{"match_all":{},"bool":{"must":[{"equals":{"author_id":123}},{"in":{"forum_id":[1,3,7]}}]}},"sort":[{"post_date":"desc"}]})
res = await searchApi.search({"table":"forum","query":{"match_all":{},"bool":{"must":[{"equals":{"author_id":123}},{"in":{"forum_id":[1,3,7]}}]}},"sort":[{"post_date":"desc"}]});
HashMap<String,Object> filters = new HashMap<String,Object>(){{
put("must", new HashMap<String,Object>(){{
put("equals",new HashMap<String,Integer>(){{
put("author_id",123);
}});
put("in",
new HashMap<String,Object>(){{
put("forum_id",new int[] {1,3,7});
}});
}});
}};
Map<String,Object> query = new HashMap<String,Object>();
query.put("match_all",null);
query.put("bool",filters);
SearchRequest searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
searchRequest.setQuery(query);
searchRequest.setSort(new ArrayList<Object>(){{
add(new HashMap<String,String>(){{ put("post_date","desc");}});
}});
SearchResponse searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
var boolFilter = new BoolFilter();
boolFilter.Must = new List<Object> {
new EqualsFilter("author_id", 123),
new InFilter("forum_id", new List<Object> {1,3,7})
};
searchRequest.AttrFilter = boolFilter;
searchRequest.Sort = new List<Object> { new SortOrder("post_date", SortOrder.OrderEnum.Desc) };
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut sort = HashMap::new();
sort.insert("post_date".to_string(), serde_json::json!("desc"));
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
sort: serde_json::json!(sort)
..Default::default()
};
let search_res = search_api.search(search_req).await;
Manticore 支持两种类型的属性存储:
顾名思义,它们以不同方式存储数据。传统的行存储:
- 存储未压缩的属性
- 同一文档的所有属性存储在一行中,彼此相邻
- 行按顺序存储
- 访问属性基本上是通过将行 ID 乘以步幅(单个向量的长度),然后从计算出的内存位置获取请求的属性。这提供了非常低的随机访问延迟。
- 属性必须在内存中才能获得可接受的性能,否则由于行存储的特性,Manticore 可能需要从磁盘读取过多不必要的数据,这在许多情况下是次优的。
而列存储:
- 每个属性独立存储在其单独的“列”中,与其他属性无关
- 存储被分割成 65536 条目的块
- 块以压缩形式存储。这通常允许只存储少数不同的值,而不是像行存储那样存储所有值。高压缩比允许更快地从磁盘读取,并大大降低内存需求
- 索引数据时,每个块独立选择存储方案。例如,如果一个块中的所有值都相同,则采用“const”存储,整个块只存储一个值。如果每个块中唯一值少于 256 个,则采用“table”存储,存储指向值表的索引而非值本身
- 如果明确请求的值不在块中,则可以提前拒绝该块的搜索。
列存储设计用于处理无法全部装入内存的大量数据,因此建议:
- 如果你有足够的内存存放所有属性,使用行存储会更有利
- 否则,列存储仍能提供不错的性能,同时大幅降低内存占用,这将允许你在表中存储更多文档
传统的行存储是默认的,因此如果您希望所有内容都以行存储方式存储,创建表时无需做任何操作。
要启用列存储,您需要:
- 在 CREATE TABLE 中指定
engine='columnar',使表的所有属性都为列存储。然后,如果您想保持某个特定属性为行存储,需要在声明该属性时添加 engine='rowwise'。例如:create table tbl(title text, type int, price float engine='rowwise') engine='columnar'
- 在
CREATE TABLE 中为特定属性指定 engine='columnar',使其为列存储。例如:create table tbl(title text, type int, price float engine='columnar');
或
create table tbl(title text, type int, price float engine='columnar') engine='rowwise';
- 在 plain 模式 中,您需要在 columnar_attrs 中列出您希望为列存储的属性。
以下是 Manticore Search 支持的数据类型列表:
文档标识符是一个必需的属性,必须是唯一的 64 位无符号整数。创建表时可以显式指定文档 ID,但即使未指定,文档 ID 也始终启用。文档 ID 不能被更新。
创建表时,您可以显式指定 ID,但无论使用何种数据类型,它始终按上述方式行为——存储为无符号 64 位,但以有符号 64 位整数形式暴露。
mysql> CREATE TABLE tbl(id bigint, content text);
DESC tbl;
+---------+--------+----------------+
| Field | Type | Properties |
+---------+--------+----------------+
| id | bigint | |
| content | text | indexed stored |
+---------+--------+----------------+
2 rows in set (0.00 sec)
您也可以完全省略指定 ID,它会自动启用。
mysql> CREATE TABLE tbl(content text);
DESC tbl;
+---------+--------+----------------+
| Field | Type | Properties |
+---------+--------+----------------+
| id | bigint | |
| content | text | indexed stored |
+---------+--------+----------------+
2 rows in set (0.00 sec)
在处理文档 ID 时,重要的是要知道它们在内部存储为无符号 64 位整数,但根据接口不同,处理方式有所不同:
MySQL/SQL 接口:
- 大于 2^63-1 的 ID 会显示为负数。
- 过滤此类大 ID 时,必须使用它们的有符号表示。
- 使用 UINT64() 函数查看实际的无符号值。
JSON/HTTP 接口:
- ID 始终以其原始无符号值显示,无论大小。
- 过滤时可以使用有符号和无符号表示。
- 插入操作接受完整的无符号 64 位范围。
例如,创建一个表并插入一些接近 2^63 的值:
mysql> create table t(id_text string);
Query OK, 0 rows affected (0.01 sec)
mysql> insert into t values(9223372036854775807, '2 ^ 63 - 1'),(9223372036854775808, '2 ^ 63');
Query OK, 2 rows affected (0.00 sec)
某些 ID 在结果中显示为负数,因为它们超过了 2^63-1。然而,使用 UINT64(id) 可以显示它们的实际无符号值:
mysql> select *, uint64(id) from t;
+----------------------+------------+---------------------+
| id | id_text | uint64(id) |
+----------------------+------------+---------------------+
| 9223372036854775807 | 2 ^ 63 - 1 | 9223372036854775807 |
| -9223372036854775808 | 2 ^ 63 | 9223372036854775808 |
+----------------------+------------+---------------------+
2 rows in set (0.00 sec)
--- 2 out of 2 results in 0ms ---
查询 ID 小于 2^63 的文档时,可以直接使用无符号值:
mysql> select * from t where id = 9223372036854775807;
+---------------------+------------+
| id | id_text |
+---------------------+------------+
| 9223372036854775807 | 2 ^ 63 - 1 |
+---------------------+------------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 0ms ---
但对于从 2^63 开始的 ID,您需要使用有符号值:
mysql> select * from t where id = -9223372036854775808;
+----------------------+---------+
| id | id_text |
+----------------------+---------+
| -9223372036854775808 | 2 ^ 63 |
+----------------------+---------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 0ms ---
如果使用无符号值,则会出现错误:
mysql> select * from t where id = 9223372036854775808;
ERROR 1064 (42000): number 9223372036854775808 is out of range [-9223372036854775808..9223372036854775807]
超出 64 位范围的值会触发类似错误:
mysql> select * from t where id = -9223372036854775809;
ERROR 1064 (42000): number -9223372036854775809 is out of range [-9223372036854775808..9223372036854775807]
MySQL/SQL 和 JSON/HTTP 接口在处理非常大的文档 ID 时行为差异更明显。以下是一个完整示例:
MySQL/SQL 接口:
mysql> drop table if exists t; create table t; insert into t values(17581446260360033510);
Query OK, 0 rows affected (0.01 sec)
mysql> select * from t;
+---------------------+
| id |
+---------------------+
| -865297813349518106 |
+---------------------+
mysql> select *, uint64(id) from t;
+---------------------+----------------------+
| id | uint64(id) |
+---------------------+----------------------+
| -865297813349518106 | 17581446260360033510 |
+---------------------+----------------------+
mysql> select * from t where id = -865297813349518106;
+---------------------+
| id |
+---------------------+
| -865297813349518106 |
+---------------------+
mysql> select * from t where id = 17581446260360033510;
ERROR 1064 (42000): number 17581446260360033510 is out of range [-9223372036854775808..9223372036854775807]
JSON/HTTP 接口:
# Search returns the original unsigned value
curl -s 0:9308/search -d '{"table": "t"}'
{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"total_relation": "eq",
"hits": [
{
"_id": 17581446260360033510,
"_score": 1,
"_source": {}
}
]
}
}
# Both signed and unsigned values work for filtering
curl -s 0:9308/search -d '{"table": "t", "query": {"equals": {"id": 17581446260360033510}}}'
curl -s 0:9308/search -d '{"table": "t", "query": {"equals": {"id": -865297813349518106}}}'
# Insert with maximum unsigned 64-bit value
curl -s 0:9308/insert -d '{"table": "t", "id": 18446744073709551615, "doc": {}}'
这意味着在处理大文档 ID 时:
- MySQL 接口 查询时需要使用有符号表示,但可以用
UINT64() 显示无符号值
- JSON 接口 始终使用无符号值显示,过滤时接受两种表示
通用语法:
string|text [stored|attribute] [indexed]
属性:
indexed - 全文索引(可用于全文查询)
stored - 存储在文档存储中(存储在磁盘上,不在内存中,延迟读取)
attribute - 使其成为字符串属性(可用于排序/分组)
指定至少一个属性会覆盖所有默认属性(见下文),即如果您决定使用自定义属性组合,需要列出所有想要的属性。
未指定属性时:
string 和 text 是别名,但如果不指定任何属性,它们默认含义不同:
- 仅
string 默认表示 attribute(详见下文)。
- 仅
text 默认表示 stored + indexed(详见下文)。
text(仅 text 或 text/string indexed)数据类型构成表的全文部分。文本字段被索引,可以搜索关键词。
文本通过分析器管道处理,将文本转换为单词,应用形态学变换等。最终,从该文本构建全文表(一种特殊数据结构,支持快速关键词搜索)。
全文字段只能用于 MATCH() 子句,不能用于排序或聚合。单词存储在倒排索引中,附带它们所属字段的引用和字段中的位置。这允许在每个字段内搜索单词,并使用高级操作符如邻近搜索。默认情况下,字段的原始文本既被索引又存储在文档存储中。这意味着原始文本可以随查询结果返回,并用于搜索结果高亮。
CREATE TABLE products(title text);
POST /cli -d "CREATE TABLE products(title text)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text']
]);
utilsApi.sql('CREATE TABLE products(title text)')
await utilsApi.sql('CREATE TABLE products(title text)')
res = await utilsApi.sql('CREATE TABLE products(title text)');
utilsApi.sql("CREATE TABLE products(title text)");
utilsApi.Sql("CREATE TABLE products(title text)");
utils_api.sql("CREATE TABLE products(title text)", Some(true)).await;
table products
{
type = rt
path = products
# when configuring fields via config, they are indexed (and not stored) by default
rt_field = title
# this option needs to be specified for the field to be stored
stored_fields = title
}
CREATE TABLE products(title text indexed);
POST /cli -d "CREATE TABLE products(title text indexed)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text','options'=>['indexed']]
]);
utilsApi.sql('CREATE TABLE products(title text indexed)')
await utilsApi.sql('CREATE TABLE products(title text indexed)')
res = await utilsApi.sql('CREATE TABLE products(title text indexed)');
utilsApi.sql("CREATE TABLE products(title text indexed)");
utilsApi.Sql("CREATE TABLE products(title text indexed)");
utils_api.sql("CREATE TABLE products(title text indexed)", Some(true)).await;
table products
{
type = rt
path = products
# when configuring fields via config, they are indexed (and not stored) by default
rt_field = title
}
字段是有名称的,您可以将搜索限制在单个字段(例如仅搜索“title”)或字段子集(例如仅“title”和“abstract”)中。您最多可以有256个全文字段。
select * from products where match('@title first');
POST /search
{
"table": "products",
"query":
{
"match": { "title": "first" }
}
}
$index->setName('products')->search('@title')->get();
searchApi.search({"table":"products","query":{"match":{"title":"first"}}})
await searchApi.search({"table":"products","query":{"match":{"title":"first"}}})
res = await searchApi.search({"table":"products","query":{"match":{"title":"first"}}});
utilsApi.sql("CREATE TABLE products(title text indexed)");
utilsApi.Sql("CREATE TABLE products(title text indexed)");
utils_api.sql("CREATE TABLE products(title text indexed)", Some(true)).await;
与全文字段不同,字符串属性(仅 string 或 string/text attribute)按接收的原样存储,不能用于全文搜索。相反,它们会在结果中返回,可以在 WHERE 子句中用于比较过滤或 REGEX,并且可以用于排序和聚合。通常,不建议在字符串属性中存储大量文本,而是将字符串属性用于元数据,如名称、标题、标签、键。
如果您还想索引字符串属性,可以同时指定为 string attribute indexed。这将允许全文搜索,并且作为属性工作。
CREATE TABLE products(title text, keys string);
POST /cli -d "CREATE TABLE products(title text, keys string)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'keys'=>['type'=>'string']
]);
utilsApi.sql('CREATE TABLE products(title text, keys string)')
await utilsApi.sql('CREATE TABLE products(title text, keys string)')
res = await utilsApi.sql('CREATE TABLE products(title text, keys string)');
utilsApi.sql("CREATE TABLE products(title text, keys string)");
utilsApi.Sql("CREATE TABLE products(title text, keys string)");
utils_api.sql("CREATE TABLE products(title text, keys string)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_string = keys
}
Manticore 没有专门的二进制数据字段类型,但您可以通过使用 base64 编码和 text stored 或 string stored 字段类型(它们是同义词)安全地存储它。如果不对二进制数据进行编码,部分数据可能会丢失——例如,如果遇到空字节,Manticore 会截断字符串末尾。
下面是一个示例,我们使用 base64 编码 ls 命令,将其存储在 Manticore 中,然后解码以验证 MD5 校验和保持不变:
# md5sum /bin/ls
43d1b8a7ccda411118e2caba685f4329 /bin/ls
# encoded_data=`base64 -i /bin/ls `
# mysql -P9306 -h0 -e "drop table if exists test; create table test(data text stored); insert into test(data) values('$encoded_data')"
# mysql -P9306 -h0 -NB -e "select data from test" | base64 -d > /tmp/ls | md5sum
43d1b8a7ccda411118e2caba685f4329 -
CREATE TABLE products(title text, price int);
POST /cli -d "CREATE TABLE products(title text, price int)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'price'=>['type'=>'int']
]);
utilsApi.sql('CREATE TABLE products(title text, price int)')
await utilsApi.sql('CREATE TABLE products(title text, price int)')
res = await utilsApi.sql('CREATE TABLE products(title text, price int)');
utilsApi.sql("CREATE TABLE products(title text, price int)");
utilsApi.Sql("CREATE TABLE products(title text, price int)");
utils_api.sql("CREATE TABLE products(title text, price int)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_uint = type
}
整数可以通过指定位数来存储为小于32位的大小。例如,如果我们想存储一个我们知道不会大于8的数值,则类型可以定义为 bit(3)。位数整数的性能比全尺寸整数慢,但它们需要更少的内存。它们以32位块保存,因此为了节省空间,应该将它们放在属性定义的末尾(否则位数整数夹在两个全尺寸整数之间时,也会占用32位)。
CREATE TABLE products(title text, flags bit(3), tags bit(2) );
POST /cli -d "CREATE TABLE products(title text, flags bit(3), tags bit(2))"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'flags'=>['type'=>'bit(3)'],
'tags'=>['type'=>'bit(2)']
]);
utilsApi.sql('CREATE TABLE products(title text, flags bit(3), tags bit(2) ')
await utilsApi.sql('CREATE TABLE products(title text, flags bit(3), tags bit(2) ')
res = await utilsApi.sql('CREATE TABLE products(title text, flags bit(3), tags bit(2) ');
utilsApi.sql("CREATE TABLE products(title text, flags bit(3), tags bit(2)");
utilsApi.Sql("CREATE TABLE products(title text, flags bit(3), tags bit(2)");
utils_api.sql("CREATE TABLE products(title text, flags bit(3), tags bit(2)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_uint = flags:3
rt_attr_uint = tags:2
}
CREATE TABLE products(title text, price bigint );
POST /cli -d "CREATE TABLE products(title text, price bigint)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'price'=>['type'=>'bigint']
]);
utilsApi.sql('CREATE TABLE products(title text, price bigint )')
await utilsApi.sql('CREATE TABLE products(title text, price bigint )')
res = await utilsApi.sql('CREATE TABLE products(title text, price bigint )');
utilsApi.sql("CREATE TABLE products(title text, price bigint )");
utilsApi.Sql("CREATE TABLE products(title text, price bigint )");
utils_api.sql("CREATE TABLE products(title text, price bigint )", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_bigint = type
}
CREATE TABLE products(title text, sold bool );
POST /cli -d "CREATE TABLE products(title text, sold bool)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'sold'=>['type'=>'bool']
]);
utilsApi.sql('CREATE TABLE products(title text, sold bool )')
await utilsApi.sql('CREATE TABLE products(title text, sold bool )')
res = await utilsApi.sql('CREATE TABLE products(title text, sold bool )');
utilsApi.sql("CREATE TABLE products(title text, sold bool )");
utilsApi.Sql("CREATE TABLE products(title text, sold bool )");
utils_api.sql("CREATE TABLE products(title text, sold bool )", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_bool = sold
}
时间戳类型表示Unix时间戳,存储为32位整数。与基本整数不同,时间戳类型允许使用时间和日期函数。字符串值的转换遵循以下规则:
- 没有分隔符且长度至少为10个字符的数字,按原样转换为时间戳。
%Y-%m-%dT%H:%M:%E*S%Z
%Y-%m-%d'T'%H:%M:%S%Z
%Y-%m-%dT%H:%M:%E*S
%Y-%m-%dT%H:%M:%s
%Y-%m-%dT%H:%M
%Y-%m-%dT%H
%Y-%m-%d
%Y-%m
%Y
这些转换说明符的含义详见strptime手册,其中 %E*S 表示毫秒。
注意,普通表中不支持时间戳的自动转换。
CREATE TABLE products(title text, date timestamp);
POST /cli -d "CREATE TABLE products(title text, date timestamp)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'date'=>['type'=>'timestamp']
]);
utilsApi.sql('CREATE TABLE products(title text, date timestamp)')
await utilsApi.sql('CREATE TABLE products(title text, date timestamp)')
res = await utilsApi.sql('CREATE TABLE products(title text, date timestamp)');
utilsApi.sql("CREATE TABLE products(title text, date timestamp)");
utilsApi.Sql("CREATE TABLE products(title text, date timestamp)");
utils_api.sql("CREATE TABLE products(title text, date timestamp)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_timestamp = date
}
CREATE TABLE products(title text, coeff float);
POST /cli -d "CREATE TABLE products(title text, coeff float)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'coeff'=>['type'=>'float']
]);
utilsApi.sql('CREATE TABLE products(title text, coeff float)')
await utilsApi.sql('CREATE TABLE products(title text, coeff float)')
res = await utilsApi.sql('CREATE TABLE products(title text, coeff float)');
utilsApi.sql("CREATE TABLE products(title text, coeff float)");
utilsApi.Sql("CREATE TABLE products(title text, coeff float)");
utils_api.sql("CREATE TABLE products(title text, coeff float)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_float = coeff
}
与整数类型不同,由于可能存在舍入误差,不建议直接比较两个浮点数是否相等。更可靠的方法是使用近似相等比较,通过检查绝对误差范围。
select abs(a-b)<=0.00001 from products
POST /search
{
"table": "products",
"query": { "match_all": {} } },
"expressions": { "eps": "abs(a-b)" }
}
$index->setName('products')->search('')->expression('eps','abs(a-b)')->get();
searchApi.search({"table":"products","query":{"match_all":{}},"expressions":{"eps":"abs(a-b)"}})
await searchApi.search({"table":"products","query":{"match_all":{}},"expressions":{"eps":"abs(a-b)"}})
res = await searchApi.search({"table":"products","query":{"match_all":{}},"expressions":{"eps":"abs(a-b)"}});
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
Object expressions = new HashMap<String,Object>(){{
put("ebs","abs(a-b)");
}};
searchRequest.setExpressions(expressions);
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Expressions = new List<Object>{
new Dictionary<string, string> { {"ebs", "abs(a-b)"} }
};
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut expr = HashMap::new();
expr.insert("ebs".to_string(), serde_json::json!("abs(a-b)"));
let expressions: [HashMap; 1] = [expr];
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
expressions: serde_json::json!(expressions),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
另一种替代方法,也可以用来执行 IN(attr,val1,val2,val3),是通过选择一个乘数因子,将浮点数转换为整数进行比较。下面的示例演示了如何修改 IN(attr,2.0,2.5,3.5) 以使用整数值。
select in(ceil(attr*100),200,250,350) from products
POST /search
{
"table": "products",
"query": { "match_all": {} } },
"expressions": { "inc": "in(ceil(attr*100),200,250,350)" }
}
$index->setName('products')->search('')->expression('inc','in(ceil(attr*100),200,250,350)')->get();
searchApi.search({"table":"products","query":{"match_all":{}}},"expressions":{"inc":"in(ceil(attr*100),200,250,350)"}})
await searchApi.search({"table":"products","query":{"match_all":{}}},"expressions":{"inc":"in(ceil(attr*100),200,250,350)"}})
res = await searchApi.search({"table":"products","query":{"match_all":{}}},"expressions":{"inc":"in(ceil(attr*100),200,250,350)"}});
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
Object expressions = new HashMap<String,Object>(){{
put("inc","in(ceil(attr*100),200,250,350)");
}};
searchRequest.setExpressions(expressions);
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Expressions = new List<Object> {
new Dictionary<string, string> { {"ebs", "in(ceil(attr*100),200,250,350)"} }
};
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut expr = HashMap::new();
expr.insert("ebs".to_string(), serde_json::json!("in(ceil(attr*100),200,250,350)"));
let expressions: [HashMap; 1] = [expr];
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
expressions: serde_json::json!(expressions),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
Manticore 中的浮点值以精确的精度显示,以确保它们反映存储的确切值。引入这种方法是为了防止精度丢失,特别是在地理坐标等情况下,四舍五入到小数点后6位会导致不准确。
现在,Manticore 首先输出一个6位数字,然后解析并与原始值比较。如果不匹配,则会添加更多数字,直到匹配为止。
例如,如果插入的浮点值是 19.45,Manticore 会显示为 19.450001,以准确表示存储的值。
insert into t(id, f) values(1, 19.45)
--------------
Query OK, 1 row affected (0.02 sec)
--------------
select * from t
--------------
+------+-----------+
| id | f |
+------+-----------+
| 1 | 19.450001 |
+------+-----------+
1 row in set (0.00 sec)
--- 1 out of 1 results in 0ms ---
此数据类型允许存储 JSON 对象,这对于处理无模式数据特别有用。在定义 JSON 值时,确保包含对象的起始和结束大括号 { 和 },或数组的方括号 [ 和 ]。虽然列式存储不支持 JSON,但它可以存储在传统的行存储中。值得注意的是,这两种存储类型可以在同一个表中结合使用。
CREATE TABLE products(title text, data json);
POST /cli -d "CREATE TABLE products(title text, data json)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'data'=>['type'=>'json']
]);
utilsApi.sql('CREATE TABLE products(title text, data json)')
await utilsApi.sql('CREATE TABLE products(title text, data json)')
res = await utilsApi.sql('CREATE TABLE products(title text, data json)');
utilsApi.sql("CREATE TABLE products(title text, data json)");
utilsApi.Sql("CREATE TABLE products(title text, data json)");
utils_api.sql("CREATE TABLE products(title text, data json)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_json = data
}
select indexof(x>2 for x in data.intarray) from products
POST /search
{
"table": "products",
"query": { "match_all": {} } },
"expressions": { "idx": "indexof(x>2 for x in data.intarray)" }
}
$index->setName('products')->search('')->expression('idx','indexof(x>2 for x in data.intarray)')->get();
searchApi.search({"table":"products","query":{"match_all":{}},"expressions":{"idx":"indexof(x>2 for x in data.intarray)"}})
await searchApi.search({"table":"products","query":{"match_all":{}},"expressions":{"idx":"indexof(x>2 for x in data.intarray)"}})
res = await searchApi.search({"table":"products","query":{"match_all":{}},"expressions":{"idx":"indexof(x>2 for x in data.intarray)"}});
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
Object expressions = new HashMap<String,Object>(){{
put("idx","indexof(x>2 for x in data.intarray)");
}};
searchRequest.setExpressions(expressions);
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Expressions = new List<Object> {
new Dictionary<string, string> { {"idx", "indexof(x>2 for x in data.intarray)"} }
};
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut expr = HashMap::new();
expr.insert("idx".to_string(), serde_json::json!("indexof(x>2 for x in data.intarray)"));
let expressions: [HashMap; 1] = [expr];
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
expressions: serde_json::json!(expressions),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
文本属性被视为字符串,因此不能在全文匹配表达式中使用它们。但是,可以使用字符串函数,如 REGEX()。
select regex(data.name, 'est') as c from products where c>0
POST /search
{
"table": "products",
"query":
{
"match_all": {},
"range": { "c": { "gt": 0 } } }
},
"expressions": { "c": "regex(data.name, 'est')" }
}
$index->setName('products')->search('')->expression('idx',"regex(data.name, 'est')")->filter('c','gt',0)->get();
searchApi.search({"table":"products","query":{"match_all":{},"range":{"c":{"gt":0}}}},"expressions":{"c":"regex(data.name, 'est')"}})
await searchApi.search({"table":"products","query":{"match_all":{},"range":{"c":{"gt":0}}}},"expressions":{"c":"regex(data.name, 'est')"}})
res = await searchApi.search({"table":"products","query":{"match_all":{},"range":{"c":{"gt":0}}}},"expressions":{"c":"regex(data.name, 'est')"}}});
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
query.put("range", new HashMap<String,Object>(){{
put("c", new HashMap<String,Object>(){{
put("gt",0);
}});
}});
searchRequest.setQuery(query);
Object expressions = new HashMap<String,Object>(){{
put("idx","indexof(x>2 for x in data.intarray)");
}};
searchRequest.setExpressions(expressions);
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
var rangeFilter = new RangeFilter("c");
rangeFilter.Gt = 0;
searchRequest.AttrFilter = rangeFilter;
searchRequest.Expressions = new List<Object> {
new Dictionary<string, string> { {"idx", "indexof(x>2 for x in data.intarray)"} }
};
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut expr = HashMap::new();
expr.insert("idx".to_string(), serde_json::json!("indexof(x>2 for x in data.intarray)"));
let expressions: [HashMap; 1] = [expr];
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
expressions: serde_json::json!(expressions),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
对于 JSON 属性,在某些情况下可能需要强制数据类型以确保功能正常。例如,在处理浮点值时,必须使用 DOUBLE() 以实现正确排序。
select * from products order by double(data.myfloat) desc
POST /search
{
"table": "products",
"query": { "match_all": {} } },
"sort": [ { "double(data.myfloat)": { "order": "desc"} } ]
}
$index->setName('products')->search('')->sort('double(data.myfloat)','desc')->get();
searchApi.search({"table":"products","query":{"match_all":{}}},"sort":[{"double(data.myfloat)":{"order":"desc"}}]})
await searchApi.search({"table":"products","query":{"match_all":{}}},"sort":[{"double(data.myfloat)":{"order":"desc"}}]})
res = await searchApi.search({"table":"products","query":{"match_all":{}}},"sort":[{"double(data.myfloat)":{"order":"desc"}}]});
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
searchRequest.setSort(new ArrayList<Object>(){{
add(new HashMap<String,String>(){{ put("double(data.myfloat)",new HashMap<String,String>(){{ put("order","desc");}});}});
}});
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Sort = new List<Object> {
new SortOrder("double(data.myfloat)", SortOrder.OrderEnum.Desc)
};
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut sort = HashMap::new();
sort.insert("double(data.myfloat)".to_string(), serde_json::json!("desc"));
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
sort: serde_json::json!(sort)
..Default::default()
};
let search_res = search_api.search(search_req).await;
Float vector attributes allow storing variable-length lists of floats, primarily used for machine learning applications and similarity searches. This type differs from multi-valued attributes (MVAs) in several important ways:
- Preserves the exact order of values (unlike MVAs which may reorder)
- Retains duplicate values (unlike MVAs which deduplicate)
- No additional processing during insertion (unlike MVAs which sort and deduplicate)
Float vector attributes allow storing variable-length lists of floats, primarily used for machine learning applications and similarity searches.
- Currently only supported in real-time tables
- Can only be utilized in KNN (k-nearest neighbor) searches
- Not supported in plain tables or other functions/expressions
- When used with KNN settings, you cannot
UPDATE float_vector values. Use REPLACE instead
- When used without KNN settings, you can
UPDATE float_vector values
- Float vectors cannot be used in regular filters or sorting
- The only way to filter by
float_vector values is through vector search operations (KNN)
- Text embeddings for semantic search
- Recommendation system vectors
- Image embeddings for similarity search
- Feature vectors for machine learning
Keep in mind that the float_vector data type is not compatible with the Auto schema mechanism.
For more details on setting up float vectors and using them in searches, see KNN search.
The most convenient way to work with float vectors is using auto embeddings. This feature automatically generates embeddings from your text data using machine learning models, eliminating the need to manually compute and insert vectors.
- Simplified workflow: Just insert text, embeddings are generated automatically
- No manual vector computation: No need to run separate embedding models
- Consistent embeddings: Same model ensures consistent vector representations
- Multiple model support: Choose from sentence-transformers, OpenAI, Voyage, and Jina models
- Flexible field selection: Control which fields are used for embedding generation
When creating a table with auto embeddings, specify these additional parameters:
MODEL_NAME: The embedding model to use for automatic vector generation
FROM: Which fields to use for embedding generation (empty string means all text/string fields)
Supported embedding models:
- Sentence Transformers: Any suitable BERT-based Hugging Face model (e.g.,
sentence-transformers/all-MiniLM-L6-v2) — no API key needed. Manticore downloads the model when you create the table.
- OpenAI: OpenAI embedding models like
openai/text-embedding-ada-002 - requires API_KEY='<OPENAI_API_KEY>' parameter
- Voyage: Voyage AI embedding models - requires
API_KEY='<VOYAGE_API_KEY>' parameter
- Jina: Jina AI embedding models - requires
API_KEY='<JINA_API_KEY>' parameter
Using sentence-transformers model (no API key needed)
CREATE TABLE products (
title TEXT,
description TEXT,
embedding_vector FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2'
MODEL_NAME='sentence-transformers/all-MiniLM-L6-v2' FROM='title'
);
Using OpenAI model (requires API_KEY parameter)
CREATE TABLE products_openai (
title TEXT,
content TEXT,
embedding_vector FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='cosine'
MODEL_NAME='openai/text-embedding-ada-002' FROM='title,content' API_KEY='<OPENAI_API_KEY>'
);
Using all text fields for embeddings (FROM is empty)
CREATE TABLE products_all_fields (
title TEXT,
description TEXT,
tags TEXT,
embedding_vector FLOAT_VECTOR KNN_TYPE='hnsw' HNSW_SIMILARITY='l2'
MODEL_NAME='sentence-transformers/all-MiniLM-L6-v2' FROM=''
);
The FROM parameter controls which fields are used for embedding generation:
- Specific fields:
FROM='title' - only the title field is used
- Multiple fields:
FROM='title,description' - both title and description are concatenated and used
- All text fields:
FROM='' (empty) - all text (full-text field) and string (string attribute) fields in the table are used
- Empty vectors: You can still insert empty vectors using
() to exclude documents from vector search
When using auto embeddings, do not specify the vector field in your INSERT statements. The embeddings are automatically generated from the specified text fields:
-- Insert text data - embeddings generated automatically
INSERT INTO products (title, description) VALUES
('smartphone', 'latest mobile device with camera'),
('laptop computer', 'portable workstation for developers');
-- Insert with empty vector (excluded from vector search)
INSERT INTO products (title, description, embedding_vector) VALUES
('no-vector item', 'this item has no embedding', ());
Alternatively, you can work with manually computed float vectors.
CREATE TABLE products(title text, image_vector float_vector);
POST /cli -d "CREATE TABLE products(title text, image_vector float_vector)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'image_vector'=>['type'=>'float_vector']
]);
utilsApi.sql('CREATE TABLE products(title text, image_vector float_vector)')
await utilsApi.sql('CREATE TABLE products(title text, image_vector float_vector)')
res = await utilsApi.sql('CREATE TABLE products(title text, image_vector float_vector)');
utilsApi.sql("CREATE TABLE products(title text, image_vector float_vector)");
utilsApi.Sql("CREATE TABLE products(title text, image_vector float_vector)");
utils_api.sql("CREATE TABLE products(title text, image_vector float_vector)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_float_vector = image_vector
}
Multi-value attributes allow storing variable-length lists of 32-bit unsigned integers. This can be useful for storing one-to-many numeric values, such as tags, product categories, and properties.
CREATE TABLE products(title text, product_codes multi);
POST /cli -d "CREATE TABLE products(title text, product_codes multi)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'product_codes'=>['type'=>'multi']
]);
utilsApi.sql('CREATE TABLE products(title text, product_codes multi)')
await utilsApi.sql('CREATE TABLE products(title text, product_codes multi)')
res = await utilsApi.sql('CREATE TABLE products(title text, product_codes multi)');
utilsApi.sql("CREATE TABLE products(title text, product_codes multi)");
utilsApi.Sql("CREATE TABLE products(title text, product_codes multi)");
utils_api.sql("CREATE TABLE products(title text, product_codes multi)", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_multi = product_codes
}
It supports filtering and aggregation, but not sorting. Filtering can be done using a condition that requires at least one element to pass (using ANY()) or all elements (ALL()) to pass.
select * from products where any(product_codes)=3
POST /search
{
"table": "products",
"query":
{
"match_all": {},
"equals" : { "any(product_codes)": 3 }
}
}
$index->setName('products')->search('')->filter('any(product_codes)','equals',3)->get();
searchApi.search({"table":"products","query":{"match_all":{},"equals":{"any(product_codes)":3}}}})
await searchApi.search({"table":"products","query":{"match_all":{},"equals":{"any(product_codes)":3}}}})
res = await searchApi.search({"table":"products","query":{"match_all":{},"equals":{"any(product_codes)":3}}}})'
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
query.put("equals",new HashMap<String,Integer>(){{
put("any(product_codes)",3);
}});
searchRequest.setQuery(query);
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
searchRequest.AttrFilter = new EqualsFilter("any(product_codes)", 3);
var searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let search_request = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
Information like least or greatest element and length of the list can be extracted. An example shows ordering by the least element of a multi-value attribute.
select least(product_codes) l from products order by l asc
POST /search
{
"table": "products",
"query":
{
"match_all": {},
"sort": [ { "product_codes":{ "order":"asc", "mode":"min" } } ]
}
}
$index->setName('products')->search('')->sort('product_codes','asc','min')->get();
searchApi.search({"table":"products","query":{"match_all":{},"sort":[{"product_codes":{"order":"asc","mode":"min"}}]}})
await searchApi.search({"table":"products","query":{"match_all":{},"sort":[{"product_codes":{"order":"asc","mode":"min"}}]}})
res = await searchApi.search({"table":"products","query":{"match_all":{},"sort":[{"product_codes":{"order":"asc","mode":"min"}}]}});
searchRequest = new SearchRequest();
searchRequest.setIndex("forum");
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
searchRequest.setSort(new ArrayList<Object>(){{
add(new HashMap<String,String>(){{ put("product_codes",new HashMap<String,String>(){{ put("order","asc");put("mode","min");}});}});
}});
searchResponse = searchApi.search(searchRequest);
object query = new { match_all=null };
var searchRequest = new SearchRequest("forum", query);
searchRequest.Sort = new List<Object> {
new SortMVA("product_codes", SortOrder.OrderEnum.Asc, SortMVA.ModeEnum.Min)
};
searchResponse = searchApi.Search(searchRequest);
let query = SearchQuery::new();
let mut sort_opts = HashMap::new();
sort_opts.insert("order".to_string(), serde_json::json!("asc"));
sort_opts.insert("mode".to_string(), serde_json::json!("min"));
sort_expr.insert("product_codes".to_string(), serde_json::json!(sort_opts));
let sort: [HashMap; 1] = [sort_expr];
let search_req = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
sort: serde_json::json!(sort),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
When grouping by a multi-value attribute, a document will contribute to as many groups as there are different values associated with that document. For instance, if a collection contains exactly one document having a 'product_codes' multi-value attribute with values 5, 7, and 11, grouping on 'product_codes' will produce 3 groups with COUNT(*)equal to 1 and GROUPBY() key values of 5, 7, and 11, respectively. Also, note that grouping by multi-value attributes may lead to duplicate documents in the result set because each document can participate in many groups.
insert into products values ( 1, 'doc one', (5,7,11) );
select id, count(*), groupby() from products group by product_codes;
Query OK, 1 row affected (0.00 sec)
+------+----------+-----------+
| id | count(*) | groupby() |
+------+----------+-----------+
| 1 | 1 | 11 |
| 1 | 1 | 7 |
| 1 | 1 | 5 |
+------+----------+-----------+
3 rows in set (0.00 sec)
The order of the numbers inserted as values of multivalued attributes is not preserved. Values are stored internally as a sorted set.
insert into product values (1,'first',(4,2,1,3));
select * from products;
POST /insert
{
"table":"products",
"id":1,
"doc":
{
"title":"first",
"product_codes":[4,2,1,3]
}
}
POST /search
{
"table": "products",
"query": { "match_all": {} }
}
$index->addDocument([
"title"=>"first",
"product_codes"=>[4,2,1,3]
]);
$index->search('')-get();
indexApi.insert({"table":"products","id":1,"doc":{"title":"first","product_codes":[4,2,1,3]}})
searchApi.search({"table":"products","query":{"match_all":{}}})
await indexApi.insert({"table":"products","id":1,"doc":{"title":"first","product_codes":[4,2,1,3]}})
await searchApi.search({"table":"products","query":{"match_all":{}}})
await indexApi.insert({"table":"products","id":1,"doc":{"title":"first","product_codes":[4,2,1,3]}});
res = await searchApi.search({"table":"products","query":{"match_all":{}}});
InsertDocumentRequest newdoc = new InsertDocumentRequest();
HashMap<String,Object> doc = new HashMap<String,Object>(){{
put("title","first");
put("product_codes",new int[] {4,2,1,3});
}};
newdoc.index("products").id(1L).setDoc(doc);
sqlresult = indexApi.insert(newdoc);
Map<String,Object> query = new HashMap<String,Object>();
query.put("match_all",null);
SearchRequest searchRequest = new SearchRequest();
searchRequest.setIndex("products");
searchRequest.setQuery(query);
SearchResponse searchResponse = searchApi.search(searchRequest);
System.out.println(searchResponse.toString() );
Dictionary<string, Object> doc = new Dictionary<string, Object>();
doc.Add("title", "first");
doc.Add("product_codes", new List<Object> {4,2,1,3});
InsertDocumentRequest newdoc = new InsertDocumentRequest(index: "products", id: 1, doc: doc);
var sqlresult = indexApi.Insert(newdoc);
object query = new { match_all=null };
var searchRequest = new SearchRequest("products", query);
var searchResponse = searchApi.Search(searchRequest);
Console.WriteLine(searchResponse.ToString())
let mut doc = HashMap::new();
doc.insert("title".to_string(), serde_json::json!("first"));
doc.insert("product_codes".to_string(), serde_json::json!([4,2,1,3]));
let insert_req = InsertDocumentRequest::new("products".to_string(), serde_json::json!(doc));
let insert_res = index_api.insert(insert_req).await;
let query = SearchQuery::new();
let search_req = SearchRequest {
table: "forum".to_string(),
query: Some(Box::new(query)),
..Default::default(),
};
let search_res = search_api.search(search_req).await;
println!("{:?}", search_res);
Query OK, 1 row affected (0.00 sec)
+------+---------------+-------+
| id | product_codes | title |
+------+---------------+-------+
| 1 | 1,2,3,4 | first |
+------+---------------+-------+
1 row in set (0.01 sec)
{
"table":"products",
"_id":1,
"created":true,
"result":"created",
"status":201
}
{
"took":0,
"timed_out":false,
"hits":{
"total":1,
"hits":[
{
"_id": 1,
"_score":1,
"_source":{
"product_codes":[
1,
2,
3,
4
],
"title":"first"
}
}
]
}
}
Array
(
[_index] => products
[_id] => 1
[created] => 1
[result] => created
[status] => 201
)
Array
(
[took] => 0
[timed_out] =>
[hits] => Array
(
[total] => 1
[hits] => Array
(
[0] => Array
(
[_id] => 1
[_score] => 1
[_source] => Array
(
[product_codes] => Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
)
[title] => first
)
)
)
)
)
{'created': True,
'found': None,
'id': 1,
'table': 'products',
'result': 'created'}
{'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'product_codes': [1, 2, 3, 4],
u'title': u'first'}}],
'total': 1},
'profile': None,
'timed_out': False,
'took': 29}
{'created': True,
'found': None,
'id': 1,
'table': 'products',
'result': 'created'}
{'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'product_codes': [1, 2, 3, 4],
u'title': u'first'}}],
'total': 1},
'profile': None,
'timed_out': False,
'took': 29}
{"took":0,"timed_out":false,"hits":{"total":1,"hits":[{"_id": 1,"_score":1,"_source":{"product_codes":[1,2,3,4],"title":"first"}}]}}
class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=1, _score=1, _source={product_codes=[1, 2, 3, 4], title=first}}]
aggregations: null
}
profile: null
}
class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=1, _score=1, _source={product_codes=[1, 2, 3, 4], title=first}}]
aggregations: null
}
profile: null
}
class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
hits: [{_id=1, _score=1, _source={product_codes=[1, 2, 3, 4], title=first}}]
aggregations: null
}
profile: null
}
A data type that allows storing variable-length lists of 64-bit signed integers. It has the same functionality as multi-value integer.
CREATE TABLE products(title text, values multi64);
POST /cli -d "CREATE TABLE products(title text, values multi64)"
$index = new \Manticoresearch\Index($client);
$index->setName('products');
$index->create([
'title'=>['type'=>'text'],
'values'=>['type'=>'multi64']
]);
utilsApi.sql('CREATE TABLE products(title text, values multi64))')
await utilsApi.sql('CREATE TABLE products(title text, values multi64))')
res = await utilsApi.sql('CREATE TABLE products(title text, values multi64))');
utilsApi.sql("CREATE TABLE products(title text, values multi64))");
utilsApi.Sql("CREATE TABLE products(title text, values multi64))");
utils_api.sql("CREATE TABLE products(title text, values multi64))", Some(true)).await;
table products
{
type = rt
path = products
rt_field = title
stored_fields = title
rt_attr_multi_64 = values
}
When you use the columnar storage you can specify the following properties for the attributes.
By default, Manticore Columnar storage stores all attributes in a columnar fashion, as well as in a special docstore row by row. This enables fast execution of queries like SELECT * FROM ..., especially when fetching a large number of records at once. However, if you are sure that you do not need it or wish to save disk space, you can disable it by specifying fast_fetch='0' when creating a table or (if you are defining a table in a config) by using columnar_no_fast_fetch as shown in the following example.
create table t(a int, b int fast_fetch='0') engine='columnar'; desc t;
source min {
type = mysql
sql_host = localhost
sql_user = test
sql_pass =
sql_db = test
sql_query = select 1, 1 a, 1 b
sql_attr_uint = a
sql_attr_uint = b
}
table tbl {
path = tbl/col
source = min
columnar_attrs = *
columnar_no_fast_fetch = b
}
+-------+--------+---------------------+
| Field | Type | Properties |
+-------+--------+---------------------+
| id | bigint | columnar fast_fetch |
| a | uint | columnar fast_fetch |
| b | uint | columnar |
+-------+--------+---------------------+
3 rows in set (0.00 sec)
+-------+--------+---------------------+
| Field | Type | Properties |
+-------+--------+---------------------+
| id | bigint | columnar fast_fetch |
| a | uint | columnar fast_fetch |
| b | uint | columnar |
+-------+--------+---------------------+