Фасетный поиск так же важен для современного поискового приложения, как автозаполнение, исправление опечаток и подсветка поисковых ключевых слов highlighting, особенно в продуктах электронной коммерции.

Фасетный поиск полезен при работе с большими объемами данных и различными взаимосвязанными свойствами, такими как размер, цвет, производитель или другие факторы. При запросе больших объемов данных результаты поиска часто включают множество записей, которые не соответствуют ожиданиям пользователя. Фасетный поиск позволяет конечному пользователю явно определить критерии, которым должны удовлетворять результаты поиска.
В Manticore Search есть оптимизация, которая сохраняет набор результатов исходного запроса и повторно использует его для каждого вычисления фасета. Поскольку агрегации применяются к уже вычисленному подмножеству документов, они выполняются быстро, и общее время выполнения часто может быть лишь немного дольше, чем у первоначального запроса. Фасеты можно добавлять к любому запросу, и фасет может быть любым атрибутом или выражением. Результат фасета включает значения фасета и количество по фасету. К фасетам можно получить доступ с помощью SQL-запроса SELECT, объявляя их в самом конце запроса.
Значения фасета могут исходить из атрибута, свойства JSON внутри JSON-атрибута или выражения. Значения фасета также могут иметь псевдонимы, но псевдоним должен быть уникальным во всех наборах результатов (основной набор результатов запроса и другие наборы результатов фасетов). Значение фасета выводится из агрегированного атрибута/выражения, но также может исходить из другого атрибута/выражения.
FACET {expr_list} [BY {expr_list} ] [DISTINCT {field_name}] [ORDER BY {expr | FACET()} {ASC | DESC}] [LIMIT [offset,] count]
Несколько объявлений фасетов должны разделяться пробелом.
Фасеты можно определить в узле aggs:
"aggs" :
{
"group name" :
{
"terms" :
{
"field":"attribute name",
"size": 1000
}
"sort": [ {"attribute name": { "order":"asc" }} ]
}
}
где:
group name— это псевдоним, присвоенный агрегации- значение
fieldдолжно содержать имя атрибута или выражения, по которому выполняется фасетирование - необязательный параметр
sizeуказывает максимальное количество бакетов, включаемых в результат. Если не указан, наследует лимит основного запроса. Подробнее см. в разделе Размер результата фасета. - необязательный параметр
sortзадает массив атрибутов и/или дополнительных свойств с использованием того же синтаксиса, что и "параметр sort в основном запросе".
Набор результатов будет содержать узел aggregations с возвращенными фасетами, где key — агрегированное значение, а doc_count — количество в агрегации.
"aggregations": {
"group name": {
"buckets": [
{
"key": 10,
"doc_count": 1019
},
{
"key": 9,
"doc_count": 954
},
{
"key": 8,
"doc_count": 1021
},
{
"key": 7,
"doc_count": 1011
},
{
"key": 6,
"doc_count": 997
}
]
}
}
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- Javascript
- Java
- C#
- Rust
- TypeScript
- Go
SELECT *, price AS aprice FROM facetdemo LIMIT 10 FACET price LIMIT 10 FACET brand_id LIMIT 5;POST /search -d '
{
"table" : "facetdemo",
"query" : {"match_all" : {} },
"limit": 5,
"aggs" :
{
"group_property" :
{
"terms" :
{
"field":"price"
}
},
"group_brand_id" :
{
"terms" :
{
"field":"brand_id"
}
}
}
}
'$index->setName('facetdemo');
$search = $index->search('');
$search->limit(5);
$search->facet('price','price');
$search->facet('brand_id','group_brand_id');
$results = $search->get();res =searchApi.search({"table":"facetdemo","query":{"match_all":{}},"limit":5,"aggs":{"group_property":{"terms":{"field":"price",}},"group_brand_id":{"terms":{"field":"brand_id"}}}})res = await searchApi.search({"table":"facetdemo","query":{"match_all":{}},"limit":5,"aggs":{"group_property":{"terms":{"field":"price",}},"group_brand_id":{"terms":{"field":"brand_id"}}}})res = await searchApi.search({"table":"facetdemo","query":{"match_all":{}},"limit":5,"aggs":{"group_property":{"terms":{"field":"price",}},"group_brand_id":{"terms":{"field":"brand_id"}}}});aggs = new HashMap<String,Object>(){{
put("group_property", new HashMap<String,Object>(){{
put("terms", new HashMap<String,Object>(){{
put("field","price");
}});
}});
put("group_brand_id", new HashMap<String,Object>(){{
put("terms", new HashMap<String,Object>(){{
put("field","brand_id");
}});
}});
}};
searchRequest = new SearchRequest();
searchRequest.setIndex("facetdemo");
searchRequest.setLimit(5);
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
searchRequest.setAggs(aggs);
searchResponse = searchApi.search(searchRequest);var agg1 = new Aggregation("group_property", "price");
var agg2 = new Aggregation("group_brand_id", "brand_id");
object query = new { match_all=null };
var searchRequest = new SearchRequest("facetdemo", query);
searchRequest.Limit = 5;
searchRequest.Aggs = new List<Aggregation> {agg1, agg2};
var searchResponse = searchApi.Search(searchRequest);let query = SearchQuery::new();
let aggTerms1 = AggTerms::new("price");
let agg1 = Aggregation {
terms: Some(Box::new(aggTerms1)),
..Default::default(),
};
let aggTerms2 = AggTerms::new("brand_id");
let agg2 = Aggregation {
terms: Some(Box::new(aggTerms2)),
..Default::default(),
};
let mut aggs = HashMap::new();
aggs.insert("group_property".to_string(), agg1);
aggs.insert("group_brand_id".to_string(), agg2);
let search_req = SearchRequest {
table: "facetdemo".to_string(),
query: Some(Box::new(query)),
aggs: serde_json::json!(aggs),
limit: serde_json::json!(5),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({
index: 'test',
query: { match_all:{} },
aggs: {
name_group: {
terms: { field : 'name' }
},
cat_group: {
terms: { field: 'cat' }
}
}
});query := map[string]interface{} {}
searchRequest.SetQuery(query)
aggByName := manticoreclient.NewAggregation()
aggTerms := manticoreclient.NewAggregationTerms()
aggTerms.SetField("name")
aggByName.SetTerms(aggTerms)
aggByCat := manticoreclient.NewAggregation()
aggTerms.SetField("cat")
aggByCat.SetTerms(aggTerms)
aggs := map[string]Aggregation{} { "name_group": aggByName, "cat_group": aggByCat }
searchRequest.SetAggs(aggs)
res, _, _ := apiClient.SearchAPI.Search(context.Background()).SearchRequest(*searchRequest).Execute()+------+-------+----------+---------------------+------------+-------------+---------------------------------------+------------+--------+
| id | price | brand_id | title | brand_name | property | j | categories | aprice |
+------+-------+----------+---------------------+------------+-------------+---------------------------------------+------------+--------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 | 306 |
| 2 | 400 | 10 | Product Three One | Brand Ten | Four_Three | {"prop1":69,"prop2":19,"prop3":"One"} | 13,14 | 400 |
...
| 9 | 560 | 6 | Product Two Five | Brand Six | Eight_Two | {"prop1":90,"prop2":84,"prop3":"One"} | 13,14 | 560 |
| 10 | 229 | 9 | Product Three Eight | Brand Nine | Seven_Three | {"prop1":84,"prop2":39,"prop3":"One"} | 12,13 | 229 |
+------+-------+----------+---------------------+------------+-------------+---------------------------------------+------------+--------+
10 rows in set (0.00 sec)
+-------+----------+
| price | count(*) |
+-------+----------+
| 306 | 7 |
| 400 | 13 |
...
| 229 | 9 |
| 595 | 10 |
+-------+----------+
10 rows in set (0.00 sec)
+----------+----------+
| brand_id | count(*) |
+----------+----------+
| 1 | 1013 |
| 10 | 998 |
| 5 | 1007 |
| 8 | 1033 |
| 7 | 965 |
+----------+----------+
5 rows in set (0.00 sec){
"took": 3,
"timed_out": false,
"hits": {
"total": 10000,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"price": 197,
"brand_id": 10,
"brand_name": "Brand Ten",
"categories": [
10
]
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"price": 805,
"brand_id": 7,
"brand_name": "Brand Seven",
"categories": [
11,
12,
13
]
}
}
]
},
"aggregations": {
"group_property": {
"buckets": [
{
"key": 1000,
"doc_count": 11
},
{
"key": 999,
"doc_count": 12
},
...
{
"key": 991,
"doc_count": 7
}
]
},
"group_brand_id": {
"buckets": [
{
"key": 10,
"doc_count": 1019
},
{
"key": 9,
"doc_count": 954
},
{
"key": 8,
"doc_count": 1021
},
{
"key": 7,
"doc_count": 1011
},
{
"key": 6,
"doc_count": 997
}
]
}
}
}Array
(
[price] => Array
(
[buckets] => Array
(
[0] => Array
(
[key] => 1000
[doc_count] => 11
)
[1] => Array
(
[key] => 999
[doc_count] => 12
)
[2] => Array
(
[key] => 998
[doc_count] => 7
)
[3] => Array
(
[key] => 997
[doc_count] => 14
)
[4] => Array
(
[key] => 996
[doc_count] => 8
)
)
)
[group_brand_id] => Array
(
[buckets] => Array
(
[0] => Array
(
[key] => 10
[doc_count] => 1019
)
[1] => Array
(
[key] => 9
[doc_count] => 954
)
[2] => Array
(
[key] => 8
[doc_count] => 1021
)
[3] => Array
(
[key] => 7
[doc_count] => 1011
)
[4] => Array
(
[key] => 6
[doc_count] => 997
)
)
)
){'aggregations': {u'group_brand_id': {u'buckets': [{u'doc_count': 1019,
u'key': 10},
{u'doc_count': 954,
u'key': 9},
{u'doc_count': 1021,
u'key': 8},
{u'doc_count': 1011,
u'key': 7},
{u'doc_count': 997,
u'key': 6}]},
u'group_property': {u'buckets': [{u'doc_count': 11,
u'key': 1000},
{u'doc_count': 12,
u'key': 999},
{u'doc_count': 7,
u'key': 998},
{u'doc_count': 14,
u'key': 997},
{u'doc_count': 8,
u'key': 996}]}},
'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [10],
u'price': 197,
u'property': u'Six',
u'title': u'Product Eight One'}},
{u'_id': u'2',
u'_score': 1,
u'_source': {u'brand_id': 6,
u'brand_name': u'Brand Six',
u'categories': [12, 13, 14],
u'price': 671,
u'property': u'Four',
u'title': u'Product Nine Seven'}},
{u'_id': u'3',
u'_score': 1,
u'_source': {u'brand_id': 3,
u'brand_name': u'Brand Three',
u'categories': [13, 14, 15],
u'price': 92,
u'property': u'Six',
u'title': u'Product Five Four'}},
{u'_id': u'4',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [11],
u'price': 713,
u'property': u'Five',
u'title': u'Product Eight Nine'}},
{u'_id': u'5',
u'_score': 1,
u'_source': {u'brand_id': 7,
u'brand_name': u'Brand Seven',
u'categories': [11, 12, 13],
u'price': 805,
u'property': u'Two',
u'title': u'Product Ten Three'}}],
'max_score': None,
'total': 10000},
'profile': None,
'timed_out': False,
'took': 4}{'aggregations': {u'group_brand_id': {u'buckets': [{u'doc_count': 1019,
u'key': 10},
{u'doc_count': 954,
u'key': 9},
{u'doc_count': 1021,
u'key': 8},
{u'doc_count': 1011,
u'key': 7},
{u'doc_count': 997,
u'key': 6}]},
u'group_property': {u'buckets': [{u'doc_count': 11,
u'key': 1000},
{u'doc_count': 12,
u'key': 999},
{u'doc_count': 7,
u'key': 998},
{u'doc_count': 14,
u'key': 997},
{u'doc_count': 8,
u'key': 996}]}},
'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [10],
u'price': 197,
u'property': u'Six',
u'title': u'Product Eight One'}},
{u'_id': u'2',
u'_score': 1,
u'_source': {u'brand_id': 6,
u'brand_name': u'Brand Six',
u'categories': [12, 13, 14],
u'price': 671,
u'property': u'Four',
u'title': u'Product Nine Seven'}},
{u'_id': u'3',
u'_score': 1,
u'_source': {u'brand_id': 3,
u'brand_name': u'Brand Three',
u'categories': [13, 14, 15],
u'price': 92,
u'property': u'Six',
u'title': u'Product Five Four'}},
{u'_id': u'4',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [11],
u'price': 713,
u'property': u'Five',
u'title': u'Product Eight Nine'}},
{u'_id': u'5',
u'_score': 1,
u'_source': {u'brand_id': 7,
u'brand_name': u'Brand Seven',
u'categories': [11, 12, 13],
u'price': 805,
u'property': u'Two',
u'title': u'Product Ten Three'}}],
'max_score': None,
'total': 10000},
'profile': None,
'timed_out': False,
'took': 4}{"took":0,"timed_out":false,"hits":{"total":10000,"hits":[{"_id": 1,"_score":1,"_source":{"price":197,"brand_id":10,"brand_name":"Brand Ten","categories":[10],"title":"Product Eight One","property":"Six"}},{"_id": 2,"_score":1,"_source":{"price":671,"brand_id":6,"brand_name":"Brand Six","categories":[12,13,14],"title":"Product Nine Seven","property":"Four"}},{"_id": 3,"_score":1,"_source":{"price":92,"brand_id":3,"brand_name":"Brand Three","categories":[13,14,15],"title":"Product Five Four","property":"Six"}},{"_id": 4,"_score":1,"_source":{"price":713,"brand_id":10,"brand_name":"Brand Ten","categories":[11],"title":"Product Eight Nine","property":"Five"}},{"_id": 5,"_score":1,"_source":{"price":805,"brand_id":7,"brand_name":"Brand Seven","categories":[11,12,13],"title":"Product Ten Three","property":"Two"}}]}}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=1000, doc_count=11}, {key=999, doc_count=12}, {key=998, doc_count=7}, {key=997, doc_count=14}, {key=996, doc_count=8}]}, group_brand_id={buckets=[{key=10, doc_count=1019}, {key=9, doc_count=954}, {key=8, doc_count=1021}, {key=7, doc_count=1011}, {key=6, doc_count=997}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two}}]
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=1000, doc_count=11}, {key=999, doc_count=12}, {key=998, doc_count=7}, {key=997, doc_count=14}, {key=996, doc_count=8}]}, group_brand_id={buckets=[{key=10, doc_count=1019}, {key=9, doc_count=954}, {key=8, doc_count=1021}, {key=7, doc_count=1011}, {key=6, doc_count=997}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two}}]
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=1000, doc_count=11}, {key=999, doc_count=12}, {key=998, doc_count=7}, {key=997, doc_count=14}, {key=996, doc_count=8}]}, group_brand_id={buckets=[{key=10, doc_count=1019}, {key=9, doc_count=954}, {key=8, doc_count=1021}, {key=7, doc_count=1011}, {key=6, doc_count=997}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two}}]
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 5,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"content": "Text 1",
"name": "Doc 1",
"cat": 1
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"content": "Text 5",
"name": "Doc 5",
"cat": 4
}
}
]
},
"aggregations": {
"name_group": {
"buckets": [
{
"key": "Doc 1",
"doc_count": 1
},
...
{
"key": "Doc 5",
"doc_count": 1
}
]
},
"cat_group": {
"buckets": [
{
"key": 1,
"doc_count": 2
},
...
{
"key": 4,
"doc_count": 1
}
]
}
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 5,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"content": "Text 1",
"name": "Doc 1",
"cat": 1
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"content": "Text 5",
"name": "Doc 5",
"cat": 4
}
}
]
},
"aggregations": {
"name_group": {
"buckets": [
{
"key": "Doc 1",
"doc_count": 1
},
...
{
"key": "Doc 5",
"doc_count": 1
}
]
},
"cat_group": {
"buckets": [
{
"key": 1,
"doc_count": 2
},
...
{
"key": 4,
"doc_count": 1
}
]
}
}
}Данные можно фасетировать, агрегируя другой атрибут или выражение. Например, если документы содержат как идентификатор бренда, так и его название, можно вернуть в фасете названия брендов, но агрегировать идентификаторы брендов. Это можно сделать с помощью FACET {expr1} BY {expr2}
- SQL
SELECT * FROM facetdemo FACET brand_name by brand_id;+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| id | price | brand_id | title | brand_name | property | j | categories |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 |
| 2 | 400 | 10 | Product Three One | Brand Ten | Four_Three | {"prop1":69,"prop2":19,"prop3":"One"} | 13,14 |
....
| 19 | 855 | 1 | Product Seven Two | Brand One | Eight_Seven | {"prop1":63,"prop2":78,"prop3":"One"} | 10,11,12 |
| 20 | 31 | 9 | Product Four One | Brand Nine | Ten_Four | {"prop1":79,"prop2":42,"prop3":"One"} | 12,13,14 |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
20 rows in set (0.00 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand One | 1013 |
| Brand Ten | 998 |
| Brand Five | 1007 |
| Brand Nine | 944 |
| Brand Two | 990 |
| Brand Six | 1039 |
| Brand Three | 1016 |
| Brand Four | 994 |
| Brand Eight | 1033 |
| Brand Seven | 965 |
+-------------+----------+
10 rows in set (0.00 sec)Если нужно удалить дубликаты из бакетов, возвращаемых FACET, можно использовать DISTINCT field_name, где field_name — поле, по которому нужно выполнить дедупликацию. Это также может быть id (что является значением по умолчанию), если вы выполняете FACET-запрос к распределенной таблице и не уверены, что у вас уникальные идентификаторы в таблицах (таблицы должны быть локальными и иметь одинаковую схему).
Если в вашем запросе несколько объявлений FACET, field_name должен быть одинаковым во всех них.
DISTINCT возвращает дополнительный столбец count(distinct ...) перед столбцом count(*), что позволяет получить оба результата без необходимости делать дополнительный запрос.
- SQL
- JSON
SELECT brand_name, property FROM facetdemo FACET brand_name distinct property;POST /sql -d 'SELECT brand_name, property FROM facetdemo FACET brand_name distinct property'+-------------+----------+
| brand_name | property |
+-------------+----------+
| Brand Nine | Four |
| Brand Ten | Four |
| Brand One | Five |
| Brand Seven | Nine |
| Brand Seven | Seven |
| Brand Three | Seven |
| Brand Nine | Five |
| Brand Three | Eight |
| Brand Two | Eight |
| Brand Six | Eight |
| Brand Ten | Four |
| Brand Ten | Two |
| Brand Four | Ten |
| Brand One | Nine |
| Brand Four | Eight |
| Brand Nine | Seven |
| Brand Four | Five |
| Brand Three | Four |
| Brand Four | Two |
| Brand Four | Eight |
+-------------+----------+
20 rows in set (0.00 sec)
+-------------+--------------------------+----------+
| brand_name | count(distinct property) | count(*) |
+-------------+--------------------------+----------+
| Brand Nine | 3 | 3 |
| Brand Ten | 2 | 3 |
| Brand One | 2 | 2 |
| Brand Seven | 2 | 2 |
| Brand Three | 3 | 3 |
| Brand Two | 1 | 1 |
| Brand Six | 1 | 1 |
| Brand Four | 4 | 5 |
+-------------+--------------------------+----------+
8 rows in set (0.00 sec){
"took": 0,
"timed_out": false,
"hits": {
"total": 20,
"total_relation": "eq",
"hits": [
{
"_score": 1,
"_source": {
"brand_name": "Brand Nine",
"property": "Four"
}
},
{
"_score": 1,
"_source": {
"brand_name": "Brand Ten",
"property": "Four"
}
},
...
{
"_score": 1,
"_source": {
"brand_name": "Brand Four",
"property": "Eight"
}
}
]
},
"aggregations": {
"brand_name": {
"buckets": [
{
"key": "Brand Nine",
"doc_count": 3,
"count(distinct property)": 3
},
{
"key": "Brand Ten",
"doc_count": 3,
"count(distinct property)": 2
},
...
{
"key": "Brand Two",
"doc_count": 1,
"count(distinct property)": 1
},
{
"key": "Brand Six",
"doc_count": 1,
"count(distinct property)": 1
},
{
"key": "Brand Four",
"doc_count": 5,
"count(distinct property)": 4
}
]
}
}
}Фасеты могут агрегировать по выражениям. Классический пример — сегментация цен по определенным диапазонам:
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- Javascript
- Java
- C#
- Rust
- TypeScript
- Go
SELECT * FROM facetdemo FACET INTERVAL(price,200,400,600,800) AS price_range ;POST /search -d '
{
"table": "facetdemo",
"query":
{
"match_all": {}
},
"expressions":
{
"price_range": "INTERVAL(price,200,400,600,800)"
},
"aggs":
{
"group_property":
{
"terms":
{
"field": "price_range"
}
}
}
}$index->setName('facetdemo');
$search = $index->search('');
$search->limit(5);
$search->expression('price_range','INTERVAL(price,200,400,600,800)');
$search->facet('price_range','group_property');
$results = $search->get();
print_r($results->getFacets());res =searchApi.search({"table":"facetdemo","query":{"match_all":{}},"expressions":{"price_range":"INTERVAL(price,200,400,600,800)"},"aggs":{"group_property":{"terms":{"field":"price_range"}}}})res = await searchApi.search({"table":"facetdemo","query":{"match_all":{}},"expressions":{"price_range":"INTERVAL(price,200,400,600,800)"},"aggs":{"group_property":{"terms":{"field":"price_range"}}}})res = await searchApi.search({"table":"facetdemo","query":{"match_all":{}},"expressions":{"price_range":"INTERVAL(price,200,400,600,800)"},"aggs":{"group_property":{"terms":{"field":"price_range"}}}});searchRequest = new SearchRequest();
expressions = new HashMap<String,Object>(){{
put("price_range","INTERVAL(price,200,400,600,800)");
}};
searchRequest.setExpressions(expressions);
aggs = new HashMap<String,Object>(){{
put("group_property", new HashMap<String,Object>(){{
put("terms", new HashMap<String,Object>(){{
put("field","price_range");
}});
}});
}};
searchRequest.setIndex("facetdemo");
searchRequest.setLimit(5);
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
searchRequest.setAggs(aggs);
searchResponse = searchApi.search(searchRequest);var expr = new Dictionary<string, string> { {"price_range", "INTERVAL(price,200,400,600,800"} } ;
var agg = new Aggregation("group_property", "price_range");
object query = new { match_all=null };
var searchRequest = new SearchRequest("facetdemo", query);
searchRequest.Limit = 5;
searchRequest.Expressions = new List<Object> {expr};
searchRequest.Aggs = new List<Aggregation> {agg};
var searchResponse = searchApi.Search(searchRequest);let query = SearchQuery::new();
let aggTerms1 = AggTerms::new("price_range");
let agg1 = Aggregation {
terms: Some(Box::new(aggTerms1)),
..Default::default(),
};
let mut aggs = HashMap::new();
aggs.insert("group_property".to_string(), agg1);
let mut expr = HashMap::new();
expr.insert("price_range".to_string(), "INTERVAL(price,200,400,600,800");
let expressions: [HashMap; 1] = [expr];
let search_req = SearchRequest {
table: "facetdemo".to_string(),
query: Some(Box::new(query)),
expressions: serde_json::json!(expressions),
aggs: serde_json::json!(aggs),
limit: serde_json::json!(5),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({
index: 'test',
query: { match_all:{} },
expressions: { cat_range: "INTERVAL(cat,1,3)" }
aggs: {
expr_group: {
terms: { field : 'cat_range' }
}
}
});query := map[string]interface{} {}
searchRequest.SetQuery(query)
exprs := map[string]string{} { "cat_range": "INTERVAL(cat,1,3)" }
searchRequest.SetExpressions(exprs)
aggByExpr := manticoreclient.NewAggregation()
aggTerms := manticoreclient.NewAggregationTerms()
aggTerms.SetField("cat_range")
aggByExpr.SetTerms(aggTerms)
aggs := map[string]Aggregation{} { "expr_group": aggByExpr }
searchRequest.SetAggs(aggs)
res, _, _ := apiClient.SearchAPI.Search(context.Background()).SearchRequest(*searchRequest).Execute()+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+-------------+
| id | price | brand_id | title | brand_name | property | j | categories | price_range |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+-------------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 | 1 |
...
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+-------------+
20 rows in set (0.00 sec)
+-------------+----------+
| price_range | count(*) |
+-------------+----------+
| 0 | 1885 |
| 3 | 1973 |
| 4 | 2100 |
| 2 | 1999 |
| 1 | 2043 |
+-------------+----------+
5 rows in set (0.01 sec){
"took": 3,
"timed_out": false,
"hits": {
"total": 10000,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"price": 197,
"brand_id": 10,
"brand_name": "Brand Ten",
"categories": [
10
],
"price_range": 0
}
},
...
{
"_id": 20,
"_score": 1,
"_source": {
"price": 227,
"brand_id": 3,
"brand_name": "Brand Three",
"categories": [
12,
13
],
"price_range": 1
}
}
]
},
"aggregations": {
"group_property": {
"buckets": [
{
"key": 4,
"doc_count": 2100
},
{
"key": 3,
"doc_count": 1973
},
{
"key": 2,
"doc_count": 1999
},
{
"key": 1,
"doc_count": 2043
},
{
"key": 0,
"doc_count": 1885
}
]
}
}
}Array
(
[group_property] => Array
(
[buckets] => Array
(
[0] => Array
(
[key] => 4
[doc_count] => 2100
)
[1] => Array
(
[key] => 3
[doc_count] => 1973
)
[2] => Array
(
[key] => 2
[doc_count] => 1999
)
[3] => Array
(
[key] => 1
[doc_count] => 2043
)
[4] => Array
(
[key] => 0
[doc_count] => 1885
)
)
)
){'aggregations': {u'group_brand_id': {u'buckets': [{u'doc_count': 1019,
u'key': 10},
{u'doc_count': 954,
u'key': 9},
{u'doc_count': 1021,
u'key': 8},
{u'doc_count': 1011,
u'key': 7},
{u'doc_count': 997,
u'key': 6}]},
u'group_property': {u'buckets': [{u'doc_count': 11,
u'key': 1000},
{u'doc_count': 12,
u'key': 999},
{u'doc_count': 7,
u'key': 998},
{u'doc_count': 14,
u'key': 997},
{u'doc_count': 8,
u'key': 996}]}},
'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [10],
u'price': 197,
u'property': u'Six',
u'title': u'Product Eight One'}},
{u'_id': u'2',
u'_score': 1,
u'_source': {u'brand_id': 6,
u'brand_name': u'Brand Six',
u'categories': [12, 13, 14],
u'price': 671,
u'property': u'Four',
u'title': u'Product Nine Seven'}},
{u'_id': u'3',
u'_score': 1,
u'_source': {u'brand_id': 3,
u'brand_name': u'Brand Three',
u'categories': [13, 14, 15],
u'price': 92,
u'property': u'Six',
u'title': u'Product Five Four'}},
{u'_id': u'4',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [11],
u'price': 713,
u'property': u'Five',
u'title': u'Product Eight Nine'}},
{u'_id': u'5',
u'_score': 1,
u'_source': {u'brand_id': 7,
u'brand_name': u'Brand Seven',
u'categories': [11, 12, 13],
u'price': 805,
u'property': u'Two',
u'title': u'Product Ten Three'}}],
'max_score': None,
'total': 10000},
'profile': None,
'timed_out': False,
'took': 0}{'aggregations': {u'group_brand_id': {u'buckets': [{u'doc_count': 1019,
u'key': 10},
{u'doc_count': 954,
u'key': 9},
{u'doc_count': 1021,
u'key': 8},
{u'doc_count': 1011,
u'key': 7},
{u'doc_count': 997,
u'key': 6}]},
u'group_property': {u'buckets': [{u'doc_count': 11,
u'key': 1000},
{u'doc_count': 12,
u'key': 999},
{u'doc_count': 7,
u'key': 998},
{u'doc_count': 14,
u'key': 997},
{u'doc_count': 8,
u'key': 996}]}},
'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [10],
u'price': 197,
u'property': u'Six',
u'title': u'Product Eight One'}},
{u'_id': u'2',
u'_score': 1,
u'_source': {u'brand_id': 6,
u'brand_name': u'Brand Six',
u'categories': [12, 13, 14],
u'price': 671,
u'property': u'Four',
u'title': u'Product Nine Seven'}},
{u'_id': u'3',
u'_score': 1,
u'_source': {u'brand_id': 3,
u'brand_name': u'Brand Three',
u'categories': [13, 14, 15],
u'price': 92,
u'property': u'Six',
u'title': u'Product Five Four'}},
{u'_id': u'4',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [11],
u'price': 713,
u'property': u'Five',
u'title': u'Product Eight Nine'}},
{u'_id': u'5',
u'_score': 1,
u'_source': {u'brand_id': 7,
u'brand_name': u'Brand Seven',
u'categories': [11, 12, 13],
u'price': 805,
u'property': u'Two',
u'title': u'Product Ten Three'}}],
'max_score': None,
'total': 10000},
'profile': None,
'timed_out': False,
'took': 0}{"took":0,"timed_out":false,"hits":{"total":10000,"hits":[{"_id": 1,"_score":1,"_source":{"price":197,"brand_id":10,"brand_name":"Brand Ten","categories":[10],"title":"Product Eight One","property":"Six","price_range":0}},{"_id": 2,"_score":1,"_source":{"price":671,"brand_id":6,"brand_name":"Brand Six","categories":[12,13,14],"title":"Product Nine Seven","property":"Four","price_range":3}},{"_id": 3,"_score":1,"_source":{"price":92,"brand_id":3,"brand_name":"Brand Three","categories":[13,14,15],"title":"Product Five Four","property":"Six","price_range":0}},{"_id": 4,"_score":1,"_source":{"price":713,"brand_id":10,"brand_name":"Brand Ten","categories":[11],"title":"Product Eight Nine","property":"Five","price_range":3}},{"_id": 5,"_score":1,"_source":{"price":805,"brand_id":7,"brand_name":"Brand Seven","categories":[11,12,13],"title":"Product Ten Three","property":"Two","price_range":4}},{"_id": 6,"_score":1,"_source":{"price":420,"brand_id":2,"brand_name":"Brand Two","categories":[10,11],"title":"Product Two One","property":"Six","price_range":2}},{"_id": 7,"_score":1,"_source":{"price":412,"brand_id":9,"brand_name":"Brand Nine","categories":[10],"title":"Product Four Nine","property":"Eight","price_range":2}},{"_id": 8,"_score":1,"_source":{"price":300,"brand_id":9,"brand_name":"Brand Nine","categories":[13,14,15],"title":"Product Eight Four","property":"Five","price_range":1}},{"_id": 9,"_score":1,"_source":{"price":728,"brand_id":1,"brand_name":"Brand One","categories":[11],"title":"Product Nine Six","property":"Four","price_range":3}},{"_id": 10,"_score":1,"_source":{"price":622,"brand_id":3,"brand_name":"Brand Three","categories":[10,11],"title":"Product Six Seven","property":"Two","price_range":3}},{"_id": 11,"_score":1,"_source":{"price":462,"brand_id":5,"brand_name":"Brand Five","categories":[10,11],"title":"Product Ten Two","property":"Eight","price_range":2}},{"_id": 12,"_score":1,"_source":{"price":939,"brand_id":7,"brand_name":"Brand Seven","categories":[12,13],"title":"Product Nine Seven","property":"Six","price_range":4}},{"_id": 13,"_score":1,"_source":{"price":948,"brand_id":8,"brand_name":"Brand Eight","categories":[12],"title":"Product Ten One","property":"Six","price_range":4}},{"_id": 14,"_score":1,"_source":{"price":900,"brand_id":9,"brand_name":"Brand Nine","categories":[12,13,14],"title":"Product Ten Nine","property":"Three","price_range":4}},{"_id": 15,"_score":1,"_source":{"price":224,"brand_id":3,"brand_name":"Brand Three","categories":[13],"title":"Product Two Six","property":"Four","price_range":1}},{"_id": 16,"_score":1,"_source":{"price":713,"brand_id":10,"brand_name":"Brand Ten","categories":[12],"title":"Product Two Four","property":"Six","price_range":3}},{"_id": 17,"_score":1,"_source":{"price":510,"brand_id":2,"brand_name":"Brand Two","categories":[10],"title":"Product Ten Two","property":"Seven","price_range":2}},{"_id": 18,"_score":1,"_source":{"price":702,"brand_id":10,"brand_name":"Brand Ten","categories":[12,13],"title":"Product Nine One","property":"Three","price_range":3}},{"_id": 19,"_score":1,"_source":{"price":836,"brand_id":4,"brand_name":"Brand Four","categories":[10,11,12],"title":"Product Four Five","property":"Two","price_range":4}},{"_id": 20,"_score":1,"_source":{"price":227,"brand_id":3,"brand_name":"Brand Three","categories":[12,13],"title":"Product Three Four","property":"Ten","price_range":1}}]}}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=4, doc_count=2100}, {key=3, doc_count=1973}, {key=2, doc_count=1999}, {key=1, doc_count=2043}, {key=0, doc_count=1885}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six, price_range=0}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four, price_range=3}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six, price_range=0}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five, price_range=3}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two, price_range=4}}]
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=4, doc_count=2100}, {key=3, doc_count=1973}, {key=2, doc_count=1999}, {key=1, doc_count=2043}, {key=0, doc_count=1885}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six, price_range=0}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four, price_range=3}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six, price_range=0}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five, price_range=3}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two, price_range=4}}]
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=4, doc_count=2100}, {key=3, doc_count=1973}, {key=2, doc_count=1999}, {key=1, doc_count=2043}, {key=0, doc_count=1885}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six, price_range=0}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four, price_range=3}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six, price_range=0}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five, price_range=3}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two, price_range=4}}]
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 5,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"content": "Text 1",
"name": "Doc 1",
"cat": 1,
"cat_range": 1
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"content": "Text 5",
"name": "Doc 5",
"cat": 4,
"cat_range": 2,
}
}
]
},
"aggregations": {
"expr_group": {
"buckets": [
{
"key": 0,
"doc_count": 0
},
{
"key": 1,
"doc_count": 3
},
{
"key": 2,
"doc_count": 2
}
]
}
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 5,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"content": "Text 1",
"name": "Doc 1",
"cat": 1,
"cat_range": 1
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"content": "Text 5",
"name": "Doc 5",
"cat": 4,
"cat_range": 2
}
}
]
},
"aggregations": {
"expr_group": {
"buckets": [
{
"key": 0,
"doc_count": 0
},
{
"key": 1,
"doc_count": 3
},
{
"key": 2,
"doc_count": 2
}
]
}
}
}Facets can aggregate over multi-level grouping, with the result set being the same as if the query performed a multi-level grouping:
- SQL
SELECT *,INTERVAL(price,200,400,600,800) AS price_range FROM facetdemo
FACET price_range AS price_range,brand_name ORDER BY brand_name asc;+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+-------------+
| id | price | brand_id | title | brand_name | property | j | categories | price_range |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+-------------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 | 1 |
...
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+-------------+
20 rows in set (0.00 sec)
+--------------+-------------+----------+
| fprice_range | brand_name | count(*) |
+--------------+-------------+----------+
| 1 | Brand Eight | 197 |
| 4 | Brand Eight | 235 |
| 3 | Brand Eight | 203 |
| 2 | Brand Eight | 201 |
| 0 | Brand Eight | 197 |
| 4 | Brand Five | 230 |
| 2 | Brand Five | 197 |
| 1 | Brand Five | 204 |
| 3 | Brand Five | 193 |
| 0 | Brand Five | 183 |
| 1 | Brand Four | 195 |
...Facets can aggregate over histogram values by constructing fixed-size buckets over the values. The key function is:
key_of_the_bucket = interval + offset * floor ( ( value - offset ) / interval )
The histogram argument interval must be positive, and the histogram argument offset must be positive and less than interval. By default, the buckets are returned as an array. The histogram argument keyed makes the response a dictionary with the bucket keys.
- SQL
- JSON
- JSON 2
SELECT COUNT(*), HISTOGRAM(price, {hist_interval=100}) as price_range FROM facets GROUP BY price_range ORDER BY price_range ASC;POST /search -d '
{
"size": 0,
"table": "facets",
"aggs": {
"price_range": {
"histogram": {
"field": "price",
"interval": 300
}
}
}
}'POST /search -d '
{
"size": 0,
"table": "facets",
"aggs": {
"price_range": {
"histogram": {
"field": "price",
"interval": 300,
"keyed": true
}
}
}
}'+----------+-------------+
| count(*) | price_range |
+----------+-------------+
| 5 | 0 |
| 5 | 100 |
| 1 | 300 |
| 4 | 400 |
| 1 | 500 |
| 3 | 700 |
| 1 | 900 |
+----------+-------------+{
"took": 0,
"timed_out": false,
"hits": {
"total": 20,
"total_relation": "eq",
"hits": []
},
"aggregations": {
"price_range": {
"buckets": [
{
"key": 0,
"doc_count": 10
},
{
"key": 300,
"doc_count": 6
},
{
"key": 600,
"doc_count": 3
},
{
"key": 900,
"doc_count": 1
}
]
}
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 20,
"total_relation": "eq",
"hits": []
},
"aggregations": {
"price_range": {
"buckets": {
"0": {
"key": 0,
"doc_count": 10
},
"300": {
"key": 300,
"doc_count": 6
},
"600": {
"key": 600,
"doc_count": 3
},
"900": {
"key": 900,
"doc_count": 1
}
}
}
}
}Facets can aggregate over histogram date values, which is similar to the normal histogram. The difference is that the interval is specified using a date or time expression. Such expressions require special support because the intervals are not always of fixed length. Values are rounded to the closest bucket using the following key function:
key_of_the_bucket = interval * floor ( value / interval )
The histogram parameter calendar_interval understands months to have different amounts of days.
Unlike calendar_interval, the fixed_interval parameter uses a fixed number of units and does not deviate, regardless of where it falls on the calendar. However fixed_interval cannot process units such as months because a month is not a fixed quantity. Attempting to specify units like weeks or months for fixed_interval will result in an error.
The accepted intervals are described in the date_histogram expression. By default, the buckets are returned as an array. The histogram argument keyed makes the response a dictionary with the bucket keys.
- SQL
- JSON
SELECT count(*), DATE_HISTOGRAM(tm, {calendar_interval='month'}) AS months FROM idx_dates GROUP BY months ORDER BY months ASCPOST /search -d '
{
"table": "idx_dates",
"size": 0,
"aggs": {
"months": {
"date_histogram": {
"field": "tm",
"keyed": true,
"calendar_interval": "month"
}
}
}
}'+----------+------------+
| count(*) | months |
+----------+------------+
| 442 | 1485907200 |
| 744 | 1488326400 |
| 720 | 1491004800 |
| 230 | 1493596800 |
+----------+------------+{
"timed_out": false,
"hits": {
"total": 2136,
"total_relation": "eq",
"hits": []
},
"aggregations": {
"months": {
"buckets": {
"2017-02-01T00:00:00": {
"key": 1485907200,
"key_as_string": "2017-02-01T00:00:00",
"doc_count": 442
},
"2017-03-01T00:00:00": {
"key": 1488326400,
"key_as_string": "2017-03-01T00:00:00",
"doc_count": 744
},
"2017-04-01T00:00:00": {
"key": 1491004800,
"key_as_string": "2017-04-01T00:00:00",
"doc_count": 720
},
"2017-05-01T00:00:00": {
"key": 1493596800,
"key_as_string": "2017-05-01T00:00:00",
"doc_count": 230
}
}
}
}
}Facets can aggregate over a set of ranges. The values are checked against the bucket range, where each bucket includes the from value and excludes the to value from the range.
Setting the keyed property to true makes the response a dictionary with the bucket keys rather than an array.
- SQL
- JSON
- JSON 2
SELECT COUNT(*), RANGE(price, {range_to=150},{range_from=150,range_to=300},{range_from=300}) price_range FROM facets GROUP BY price_range ORDER BY price_range ASC;POST /search -d '
{
"size": 0,
"table": "facets",
"aggs": {
"price_range": {
"range": {
"field": "price",
"ranges": [
{
"to": 99
},
{
"from": 99,
"to": 550
},
{
"from": 550
}
]
}
}
}
}'POST /search -d '
{
"size":0,
"table":"facets",
"aggs":{
"price_range":{
"range":{
"field":"price",
"keyed":true,
"ranges":[
{
"from":100,
"to":399
},
{
"from":399
}
]
}
}
}
}'+----------+-------------+
| count(*) | price_range |
+----------+-------------+
| 8 | 0 |
| 2 | 1 |
| 10 | 2 |
+----------+-------------+{
"took": 0,
"timed_out": false,
"hits": {
"total": 20,
"total_relation": "eq",
"hits": []
},
"aggregations": {
"price_range": {
"buckets": [
{
"key": "*-99",
"to": 99,
"doc_count": 5
},
{
"key": "99-550",
"from": 99,
"to": 550,
"doc_count": 11
},
{
"key": "550-*",
"from": 550,
"doc_count": 4
}
]
}
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 20,
"total_relation": "eq",
"hits": []
},
"aggregations": {
"price_range": {
"buckets": {
"100-399": {
"from": 100,
"to": 399,
"doc_count": 6
},
"399-*": {
"from": 399,
"doc_count": 9
}
}
}
}
}Facets can aggregate over a set of date ranges, which is similar to the normal range. The difference is that the from and to values can be expressed in Date math expressions. This aggregation includes the from value and excludes the to value for each range. Setting the keyed property to true makes the response a dictionary with the bucket keys rather than an array.
- SQL
- JSON
SELECT COUNT(*), DATE_RANGE(tm, {range_to='2017||+2M/M'},{range_from='2017||+2M/M',range_to='2017||+5M/M'},{range_from='2017||+5M/M'}) AS points FROM idx_dates GROUP BY points ORDER BY points ASC;POST /search -d '
{
"table": "idx_dates",
"size": 0,
"aggs": {
"points": {
"date_range": {
"field": "tm",
"keyed": true,
"ranges": [
{
"to": "2017||+2M/M"
},
{
"from": "2017||+2M/M",
"to": "2017||+4M/M"
},
{
"from": "2017||+4M/M",
"to": "2017||+5M/M"
},
{
"from": "2017||+5M/M"
}
]
}
}
}
}'+----------+--------+
| count(*) | points |
+----------+--------+
| 442 | 0 |
| 1464 | 1 |
| 230 | 2 |
+----------+--------+{
"timed_out": false,
"hits": {
"total": 2136,
"total_relation": "eq",
"hits": []
},
"aggregations": {
"points": {
"buckets": {
"*-2017-03-01T00:00:00": {
"to": "2017-03-01T00:00:00",
"doc_count": 442
},
"2017-03-01T00:00:00-2017-04-01T00:00:00": {
"from": "2017-03-01T00:00:00",
"to": "2017-04-01T00:00:00",
"doc_count": 744
},
"2017-04-01T00:00:00-2017-05-01T00:00:00": {
"from": "2017-04-01T00:00:00",
"to": "2017-05-01T00:00:00",
"doc_count": 720
},
"2017-05-01T00:00:00-*": {
"from": "2017-05-01T00:00:00",
"doc_count": 230
}
}
}
}
}Facets support the ORDER BY clause just like a standard query. Each facet can have its own ordering, and the facet ordering doesn't affect the main result set's ordering, which is determined by the main query's ORDER BY. Sorting can be done on attribute name, count (using COUNT(*), COUNT(DISTINCT attribute_name)), or the special FACET() function, which provides the aggregated data values. By default, a query with ORDER BY COUNT(*) will sort in descending order.
- SQL
- JSON
SELECT * FROM facetdemo
FACET brand_name BY brand_id ORDER BY FACET() ASC
FACET brand_name BY brand_id ORDER BY brand_name ASC
FACET brand_name BY brand_id order BY COUNT(*) DESC;
FACET brand_name BY brand_id order BY COUNT(*);POST /search -d '
{
"table":"table_name",
"aggs":{
"group_property":{
"terms":{
"field":"a"
},
"sort":[
{
"count(*)":{
"order":"desc"
}
}
]
}
}
}'+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| id | price | brand_id | title | brand_name | property | j | categories |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 |
...
| 20 | 31 | 9 | Product Four One | Brand Nine | Ten_Four | {"prop1":79,"prop2":42,"prop3":"One"} | 12,13,14 |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
20 rows in set (0.01 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand One | 1013 |
| Brand Two | 990 |
| Brand Three | 1016 |
| Brand Four | 994 |
| Brand Five | 1007 |
| Brand Six | 1039 |
| Brand Seven | 965 |
| Brand Eight | 1033 |
| Brand Nine | 944 |
| Brand Ten | 998 |
+-------------+----------+
10 rows in set (0.01 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand Eight | 1033 |
| Brand Five | 1007 |
| Brand Four | 994 |
| Brand Nine | 944 |
| Brand One | 1013 |
| Brand Seven | 965 |
| Brand Six | 1039 |
| Brand Ten | 998 |
| Brand Three | 1016 |
| Brand Two | 990 |
+-------------+----------+
10 rows in set (0.01 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand Six | 1039 |
| Brand Eight | 1033 |
| Brand Three | 1016 |
| Brand One | 1013 |
| Brand Five | 1007 |
| Brand Ten | 998 |
| Brand Four | 994 |
| Brand Two | 990 |
| Brand Seven | 965 |
| Brand Nine | 944 |
+-------------+----------+
10 rows in set (0.01 sec){
"took": 0,
"timed_out": false,
"hits": {
"total": 6,
"total_relation": "eq",
"hits": [
{
"_id": 1515697460415037554,
"_score": 1,
"_source": {
"a": 1
}
},
{
"_id": 1515697460415037555,
"_score": 1,
"_source": {
"a": 2
}
},
{
"_id": 1515697460415037556,
"_score": 1,
"_source": {
"a": 2
}
},
{
"_id": 1515697460415037557,
"_score": 1,
"_source": {
"a": 3
}
},
{
"_id": 1515697460415037558,
"_score": 1,
"_source": {
"a": 3
}
},
{
"_id": 1515697460415037559,
"_score": 1,
"_source": {
"a": 3
}
}
]
},
"aggregations": {
"group_property": {
"buckets": [
{
"key": 3,
"doc_count": 3
},
{
"key": 2,
"doc_count": 2
},
{
"key": 1,
"doc_count": 1
}
]
}
}
}By default, each facet result set is limited to 20 values. The number of facet values can be controlled with the LIMIT clause individually for each facet by providing either a number of values to return in the format LIMIT count or with an offset as LIMIT offset, count.
The maximum facet values that can be returned is limited by the query's max_matches setting. If you want to implement dynamic max_matches (limiting max_matches to offset + per page for better performance), it must be taken into account that a too low max_matches value can affect the number of facet values. In this case, a minimum max_matches value should be used that is sufficient to cover the number of facet values.
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- Javascript
- Java
- C#
- Rust
- TypeScript
- Go
SELECT * FROM facetdemo
FACET brand_name BY brand_id ORDER BY FACET() ASC LIMIT 0,1
FACET brand_name BY brand_id ORDER BY brand_name ASC LIMIT 2,4
FACET brand_name BY brand_id order BY COUNT(*) DESC LIMIT 4;POST /search -d '
{
"table" : "facetdemo",
"query" : {"match_all" : {} },
"limit": 5,
"aggs" :
{
"group_property" :
{
"terms" :
{
"field":"price",
"size":1,
}
},
"group_brand_id" :
{
"terms" :
{
"field":"brand_id",
"size":3
}
}
}
}
'$index->setName('facetdemo');
$search = $index->search('');
$search->limit(5);
$search->facet('price','price',1);
$search->facet('brand_id','group_brand_id',3);
$results = $search->get();
print_r($results->getFacets());res =searchApi.search({"table":"facetdemo","query":{"match_all":{}},"limit":5,"aggs":{"group_property":{"terms":{"field":"price","size":1,}},"group_brand_id":{"terms":{"field":"brand_id","size":3}}}})res = await searchApi.search({"table":"facetdemo","query":{"match_all":{}},"limit":5,"aggs":{"group_property":{"terms":{"field":"price","size":1,}},"group_brand_id":{"terms":{"field":"brand_id","size":3}}}})res = await searchApi.search({"table":"facetdemo","query":{"match_all":{}},"limit":5,"aggs":{"group_property":{"terms":{"field":"price","size":1,}},"group_brand_id":{"terms":{"field":"brand_id","size":3}}}});searchRequest = new SearchRequest();
aggs = new HashMap<String,Object>(){{
put("group_property", new HashMap<String,Object>(){{
put("terms", new HashMap<String,Object>(){{
put("field","price");
put("size",1);
}});
}});
put("group_brand_id", new HashMap<String,Object>(){{
put("terms", new HashMap<String,Object>(){{
put("field","brand_id");
put("size",3);
}});
}});
}};
searchRequest.setIndex("facetdemo");
searchRequest.setLimit(5);
query = new HashMap<String,Object>();
query.put("match_all",null);
searchRequest.setQuery(query);
searchRequest.setAggs(aggs);
searchResponse = searchApi.search(searchRequest);var agg1 = new Aggregation("group_property", "price");
agg1.Size = 1;
var agg2 = new Aggregation("group_brand_id", "brand_id");
agg2.Size = 3;
agg2.Size = 100;
object query = new { match_all=null };
var searchRequest = new SearchRequest("facetdemo", query);
searchRequest.Aggs = new List<Aggregation> {agg1, agg2};
var searchResponse = searchApi.Search(searchRequest);let query = SearchQuery::new();
let aggTerms1 = AggTerms {
field: "price".to_string(),
size: Some(1),
};
let agg1 = Aggregation {
terms: Some(Box::new(aggTerms1)),
..Default::default(),
};
let aggTerms1 = AggTerms {
field: "brand_id".to_string(),
size: Some(3),
};
let agg2 = Aggregation {
terms: Some(Box::new(aggTerms2)),
..Default::default(),
};
let mut aggs = HashMap::new();
aggs.insert("group_property".to_string(), agg1);
aggs.insert("group_brand_id".to_string(), agg2);
let search_req = SearchRequest {
table: "facetdemo".to_string(),
query: Some(Box::new(query)),
aggs: serde_json::json!(aggs),
limit: serde_json::json!(5),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({
index: 'test',
query: { match_all:{} },
aggs: {
name_group: {
terms: { field : 'name', size: 1 }
},
cat_group: {
terms: { field: 'cat' }
}
}
});query := map[string]interface{} {}
searchRequest.SetQuery(query)
aggByName := manticoreclient.NewAggregation()
aggTerms := manticoreclient.NewAggregationTerms()
aggTerms.SetField("name")
aggByName.SetTerms(aggTerms)
aggByName.SetSize(1)
aggByCat := manticoreclient.NewAggregation()
aggTerms.SetField("cat")
aggByCat.SetTerms(aggTerms)
aggs := map[string]Aggregation{} { "name_group": aggByName, "cat_group": aggByCat }
searchRequest.SetAggs(aggs)
res, _, _ := apiClient.SearchAPI.Search(context.Background()).SearchRequest(*searchRequest).Execute()+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| id | price | brand_id | title | brand_name | property | j | categories |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 |
...
| 20 | 31 | 9 | Product Four One | Brand Nine | Ten_Four | {"prop1":79,"prop2":42,"prop3":"One"} | 12,13,14 |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
20 rows in set (0.01 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand One | 1013 |
+-------------+----------+
1 rows in set (0.01 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand Four | 994 |
| Brand Nine | 944 |
| Brand One | 1013 |
| Brand Seven | 965 |
+-------------+----------+
4 rows in set (0.01 sec)
+-------------+----------+
| brand_name | count(*) |
+-------------+----------+
| Brand Six | 1039 |
| Brand Eight | 1033 |
| Brand Three | 1016 |
+-------------+----------+
3 rows in set (0.01 sec){
"took": 3,
"timed_out": false,
"hits": {
"total": 10000,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"price": 197,
"brand_id": 10,
"brand_name": "Brand Ten",
"categories": [
10
]
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"price": 805,
"brand_id": 7,
"brand_name": "Brand Seven",
"categories": [
11,
12,
13
]
}
}
]
},
"aggregations": {
"group_property": {
"buckets": [
{
"key": 1000,
"doc_count": 11
}
]
},
"group_brand_id": {
"buckets": [
{
"key": 10,
"doc_count": 1019
},
{
"key": 9,
"doc_count": 954
},
{
"key": 8,
"doc_count": 1021
}
]
}
}
}Array
(
[price] => Array
(
[buckets] => Array
(
[0] => Array
(
[key] => 1000
[doc_count] => 11
)
)
)
[group_brand_id] => Array
(
[buckets] => Array
(
[0] => Array
(
[key] => 10
[doc_count] => 1019
)
[1] => Array
(
[key] => 9
[doc_count] => 954
)
[2] => Array
(
[key] => 8
[doc_count] => 1021
)
)
)
){'aggregations': {u'group_brand_id': {u'buckets': [{u'doc_count': 1019,
u'key': 10},
{u'doc_count': 954,
u'key': 9},
{u'doc_count': 1021,
u'key': 8}]},
u'group_property': {u'buckets': [{u'doc_count': 11,
u'key': 1000}]}},
'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [10],
u'price': 197,
u'property': u'Six',
u'title': u'Product Eight One'}},
{u'_id': u'2',
u'_score': 1,
u'_source': {u'brand_id': 6,
u'brand_name': u'Brand Six',
u'categories': [12, 13, 14],
u'price': 671,
u'property': u'Four',
u'title': u'Product Nine Seven'}},
{u'_id': u'3',
u'_score': 1,
u'_source': {u'brand_id': 3,
u'brand_name': u'Brand Three',
u'categories': [13, 14, 15],
u'price': 92,
u'property': u'Six',
u'title': u'Product Five Four'}},
{u'_id': u'4',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [11],
u'price': 713,
u'property': u'Five',
u'title': u'Product Eight Nine'}},
{u'_id': u'5',
u'_score': 1,
u'_source': {u'brand_id': 7,
u'brand_name': u'Brand Seven',
u'categories': [11, 12, 13],
u'price': 805,
u'property': u'Two',
u'title': u'Product Ten Three'}}],
'max_score': None,
'total': 10000},
'profile': None,
'timed_out': False,
'took': 0}{'aggregations': {u'group_brand_id': {u'buckets': [{u'doc_count': 1019,
u'key': 10},
{u'doc_count': 954,
u'key': 9},
{u'doc_count': 1021,
u'key': 8}]},
u'group_property': {u'buckets': [{u'doc_count': 11,
u'key': 1000}]}},
'hits': {'hits': [{u'_id': u'1',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [10],
u'price': 197,
u'property': u'Six',
u'title': u'Product Eight One'}},
{u'_id': u'2',
u'_score': 1,
u'_source': {u'brand_id': 6,
u'brand_name': u'Brand Six',
u'categories': [12, 13, 14],
u'price': 671,
u'property': u'Four',
u'title': u'Product Nine Seven'}},
{u'_id': u'3',
u'_score': 1,
u'_source': {u'brand_id': 3,
u'brand_name': u'Brand Three',
u'categories': [13, 14, 15],
u'price': 92,
u'property': u'Six',
u'title': u'Product Five Four'}},
{u'_id': u'4',
u'_score': 1,
u'_source': {u'brand_id': 10,
u'brand_name': u'Brand Ten',
u'categories': [11],
u'price': 713,
u'property': u'Five',
u'title': u'Product Eight Nine'}},
{u'_id': u'5',
u'_score': 1,
u'_source': {u'brand_id': 7,
u'brand_name': u'Brand Seven',
u'categories': [11, 12, 13],
u'price': 805,
u'property': u'Two',
u'title': u'Product Ten Three'}}],
'max_score': None,
'total': 10000},
'profile': None,
'timed_out': False,
'took': 0}{"took":0,"timed_out":false,"hits":{"total":10000,"hits":[{"_id": 1,"_score":1,"_source":{"price":197,"brand_id":10,"brand_name":"Brand Ten","categories":[10],"title":"Product Eight One","property":"Six"}},{"_id": 2,"_score":1,"_source":{"price":671,"brand_id":6,"brand_name":"Brand Six","categories":[12,13,14],"title":"Product Nine Seven","property":"Four"}},{"_id": 3,"_score":1,"_source":{"price":92,"brand_id":3,"brand_name":"Brand Three","categories":[13,14,15],"title":"Product Five Four","property":"Six"}},{"_id": 4,"_score":1,"_source":{"price":713,"brand_id":10,"brand_name":"Brand Ten","categories":[11],"title":"Product Eight Nine","property":"Five"}},{"_id": 5,"_score":1,"_source":{"price":805,"brand_id":7,"brand_name":"Brand Seven","categories":[11,12,13],"title":"Product Ten Three","property":"Two"}}]}}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=1000, doc_count=11}]}, group_brand_id={buckets=[{key=10, doc_count=1019}, {key=9, doc_count=954}, {key=8, doc_count=1021}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two}}]
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=1000, doc_count=11}]}, group_brand_id={buckets=[{key=10, doc_count=1019}, {key=9, doc_count=954}, {key=8, doc_count=1021}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two}}]
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
aggregations: {group_property={buckets=[{key=1000, doc_count=11}]}, group_brand_id={buckets=[{key=10, doc_count=1019}, {key=9, doc_count=954}, {key=8, doc_count=1021}]}}
hits: class SearchResponseHits {
maxScore: null
total: 10000
hits: [{_id=1, _score=1, _source={price=197, brand_id=10, brand_name=Brand Ten, categories=[10], title=Product Eight One, property=Six}}, {_id=2, _score=1, _source={price=671, brand_id=6, brand_name=Brand Six, categories=[12, 13, 14], title=Product Nine Seven, property=Four}}, {_id=3, _score=1, _source={price=92, brand_id=3, brand_name=Brand Three, categories=[13, 14, 15], title=Product Five Four, property=Six}}, {_id=4, _score=1, _source={price=713, brand_id=10, brand_name=Brand Ten, categories=[11], title=Product Eight Nine, property=Five}}, {_id=5, _score=1, _source={price=805, brand_id=7, brand_name=Brand Seven, categories=[11, 12, 13], title=Product Ten Three, property=Two}}]
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 5,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"content": "Text 1",
"name": "Doc 1",
"cat": 1
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"content": "Text 5",
"name": "Doc 5",
"cat": 4
}
}
]
},
"aggregations": {
"name_group": {
"buckets": [
{
"key": "Doc 1",
"doc_count": 1
}
]
},
"cat_group": {
"buckets": [
{
"key": 1,
"doc_count": 2
},
...
{
"key": 4,
"doc_count": 1
}
]
}
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 5,
"hits": [
{
"_id": 1,
"_score": 1,
"_source": {
"content": "Text 1",
"name": "Doc 1",
"cat": 1
}
},
...
{
"_id": 5,
"_score": 1,
"_source": {
"content": "Text 5",
"name": "Doc 5",
"cat": 4
}
}
]
},
"aggregations": {
"name_group": {
"buckets": [
{
"key": "Doc 1",
"doc_count": 1
}
]
},
"cat_group": {
"buckets": [
{
"key": 1,
"doc_count": 2
},
...
{
"key": 4,
"doc_count": 1
}
]
}
}
}При использовании SQL поиск с фасетами возвращает несколько наборов результатов. Клиент/библиотека/коннектор MySQL, используемый должен поддерживать несколько наборов результатов, чтобы получить доступ к наборам результатов фасетов.
Внутренне FACET является сокращением для выполнения мультизапроса, где первый запрос содержит основной поисковый запрос, а остальные запросы в пакете имеют каждый свою кластеризацию. Как и в случае мультизапроса, общая оптимизация запросов может сработать для фасетного поиска, что означает, что поисковый запрос выполняется только один раз, а фасеты работают с результатом поискового запроса, при этом каждый фасет добавляет лишь часть времени к общему времени запроса.
Чтобы проверить, был ли фасетный поиск выполнен в оптимизированном режиме, вы можете посмотреть в журнал запросов, где все записанные запросы будут содержать строку xN, где N — количество запросов, выполненных в оптимизированной группе. Кроме того, вы можете проверить вывод оператора SHOW META, который отобразит метрику multiplier:
- SQL
SELECT * FROM facetdemo FACET brand_id FACET price FACET categories;
SHOW META LIKE 'multiplier';+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| id | price | brand_id | title | brand_name | property | j | categories |
+------+-------+----------+---------------------+-------------+-------------+---------------------------------------+------------+
| 1 | 306 | 1 | Product Ten Three | Brand One | Six_Ten | {"prop1":66,"prop2":91,"prop3":"One"} | 10,11 |
...
+----------+----------+
| brand_id | count(*) |
+----------+----------+
| 1 | 1013 |
...
+-------+----------+
| price | count(*) |
+-------+----------+
| 306 | 7 |
...
+------------+----------+
| categories | count(*) |
+------------+----------+
| 10 | 2436 |
...
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| multiplier | 4 |
+---------------+-------+
1 row in set (0.00 sec)Одна из главных особенностей Manticore Search — возможность комбинировать полнотекстовый поиск с геолокацией. Например, ритейлер может предложить поиск, где пользователь ищет товар, а набор результатов может указывать ближайший магазин, в котором этот товар есть в наличии, чтобы пользователь мог прийти в магазин и забрать его. Туристический сайт может предоставлять результаты на основе поиска, ограниченного определённой областью, и сортировать результаты по расстоянию от точки (например, «поиск музеев рядом с отелем»).
Для выполнения гео-поиска документ должен содержать пары координат широты/долготы. Координаты могут храниться как атрибуты с плавающей точкой. Если в документе несколько локаций, может быть удобно использовать JSON-атрибут для хранения пар координат.
table myrt
{
...
rt_attr_float = lat
rt_attr_float = lon
...
}
Координаты могут храниться в градусах или радианах.
Если для атрибутов широты и долготы сгенерированы вторичные индексы, они могут автоматически использоваться для ускорения гео-поисков, если оптимизатор на основе стоимости решит их применить.
Чтобы узнать расстояние между двумя точками, можно использовать функцию GEODIST(). GEODIST требует две пары координат в качестве первых четырёх параметров.
5-й параметр в упрощённом JSON-формате может настраивать определённые аспекты функции. По умолчанию GEODIST ожидает координаты в радианах, но можно добавить in=degrees, чтобы использовать градусы в качестве входных данных. Координаты, для которых выполняется вычисление гео-расстояния, должны иметь тот же тип (градусы или радианы), что и хранящиеся в таблице; в противном случае результаты будут некорректными.
Вычисленное расстояние по умолчанию в метрах, но с опцией out его можно преобразовать в километры, футы или мили. Наконец, по умолчанию используется метод расчёта под названием adaptive. Альтернативный метод на основе алгоритма haversine также доступен; однако он медленнее и менее точен.
Результат функции — расстояние — можно использовать в предложении ORDER BY для сортировки результатов:
SELECT *, GEODIST(40.7643929, -73.9997683, lat, lon, {in=degrees, out=miles}) AS distance FROM myindex WHERE MATCH('...') ORDER BY distance ASC, WEIGHT() DESC;
Или для ограничения результатов радиальной областью вокруг точки:
SELECT *,GEODIST(40.7643929, -73.9997683, lat,lon, {in=degrees, out=miles}) AS distance FROM myindex WHERE MATCH('...') AND distance <1000 ORDER BY WEIGHT(), DISTANCE ASC;
Ещё одна функция гео-поиска — возможность определить, находится ли локация внутри заданной области. Специальная функция создаёт объект многоугольника, который затем используется другой функцией для проверки, содержатся ли координаты внутри этого многоугольника или нет.
Доступны две функции для создания многоугольника:
- GEOPOLY2D() — создаёт многоугольник с учётом кривизны Земли
- POLY2D() — создаёт простой многоугольник в плоском пространстве
POLY2D подходит для гео-поиска, когда стороны области короче 500 км (для многоугольников с 3-4 сторонами; для многоугольников с большим числом сторон следует учитывать меньшие значения). Для областей с более длинными сторонами требуется использовать GEOPOLY2D для сохранения точности результатов. GEOPOLY2D ожидает координаты в виде пар широты/долготы в градусах; использование радиан даст результаты в плоском пространстве (аналогично POLY2D).
CONTAINS() принимает многоугольник и набор координат на вход и возвращает 1, если точка находится внутри многоугольника, или 0 в противном случае.
SELECT *,CONTAINS(GEOPOLY2D(40.76439, -73.9997, 42.21211, -73.999, 42.21211, -76.123, 40.76439, -76.123), 41.5445, -74.973) AS inside FROM myindex WHERE MATCH('...') AND inside=1;
Percolate queries are also known as Persistent queries, Prospective search, document routing, search in reverse, and inverse search.
The traditional way of conducting searches involves storing documents and performing search queries against them. However, there are cases where we want to apply a query to a newly incoming document to signal a match. Some scenarios where this is desired include monitoring systems that collect data and notify users about specific events, such as reaching a certain threshold for a metric or a particular value appearing in the monitored data. Another example is news aggregation, where users may want to be notified only about certain categories or topics, or even specific "keywords."
In these situations, traditional search is not the best fit, as it assumes the desired search is performed over the entire collection. This process gets multiplied by the number of users, resulting in many queries running over the entire collection, which can cause significant additional load. The alternative approach described in this section involves storing the queries instead and testing them against an incoming new document or a batch of documents.
Google Alerts, AlertHN, Bloomberg Terminal, and other systems that allow users to subscribe to specific content utilize similar technology.
- See percolate for information on creating a PQ table.
- See Adding rules to a percolate table to learn how to add percolate rules (also known as PQ rules). Here's a quick example:
The key thing to remember about percolate queries is that your search queries are already in the table. What you need to provide are documents to check if any of them match any of the stored rules.
You can perform a percolate query via SQL or JSON interfaces, as well as using programming language clients. The SQL approach offers more flexibility, while the HTTP method is simpler and provides most of what you need. The table below can help you understand the differences.
| Desired Behavior | SQL | HTTP |
|---|---|---|
| Provide a single document | CALL PQ('tbl', '{doc1}') |
query.percolate.document{doc1} |
| Provide a single document (alternative) | CALL PQ('tbl', 'doc1', 0 as docs_json) |
- |
| Provide multiple documents | CALL PQ('tbl', ('doc1', 'doc2'), 0 as docs_json) |
- |
| Provide multiple documents (alternative) | CALL PQ('tbl', ('{doc1}', '{doc2}')) |
- |
| Provide multiple documents (alternative) | CALL PQ('tbl', '[{doc1}, {doc2}]') |
- |
| Return matching document ids | 0/1 as docs (disabled by default) | Enabled by default |
| Use document's own id to show in the result | 'id field' as docs_id (disabled by default) | Not available |
| Consider input documents are JSON | 1 as docs_json (1 by default) | Enabled by default |
| Consider input documents are plain text | 0 as docs_json (1 by default) | Not available |
| Sparsed distribution mode | default | default |
| Sharded distribution mode | sharded as mode | Not available |
| Return all info about matching query | 1 as query (0 by default) | Enabled by default |
| Skip invalid JSON | 1 as skip_bad_json (0 by default) | Not available |
| Extended info in SHOW META | 1 as verbose (0 by default) | Not available |
| Define the number which will be added to document ids if no docs_id fields provided (mostly relevant in distributed PQ modes) | 1 as shift (0 by default) | Not available |
To demonstrate how this works, here are a few examples. Let's create a PQ table with two fields:
- title (text)
- color (string)
and three rules in it:
- Just full-text. Query:
@title bag - Full-text and filtering. Query:
@title shoes. Filters:color='red' - Full-text and more complex filtering. Query:
@title shoes. Filters:color IN('blue', 'green')
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
CREATE TABLE products(title text, color string) type='pq';
INSERT INTO products(query) values('@title bag');
INSERT INTO products(query,filters) values('@title shoes', 'color=\'red\'');
INSERT INTO products(query,filters) values('@title shoes', 'color in (\'blue\', \'green\')');
select * from products;PUT /pq/products/doc/
{
"query": {
"match": {
"title": "bag"
}
},
"filters": ""
}
PUT /pq/products/doc/
{
"query": {
"match": {
"title": "shoes"
}
},
"filters": "color='red'"
}
PUT /pq/products/doc/
{
"query": {
"match": {
"title": "shoes"
}
},
"filters": "color IN ('blue', 'green')"
}$index = [
'table' => 'products',
'body' => [
'columns' => [
'title' => ['type' => 'text'],
'color' => ['type' => 'string']
],
'settings' => [
'type' => 'pq'
]
]
];
$client->indices()->create($index);
$query = [
'table' => 'products',
'body' => [ 'query'=>['match'=>['title'=>'bag']]]
];
$client->pq()->doc($query);
$query = [
'table' => 'products',
'body' => [ 'query'=>['match'=>['title'=>'shoes']],'filters'=>"color='red'"]
];
$client->pq()->doc($query);
$query = [
'table' => 'products',
'body' => [ 'query'=>['match'=>['title'=>'shoes']],'filters'=>"color IN ('blue', 'green')"]
];
$client->pq()->doc($query);utilsApi.sql('create table products(title text, color string) type=\'pq\'')
indexApi.insert({"table" : "products", "doc" : {"query" : "@title bag" }})
indexApi.insert({"table" : "products", "doc" : {"query" : "@title shoes", "filters": "color='red'" }})
indexApi.insert({"table" : "products", "doc" : {"query" : "@title shoes","filters": "color IN ('blue', 'green')" }})await utilsApi.sql('create table products(title text, color string) type=\'pq\'')
await indexApi.insert({"table" : "products", "doc" : {"query" : "@title bag" }})
await indexApi.insert({"table" : "products", "doc" : {"query" : "@title shoes", "filters": "color='red'" }})
await indexApi.insert({"table" : "products", "doc" : {"query" : "@title shoes","filters": "color IN ('blue', 'green')" }})res = await utilsApi.sql('create table products(title text, color string) type=\'pq\'');
res = indexApi.insert({"table" : "products", "doc" : {"query" : "@title bag" }});
res = indexApi.insert({"table" : "products", "doc" : {"query" : "@title shoes", "filters": "color='red'" }});
res = indexApi.insert({"table" : "products", "doc" : {"query" : "@title shoes","filters": "color IN ('blue', 'green')" }});utilsApi.sql("create table products(title text, color string) type='pq'", true);
doc = new HashMap<String,Object>(){{
put("query", "@title bag");
}};
newdoc = new InsertDocumentRequest();
newdoc.index("products").setDoc(doc);
indexApi.insert(newdoc);
doc = new HashMap<String,Object>(){{
put("query", "@title shoes");
put("filters", "color='red'");
}};
newdoc = new InsertDocumentRequest();
newdoc.index("products").setDoc(doc);
indexApi.insert(newdoc);
doc = new HashMap<String,Object>(){{
put("query", "@title shoes");
put("filters", "color IN ('blue', 'green')");
}};
newdoc = new InsertDocumentRequest();
newdoc.index("products").setDoc(doc);
indexApi.insert(newdoc);utilsApi.Sql("create table products(title text, color string) type='pq'", true);
Dictionary<string, Object> doc = new Dictionary<string, Object>();
doc.Add("query", "@title bag");
InsertDocumentRequest newdoc = new InsertDocumentRequest(index: "products", doc: doc);
indexApi.Insert(newdoc);
doc = new Dictionary<string, Object>();
doc.Add("query", "@title shoes");
doc.Add("filters", "color='red'");
newdoc = new InsertDocumentRequest(index: "products", doc: doc);
indexApi.Insert(newdoc);
doc = new Dictionary<string, Object>();
doc.Add("query", "@title bag");
doc.Add("filters", "color IN ('blue', 'green')");
newdoc = new InsertDocumentRequest(index: "products", doc: doc);
indexApi.Insert(newdoc);utils_api.sql("create table products(title text, color string) type='pq'", Some(true)).await;
let mut doc1 = HashMap::new();
doc1.insert("query".to_string(), serde_json::json!("@title bag"));
let insert_req1 = InsertDocumentRequest::new("products".to_string(), serde_json::json!(doc1));
index_api.insert(insert_req1).await;
let mut doc2 = HashMap::new();
doc2.insert("query".to_string(), serde_json::json!("@title shoes"));
doc2.insert("filters".to_string(), serde_json::json!("color='red'"));
let insert_req2 = InsertDocumentRequest::new("products".to_string(), serde_json::json!(doc2));
index_api.insert(insert_req2).await;
let mut doc3 = HashMap::new();
doc3.insert("query".to_string(), serde_json::json!("@title bag"));
doc3.insert("filters".to_string(), serde_json::json!("color IN ('blue', 'green')"));
let insert_req3 = InsertDocumentRequest::new("products".to_string(), serde_json::json!(doc3));
index_api.insert(insert_req3).await;res = await utilsApi.sql("create table test_pq(title text, color string) type='pq'");
res = indexApi.insert({
index: 'test_pq',
doc: { query : '@title bag' }
});
res = indexApi.insert(
index: 'test_pq',
doc: { query: '@title shoes', filters: "color='red'" }
});
res = indexApi.insert({
index: 'test_pq',
doc: { query : '@title shoes', filters: "color IN ('blue', 'green')" }
});apiClient.UtilsAPI.Sql(context.Background()).Body("create table test_pq(title text, color string) type='pq'").Execute()
indexDoc := map[string]interface{} {"query": "@title bag"}
indexReq := manticoreclient.NewInsertDocumentRequest("test_pq", indexDoc)
apiClient.IndexAPI.Insert(context.Background()).InsertDocumentRequest(*indexReq).Execute();
indexDoc = map[string]interface{} {"query": "@title shoes", "filters": "color='red'"}
indexReq = manticoreclient.NewInsertDocumentRequest("test_pq", indexDoc)
apiClient.IndexAPI.Insert(context.Background()).InsertDocumentRequest(*indexReq).Execute();
indexDoc = map[string]interface{} {"query": "@title shoes", "filters": "color IN ('blue', 'green')"}
indexReq = manticoreclient.NewInsertDocumentRequest("test_pq", indexDoc)
apiClient.IndexAPI.Insert(context.Background()).InsertDocumentRequest(*indexReq).Execute();+---------------------+--------------+------+---------------------------+
| id | query | tags | filters |
+---------------------+--------------+------+---------------------------+
| 1657852401006149635 | @title shoes | | color IN ('blue, 'green') |
| 1657852401006149636 | @title shoes | | color='red' |
| 1657852401006149637 | @title bag | | |
+---------------------+--------------+------+---------------------------+{
"table": "products",
"type": "doc",
"_id": 1657852401006149661,
"result": "created"
}
{
"table": "products",
"type": "doc",
"_id": 1657852401006149662,
"result": "created"
}
{
"table": "products",
"type": "doc",
"_id": 1657852401006149663,
"result": "created"
}Array(
[table] => products
[type] => doc
[_id] => 1657852401006149661
[result] => created
)
Array(
[table] => products
[type] => doc
[_id] => 1657852401006149662
[result] => created
)
Array(
[table] => products
[type] => doc
[_id] => 1657852401006149663
[result] => created
){'created': True,
'found': None,
'id': 0,
'table': 'products',
'result': 'created'}
{'created': True,
'found': None,
'id': 0,
'table': 'products',
'result': 'created'}
{'created': True,
'found': None,
'id': 0,
'table': 'products',
'result': 'created'}{'created': True,
'found': None,
'id': 0,
'table': 'products',
'result': 'created'}
{'created': True,
'found': None,
'id': 0,
'table': 'products',
'result': 'created'}
{'created': True,
'found': None,
'id': 0,
'table': 'products',
'result': 'created'}"table":"products","_id":0,"created":true,"result":"created"}
{"table":"products","_id":0,"created":true,"result":"created"}
{"table":"products","_id":0,"created":true,"result":"created"}{total=0, error=, warning=}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}{total=0, error="", warning=""}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}{total=0, error="", warning=""}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}
class SuccessResponse {
index: products
id: 0
created: true
result: created
found: null
}{
"table":"test_pq",
"_id":1657852401006149661,
"created":true,
"result":"created"
}
{
"table":"test_pq",
"_id":1657852401006149662,
"created":true,
"result":"created"
}
{
"table":"test_pq",
"_id":1657852401006149663,
"created":true,
"result":"created"
}{
"table":"test_pq",
"_id":1657852401006149661,
"created":true,
"result":"created"
}
{
"table":"test_pq",
"_id":1657852401006149662,
"created":true,
"result":"created"
}
{
"table":"test_pq",
"_id":1657852401006149663,
"created":true,
"result":"created"
}The first document doesn't match any rules. It could match the first two, but they require additional filters.
The second document matches one rule. Note that CALL PQ by default expects a document to be a JSON, but if you use 0 as docs_json, you can pass a plain string instead.
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
CALL PQ('products', 'Beautiful shoes', 0 as docs_json);
CALL PQ('products', 'What a nice bag', 0 as docs_json);
CALL PQ('products', '{"title": "What a nice bag"}');POST /pq/products/_search
{
"query": {
"percolate": {
"document": {
"title": "What a nice bag"
}
}
}
}$percolate = [
'table' => 'products',
'body' => [
'query' => [
'percolate' => [
'document' => [
'title' => 'What a nice bag'
]
]
]
]
];
$client->pq()->search($percolate);searchApi.percolate('products',{"query":{"percolate":{"document":{"title":"What a nice bag"}}}})await searchApi.percolate('products',{"query":{"percolate":{"document":{"title":"What a nice bag"}}}})res = await searchApi.percolate('products',{"query":{"percolate":{"document":{"title":"What a nice bag"}}}});PercolateRequest percolateRequest = new PercolateRequest();
query = new HashMap<String,Object>(){{
put("percolate",new HashMap<String,Object >(){{
put("document", new HashMap<String,Object >(){{
put("title","what a nice bag");
}});
}});
}};
percolateRequest.query(query);
searchApi.percolate("test_pq",percolateRequest);Dictionary<string, Object> percolateDoc = new Dictionary<string, Object>();
percolateDoc.Add("document", new Dictionary<string, Object> {{ "title", "what a nice bag" }});
Dictionary<string, Object> query = new Dictionary<string, Object> {{ "percolate", percolateDoc }};
PercolateRequest percolateRequest = new PercolateRequest(query=query);
searchApi.Percolate("test_pq",percolateRequest);let mut percolate_doc_fields = HashMap::new();
percolate_doc_fileds.insert("title".to_string(), "what a nice bag");
let mut percolate_doc = HashMap::new();
percolate_doc.insert("document".to_string(), percolate_doc_fields);
let percolate_query = PercolateRequestQuery::new(serde_json::json!(percolate_doc));
let percolate_req = PercolateRequest::new(percolate_query);
search_api.percolate("test_pq", percolate_req).await;res = await searchApi.percolate('test_pq', { query: { percolate: { document : { title : 'What a nice bag' } } } } );query := map[string]interface{} {"title": "what a nice bag"}
percolateRequestQuery := manticoreclient.NewPercolateQuery(query)
percolateRequest := manticoreclient.NewPercolateRequest(percolateRequestQuery)
res, _, _ := apiClient.SearchAPI.Percolate(context.Background(), "test_pq").PercolateRequest(*percolateRequest).Execute()+---------------------+
| id |
+---------------------+
| 1657852401006149637 |
+---------------------+
+---------------------+
| id |
+---------------------+
| 1657852401006149637 |
+---------------------+{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 1657852401006149644,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}Array
(
[took] => 0
[timed_out] =>
[hits] => Array
(
[total] => 1
[max_score] => 1
[hits] => Array
(
[0] => Array
(
[_index] => products
[_type] => doc
[_id] => 1657852401006149644
[_score] => 1
[_source] => Array
(
[query] => Array
(
[match] => Array
(
[title] => bag
)
)
)
[fields] => Array
(
[_percolator_document_slot] => Array
(
[0] => 1
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 1},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 1},
'profile': None,
'timed_out': False,
'took': 0}{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 2811045522851233808,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234109, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234109, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234109, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
CALL PQ('products', '{"title": "What a nice bag"}', 1 as query);POST /pq/products/_search
{
"query": {
"percolate": {
"document": {
"title": "What a nice bag"
}
}
}
}$percolate = [
'table' => 'products',
'body' => [
'query' => [
'percolate' => [
'document' => [
'title' => 'What a nice bag'
]
]
]
]
];
$client->pq()->search($percolate);searchApi.percolate('products',{"query":{"percolate":{"document":{"title":"What a nice bag"}}}})await searchApi.percolate('products',{"query":{"percolate":{"document":{"title":"What a nice bag"}}}})res = await searchApi.percolate('products',{"query":{"percolate":{"document":{"title":"What a nice bag"}}}});PercolateRequest percolateRequest = new PercolateRequest();
query = new HashMap<String,Object>(){{
put("percolate",new HashMap<String,Object >(){{
put("document", new HashMap<String,Object >(){{
put("title","what a nice bag");
}});
}});
}};
percolateRequest.query(query);
searchApi.percolate("test_pq",percolateRequest);Dictionary<string, Object> percolateDoc = new Dictionary<string, Object>();
percolateDoc.Add("document", new Dictionary<string, Object> {{ "title", "what a nice bag" }});
Dictionary<string, Object> query = new Dictionary<string, Object> {{ "percolate", percolateDoc }};
PercolateRequest percolateRequest = new PercolateRequest(query=query);
searchApi.Percolate("test_pq",percolateRequest);let mut percolate_doc_fields = HashMap::new();
percolate_doc_fileds.insert("title".to_string(), "what a nice bag");
let mut percolate_doc = HashMap::new();
percolate_doc.insert("document".to_string(), percolate_doc_fields);
let percolate_query = PercolateRequestQuery::new(serde_json::json!(percolate_doc));
let percolate_req = PercolateRequest::new(percolate_query);
search_api.percolate("test_pq", percolate_req).await;res = await searchApi.percolate('test_pq', { query: { percolate: { document : { title : 'What a nice bag' } } } } );query := map[string]interface{} {"title": "what a nice bag"}
percolateRequestQuery := manticoreclient.NewPercolateQuery(query)
percolateRequest := manticoreclient.NewPercolateRequest(percolateRequestQuery)
res, _, _ := apiClient.SearchAPI.Percolate(context.Background(), "test_pq").PercolateRequest(*percolateRequest).Execute()+---------------------+------------+------+---------+
| id | query | tags | filters |
+---------------------+------------+------+---------+
| 1657852401006149637 | @title bag | | |
+---------------------+------------+------+---------+{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 1657852401006149644,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}Array
(
[took] => 0
[timed_out] =>
[hits] => Array
(
[total] => 1
[max_score] => 1
[hits] => Array
(
[0] => Array
(
[_index] => products
[_type] => doc
[_id] => 1657852401006149644
[_score] => 1
[_source] => Array
(
[query] => Array
(
[match] => Array
(
[title] => bag
)
)
)
[fields] => Array
(
[_percolator_document_slot] => Array
(
[0] => 1
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 1},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 1},
'profile': None,
'timed_out': False,
'took': 0}{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 2811045522851233808,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234109, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234109, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 1
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234109, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 1,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}Обратите внимание, что с помощью CALL PQ вы можете предоставить несколько документов разными способами:
- как массив простых документов в круглых скобках
('doc1', 'doc2'). Для этого требуется0 as docs_json - как массив JSON в круглых скобках
('{doc1}', '{doc2}') - или как стандартный JSON-массив
'[{doc1}, {doc2}]'
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
CALL PQ('products', ('nice pair of shoes', 'beautiful bag'), 1 as query, 0 as docs_json);
CALL PQ('products', ('{"title": "nice pair of shoes", "color": "red"}', '{"title": "beautiful bag"}'), 1 as query);
CALL PQ('products', '[{"title": "nice pair of shoes", "color": "blue"}, {"title": "beautiful bag"}]', 1 as query);POST /pq/products/_search
{
"query": {
"percolate": {
"documents": [
{"title": "nice pair of shoes", "color": "blue"},
{"title": "beautiful bag"}
]
}
}
}$percolate = [
'table' => 'products',
'body' => [
'query' => [
'percolate' => [
'documents' => [
['title' => 'nice pair of shoes','color'=>'blue'],
['title' => 'beautiful bag']
]
]
]
]
];
$client->pq()->search($percolate);searchApi.percolate('products',{"query":{"percolate":{"documents":[{"title":"nice pair of shoes","color":"blue"},{"title":"beautiful bag"}]}}})await searchApi.percolate('products',{"query":{"percolate":{"documents":[{"title":"nice pair of shoes","color":"blue"},{"title":"beautiful bag"}]}}})res = await searchApi.percolate('products',{"query":{"percolate":{"documents":[{"title":"nice pair of shoes","color":"blue"},{"title":"beautiful bag"}]}}});percolateRequest = new PercolateRequest();
query = new HashMap<String,Object>(){{
put("percolate",new HashMap<String,Object >(){{
put("documents", new ArrayList<Object>(){{
add(new HashMap<String,Object >(){{
put("title","nice pair of shoes");
put("color","blue");
}});
add(new HashMap<String,Object >(){{
put("title","beautiful bag");
}});
}});
}});
}};
percolateRequest.query(query);
searchApi.percolate("products",percolateRequest);var doc1 = new Dictionary<string, Object>();
doc1.Add("title","nice pair of shoes");
doc1.Add("color","blue");
var doc2 = new Dictionary<string, Object>();
doc2.Add("title","beautiful bag");
var docs = new List<Object> {doc1, doc2};
Dictionary<string, Object> percolateDoc = new Dictionary<string, Object> {{ "documents", docs }};
Dictionary<string, Object> query = new Dictionary<string, Object> {{ "percolate", percolateDoc }};
PercolateRequest percolateRequest = new PercolateRequest(query=query);
searchApi.Percolate("products",percolateRequest);let mut percolate_doc_fields1 = HashMap::new();
percolate_doc_fields1.insert("title".to_string(), "nice pair of shoes");
percolate_doc_fields1.insert("color".to_string(), "blue");
let mut percolate_doc_fields2 = HashMap::new();
percolate_doc_fields2.insert("title".to_string(), "beautiful bag");
let mut percolate_doc_fields_list: [HashMap; 2] = [percolate_doc_fields1, percolate_doc_fields2];
let mut percolate_doc = HashMap::new();
percolate_doc.insert("documents".to_string(), percolate_doc_fields_list);
let percolate_query = PercolateRequestQuery::new(serde_json::json!(percolate_doc));
let percolate_req = PercolateRequest::new(percolate_query);
search_api.percolate("products", percolate_req).await;docs = [ {title : 'What a nice bag'}, {title : 'Really nice shoes'} ];
res = await searchApi.percolate('test_pq', { query: { percolate: { documents : docs } } } );doc1 := map[string]interface{} {"title": "What a nice bag"}
doc2 := map[string]interface{} {"title": "Really nice shoes"}
query := []interface{} {doc1, doc2}
percolateRequestQuery := manticoreclient.NewPercolateQuery(query)
percolateRequest := manticoreclient.NewPercolateRequest(percolateRequestQuery)
res, _, _ := apiClient.SearchAPI.Percolate(context.Background(), "test_pq").PercolateRequest(*percolateRequest).Execute()+---------------------+------------+------+---------+
| id | query | tags | filters |
+---------------------+------------+------+---------+
| 1657852401006149637 | @title bag | | |
+---------------------+------------+------+---------+
+---------------------+--------------+------+-------------+
| id | query | tags | filters |
+---------------------+--------------+------+-------------+
| 1657852401006149636 | @title shoes | | color='red' |
| 1657852401006149637 | @title bag | | |
+---------------------+--------------+------+-------------+
+---------------------+--------------+------+---------------------------+
| id | query | tags | filters |
+---------------------+--------------+------+---------------------------+
| 1657852401006149635 | @title shoes | | color IN ('blue, 'green') |
| 1657852401006149637 | @title bag | | |
+---------------------+--------------+------+---------------------------+{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 1657852401006149644,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
2
]
}
},
{
"table": "products",
"_type": "doc",
"_id": 1657852401006149646,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}Array
(
[took] => 23
[timed_out] =>
[hits] => Array
(
[total] => 2
[max_score] => 1
[hits] => Array
(
[0] => Array
(
[_index] => products
[_type] => doc
[_id] => 2810781492890828819
[_score] => 1
[_source] => Array
(
[query] => Array
(
[match] => Array
(
[title] => bag
)
)
)
[fields] => Array
(
[_percolator_document_slot] => Array
(
[0] => 2
)
)
)
[1] => Array
(
[_index] => products
[_type] => doc
[_id] => 2810781492890828821
[_score] => 1
[_source] => Array
(
[query] => Array
(
[match] => Array
(
[title] => shoes
)
)
)
[fields] => Array
(
[_percolator_document_slot] => Array
(
[0] => 1
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381494',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [2]}},
{u'_id': u'2811025403043381496',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title shoes'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381494',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [2]}},
{u'_id': u'2811025403043381496',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title shoes'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{
"took": 6,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 2811045522851233808,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
2
]
}
},
{
"table": "products",
"_type": "doc",
"_id": 2811045522851233810,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234133, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[2]}}, {_index=products, _type=doc, _id=2811045522851234135, _score=1, _source={query={ql=@title shoes}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234133, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[2]}}, {_index=products, _type=doc, _id=2811045522851234135, _score=1, _source={query={ql=@title shoes}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234133, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[2]}}, {_index=products, _type=doc, _id=2811045522851234135, _score=1, _source={query={ql=@title shoes}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
},
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149662,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
},
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149662,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}Использование опции 1 as docs позволяет увидеть, какие из предоставленных документов соответствуют каким правилам.
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
CALL PQ('products', '[{"title": "nice pair of shoes", "color": "blue"}, {"title": "beautiful bag"}]', 1 as query, 1 as docs);POST /pq/products/_search
{
"query": {
"percolate": {
"documents": [
{"title": "nice pair of shoes", "color": "blue"},
{"title": "beautiful bag"}
]
}
}
}$percolate = [
'table' => 'products',
'body' => [
'query' => [
'percolate' => [
'documents' => [
['title' => 'nice pair of shoes','color'=>'blue'],
['title' => 'beautiful bag']
]
]
]
]
];
$client->pq()->search($percolate);searchApi.percolate('products',{"query":{"percolate":{"documents":[{"title":"nice pair of shoes","color":"blue"},{"title":"beautiful bag"}]}}})await searchApi.percolate('products',{"query":{"percolate":{"documents":[{"title":"nice pair of shoes","color":"blue"},{"title":"beautiful bag"}]}}})res = await searchApi.percolate('products',{"query":{"percolate":{"documents":[{"title":"nice pair of shoes","color":"blue"},{"title":"beautiful bag"}]}}});percolateRequest = new PercolateRequest();
query = new HashMap<String,Object>(){{
put("percolate",new HashMap<String,Object >(){{
put("documents", new ArrayList<Object>(){{
add(new HashMap<String,Object >(){{
put("title","nice pair of shoes");
put("color","blue");
}});
add(new HashMap<String,Object >(){{
put("title","beautiful bag");
}});
}});
}});
}};
percolateRequest.query(query);
searchApi.percolate("products",percolateRequest);var doc1 = new Dictionary<string, Object>();
doc1.Add("title","nice pair of shoes");
doc1.Add("color","blue");
var doc2 = new Dictionary<string, Object>();
doc2.Add("title","beautiful bag");
var docs = new List<Object> {doc1, doc2};
Dictionary<string, Object> percolateDoc = new Dictionary<string, Object> {{ "documents", docs }};
Dictionary<string, Object> query = new Dictionary<string, Object> {{ "percolate", percolateDoc }};
PercolateRequest percolateRequest = new PercolateRequest(query=query);
searchApi.Percolate("products",percolateRequest);let mut percolate_doc_fields1 = HashMap::new();
percolate_doc_fields1.insert("title".to_string(), "nice pair of shoes");
percolate_doc_fields1.insert("color".to_string(), "blue");
let mut percolate_doc_fields2 = HashMap::new();
percolate_doc_fields2.insert("title".to_string(), "beautiful bag");
let mut percolate_doc_fields_list: [HashMap; 2] = [percolate_doc_fields1, percolate_doc_fields2];
let mut percolate_doc = HashMap::new();
percolate_doc.insert("documents".to_string(), percolate_doc_fields_list);
let percolate_query = PercolateRequestQuery::new(serde_json::json!(percolate_doc));
let percolate_req = PercolateRequest::new(percolate_query);
search_api.percolate("products", percolate_req).await;docs = [ {title : 'What a nice bag'}, {title : 'Really nice shoes'} ];
res = await searchApi.percolate('test_pq', { query: { percolate: { documents : docs } } } );doc1 := map[string]interface{} {"title": "What a nice bag"}
doc2 := map[string]interface{} {"title": "Really nice shoes"}
query := []interface{} {doc1, doc2}
percolateRequestQuery := manticoreclient.NewPercolateQuery(query)
percolateRequest := manticoreclient.NewPercolateRequest(percolateRequestQuery)
res, _, _ := apiClient.SearchAPI.Percolate(context.Background(), "test_pq").PercolateRequest(*percolateRequest).Execute()+---------------------+-----------+--------------+------+---------------------------+
| id | documents | query | tags | filters |
+---------------------+-----------+--------------+------+---------------------------+
| 1657852401006149635 | 1 | @title shoes | | color IN ('blue, 'green') |
| 1657852401006149637 | 2 | @title bag | | |
+---------------------+-----------+--------------+------+---------------------------+{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 1657852401006149644,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
2
]
}
},
{
"table": "products",
"_type": "doc",
"_id": 1657852401006149646,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}Array
(
[took] => 23
[timed_out] =>
[hits] => Array
(
[total] => 2
[max_score] => 1
[hits] => Array
(
[0] => Array
(
[_index] => products
[_type] => doc
[_id] => 2810781492890828819
[_score] => 1
[_source] => Array
(
[query] => Array
(
[match] => Array
(
[title] => bag
)
)
)
[fields] => Array
(
[_percolator_document_slot] => Array
(
[0] => 2
)
)
)
[1] => Array
(
[_index] => products
[_type] => doc
[_id] => 2810781492890828821
[_score] => 1
[_source] => Array
(
[query] => Array
(
[match] => Array
(
[title] => shoes
)
)
)
[fields] => Array
(
[_percolator_document_slot] => Array
(
[0] => 1
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381494',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [2]}},
{u'_id': u'2811025403043381496',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title shoes'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381494',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title bag'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [2]}},
{u'_id': u'2811025403043381496',
u'table': u'products',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'@title shoes'}},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{
"took": 6,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "products",
"_type": "doc",
"_id": 2811045522851233808,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
2
]
}
},
{
"table": "products",
"_type": "doc",
"_id": 2811045522851233810,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234133, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[2]}}, {_index=products, _type=doc, _id=2811045522851234135, _score=1, _source={query={ql=@title shoes}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234133, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[2]}}, {_index=products, _type=doc, _id=2811045522851234135, _score=1, _source={query={ql=@title shoes}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=products, _type=doc, _id=2811045522851234133, _score=1, _source={query={ql=@title bag}}, fields={_percolator_document_slot=[2]}}, {_index=products, _type=doc, _id=2811045522851234135, _score=1, _source={query={ql=@title shoes}}, fields={_percolator_document_slot=[1]}}]
aggregations: null
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
},
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149662,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
},
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149662,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}По умолчанию идентификаторы совпадающих документов соответствуют их относительным номерам в списке, который вы предоставляете. Однако в некоторых случаях у каждого документа уже есть свой собственный идентификатор. Для этого случая существует опция 'id field name' as docs_id для CALL PQ.
Обратите внимание, что если идентификатор не может быть найден по указанному имени поля, правило PQ не будет отображено в результатах.
Эта опция доступна только для CALL PQ через SQL.
- SQL
CALL PQ('products', '[{"id": 123, "title": "nice pair of shoes", "color": "blue"}, {"id": 456, "title": "beautiful bag"}]', 1 as query, 'id' as docs_id, 1 as docs);+---------------------+-----------+--------------+------+---------------------------+
| id | documents | query | tags | filters |
+---------------------+-----------+--------------+------+---------------------------+
| 1657852401006149664 | 456 | @title bag | | |
| 1657852401006149666 | 123 | @title shoes | | color IN ('blue, 'green') |
+---------------------+-----------+--------------+------+---------------------------+При использовании CALL PQ с отдельными JSON можно использовать опцию 1 в качестве skip_bad_json, чтобы пропускать любые недопустимые JSON в вводе. В приведённом ниже примере второй запрос не выполняется из-за недопустимого JSON, но третий запрос избегает ошибки, используя 1 в качестве skip_bad_json. Имейте в виду, что эта опция недоступна при отправке JSON-запросов по HTTP, так как в этом случае весь JSON-запрос должен быть валидным.
- SQL
CALL PQ('products', ('{"title": "nice pair of shoes", "color": "blue"}', '{"title": "beautiful bag"}'));
CALL PQ('products', ('{"title": "nice pair of shoes", "color": "blue"}', '{"title": "beautiful bag}'));
CALL PQ('products', ('{"title": "nice pair of shoes", "color": "blue"}', '{"title": "beautiful bag}'), 1 as skip_bad_json);+---------------------+
| id |
+---------------------+
| 1657852401006149635 |
| 1657852401006149637 |
+---------------------+
ERROR 1064 (42000): Bad JSON objects in strings: 2
+---------------------+
| id |
+---------------------+
| 1657852401006149635 |
+---------------------+Перколяционные запросы разработаны с учётом высокой пропускной способности и больших объёмов данных. Чтобы оптимизировать производительность для снижения задержек и увеличения пропускной способности, рассмотрите следующее.
Существует два режима распределения для перколяционной таблицы и способа работы перколяционного запроса с ней:
- Sparse (по умолчанию). Идеально для: большого количества документов, зеркальных PQ таблиц. Когда ваш набор документов большой, но набор запросов, хранящихся в PQ таблице, мал, режим sparse будет полезен. В этом режиме пакет документов, который вы передаёте, будет разделён между количеством агентов, так что каждый узел обрабатывает только часть документов из вашего запроса. Manticore разбивает ваш набор документов и распределяет части между зеркалами. После того как агенты закончат обработку запросов, Manticore собирает и объединяет результаты, возвращая итоговый набор запросов, как если бы он пришёл из одной таблицы. Используйте репликацию для поддержки процесса.
- Sharded. Идеально для: большого количества правил PQ, правил, разделённых между PQ таблицами. В этом режиме весь набор документов транслируется всем таблицам распределённой PQ таблицы без первоначального разделения документов. Это полезно, когда вы отправляете относительно небольшой набор документов, но количество хранимых запросов велико. В этом случае более целесообразно хранить только часть правил PQ на каждом узле, а затем объединять результаты, возвращаемые узлами, которые обрабатывают один и тот же набор документов против разных наборов правил PQ. Этот режим должен быть установлен явно, так как он подразумевает увеличение сетевого трафика и ожидает таблицы с разными PQ, что репликация не может сделать из коробки.
Предположим, у вас есть таблица pq_d2, определённая как:
table pq_d2
{
type = distributed
agent = 127.0.0.1:6712:pq
agent = 127.0.0.1:6712:ptitle
}
Каждая из 'pq' и 'ptitle' содержит:
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
SELECT * FROM pq;POST /pq/pq/_search$params = [
'table' => 'pq',
'body' => [
]
];
$response = $client->pq()->search($params);searchApi.search({"table":"pq","query":{"match_all":{}}})await searchApi.search({"table":"pq","query":{"match_all":{}}})res = await searchApi.search({"table":"pq","query":{"match_all":{}}});Map<String,Object> query = new HashMap<String,Object>();
query.put("match_all",null);
SearchRequest searchRequest = new SearchRequest();
searchRequest.setIndex("pq");
searchRequest.setQuery(query);
SearchResponse searchResponse = searchApi.search(searchRequest);object query = new { match_all=null };
SearchRequest searchRequest = new SearchRequest("pq", query);
SearchResponse searchResponse = searchApi.Search(searchRequest);let query = SearchQuery::new();
let search_req = SearchRequest {
table: "pq".to_string(),
query: Some(Box::new(query)),
..Default::default(),
};
let search_res = search_api.search(search_req).await;res = await searchApi.search({"table":"test_pq","query":{"match_all":{}}});query := map[string]interface{} {}
percolateRequestQuery := manticoreclient.NewPercolateRequestQuery(query)
percolateRequest := manticoreclient.NewPercolateRequest(percolateRequestQuery)
res, _, _ := apiClient.SearchAPI.Percolate(context.Background(), "test_pq").PercolateRequest(*percolateRequest).Execute()+------+-------------+------+-------------------+
| id | query | tags | filters |
+------+-------------+------+-------------------+
| 1 | filter test | | gid>=10 |
| 2 | angry | | gid>=10 OR gid<=3 |
+------+-------------+------+-------------------+
2 rows in set (0.01 sec){
"took":0,
"timed_out":false,
"hits":{
"total":2,
"hits":[
{
"_id": 1,
"_score":1,
"_source":{
"query":{ "ql":"filter test" },
"tags":"",
"filters":"gid>=10"
}
},
{
"_id": 2,
"_score":1,
"_source":{
"query":{"ql":"angry"},
"tags":"",
"filters":"gid>=10 OR gid<=3"
}
}
]
}
}(
[took] => 0
[timed_out] =>
[hits] =>
(
[total] => 2
[hits] =>
(
[0] =>
(
[_id] => 1
[_score] => 1
[_source] =>
(
[query] =>
(
[ql] => filter test
)
[tags] =>
[filters] => gid>=10
)
),
[1] =>
(
[_id] => 1
[_score] => 1
[_source] =>
(
[query] =>
(
[ql] => angry
)
[tags] =>
[filters] => gid>=10 OR gid<=3
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381501',
u'_score': 1,
u'_source': {u'filters': u"gid>=10",
u'query': u'filter test',
u'tags': u''}},
{u'_id': u'2811025403043381502',
u'_score': 1,
u'_source': {u'filters': u"gid>=10 OR gid<=3",
u'query': u'angry',
u'tags': u''}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381501',
u'_score': 1,
u'_source': {u'filters': u"gid>=10",
u'query': u'filter test',
u'tags': u''}},
{u'_id': u'2811025403043381502',
u'_score': 1,
u'_source': {u'filters': u"gid>=10 OR gid<=3",
u'query': u'angry',
u'tags': u''}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{"hits": {"hits": [{"_id": 2811025403043381501,
"_score": 1,
"_source": {"filters": u"gid>=10",
"query": "filter test",
"tags": ""}},
{"_id": 2811025403043381502,
"_score": 1,
"_source": {"filters": u"gid>=10 OR gid<=3",
"query": "angry",
"tags": ""}}],
"total": 2},
"timed_out": false,
"took": 0}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: null
hits: [{_id=2811045522851233962, _score=1, _source={filters=gid>=10, query=filter test, tags=}}, {_id=2811045522851233951, _score=1, _source={filters=gid>=10 OR gid<=3, query=angry,tags=}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: null
hits: [{_id=2811045522851233962, _score=1, _source={filters=gid>=10, query=filter test, tags=}}, {_id=2811045522851233951, _score=1, _source={filters=gid>=10 OR gid<=3, query=angry,tags=}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 0
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: null
hits: [{_id=2811045522851233962, _score=1, _source={filters=gid>=10, query=filter test, tags=}}, {_id=2811045522851233951, _score=1, _source={filters=gid>=10 OR gid<=3, query=angry,tags=}}]
aggregations: null
}
profile: null
}{
'hits':
{
'hits':
[{
'_id': '2811025403043381501',
'_score': 1,
'_source':
{
'filters': "gid>=10",
'query': 'filter test',
'tags': ''
}
},
{
'_id':
'2811025403043381502',
'_score': 1,
'_source':
{
'filters': "gid>=10 OR gid<=3",
'query': 'angry',
'tags': ''
}
}],
'total': 2
},
'profile': None,
'timed_out': False,
'took': 0
}{
'hits':
{
'hits':
[{
'_id': '2811025403043381501',
'_score': 1,
'_source':
{
'filters': "gid>=10",
'query': 'filter test',
'tags': ''
}
},
{
'_id':
'2811025403043381502',
'_score': 1,
'_source':
{
'filters': "gid>=10 OR gid<=3",
'query': 'angry',
'tags': ''
}
}],
'total': 2
},
'profile': None,
'timed_out': False,
'took': 0
}И вы выполняете CALL PQ на распределённой таблице с несколькими документами.
- SQL
- JSON
- PHP
- Python
- Python-asyncio
- javascript
- Java
- C#
- Rust
- TypeScript
- Go
CALL PQ ('pq_d2', ('{"title":"angry test", "gid":3 }', '{"title":"filter test doc2", "gid":13}'), 1 AS docs);POST /pq/pq/_search -d '
"query":
{
"percolate":
{
"documents" : [
{ "title": "angry test", "gid": 3 },
{ "title": "filter test doc2", "gid": 13 }
]
}
}
'$params = [
'table' => 'pq',
'body' => [
'query' => [
'percolate' => [
'documents' => [
[
'title'=>'angry test',
'gid' => 3
],
[
'title'=>'filter test doc2',
'gid' => 13
],
]
]
]
]
];
$response = $client->pq()->search($params);searchApi.percolate('pq',{"percolate":{"documents":[{"title":"angry test","gid":3},{"title":"filter test doc2","gid":13}]}})await searchApi.percolate('pq',{"percolate":{"documents":[{"title":"angry test","gid":3},{"title":"filter test doc2","gid":13}]}})res = await searchApi.percolate('pq',{"percolate":{"documents":[{"title":"angry test","gid":3},{"title":"filter test doc2","gid":13}]}});percolateRequest = new PercolateRequest();
query = new HashMap<String,Object>(){{
put("percolate",new HashMap<String,Object >(){{
put("documents", new ArrayList<Object>(){{
add(new HashMap<String,Object >(){{
put("title","angry test");
put("gid",3);
}});
add(new HashMap<String,Object >(){{
put("title","filter test doc2");
put("gid",13);
}});
}});
}});
}};
percolateRequest.query(query);
searchApi.percolate("pq",percolateRequest);var doc1 = new Dictionary<string, Object>();
doc1.Add("title","angry test");
doc1.Add("gid",3);
var doc2 = new Dictionary<string, Object>();
doc2.Add("title","filter test doc2");
doc2.Add("gid",13);
var docs = new List<Object> {doc1, doc2};
Dictionary<string, Object> percolateDoc = new Dictionary<string, Object> {{ "documents", docs }};
Dictionary<string, Object> query = new Dictionary<string, Object> {{ "percolate", percolateDoc }};
PercolateRequest percolateRequest = new PercolateRequest(query=query);
searchApi.Percolate("pq",percolateRequest);let mut percolate_doc_fields1 = HashMap::new();
percolate_doc_fields1.insert("title".to_string(), "angry test");
percolate_doc_fields1.insert("gid".to_string(), 3);
let mut percolate_doc_fields2 = HashMap::new();
percolate_doc_fields2.insert("title".to_string(), "filter test doc2");
percolate_doc_fields2.insert("gid".to_string(), 13);
let mut percolate_doc_fields_list: [HashMap; 2] = [percolate_doc_fields1, percolate_doc_fields2];
let mut percolate_doc = HashMap::new();
percolate_doc.insert("documents".to_string(), percolate_doc_fields_list);
let percolate_query = PercolateRequestQuery::new(serde_json::json!(percolate_doc));
let percolate_req = PercolateRequest::new(percolate_query);
search_api.percolate("pq", percolate_req).await;docs = [ {title : 'What a nice bag'}, {title : 'Really nice shoes'} ];
res = await searchApi.percolate('test_pq', { query: { percolate: { documents : docs } } } );doc1 := map[string]interface{} {"title": "What a nice bag"}
doc2 := map[string]interface{} {"title": "Really nice shoes"}
query := []interface{} {doc1, doc2}
percolateRequestQuery := manticoreclient.NewPercolateQuery(query)
percolateRequest := manticoreclient.NewPercolateRequest(percolateRequestQuery)
res, _, _ := apiClient.SearchAPI.Percolate(context.Background(), "test_pq").PercolateRequest(*percolateRequest).Execute()+------+-----------+
| id | documents |
+------+-----------+
| 1 | 2 |
| 2 | 1 |
+------+-----------+{
"took":0,
"timed_out":false,
"hits":{
"total":2,"hits":[
{
"_id": 2,
"_score":1,
"_source":{
"query":{"title":"angry"},
"tags":"",
"filters":"gid>=10 OR gid<=3"
}
}
{
"_id": 1,
"_score":1,
"_source":{
"query":{"ql":"filter test"},
"tags":"",
"filters":"gid>=10"
}
},
]
}
}(
[took] => 0
[timed_out] =>
[hits] =>
(
[total] => 2
[hits] =>
(
[0] =>
(
[_index] => pq
[_type] => doc
[_id] => 2
[_score] => 1
[_source] =>
(
[query] =>
(
[ql] => angry
)
[tags] =>
[filters] => gid>=10 OR gid<=3
),
[fields] =>
(
[_percolator_document_slot] =>
(
[0] => 1
)
)
),
[1] =>
(
[_index] => pq
[_id] => 1
[_score] => 1
[_source] =>
(
[query] =>
(
[ql] => filter test
)
[tags] =>
[filters] => gid>=10
)
[fields] =>
(
[_percolator_document_slot] =>
(
[0] => 0
)
)
)
)
)
){'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'pq',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'angry'},u'tags':u'',u'filters':u"gid>=10 OR gid<=3"},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}},
{u'_id': u'2811025403043381501',
u'table': u'pq',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'filter test'},u'tags':u'',u'filters':u"gid>=10"},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'pq',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'angry'},u'tags':u'',u'filters':u"gid>=10 OR gid<=3"},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}},
{u'_id': u'2811025403043381501',
u'table': u'pq',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'filter test'},u'tags':u'',u'filters':u"gid>=10"},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}{'hits': {'hits': [{u'_id': u'2811025403043381480',
u'table': u'pq',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'angry'},u'tags':u'',u'filters':u"gid>=10 OR gid<=3"},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}},
{u'_id': u'2811025403043381501',
u'table': u'pq',
u'_score': u'1',
u'_source': {u'query': {u'ql': u'filter test'},u'tags':u'',u'filters':u"gid>=10"},
u'_type': u'doc',
u'fields': {u'_percolator_document_slot': [1]}}],
'total': 2},
'profile': None,
'timed_out': False,
'took': 0}class SearchResponse {
took: 10
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=pq, _type=doc, _id=2811045522851234165, _score=1, _source={query={ql=@title angry}}, fields={_percolator_document_slot=[1]}}, {_index=pq, _type=doc, _id=2811045522851234166, _score=1, _source={query={ql=@title filter test doc2}}, fields={_percolator_document_slot=[2]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 10
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=pq, _type=doc, _id=2811045522851234165, _score=1, _source={query={ql=@title angry}}, fields={_percolator_document_slot=[1]}}, {_index=pq, _type=doc, _id=2811045522851234166, _score=1, _source={query={ql=@title filter test doc2}}, fields={_percolator_document_slot=[2]}}]
aggregations: null
}
profile: null
}class SearchResponse {
took: 10
timedOut: false
hits: class SearchResponseHits {
total: 2
maxScore: 1
hits: [{_index=pq, _type=doc, _id=2811045522851234165, _score=1, _source={query={ql=@title angry}}, fields={_percolator_document_slot=[1]}}, {_index=pq, _type=doc, _id=2811045522851234166, _score=1, _source={query={ql=@title filter test doc2}}, fields={_percolator_document_slot=[2]}}]
aggregations: null
}
profile: null
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
},
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149662,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}{
"took": 0,
"timed_out": false,
"hits": {
"total": 2,
"hits": [
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149661,
"_score": "1",
"_source": {
"query": {
"ql": "@title bag"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
},
{
"table": "test_pq",
"_type": "doc",
"_id": 1657852401006149662,
"_score": "1",
"_source": {
"query": {
"ql": "@title shoes"
}
},
"fields": {
"_percolator_document_slot": [
1
]
}
}
]
}
}В предыдущем примере мы использовали режим по умолчанию — sparse. Чтобы продемонстрировать режим sharded, давайте создадим распределённую PQ таблицу, состоящую из 2 локальных PQ таблиц, и добавим 2 документа в "products1" и 1 документ в "products2":
create table products1(title text, color string) type='pq';
create table products2(title text, color string) type='pq';
create table products_distributed type='distributed' local='products1' local='products2';
INSERT INTO products1(query) values('@title bag');
INSERT INTO products1(query,filters) values('@title shoes', 'color=\'red\'');
INSERT INTO products2(query,filters) values('@title shoes', 'color in (\'blue\', \'green\')');
Теперь, если вы добавите 'sharded' as mode к CALL PQ, документы будут отправлены во все таблицы агента (в данном случае только локальные таблицы, но они могут быть удалёнными для использования внешнего оборудования). Этот режим недоступен через JSON-интерфейс.
- SQL
CALL PQ('products_distributed', ('{"title": "nice pair of shoes", "color": "blue"}', '{"title": "beautiful bag"}'), 'sharded' as mode, 1 as query);+---------------------+--------------+------+---------------------------+
| id | query | tags | filters |
+---------------------+--------------+------+---------------------------+
| 1657852401006149639 | @title bag | | |
| 1657852401006149643 | @title shoes | | color IN ('blue, 'green') |
+---------------------+--------------+------+---------------------------+Обратите внимание, что синтаксис зеркал агентов в конфигурации (когда несколько хостов указаны в одной строке agent, разделённые |) не имеет никакого отношения к режиму запроса CALL PQ. Каждый agent всегда представляет один узел, независимо от количества HA-зеркал, указанных для этого агента.
В некоторых случаях вы можете захотеть получить более подробную информацию о производительности запроса percolate. Для этой цели существует опция 1 as verbose, которая доступна только через SQL и позволяет сохранять больше метрик производительности. Вы можете просмотреть их с помощью запроса SHOW META, который можно выполнить после CALL PQ. Подробнее см. в разделе SHOW META.
- 1 as verbose
- 0 as verbose
CALL PQ('products', ('{"title": "nice pair of shoes", "color": "blue"}', '{"title": "beautiful bag"}'), 1 as verbose); show meta;CALL PQ('products', ('{"title": "nice pair of shoes", "color": "blue"}', '{"title": "beautiful bag"}'), 0 as verbose); show meta;+---------------------+
| id |
+---------------------+
| 1657852401006149644 |
| 1657852401006149646 |
+---------------------+
+-------------------------+-----------+
| Variable name | Value |
+-------------------------+-----------+
| total | 0.000 sec |
| setup | 0.000 sec |
| queries_matched | 2 |
| queries_failed | 0 |
| document_matched | 2 |
| total_queries_stored | 3 |
| term_only_queries | 3 |
| fast_rejected_queries | 0 |
| time_per_query | 27, 10 |
| time_of_matched_queries | 37 |
+-------------------------+-----------++---------------------+
| id |
+---------------------+
| 1657852401006149644 |
| 1657852401006149646 |
+---------------------+
+-----------------------+-----------+
| Variable name | Value |
+-----------------------+-----------+
| total | 0.000 sec |
| queries_matched | 2 |
| queries_failed | 0 |
| document_matched | 2 |
| total_queries_stored | 3 |
| term_only_queries | 3 |
| fast_rejected_queries | 0 |
+-----------------------+-----------+Автозаполнение, или дополнение слова, предсказывает и предлагает окончание слова или фразы по мере ввода. Обычно используется в:
- Поисковых строках на сайтах
- Подсказках в поисковых системах
- Текстовых полях в приложениях
Manticore предлагает продвинутую функцию автозаполнения, которая дает подсказки во время ввода, аналогично известным поисковым системам. Это помогает ускорить поиск и позволяет пользователям быстрее находить нужное.
Помимо базовой функциональности автозаполнения, Manticore включает расширенные возможности для улучшения пользовательского опыта:
- Коррекция орфографии (Fuzziness): Автозаполнение Manticore помогает исправлять орфографические ошибки с помощью алгоритмов, которые распознают и исправляют распространённые ошибки. Это значит, что даже если вы ошиблись при вводе, вы всё равно сможете найти то, что искали.
- Автоматическое определение раскладки клавиатуры: Manticore может определить, какую раскладку клавиатуры вы используете. Это особенно полезно в местах с множеством языков или если вы случайно печатаете не на том языке. Например, если вы ошибочно ввели "ghbdtn", Manticore поймёт, что вы хотели написать "привет" и предложит правильное слово.
Автозаполнение Manticore можно настроить под разные нужды и параметры, что делает его гибким инструментом для многих приложений.

ПРИМЕЧАНИЕ:
CALL AUTOCOMPLETEи/autocompleteтребуют Manticore Buddy. Если не работает, убедитесь, что Buddy установлен.
Для использования автозаполнения в Manticore используйте SQL-запрос CALL AUTOCOMPLETE или его JSON-эквивалент /autocomplete. Эта функция предоставляет предложения по дополнению слов на основе ваших индексированных данных.
Перед использованием убедитесь, что в таблице, которую вы собираетесь использовать для автозаполнения, включены инфиксы.
Примечание: В настройках таблицы есть автоматическая проверка min_infix_len, которая использует кэш на 30 секунд для улучшения производительности CALL AUTOCOMPLETE. После внесения изменений в таблицу может быть небольшая задержка при первом использовании CALL AUTOCOMPLETE (обычно незаметная). Кэшируются только успешные результаты, поэтому если вы удалите таблицу или отключите min_infix_len, CALL AUTOCOMPLETE может временно возвращать некорректные результаты, пока не начнёт показывать ошибку, связанную с min_infix_len.
CALL AUTOCOMPLETE('query_beginning', 'table', [...options]);
POST /autocomplete
{
"table":"table_name",
"query":"query_beginning"
[,"options": {<autocomplete options>}]
}
layouts: Строка с разделёнными запятыми кодами раскладок клавиатуры для обнаружения ошибок ввода из-за несоответствия раскладок (например, ввод "ghbdtn" вместо "привет" при неправильной раскладке). Manticore сравнивает позиции символов в разных раскладках для предложения исправлений. Требуется минимум 2 раскладки для эффективного обнаружения несоответствий. Доступные варианты: us, ru, ua, se, pt, no, it, gr, uk, fr, es, dk, de, ch, br, bg, be (подробнее здесь). По умолчанию: nonefuzziness:0,1или2(по умолчанию:2). Максимальное расстояние Левенштейна для поиска опечаток. Установите в0для отключения нечеткого поискаpreserve:0или1(по умолчанию:0). При значении1сохраняет слова без нечетких совпадений в результатах поиска (например, "hello wrld" возвращает и "hello wrld", и "hello world"). При значении0возвращает только слова с успешными нечеткими совпадениями (например, "hello wrld" возвращает только "hello world"). Особенно полезно для сохранения коротких слов или собственных имён, которые могут отсутствовать в Manticore Searchprepend: Булево (0/1 в SQL). Если true(1), добавляет звёздочку перед последним словом для расширения префикса (например,*word)append: Булево (0/1 в SQL). Если true(1), добавляет звёздочку после последнего слова для расширения суффикса (например,word*)expansion_len: Количество символов для расширения в последнем слове. По умолчанию:10force_bigrams: Булево (0/1 в SQL). Принудительно использует биграммы (2-символьные n-граммы) вместо триграмм для всех длин слов, что может улучшить совпадения для слов с ошибками перестановки символов. По умолчанию:0(использовать триграммы для слов длиной ≥6 символов)
- SQL
- SQL with no fuzzy search
- JSON
- SQL with preserve option
- JSON with preserve option
mysql> CALL AUTOCOMPLETE('hello', 'comment');
+------------+
| query |
+------------+
| hello |
| helio |
| hell |
| shell |
| nushell |
| powershell |
| well |
| help |
+------------+mysql> CALL AUTOCOMPLETE('hello', 'comment', 0 as fuzziness);
+-------+
| query |
+-------+
| hello |
+-------+POST /autocomplete
{
"table":"comment",
"query":"hello"
}mysql> CALL AUTOCOMPLETE('hello wrld', 'comment', 1 as preserve);
+------------+
| query |
+------------+
| hello wrld |
| hello world|
+------------+POST /autocomplete
{
"table":"comment",
"query":"hello wrld",
"options": {
"preserve": 1
}
}[
{
"total": 8,
"error": "",
"warning": "",
"columns": [
{
"query": {
"type": "string"
}
}
],
"data": [
{
"query": "hello"
},
{
"query": "helio"
},
{
"query": "hell"
},
{
"query": "shell"
},
{
"query": "nushell"
},
{
"query": "powershell"
},
{
"query": "well"
},
{
"query": "help"
}
]
}
][
{
"total": 2,
"error": "",
"warning": "",
"columns": [
{
"query": {
"type": "string"
}
}
],
"data": [
{
"query": "hello wrld"
},
{
"query": "hello world"
}
]
}
]Опция force_bigrams помогает с словами, в которых есть ошибки перестановки символов, например "ipohne" вместо "iphone". Используя биграммы вместо триграмм, алгоритм лучше справляется с перестановками символов.
- SQL
- JSON
mysql> CALL AUTOCOMPLETE('ipohne', 'products', 1 as force_bigrams);POST /autocomplete
{
"table":"products",
"query":"ipohne",
"options": {
"force_bigrams": 1
}
}+--------+
| query |
+--------+
| iphone |
+--------+[
{
"total": 1,
"error": "",
"warning": "",
"columns": [
{
"query": {
"type": "string"
}
}
],
"data": [
{
"query": "iphone"
}
]
}
]- Демонстрация показывает работу автозаполнения:

- Блог о нечетком поиске и автозаполнении - https://manticoresearch.com/blog/new-fuzzy-search-and-autocomplete/
Хотя CALL AUTOCOMPLETE является рекомендуемым методом для большинства случаев, Manticore также поддерживает другие управляемые и настраиваемые подходы для реализации функции автозаполнения:
Для автозаполнения предложения можно использовать инфиксный поиск. Вы можете найти конец поля документа, указав его начало и:
- используя оператор подстановки полного текста
*для совпадения с любыми символами - опционально используя
^для начала с начала поля - опционально используя
""для поиска фразы - и используя подсветку результатов
В нашем блоге есть статья об этом и интерактивный курс. Быстрый пример:
- Предположим, у вас есть документ:
My cat loves my dog. The cat (Felis catus) is a domestic species of small carnivorous mammal. - Затем вы можете использовать
^,""и*, чтобы по мере ввода пользователем формировать запросы, например:^"m*",^"my *",^"my c*",^"my ca*"и так далее - Он найдет документ, и если вы также используете подсветку, вы получите что-то вроде:
<b>My cat</b> loves my dog. The cat ( ...
В некоторых случаях вам нужно автозаполнить одно слово или пару слов. В этом случае можно использовать CALL KEYWORDS.
CALL KEYWORDS доступна через SQL-интерфейс и предоставляет способ проверить, как ключевые слова токенизируются, или получить токенизированные формы конкретных ключевых слов. Если в таблице включены инфиксы, это позволяет быстро находить возможные окончания для заданных ключевых слов, что подходит для функции автозаполнения.
Это отличная альтернатива общему поиску с инфиксами, так как обеспечивает более высокую производительность, поскольку требует только словаря таблицы, а не самих документов.
CALL KEYWORDS(text, table [, options])
Оператор CALL KEYWORDS разбивает текст на ключевые слова. Он возвращает токенизированные и нормализованные формы ключевых слов, а при необходимости — статистику по ключевым словам. Кроме того, он предоставляет позицию каждого ключевого слова в запросе и все формы токенизированных ключевых слов, если в таблице включены лемматизаторы.
| Параметр | Описание |
|---|---|
| text | Текст для разбиения на ключевые слова |
| table | Имя таблицы, из которой берутся настройки обработки текста |
| 0/1 as stats | Показать статистику по ключевым словам, по умолчанию 0 |
| 0/1 as fold_wildcards | Обрабатывать подстановочные знаки, по умолчанию 0 |
| 0/1 as fold_lemmas | Обрабатывать морфологические леммы, по умолчанию 0 |
| 0/1 as fold_blended | Обрабатывать составные слова, по умолчанию 0 |
| N as expansion_limit | Переопределить expansion_limit, заданный в конфигурации сервера, по умолчанию 0 (использовать значение из конфигурации) |
| docs/hits as sort_mode | Сортировать результаты по 'docs' или 'hits'. По умолчанию сортировка не применяется. |
| jieba_mode | Режим сегментации Jieba для запроса. Подробнее см. jieba_mode |
Примеры показывают, как это работает, если предположить, что пользователь пытается получить автозаполнение для "my cat ...". На стороне приложения все, что нужно — предлагать пользователю окончания из столбца "normalized" для каждого нового слова. Часто имеет смысл сортировать по hits или docs, используя 'hits' as sort_mode или 'docs' as sort_mode.
- Examples
MySQL [(none)]> CALL KEYWORDS('m*', 't', 1 as stats);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1 | m* | my | 1 | 2 |
| 1 | m* | mammal | 1 | 1 |
+------+-----------+------------+------+------+
MySQL [(none)]> CALL KEYWORDS('my*', 't', 1 as stats);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1 | my* | my | 1 | 2 |
+------+-----------+------------+------+------+
MySQL [(none)]> CALL KEYWORDS('c*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+-------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+-------------+------+------+
| 1 | c* | cat | 1 | 2 |
| 1 | c* | carnivorous | 1 | 1 |
| 1 | c* | catus | 1 | 1 |
+------+-----------+-------------+------+------+
MySQL [(none)]> CALL KEYWORDS('ca*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+-------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+-------------+------+------+
| 1 | ca* | cat | 1 | 2 |
| 1 | ca* | carnivorous | 1 | 1 |
| 1 | ca* | catus | 1 | 1 |
+------+-----------+-------------+------+------+
MySQL [(none)]> CALL KEYWORDS('cat*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1 | cat* | cat | 1 | 2 |
| 1 | cat* | catus | 1 | 1 |
+------+-----------+------------+------+------+Есть хороший трюк, как улучшить описанный выше алгоритм — использовать bigram_index. Если он включен для таблицы, то в нем индексируется не просто отдельное слово, а каждая пара слов, стоящих друг за другом, как отдельный токен.
Это позволяет предсказывать не только окончание текущего слова, но и следующее слово, что особенно полезно для автозаполнения.
- Examples
MySQL [(none)]> CALL KEYWORDS('m*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1 | m* | my | 1 | 2 |
| 1 | m* | mammal | 1 | 1 |
| 1 | m* | my cat | 1 | 1 |
| 1 | m* | my dog | 1 | 1 |
+------+-----------+------------+------+------+
MySQL [(none)]> CALL KEYWORDS('my*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1 | my* | my | 1 | 2 |
| 1 | my* | my cat | 1 | 1 |
| 1 | my* | my dog | 1 | 1 |
+------+-----------+------------+------+------+
MySQL [(none)]> CALL KEYWORDS('c*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+--------------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+--------------------+------+------+
| 1 | c* | cat | 1 | 2 |
| 1 | c* | carnivorous | 1 | 1 |
| 1 | c* | carnivorous mammal | 1 | 1 |
| 1 | c* | cat felis | 1 | 1 |
| 1 | c* | cat loves | 1 | 1 |
| 1 | c* | catus | 1 | 1 |
| 1 | c* | catus is | 1 | 1 |
+------+-----------+--------------------+------+------+
MySQL [(none)]> CALL KEYWORDS('ca*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+--------------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+--------------------+------+------+
| 1 | ca* | cat | 1 | 2 |
| 1 | ca* | carnivorous | 1 | 1 |
| 1 | ca* | carnivorous mammal | 1 | 1 |
| 1 | ca* | cat felis | 1 | 1 |
| 1 | ca* | cat loves | 1 | 1 |
| 1 | ca* | catus | 1 | 1 |
| 1 | ca* | catus is | 1 | 1 |
+------+-----------+--------------------+------+------+
MySQL [(none)]> CALL KEYWORDS('cat*', 't', 1 as stats, 'hits' as sort_mode);
+------+-----------+------------+------+------+
| qpos | tokenized | normalized | docs | hits |
+------+-----------+------------+------+------+
| 1 | cat* | cat | 1 | 2 |
| 1 | cat* | cat felis | 1 | 1 |
| 1 | cat* | cat loves | 1 | 1 |
| 1 | cat* | catus | 1 | 1 |
| 1 | cat* | catus is | 1 | 1 |
+------+-----------+------------+------+------+CALL KEYWORDS поддерживает распределённые таблицы, так что независимо от размера вашего набора данных вы можете использовать эту функцию.