ES基本查询语句教程

您所在的位置:网站首页 es查询字段为空的数据 ES基本查询语句教程

ES基本查询语句教程

2023-07-18 08:24| 来源: 网络整理| 查看: 265

前言

es是什么? es是基于Apache Lucene的开源分布式(全文)搜索引擎,,提供简单的RESTful API来隐藏Lucene的复杂性。 es除了全文搜索引擎之外,还可以这样描述它: 1、分布式的实时文件存储,每个字段都被索引并可被搜索 2、分布式的实时分析搜索引擎 3、可以扩展到成百上千台服务器,处理PB级结构化或非结构化数据。

ES的数据组织类比

Relational DBElasticsearch 数据库(database) 索引(indices) 表(tables) types 行(rows) documents 字段(columns) fields mac安装ES - 1、更新brew ```brew update``` - 2、安装java1.8版本 ```brew cask install homebrew/cask-versions/java8``` - 3、安装ES ```brew install elasticsearch``` - 4、启动本地ES ```brew services start elasticsearch``` - 5、本地访问9200端口查看ES安装 ```http://localhost:9200``` - 6、安装kibana ```Kibana是ES的一个配套工具,可以让用户在网页中与ES进行交互``` ```brew install kibana``` - 7、本地启动kibana ```brew services start kibana``` - 8、本地访问5601端口进入kibana交互界面 ```http://localhost:5601``` 一、 ES简单的增删改查 1、创建一篇文档(有则修改,无则创建) PUT test/doc/2 { "name":"wangfei", "age":27, "desc":"热天还不让后人不认同" } PUT test/doc/1 { "name":"wangjifei", "age":27, "desc":"萨芬我反胃为范围额" } PUT test/doc/3 { "name":"wangyang", "age":30, "desc":"点在我心内的几首歌" } 2、查询指定索引信息 GET test 3、 查询指定文档信息 GET test/doc/1 GET test/doc/2 4、查询对应索引下所有数据 GET test/doc/_search 或 GET test/doc/_search { "query": { "match_all": {} } } 5、删除指定文档 DELETE test/doc/3 6、删除索引 DELETE test 7、修改指定文档方式 修改时,不指定的属性会自动覆盖,只保留指定的属性(不正确的修改指定文档方式) PUT test/doc/1 { "name":"王计飞" } 使用POST命令,在id后面跟_update,要修改的内容放到doc文档(属性)中(正确的修改指定文档方式) POST test/doc/1/_update { "doc":{ "desc":"生活就像 茫茫海上" } } 二、ES查询的两种方式 1、查询字符串搜索 GET test/doc/_search?q=name:wangfei 2、结构化查询(单字段查询,不能多字段组合查询) GET test/doc/_search { "query":{ "match":{ "name":"wang" } } } 三、match系列之操作 1、match系列之match_all (查询全部) GET test/doc/_search { "query":{ "match_all": { } } } 2、match系列之match_phrase(短语查询) 准备数据 PUT test1/doc/1 { "title": "中国是世界上人口最多的国家" } PUT test1/doc/2 { "title": "美国是世界上军事实力最强大的国家" } PUT test1/doc/3 { "title": "北京是中国的首都" } 查询语句 GET test1/doc/_search { "query":{ "match":{ "title":"中国" } } } >>>输出结果 { "took" : 241, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 0.68324494, "hits" : [ { "_index" : "test1", "_type" : "doc", "_id" : "1", "_score" : 0.68324494, "_source" : { "title" : "中国是世界上人口最多的国家" } }, { "_index" : "test1", "_type" : "doc", "_id" : "3", "_score" : 0.5753642, "_source" : { "title" : "北京是中国的首都" } }, { "_index" : "test1", "_type" : "doc", "_id" : "2", "_score" : 0.39556286, "_source" : { "title" : "美国是世界上军事实力最强大的国家" } } ] } } 通过观察结果可以发现,虽然如期的返回了中国的文档。但是却把和美国的文档也返回了,这并不是我们想要的。是怎么回事呢?因为这是elasticsearch在内部对文档做分词的时候,对于中文来说,就是一个字一个字分的,所以,我们搜中国,中和国都符合条件,返回,而美国的国也符合。而我们认为中国是个短语,是一个有具体含义的词。所以elasticsearch在处理中文分词方面比较弱势。后面会讲针对中文的插件。但目前我们还有办法解决,那就是使用短语查询 用match_phrase GET test1/doc/_search { "query":{ "match_phrase": { "title": "中国" } } } >>>查询结果 { "took" : 10, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.5753642, "hits" : [ { "_index" : "test1", "_type" : "doc", "_id" : "1", "_score" : 0.5753642, "_source" : { "title" : "中国是世界上人口最多的国家" } }, { "_index" : "test1", "_type" : "doc", "_id" : "3", "_score" : 0.5753642, "_source" : { "title" : "北京是中国的首都" } } ] } } 我们搜索中国和世界这两个指定词组时,但又不清楚两个词组之间有多少别的词间隔。那么在搜的时候就要留有一些余地。这时就要用到了slop了。相当于正则中的中国.*?世界。这个间隔默认为0 GET test1/doc/_search { "query":{ "match_phrase": { "title": { "query": "中国世界", "slop":2 } } } } >>>查询结果 { "took" : 23, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.7445889, "hits" : [ { "_index" : "test1", "_type" : "doc", "_id" : "1", "_score" : 0.7445889, "_source" : { "title" : "中国是世界上人口最多的国家" } } ] } } 3、match系列之match_phrase_prefix(最左前缀查询)智能搜索--以什么开头 数据准备 PUT test2/doc/1 { "title": "prefix1", "desc": "beautiful girl you are beautiful so" } PUT test2/doc/2 { "title": "beautiful", "desc": "I like basking on the beach" } 搜索特定英文开头的数据 查询语句 GET test2/doc/_search { "query": { "match_phrase_prefix": { "desc": "bea" } } } >>>查询结果() { "took" : 5, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.39556286, "hits" : [ { "_index" : "test2", "_type" : "doc", "_id" : "1", "_score" : 0.39556286, "_source" : { "title" : "prefix1", "desc" : "beautiful girl you are beautiful so" } }, { "_index" : "test2", "_type" : "doc", "_id" : "2", "_score" : 0.2876821, "_source" : { "title" : "beautiful", "desc" : "I like basking on the beach" } } ] } } 查询短语 GET test2/doc/_search { "query": { "match_phrase_prefix": { "desc": "you are bea" } } } >>>查询结果 { "took" : 28, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.8630463, "hits" : [ { "_index" : "test2", "_type" : "doc", "_id" : "1", "_score" : 0.8630463, "_source" : { "title" : "prefix1", "desc" : "beautiful girl you are beautiful so" } } ] } } max_expansions 参数理解 前缀查询会非常的影响性能,要对结果集进行限制,就加上这个参数。 GET test2/doc/_search { "query": { "match_phrase_prefix": { "desc": { "query": "bea", "max_expansions":1 } } } } 4、match系列之multi_match(多字段查询) multi_match是要在多个字段中查询同一个关键字 除此之外,mulit_match甚至可以当做match_phrase和match_phrase_prefix使用,只需要指定type类型即可 GET test2/doc/_search { "query": { "multi_match": { "query": "beautiful", "fields": ["title","desc"] } } } >>查询结果 { "took" : 43, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.39556286, "hits" : [ { "_index" : "test2", "_type" : "doc", "_id" : "1", "_score" : 0.39556286, "_source" : { "title" : "prefix1", "desc" : "beautiful girl you are beautiful so" } }, { "_index" : "test2", "_type" : "doc", "_id" : "2", "_score" : 0.2876821, "_source" : { "title" : "beautiful", "desc" : "I like basking on the beach" } } ] } } 当设置属性 type:phrase 时 等同于 短语查询 GET test1/doc/_search { "query": { "multi_match": { "query": "中国", "fields": ["title"], "type": "phrase" } } } >>>查询结果 { "took" : 47, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.5753642, "hits" : [ { "_index" : "test1", "_type" : "doc", "_id" : "1", "_score" : 0.5753642, "_source" : { "title" : "中国是世界上人口最多的国家" } }, { "_index" : "test1", "_type" : "doc", "_id" : "3", "_score" : 0.5753642, "_source" : { "title" : "北京是中国的首都" } } ] } } 当设置属性 type:phrase_prefix时 等同于 最左前缀查询 GET test2/doc/_search { "query": { "multi_match": { "query": "bea", "fields": ["desc"], "type": "phrase_prefix" } } } >>查询结果 { "took" : 5, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.5753642, "hits" : [ { "_index" : "test1", "_type" : "doc", "_id" : "1", "_score" : 0.5753642, "_source" : { "title" : "中国是世界上人口最多的国家" } }, { "_index" : "test1", "_type" : "doc", "_id" : "3", "_score" : 0.5753642, "_source" : { "title" : "北京是中国的首都" } } ] } }

match 查询相关总结

1、match:返回所有匹配的分词。

2、match_all:查询全部。

3、match_phrase:短语查询,在match的基础上进一步查询词组,可以指定slop分词间隔。

4、match_phrase_prefix:前缀查询,根据短语中最后一个词组做前缀匹配,可以应用于搜索提示,但注意和max_expanions搭配。其实默认是50.......

5、multi_match:多字段查询,使用相当的灵活,可以完成match_phrase和match_phrase_prefix的工作。

四、ES的排序查询 es 6.8.4版本中,需要分词的字段不可以直接排序,比如:text类型,如果想要对这类字段进行排序,需要特别设置:对字段索引两次,一次索引分词(用于搜索)一次索引不分词(用于排序),es默认生成的text类型字段就是通过这样的方法实现可排序的。

text类型字段排序问题

倒叙排序

GET test/doc/_search { "query": { "match_all": {} }, "sort": [ { "age": { "order": "desc" } } ] } >>排序结果 { "took" : 152, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : null, "hits" : [ { "_index" : "test", "_type" : "doc", "_id" : "3", "_score" : null, "_source" : { "name" : "wangyang", "age" : 30, "desc" : "点在我心内的几首歌" }, "sort" : [ 30 ] }, { "_index" : "test", "_type" : "doc", "_id" : "2", "_score" : null, "_source" : { "name" : "wangfei", "age" : 27, "desc" : "热天还不让后人不认同" }, "sort" : [ 27 ] }, { "_index" : "test", "_type" : "doc", "_id" : "1", "_score" : null, "_source" : { "name" : "wangjifei", "age" : 27, "desc" : "生活就像 茫茫海上" }, "sort" : [ 27 ] } ] } } 升序排序 GET test/doc/_search { "query": { "match_all": {} }, "sort": [ { "age": { "order": "asc" } } ] } 五、ES的分页查询 from:从哪开始查 size:返回几条结果 GET test/doc/_search { "query": { "match_phrase_prefix": { "name": "wang" } }, "from": 0, "size": 1 } >>查询结果 { "took" : 3, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 0.2876821, "hits" : [ { "_index" : "test", "_type" : "doc", "_id" : "2", "_score" : 0.2876821, "_source" : { "name" : "wangfei", "age" : 27, "desc" : "热天还不让后人不认同" } } ] } } 六、ES的bool查询 (must、should) must (must字段对应的是个列表,也就是说可以有多个并列的查询条件,一个文档满足各个子条件后才最终返回) #### 单条件查询 GET test/doc/_search { "query": { "bool": { "must": [ { "match": { "name": "wangfei" } } ] } } } >>查询结果 { "took" : 4, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.2876821, "hits" : [ { "_index" : "test", "_type" : "doc", "_id" : "2", "_score" : 0.2876821, "_source" : { "name" : "wangfei", "age" : 27, "desc" : "热天还不让后人不认同" } } ] } } #### 多条件组合查询 GET test/doc/_search { "query": { "bool": { "must": [ { "match": { "name": "wanggfei" } },{ "match": { "age": 25 } } ] } } } >>查询结果 { "took" : 21, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } } should (只要符合其中一个条件就返回) GET test/doc/_search { "query": { "bool": { "should": [ { "match": { "name": "wangjifei" } },{ "match": { "age": 27 } } ] } } } >>查询结果 { "took" : 34, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.287682, "hits" : [ { "_index" : "test", "_type" : "doc", "_id" : "1", "_score" : 1.287682, "_source" : { "name" : "wangjifei", "age" : 27, "desc" : "生活就像 茫茫海上" } }, { "_index" : "test", "_type" : "doc", "_id" : "2", "_score" : 1.0, "_source" : { "name" : "wangfei", "age" : 27, "desc" : "热天还不让后人不认同" } } ] } } must_not 顾名思义 GET test/doc/_search { "query": { "bool": { "must_not": [ { "match": { "name": "wangjifei" } },{ "match": { "age": 27 } } ] } } } >>查询结果 { "took" : 13, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "test", "_type" : "doc", "_id" : "3", "_score" : 1.0, "_source" : { "name" : "wangyang", "age" : 30, "desc" : "点在我心内的几首歌" } } ] } } filter(条件过滤查询,过滤条件的范围用range表示gt表示大于、lt表示小于、gte表示大于等于、lte表示小于等于) GET test/doc/_search { "query": { "bool": { "must": [ { "match": { "name": "wangjifei" } } ], "filter": { "range": { "age": { "gte": 10, "lt": 27 } } } } } } >>查询结果 { "took" : 33, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } }

bool查询总结

must:与关系,相当于关系型数据库中的 and。

should:或关系,相当于关系型数据库中的 or。

must_not:非关系,相当于关系型数据库中的 not。

filter:过滤条件。

range:条件筛选范围。

gt:大于,相当于关系型数据库中的 >。

gte:大于等于,相当于关系型数据库中的 >=。

lt:小于,相当于关系型数据库中的 查询结果 { "took" : 58, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.2876821, "hits" : [ { "_index" : "test3", "_type" : "doc", "_id" : "1", "_score" : 0.2876821, "_source" : { "name" : "顾老二", "age" : 30 } } ] } } 八、ES之查询结果高亮显示 ES的默认高亮显示 GET test3/doc/_search { "query": { "match": { "name": "顾老二" } }, "highlight": { "fields": { "name": {} } } } >>查询结果 { "took" : 216, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.8630463, "hits" : [ { "_index" : "test3", "_type" : "doc", "_id" : "1", "_score" : 0.8630463, "_source" : { "name" : "顾老二", "age" : 30, "from" : "gu", "desc" : "皮肤黑、武器长、性格直", "tags" : [ "黑", "长", "直" ] }, "highlight" : { "name" : [ "顾老二" ] } } ] } } ES自定义高亮显示(在highlight中,pre_tags用来实现我们的自定义标签的前半部分,在这里,我们也可以为自定义的 标签添加属性和样式。post_tags实现标签的后半部分,组成一个完整的标签。至于标签中的内容,则还是交给fields来完成) GET test3/doc/_search { "query": { "match": { "desc": "性格直" } }, "highlight": { "pre_tags": "", "post_tags": "", "fields": { "desc": {} } } } >>查询结果 { "took" : 6, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.8630463, "hits" : [ { "_index" : "test3", "_type" : "doc", "_id" : "1", "_score" : 0.8630463, "_source" : { "name" : "顾老二", "age" : 30, "from" : "gu", "desc" : "皮肤黑、武器长、性格直", "tags" : [ "黑", "长", "直" ] }, "highlight" : { "desc" : [ "皮肤黑、武器长、" ] } } ] } } 十、ES之精确查询与模糊查询 term查询查找包含文档精确的倒排索引指定的词条。也就是精确查找。 term和match的区别是:match是经过analyer的,也就是说,文档首先被分析器给处理了。根据不同的分析器,分析的结果也稍显不同,然后再根据分词结果进行匹配。term则不经过分词,它是直接去倒排索引中查找了精确的值了。 #### 准备数据 PUT w1 { "mappings": { "doc": { "properties":{ "t1":{ "type": "text" }, "t2": { "type": "keyword" } } } } } PUT w1/doc/1 { "t1": "hi single dog", "t2": "hi single dog" } 对比两者的不同 (结果就不展示出来了,只展示结果的文字叙述) # t1类型为text,会经过分词,match查询时条件也会经过分词,所以下面两种查询都能查到结果 GET w1/doc/_search { "query": { "match": { "t1": "hi single dog" } } } GET w1/doc/_search { "query": { "match": { "t1": "hi" } } } # t2类型为keyword类型,不会经过分词,match查询时条件会经过分词,所以只能当值为"hi single dog"时能查询到 GET w1/doc/_search { "query": { "match": { "t2": "hi" } } } GET w1/doc/_search { "query": { "match": { "t2": "hi single dog" } } } # t1类型为text,会经过分词,term查询时条件不会经过分词,所以只有当值为"hi"时能查询到 GET w1/doc/_search { "query": { "term": { "t1": "hi single dog" } } } GET w1/doc/_search { "query": { "term": { "t1": "hi" } } } # t2类型为keyword类型,不会经过分词,term查询时条件不会经过分词,所以只能当值为"hi single dog"时能查询到 GET w1/doc/_search { "query": { "term": { "t2": "hi single dog" } } } GET w1/doc/_search { "query": { "term": { "t2": "hi" } } } 查找多个精确值(terms) #### 第一个查询方式 GET test/doc/_search { "query": { "bool": { "should": [ { "term": { "age":27 } },{ "term":{ "age":28 } } ] } } } # 第二个查询方式 GET test/doc/_search { "query": { "terms": { "age": [ "27", "28" ] } } } >>>两种方式的查询结果都是一下结果 { "took" : 10, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "test", "_type" : "doc", "_id" : "2", "_score" : 1.0, "_source" : { "name" : "wangfei", "age" : 27, "desc" : "热天还不让后人不认同" } }, { "_index" : "test", "_type" : "doc", "_id" : "1", "_score" : 1.0, "_source" : { "name" : "wangjifei", "age" : 27, "desc" : "生活就像 茫茫海上" } } ] } } 十一、ES的聚合查询avg、max、min、sum #### 数据准备 PUT zhifou/doc/1 { "name":"顾老二", "age":30, "from": "gu", "desc": "皮肤黑、武器长、性格直", "tags": ["黑", "长", "直"] } PUT zhifou/doc/2 { "name":"大娘子", "age":18, "from":"sheng", "desc":"肤白貌美,娇憨可爱", "tags":["白", "富","美"] } PUT zhifou/doc/3 { "name":"龙套偏房", "age":22, "from":"gu", "desc":"mmp,没怎么看,不知道怎么形容", "tags":["造数据", "真","难"] } PUT zhifou/doc/4 { "name":"石头", "age":29, "from":"gu", "desc":"粗中有细,狐假虎威", "tags":["粗", "大","猛"] } PUT zhifou/doc/5 { "name":"魏行首", "age":25, "from":"广云台", "desc":"仿佛兮若轻云之蔽月,飘飘兮若流风之回雪,mmp,最后竟然没有嫁给顾老二!", "tags":["闭月","羞花"] } GET zhifou/doc/_search { "query": { "match_all": {} } } 需求1、查询from是gu的人的平均年龄。 GET zhifou/doc/_search { "query": { "match": { "from": "gu" } }, "aggs": { "my_avg": { "avg": { "field": "age" } } }, "_source": ["name", "age"] } >>>查询结果 { "took" : 83, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 0.6931472, "hits" : [ { "_index" : "zhifou", "_type" : "doc", "_id" : "4", "_score" : 0.6931472, "_source" : { "name" : "石头", "age" : 29 } }, { "_index" : "zhifou", "_type" : "doc", "_id" : "1", "_score" : 0.2876821, "_source" : { "name" : "顾老二", "age" : 30 } }, { "_index" : "zhifou", "_type" : "doc", "_id" : "3", "_score" : 0.2876821, "_source" : { "name" : "龙套偏房", "age" : 22 } } ] }, "aggregations" : { "my_avg" : { "value" : 27.0 } } } 上例中,首先匹配查询from是gu的数据。在此基础上做查询平均值的操作,这里就用到了聚合函数,其语法被封装在aggs中,而my_avg则是为查询结果起个别名,封装了计算出的平均值。那么,要以什么属性作为条件呢?是age年龄,查年龄的什么呢?是avg,查平均年龄。 如果只想看输出的值,而不关心输出的文档的话可以通过size=0来控制 GET zhifou/doc/_search { "query": { "match": { "from": "gu" } }, "aggs":{ "my_avg":{ "avg": { "field": "age" } } }, "size":0, "_source":["name","age"] } >>>查询结果 { "took" : 35, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "my_avg" : { "value" : 27.0 } } } 需求2、查询年龄的最大值 GET zhifou/doc/_search { "query": { "match_all": {} }, "aggs": { "my_max": { "max": { "field": "age" } } }, "size": 0, "_source": ["name","age","from"] } >>>查询结果 { "took" : 10, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "my_max" : { "value" : 30.0 } } } 需求3、查询年龄的最小值 GET zhifou/doc/_search { "query": { "match_all": {} }, "aggs": { "my_min": { "min": { "field": "age" } } }, "size": 0, "_source": ["name","age","from"] } >>>查询结果 { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "my_min" : { "value" : 18.0 } } } 需求4、查询符合条件的年龄之和 GET zhifou/doc/_search { "query": { "match": { "from": "gu" } }, "aggs": { "my_sum": { "sum": { "field": "age" } } }, "size": 0, "_source": ["name","age","from"] } >>>查询结果 { "took" : 4, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "my_sum" : { "value" : 81.0 } } } 十二、ES的分组查询 需求: 要查询所有人的年龄段,并且按照1520,2025,25~30分组,并且算出每组的平均年龄。 GET zhifou/doc/_search { "size": 0, "query": { "match_all": {} }, "aggs": { "age_group": { "range": { "field": "age", "ranges": [ { "from": 15, "to": 20 }, { "from": 20, "to": 25 }, { "from": 25, "to": 30 } ] } } } } >>>查询结果 { "took" : 9, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "age_group" : { "buckets" : [ { "key" : "15.0-20.0", "from" : 15.0, "to" : 20.0, "doc_count" : 1 }, { "key" : "20.0-25.0", "from" : 20.0, "to" : 25.0, "doc_count" : 1 }, { "key" : "25.0-30.0", "from" : 25.0, "to" : 30.0, "doc_count" : 2 } ] } } } 上例中,在aggs的自定义别名age_group中,使用range来做分组,field是以age为分组,分组使用ranges来做,from和to是范围 接下来,我们就要对每个小组内的数据做平均年龄处理。 GET zhifou/doc/_search { "size": 0, "query": { "match_all": {} }, "aggs": { "age_group": { "range": { "field": "age", "ranges": [ { "from": 15, "to": 20 }, { "from": 20, "to": 25 }, { "from": 25, "to": 30 } ] }, "aggs": { "my_avg": { "avg": { "field": "age" } } } } } } >>>查询结果 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 5, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "age_group" : { "buckets" : [ { "key" : "15.0-20.0", "from" : 15.0, "to" : 20.0, "doc_count" : 1, "my_avg" : { "value" : 18.0 } }, { "key" : "20.0-25.0", "from" : 20.0, "to" : 25.0, "doc_count" : 1, "my_avg" : { "value" : 22.0 } }, { "key" : "25.0-30.0", "from" : 25.0, "to" : 30.0, "doc_count" : 2, "my_avg" : { "value" : 27.0 } } ] } } }

ES的聚合查询的总结:聚合函数的使用,一定是先查出结果,然后对结果使用聚合函数做处理

avg:求平均

max:最大值

min:最小值

sum:求和

十三、ES之Mappings GET test >>>查询结果 { "test" : { "aliases" : { }, "mappings" : { "doc" : { "properties" : { "age" : { "type" : "long" }, "desc" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } }, "settings" : { "index" : { "creation_date" : "1569133097594", "number_of_shards" : "5", "number_of_replicas" : "1", "uuid" : "AztO9waYQiyHvzP6dlk4tA", "version" : { "created" : "6080299" }, "provided_name" : "test" } } } } 由返回结果可以看到,分为两大部分: 第一部分关于t1索引类型相关的,包括该索引是否有别名aliases,然后就是mappings信息, 包括索引类型doc,各字段的详细映射关系都收集在properties中。 另一部分是关于索引t1的settings设置。包括该索引的创建时间,主副分片的信息,UUID等等。 1. mappings 是什么? 映射就是在创建索引的时候,有更多定制的内容,更加的贴合业务场景。 用来定义一个文档及其包含的字段如何存储和索引的过程。 2. 字段的数据类型 简单类型如文本(text)、关键字(keyword)、日期(data)、整形(long)、双精度 (double)、布尔(boolean)或ip。 可以是支持JSON的层次结构性质的类型,如对象或嵌套。 或者一种特殊类型,如geo_point、geo_shape或completion。为了不同的目的, 以不同的方式索引相同的字段通常是有用的。例如,字符串字段可以作为全文搜索的文本字段进行索引, 也可以作为排序或聚合的关键字字段进行索引。或者,可以使用标准分析器、英语分析器和 法语分析器索引字符串字段。这就是多字段的目的。大多数数据类型通过fields参数支持多字段。 一个简单的映射示例 PUT mapping_test { "mappings": { "test1":{ "properties":{ "name":{"type": "text"}, "age":{"type":"long"} } } } } 我们在创建索引PUT mapping_test1的过程中,为该索引定制化类型(设计表结构),添加一个映射类型test1;指定字段或者属性都在properties内完成。 GET mapping_test >>>查询结果 { "mapping_test" : { "aliases" : { }, "mappings" : { "test1" : { "properties" : { "age" : { "type" : "long" }, "name" : { "type" : "text" } } } }, "settings" : { "index" : { "creation_date" : "1570794586526", "number_of_shards" : "5", "number_of_replicas" : "1", "uuid" : "P4-trriPTxq-nJj89iYXZA", "version" : { "created" : "6080299" }, "provided_name" : "mapping_test" } } } } 返回的结果中你肯定很熟悉!映射类型是test1,具体的属性都被封装在properties中。 3. ES mappings之dynamic的三种状态 一般的,mapping则又可以分为动态映射(dynamic mapping)和静态(显示)映射(explicit mapping)和精确(严格)映射(strict mappings),具体由dynamic属性控制。默认为动态映射 ##### 默认为动态映射 PUT test4 { "mappings": { "doc":{ "properties": { "name": { "type": "text" }, "age": { "type": "long" } } } } } GET test4/_mapping >>>查询结果 { "test4" : { "mappings" : { "doc" : { "properties" : { "age" : { "type" : "long" }, "name" : { "type" : "text" }, "sex" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } } #####添加数据 PUT test4/doc/1 { "name":"wangjifei", "age":"18", "sex":"不详" } #####查看数据 GET test4/doc/_search { "query": { "match_all": {} } } >>>查询结果 { "took" : 8, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "test4", "_type" : "doc", "_id" : "1", "_score" : 1.0, "_source" : { "name" : "wangjifei", "age" : "18", "sex" : "不详" } } ] } } 测试静态映射:当elasticsearch察觉到有新增字段时,因为dynamic:false的关系,会忽略该字段,但是仍会存储该字段。 #####创建静态mapping PUT test5 { "mappings": { "doc":{ "dynamic":false, "properties": { "name": { "type": "text" }, "age": { "type": "long" } } } } } #####插入数据 PUT test5/doc/1 { "name":"wangjifei", "age":"18", "sex":"不详" } ####条件查询 GET test5/doc/_search { "query": { "match": { "sex": "不详" } } } >>>查询结果 { "took" : 9, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } } #####查看所有数据 GET /test5/doc/_search { "query": { "match_all": {} } } >>>查询结果 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "test5", "_type" : "doc", "_id" : "1", "_score" : 1.0, "_source" : { "name" : "wangjifei", "age" : "18", "sex" : "不详" } } ] } } 测试严格映射:当elasticsearch察觉到有新增字段时,因为dynamic:strict 的关系,就会报错,不能插入成功。 #####创建严格mapping PUT test6 { "mappings": { "doc":{ "dynamic":"strict", "properties": { "name": { "type": "text" }, "age": { "type": "long" } } } } } #####插入数据 PUT test6/doc/1 { "name":"wangjifei", "age":"18", "sex":"不详" } >>>插入结果 { "error": { "root_cause": [ { "type": "strict_dynamic_mapping_exception", "reason": "mapping set to strict, dynamic introduction of [sex] within [doc] is not allowed" } ], "type": "strict_dynamic_mapping_exception", "reason": "mapping set to strict, dynamic introduction of [sex] within [doc] is not allowed" }, "status": 400 }

小结: 动态映射(dynamic:true):动态添加新的字段(或缺省)。 静态映射(dynamic:false):忽略新的字段。在原有的映射基础上,当有新的字段时,不会主动的添加新的映射关系,只作为查询结果出现在查询中。 严格模式(dynamic:strict):如果遇到新的字段,就抛出异常。一般静态映射用的较多。就像HTML的img标签一样,src为自带的属性,你可以在需要的时候添加id或者class属性。当然,如果你非常非常了解你的数据,并且未来很长一段时间不会改变,strict不失为一个好选择。

4. ES之mappings的 index 属性 index属性默认为true,如果该属性设置为false,那么,elasticsearch不会为该属性创建索引,也就是说无法当做主查询条件。 PUT test7 { "mappings": { "doc": { "properties": { "name": { "type": "text", "index": true }, "age": { "type": "long", "index": false } } } } } ####插入数据 PUT test7/doc/1 { "name":"wangjifei", "age":18 } ####条件查询数据 GET test7/doc/_search { "query": { "match": { "name": "wangjifei" } } } >>>查询结果 { "took" : 18, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.2876821, "hits" : [ { "_index" : "test7", "_type" : "doc", "_id" : "1", "_score" : 0.2876821, "_source" : { "name" : "wangjifei", "age" : 18 } } ] } } #####条件查询 GET test7/doc/_search { "query": { "match": { "age": 18 } } } >>>查询结果 { "error": { "root_cause": [ { "type": "query_shard_exception", "reason": "failed to create query: {\n \"match\" : {\n \"age\" : {\n \"query\" : 18,\n \"operator\" : \"OR\",\n \"prefix_length\" : 0,\n \"max_expansions\" : 50,\n \"fuzzy_transpositions\" : true,\n \"lenient\" : false,\n \"zero_terms_query\" : \"NONE\",\n \"auto_generate_synonyms_phrase_query\" : true,\n \"boost\" : 1.0\n }\n }\n}", "index_uuid": "fzN9frSZRy2OzinRjeMKGA", "index": "test7" } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": 0, "index": "test7", "node": "INueKtviRpO1dbNWngcjJA", "reason": { "type": "query_shard_exception", "reason": "failed to create query: {\n \"match\" : {\n \"age\" : {\n \"query\" : 18,\n \"operator\" : \"OR\",\n \"prefix_length\" : 0,\n \"max_expansions\" : 50,\n \"fuzzy_transpositions\" : true,\n \"lenient\" : false,\n \"zero_terms_query\" : \"NONE\",\n \"auto_generate_synonyms_phrase_query\" : true,\n \"boost\" : 1.0\n }\n }\n}", "index_uuid": "fzN9frSZRy2OzinRjeMKGA", "index": "test7", "caused_by": {


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3