Elasticsearch: 运用 Field collapsing 来减少基于单个字段的搜索结果
【腾讯云 Elasticsearch Service】高可用,可伸缩,云端全托管。集成X-Pack高级特性,适用日志分析/企业搜索/BI分析等场景
允许根据字段值折叠搜索结果。 折叠是通过每个折叠键仅选择排序最靠前的文档来完成的。要想理解这个其实也并不难,我们就那百度音乐的页面例子来说:
我们可以看到在上面的页面中,它有展示很多喜欢的歌曲。其实这个歌曲可能是一个专辑里的最突出的一个。当我们做页面的时候,我们没有必要把一个专辑里所有的歌曲都放到这个封面的位置。我也许就只想放这个专辑里点击率最高的或者是最受欢迎的一首歌作为这个专辑的代表。当我们点击这个专辑的时候,我们还可以看到其它在这个专辑里的歌曲:
Field collapsing 就是为这个而生。这种情况也适用于有些新闻头条出现在标题栏中。当我们点击进去过,可以看到更多的相关类别的新闻。
下面我们来通过一个例子来展示如何使用。
准备数据
今天我们使用的数据是一个最好游戏的一个数据。我们可以从我的 github 项目里把这个数据下载下来:
git clon https://github.com/liu-xiao-guo/best_games_json_data
然后,我们通过如下的方式把我们下载的JSON数据导入到Elasticsearch中:
我们把这个index的名字叫做best_games:
这样我们的数据就准备好了。整个索引共有500条数据。这个索引里的每一条数据就像:
{"id":"madden-nfl-2002-ps2-2001","name":"Madden NFL 2002","year":2001,"platform":"PS2","genre":"Sports","publisher":"Electronic Arts","global_sales":3.08,"critic_score":94,"user_score":7,"developer":"EA Sports","image_url":"http://www.mobygames.com/images/covers/l/202684-madden-nfl-2002-playstation-2-back-cover.png"}
它的mapping为:
{ "best_games" : { "mappings" : { "_meta" : { "created_by" : "ml-file-data-visualizer" }, "properties" : { "critic_score" : { "type" : "long" }, "developer" : { "type" : "text" }, "genre" : { "type" : "keyword" }, "global_sales" : { "type" : "double" }, "id" : { "type" : "keyword" }, "image_url" : { "type" : "keyword" }, "name" : { "type" : "text" }, "platform" : { "type" : "keyword" }, "publisher" : { "type" : "keyword" }, "user_score" : { "type" : "long" }, "year" : { "type" : "long" } } } }}
Field collapsing
下面我们用 collapsing 的方法来对我们的数据进行搜索:
GET best_games/_search{ "query": { "match": { "name": "Final Fantasy" } }, "collapse": { "field": "publisher" }, "sort": [ { "critic_score": { "order": "desc" } } ]}
搜索的结果是:
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 11, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "fields" : { "publisher" : [ "SquareSoft" ] }, "sort" : [ 94 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "wnJzF28BjrINWI3xtt40", "_score" : null, "_source" : { "id" : "final-fantasy-vii-ps-1997", "name" : "Final Fantasy VII", "year" : 1997, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "Sony Computer Entertainment", "global_sales" : 9.72, "critic_score" : 92, "user_score" : 9, "developer" : "SquareSoft", "image_url" : "https://r.hswstatic.com/w_907/gif/finalfantasyvii-MAIN.jpg" }, "fields" : { "publisher" : [ "Sony Computer Entertainment" ] }, "sort" : [ 92 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "_nJzF28BjrINWI3xtt40", "_score" : null, "_source" : { "id" : "final-fantasy-xii-ps2-2006", "name" : "Final Fantasy XII", "year" : 2006, "platform" : "PS2", "genre" : "Role-Playing", "publisher" : "Square Enix", "global_sales" : 5.95, "critic_score" : 92, "user_score" : 7, "developer" : "Square Enix", "image_url" : "https://m.media-amazon.com/images/M/MV5BM2I4MDMyMDQtNjM2OC00ZWNkLTg0ODQtNzYxZjY0M2QxODQyXkEyXkFqcGdeQXVyNjY5NTM5MjA@._V1_.jpg" }, "fields" : { "publisher" : [ "Square Enix" ] }, "sort" : [ 92 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "FXJzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-x-2-ps2-2003", "name" : "Final Fantasy X-2", "year" : 2003, "platform" : "PS2", "genre" : "Role-Playing", "publisher" : "Electronic Arts", "global_sales" : 5.29, "critic_score" : 85, "user_score" : 6, "developer" : "SquareSoft", "image_url" : "https://upload.wikimedia.org/wikipedia/en/thumb/6/6c/FFX-2_box.jpg/220px-FFX-2_box.jpg" }, "fields" : { "publisher" : [ "Electronic Arts" ] }, "sort" : [ 85 ] } ] }}
上面的结果显示:
- 我们搜索所有的名字为 Final Fantasy 的游戏,并按照 critic_score 降序排序。
- 由于我们使用 collapse,并按照 publisher 来进行分类。它的意思就是每个 publisher 只能有一个搜索的结果,尽管每一 publisher 有很多款的游戏
比如,我们可以找到 publisher 为 SquareSoft 并且 name 里含有 Final Fantasy 的游戏,有三款之多:
GET best_games/_search{ "query": { "bool": { "must": [ { "match": { "name": "Final Fantasy" } }, { "match": { "publisher": "SquareSoft" } } ] } }, "sort": [ { "critic_score": { "order": "desc" } } ]}
上面的查询结果:
"hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "sort" : [ 94 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "0nJzF28BjrINWI3xtt40", "_score" : null, "_source" : { "id" : "final-fantasy-viii-ps-1999", "name" : "Final Fantasy VIII", "year" : 1999, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 7.86, "critic_score" : 90, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585" }, "sort" : [ 90 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "SHJzF28BjrINWI3xtuA1", "_score" : null, "_source" : { "id" : "final-fantasy-tactics-ps-1997", "name" : "Final Fantasy Tactics", "year" : 1997, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 2.45, "critic_score" : 83, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg" }, "sort" : [ 83 ] } ] }
但是由于我们使用了collapse,只有一款游戏,并且是按照 critic_score 最高的那个被搜索出来。
注意:能够被 collapse 所使用的字段必须是数字或 keyword 字段,并且含有 doc_values。
扩展 Collapse 结果
我们也可以通过使用 inner_hits 选项来扩展 Collapse 的热门匹配:
GET best_games/_search{ "query": { "match": { "name": "Final Fantasy" } }, "collapse": { "field": "publisher", "inner_hits": { "name": "top 3 games", "size": 3, "sort": [{"user_score": "desc"}] } }, "sort": [ { "critic_score": { "order": "desc" } } ]}
那么运行后的结果为:
"hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "fields" : { "publisher" : [ "SquareSoft" ] }, "sort" : [ 94 ], "inner_hits" : { "top 3 games" : { "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "0nJzF28BjrINWI3xtt40", "_score" : null, "_source" : { "id" : "final-fantasy-viii-ps-1999", "name" : "Final Fantasy VIII", "year" : 1999, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 7.86, "critic_score" : 90, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585" }, "sort" : [ 8 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "sort" : [ 8 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "SHJzF28BjrINWI3xtuA1", "_score" : null, "_source" : { "id" : "final-fantasy-tactics-ps-1997", "name" : "Final Fantasy Tactics", "year" : 1997, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 2.45, "critic_score" : 83, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg" }, "sort" : [ 8 ] } ] } } } },
我们可以看出来在每个 publisher 里,在 inner_hits 里同时含有3个 top 3 games。它们分别是按照 user_score 来进行分类的。
也可以为每个合拢的匹配请求多个 inner_hits。 当您想要获得 Collapse 后的匹配的多种表示形式时,此功能很有用。
GET best_games/_search{ "query": { "match": { "name": "Final Fantasy" } }, "collapse": { "field": "publisher", "inner_hits": [ { "name": "top user liked", "size": 3, "sort": [ { "user_score": "desc" } ] }, { "name": "top most recent games", "size": 3, "sort": [ { "year": "desc" } ] } ] }, "sort": [ { "critic_score": { "order": "desc" } } ]}
显示结果为:
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释
* "hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "fields" : { "publisher" : [ "SquareSoft" ] }, "sort" : [ 94 ], "inner_hits" : { "top user liked" : { "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "0nJzF28BjrINWI3xtt40", "_score" : null, "_source" : { "id" : "final-fantasy-viii-ps-1999", "name" : "Final Fantasy VIII", "year" : 1999, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 7.86, "critic_score" : 90, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585" }, "sort" : [ 8 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "sort" : [ 8 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "SHJzF28BjrINWI3xtuA1", "_score" : null, "_source" : { "id" : "final-fantasy-tactics-ps-1997", "name" : "Final Fantasy Tactics", "year" : 1997, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 2.45, "critic_score" : 83, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg" }, "sort" : [ 8 ] } ] } }, "top most recent games" : { "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "best_games", "_type" : "_doc", "_id" : "E3JzF28BjrINWI3xtt80", "_score" : null, "_source" : { "id" : "final-fantasy-ix-ps-2000", "name" : "Final Fantasy IX", "year" : 2000, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 5.3, "critic_score" : 94, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg" }, "sort" : [ 2000 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "0nJzF28BjrINWI3xtt40", "_score" : null, "_source" : { "id" : "final-fantasy-viii-ps-1999", "name" : "Final Fantasy VIII", "year" : 1999, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 7.86, "critic_score" : 90, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585" }, "sort" : [ 1999 ] }, { "_index" : "best_games", "_type" : "_doc", "_id" : "SHJzF28BjrINWI3xtuA1", "_score" : null, "_source" : { "id" : "final-fantasy-tactics-ps-1997", "name" : "Final Fantasy Tactics", "year" : 1997, "platform" : "PS", "genre" : "Role-Playing", "publisher" : "SquareSoft", "global_sales" : 2.45, "critic_score" : 83, "user_score" : 8, "developer" : "SquareSoft", "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg" }, "sort" : [ 1997 ] } ] } } } },
*/
这样针对每个 publisher,我们也可以得到每个 publisher 在 user 中最受欢迎的三个,同时显示最新的三个游戏。
参考:
最新活动
包含文章发布时段最新活动,前往ES产品介绍页,可查找ES当前活动统一入口
Elasticsearch Service自建迁移特惠政策>>
Elasticsearch Service 新用户特惠狂欢,最低4折首购优惠 >>
Elasticsearch Service 企业首购特惠,助力企业复工复产>>
关注“腾讯云大数据”公众号,技术交流、最新活动、服务专享一站Get~
- Linux进程间通信(五) - 信号灯(史上最全)及其经典应用案例
- 写出完美的snprintf
- 计算CPU利用率
- 详解Hadoop HA 如何运作
- Linux时间时区详解与常用时间函数
- 基于Linux整形时间的常用计算思路
- 如何追踪每一笔记录的来龙去脉:一个完整的Audit Logging解决方案[上篇]
- WCF技术剖析之二十四: ServiceDebugBehavior服务行为是如何实现异常的传播的?
- Linux64位程序移植
- history命令使用方法详解
- Linux删除乱码文件的方法
- 和智能机器一起工作,而不是惧怕它们
- Hulu大数据架构与应用经验
- SQL Server 2005:一个使用新创建的User的问题和解决方法
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- Linux磁盘挂载、分区、扩容操作的实现办法
- PHP商品秒杀问题解决方案实例详解【mysql与redis】
- PHP设计模式之组合模式定义与应用示例
- php使用socket调用http和smtp协议实例小结
- Ubuntu删除多余内核的办法
- php 使用mpdf实现指定字段配置字体样式的方法
- 虚拟机中CentOS7设置固定IP地址的方法
- CentOs下手动升级node版本的办法
- php设计模式之抽象工厂模式分析【星际争霸游戏案例】
- PHP使用PDO、mysqli扩展实现与数据库交互操作详解
- Linux中的who命令实例介绍
- php获取本年、本月、本周时间戳和日期格式的实例代码
- Smarty缓存机制实例详解【三种缓存方式】
- 详解在Ubuntu上的Apache配置SSL(https证书)的正确姿势
- php设计模式之建造器模式分析【星际争霸游戏案例】