搜索引擎之Elastic Search的高级使用

数据库 其他数据库
索引别名就像⼀个快捷⽅式或是软连接,可以指向⼀个或多个索引,也可以给任意⼀个需要索引名的API来使⽤。别名的应⽤为程序提供了极⼤地灵活性。

es之索引别名的使用

  • 别名有什么⽤

在开发中,随着业务需求的迭代,较⽼的业务逻辑就要⾯临更新甚⾄是重构,⽽对于es来说,为了适应新的业务逻辑,可能就要对原有的索引做⼀些修改,⽐如对某些字段做调整,甚⾄是重建索引。⽽做这些操作的时候,可能会对业务造成影响,甚⾄是停机调整等问题。由此,es提供了索引别名来解决这些问题。 索引别名就像⼀个快捷⽅式或是软连接,可以指向⼀个或多个索引,也可以给任意⼀个需要索引名的API来使⽤。别名的应⽤为程序提供了极⼤地灵活性。

  • 查询别名
GET /nba/_alias
GET /_alias
  • 新增别名
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "nba_v1.0"
}
}
]
}
PUT /nba/_alias/nba_v1.1
  • 删除别名
POST /_aliases
{
"actions": [
{
"remove": {
"index": "nba",
"alias": "nba_v1.0"
}
}
]
}
DELETE /nba/_alias/nba_v1.1
  • 重命名
POST /_aliases
{
"actions": [
{
"remove": {
"index": "nba",
"alias": "nba_v1.0"
}
},
{
"add": {
"index": "nba",
"alias": "nba_v2.0"
}
}
]
}
  • 为多个索引指定⼀个别名
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "national_player"
}
},
{
"add": {
"index": "wnba",
"alias": "national_player"
}
}
]
}
  • 为同个索引指定多个别名
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "nba_v2.1"
}
},
{
"add": {
"index": "nba",
"alias": "nba_v2.2"
}
}
]
}
  • 通过别名读索引

当别名指定了⼀个索引,则查出⼀个索引。

GET /nba_v2.1

当别名指定了多个索引,则查出多个索引。

GET /national_player
  • 通过别名写索引

当别名指定了⼀个索引,则可以做写的操作。

POST /nba_v2.1/_doc/566
{
"countryEn": "Croatia",
"teamName": "快船",
"birthDay": 858661200000,
"country": "克罗地亚",
"teamCityEn": "LA",
"code": "ivica_zubac",
"displayAffiliation": "Croatia",
"displayName": "伊维察 祖巴茨哥哥",
"schoolType": "",
"teamConference": "⻄部",
"teamConferenceEn": "Western",
"weight": "108.9 公⽄",
"teamCity": "洛杉矶",
"playYear": 3,
"jerseyNo": "40",
"teamNameEn": "Clippers",
"draft": 2016,
"displayNameEn": "Ivica Zubac",
"heightValue": 2.16,
"birthDayStr": "1997-03-18",
"position": "中锋",
"age": 22,
"playerId": "1627826"
}

当别名指定了多个索引,可以指定写某个索引。

POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "national_player",
"is_write_index": true
}
},
{
"add": {
"index": "wnba",
"alias": "national_player"
}
}
]
}
POST /national_player/_doc/566
{
"countryEn": "Croatia",
"teamName": "快船",
"birthDay": 858661200000,
"country": "克罗地亚",
"teamCityEn": "LA",
"code": "ivica_zubac",
"displayAffiliation": "Croatia",
"displayName": "伊维察 祖巴茨妹妹",
"schoolType": "",
"teamConference": "⻄部",
"teamConferenceEn": "Western",
"weight": "108.9 公⽄",
"teamCity": "洛杉矶",
"playYear": 3,
"jerseyNo": "40",
"teamNameEn": "Clippers",
"draft": 2016,
"displayNameEn": "Ivica Zubac",
"heightValue": 2.16,
"birthDayStr": "1997-03-18",
"position": "中锋",
"age": 22,
"playerId": "1627826"
}

es之如何重建索引

背景

Elasticsearch是⼀个实时的分布式搜索引擎,为⽤户提供搜索服务,当我们决定存储某种数据时,在创建索引的时候需要将数据结构完整确定下来,于此同时索引的设定和很多固定配置将⽤不能改变。当需要改变数据结构时,就需要重新建⽴索引,为此,Elastic团队提供了很多辅助⼯具帮助开发⼈员进⾏重建索引。

步骤

  • nba取⼀个别名nba_latest, nba_latest作为对外使⽤。
  • 新增⼀个索引nba_20220101,结构复制于nba索引,根据业务要求修改字段。
  • 将nba数据同步到nba_20220101。
  • 给nba_20220101添加别名nba_latest,删除nba别名nba_latest。
  • 删除nba索引。

我们对外提供访问nba索引时使⽤的是nba_latest别名

1.新增⼀个索引(⽐如修改字段类型,jerseyNo改成keyword类型)

PUT /nba_20220101
{
"mappings": {
"properties": {
"age": {
"type": "integer"
},
"birthDay": {
"type": "date"
},
"birthDayStr": {
"type": "keyword"
},
"code": {
"type": "text"
},
"country": {
"type": "keyword"
},
"countryEn": {
"type": "keyword"
},
"displayAffiliation": {
"type": "text"
},
"displayName": {
"type": "text"
},
"displayNameEn": {
"type": "text"
},
"draft": {
"type": "long"
},
"heightValue": {
"type": "float"
},
"jerseyNo": {
"type": "keyword"
},
"playYear": {
"type": "long"
},
"playerId": {
"type": "keyword"
},
"position": {
"type": "text"
},
"schoolType": {
"type": "text"
},
"teamCity": {
"type": "text"
},
"teamCityEn": {
"type": "text"
},
"teamConference": {
"type": "keyword"
},
"teamConferenceEn": {
"type": "keyword"
},
"teamName": {
"type": "keyword"
},
"teamNameEn": {
"type": "keyword"
},
"weight": {
"type": "text"
}
}
}
}

2.将旧索引数据copy到新索引

同步等待,接⼝将会在 reindex 结束后返回。

POST /_reindex
{
"source": {
"index": "nba"
},
"dest": {
"index": "nba_20220101"
}
}

异步执⾏,如果 reindex 时间过⻓,建议加上 wait_for_completion=false 的参数条件,这样 reindex 将直接返回 taskId

POST /_reindex?wait_for_completion=false
{
"source": {
"index": "nba"
},
"dest": {
"index": "nba_20220101"
}
}

3.替换别名

POST /_aliases
{
"actions": [
{
"add": {
"index": "nba_20220101",
"alias": "nba_latest"
}
},
{
"remove": {
"index": "nba",
"alias": "nba_latest"
}
}
]
}

4.删除旧索引

DELETE /nba

5.通过别名访问新索引

POST /nba_latest/_search
{
"query": {
"match": {
"displayNameEn": "james"
}
}
}

es之refresh操作

理想的搜索:

新的数据⼀添加到索引中⽴⻢就能搜索到,但是真实情况不是这样的。我们使⽤链式命令请求,先添加⼀个⽂档,再⽴刻搜索。

curl -X PUT localhost:9200/star/_doc/888 -H 'Content-Type:
application/json' -d '{ "displayName": "蔡徐坤" }'
curl -X GET localhost:9200/star/_doc/_search?pretty

强制刷新

curl -X PUT localhost:9200/star/_doc/666?refresh -H 'Content-Type:
application/json' -d '{ "displayName": "杨超越" }'
curl -X GET localhost:9200/star/_doc/_search?pretty

修改默认更新时间(默认时间是1s)

PUT /star/_settings
{
"index": {
"refresh_interval": "5s"
}
}

将refresh关闭

PUT /star/_settings
{
"index": {
"refresh_interval": "-1"
}
}

es之高亮查询

前⾔

如果返回的结果集中很多符合条件的结果,那怎么能⼀眼就能看到我们想要的那个结果呢?⽐如下⾯⽹站所示的那样,我们搜索 ⼩d课堂 ,在结果集中,将所有 ⼩d课堂 ⾼亮显示?

高亮查询

POST /nba_latest/_search
{
"query": {
"match": {
"displayNameEn": "james"
}
},
"highlight": {
"fields": {
"displayNameEn": {}
}
}
}

自定义高亮查询

POST /nba_latest/_search
{
"query": {
"match": {
"displayNameEn": "james"
}
},
"highlight": {
"fields": {
"displayNameEn": {
"pre_tags": [
"<h1>"
],
"post_tags": [
"</h1>"
]
}
}
}
}

es之查询建议

查询建议是什么

  • 查询建议,是为了给⽤户提供更好的搜索体验。包括:词条检查,⾃动补全。
  • 词条检查

  • ⾃动补全

Suggester

  • Term suggester
  • Phrase suggester
  • Completion suggester

字段

Term suggester

term 词条建议器,对给输⼊的⽂本进⾏分词,为每个分词提供词项建议。

POST /nba_latest/_search
{
"suggest": {
"my-suggestion": {
"text": "jamse hardne",
"term": {
"suggest_mode": "missing",
"field": "displayNameEn"
}
}
}
}

Phrase suggester

phrase 短语建议,在term的基础上,会考量多个term之间的关系,⽐如是否同时出现在索引的原⽂⾥,相邻程度,以及词频等。

POST /nba_latest/_search
{
"suggest": {
"my-suggestion": {
"text": "jamse harden",
"phrase": {
"field": "displayNameEn"
}
}
}
}

Completion suggester

Completion 完成建议。

POST /nba_latest/_search
{
"suggest": {
"my-suggestion": {
"text": "Miam",
"completion": {
"field": "teamCityEn"
}
}
}
}


责任编辑:武晓燕 来源: 今日头条
相关推荐

2020-03-20 10:14:49

搜索引擎倒排索引

2017-08-07 08:15:31

搜索引擎倒排

2011-06-20 18:23:06

SEO

2020-02-24 08:52:08

开源索引YaCy

2023-01-03 15:42:29

机器学习视频搜索

2009-02-19 09:41:36

搜索引擎搜狐百度

2014-08-13 11:04:02

搜索引擎排序算法

2014-08-05 15:10:05

Larbin搜索引擎

2009-09-22 16:23:52

搜索引擎

2016-12-26 13:41:19

大数据搜索引擎工作原理

2010-06-13 16:27:28

搜索引擎

2012-09-07 13:22:21

搜索搜狗

2022-10-08 09:13:18

搜索引擎⽹站

2010-04-20 11:43:46

2017-08-21 11:14:36

2012-05-14 11:01:50

搜索引擎微软

2011-06-22 17:28:51

SEO

2011-06-15 19:09:24

搜索引擎

2020-08-10 14:39:30

搜索引擎

2015-08-31 10:41:58

搜索引擎Google云应用
点赞
收藏

51CTO技术栈公众号