2017-05-23 105 views
0

我想知道是否有任何方法讓詞組建議者糾正拼音差異的前綴拼寫錯誤。Elasticsearch詞組建議詞語拼音差異

Elasticsearch 5.1.2

測試中Kibana 5.1.2

例如:

而不是 「馬戲團」 有人寫 「sircus」 的,或不是 「編碼」 有人寫「koding 」。 有趣的是,不是「短語」,你可以寫「擦除」,並得到一個建議。

這是我的設置。

設置:

PUT text_index 
{ 
    "settings": { 
    "analysis": { 
     "analyzer": { 
     "suggests_analyzer": { 
      "tokenizer": "standard", 
      "filter": [ 
      "lowercase", 
      "asciifolding", 
      "shingle_filter" 
      ], 
      "type": "custom" 
     }, 
     "reverse": { 
      "type": "custom", 
      "tokenizer": "standard", 
      "filter": ["standard", "reverse"] 
      } 
     }, 
     "filter": { 
     "shingle_filter": { 
      "min_shingle_size": 2, 
      "max_shingle_size": 5, 
      "type": "shingle" 
     } 
     } 
    } 
    }, 
    "mappings": { 
    "testtype": { 
     "properties": { 
     "suggest_field": { 
      "type": "text", 
      "analyzer": "suggests_analyzer", 
      "fields": { 
      "reverse": { 
       "type": "text", 
       "analyzer": "reverse" 
      } 
      } 
     } 
     } 
    } 
    } 
} 

有些文件:

POST test_index/test_type/_bulk 
{"index":{}} 
{ "suggest_field": "phrase"} 
{"index":{}} 
{ "suggest_field": "Circus"} 
{"index":{}} 
{ "suggest_field": "Coding"} 

查詢:

POST /so-index/_search 
{ 
    "suggest" : { 
    "text" : "sircus", 
    "simple_phrase" : { 
     "phrase" : { 
     "field" : "suggest_field", 
     "max_errors": 0.9, 
     "highlight": { 
      "pre_tag": "<em>", 
      "post_tag": "</em>" 
     }, 
     "direct_generator" : [ { 
      "field" : "suggest_field", 
      "suggest_mode" : "always" 
     }, { 
      "field" : "suggest_field.reverse", 
      "suggest_mode" : "always", 
      "pre_filter" : "reverse", 
      "post_filter" : "reverse" 
     }] 
     } 
    } 
    } 
} 

另外,我重複以下步驟幾次(在5和10),在不改變任何東西:

  • 刪除索引
  • 放指標,設置&映射
  • 文檔添加
  • 查詢(codign)

有時我建議,有時候我不知道。有沒有解釋呢?

+0

這可以使用術語提示器https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html進行更正 –

回答

0

嘗試在direct_generator中設置「prefix_length」:0。