2016-06-14 169 views
3

問題是任何帶有助推符的字符序列「^(插入符號)」不返回任何搜索結果。elastic搜索沒有用於產生特殊字符'^(插入符號)'

但按照下面的彈性搜索文檔

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_reserved_characters

    • & & || ! (){} [] ^「〜*:?。\字符可以與\符號轉義

有一個要求做在彈性搜索使用的n-gram分析裝置包含搜索

下面是示例用例的映射結構和

{ 
    "settings": { 
    "index": { 
     "analysis": { 
     "analyzer": { 
      "nGram_analyzer": { 
      "filter": [ 
       "lowercase", 
       "asciifolding" 
      ], 
      "type": "custom", 
      "tokenizer": "ngram_tokenizer" 
      }, 
      "whitespace_analyzer": { 
      "filter": [ 
       "lowercase", 
       "asciifolding" 
      ], 
      "type": "custom", 
      "tokenizer": "whitespace" 
      } 
     }, 
     "tokenizer": { 
      "ngram_tokenizer": { 
      "token_chars": [ 
       "letter", 
       "digit", 
       "punctuation", 
       "symbol" 
      ], 
      "min_gram": "2", 
      "type": "nGram", 
      "max_gram": "20" 
      } 
     } 
     } 
    } 
    }, 
    "mappings": { 
    "employee": { 
     "properties": { 
     "employeeName": { 
      "type": "string", 
      "analyzer": "nGram_analyzer", 
      "search_analyzer": "whitespace_analyzer" 
     } 
     } 
    } 
    } 
} 

有一個僱員姓名像下面有特殊字符包括 XYZ%^ & *

還用於樣品查詢包含搜索如下

GET 
{ 
    "query": { 
    "bool": { 
     "must": [ 
     { 
      "match": { 
      "employeeName": { 
       "query": "xyz%^", 
       "type": "boolean", 
       "operator": "or" 
      } 
      } 
     } 
     ] 
    } 
    } 
} 

即使我們試圖逃跑的「查詢」:「XYZ%\ ^」它的錯誤了。所以不能搜索任何字符包含搜索有「^(脫字符號)」

任何幫助,非常感謝。

回答

2

存在與issue相關的ngram tokenizer中的錯誤。

基本上^不被認爲是Symbol |Letter |Punctuationngram-tokenizer。 因此它標記輸入^

實施例:(URL編碼的xyz%^):

GET <index_name>/_analyze?tokenizer=ngram_tokenizer&text=xyz%25%5E

分析API的上述結果表明沒有^如示於下面的反應:

{ 
    "tokens": [ 
     { 
     "token": "xy", 
     "start_offset": 0, 
     "end_offset": 2, 
     "type": "word", 
     "position": 0 
     }, 
     { 
     "token": "xyz", 
     "start_offset": 0, 
     "end_offset": 3, 
     "type": "word", 
     "position": 1 
     }, 
     { 
     "token": "xyz%", 
     "start_offset": 0, 
     "end_offset": 4, 
     "type": "word", 
     "position": 2 
     }, 
     { 
     "token": "yz", 
     "start_offset": 1, 
     "end_offset": 3, 
     "type": "word", 
     "position": 3 
     }, 
     { 
     "token": "yz%", 
     "start_offset": 1, 
     "end_offset": 4, 
     "type": "word", 
     "position": 4 
     }, 
     { 
     "token": "z%", 
     "start_offset": 2, 
     "end_offset": 4, 
     "type": "word", 
     "position": 5 
     } 
    ] 
} 

由於' ^'沒有編入索引,因此沒有匹配

+0

感謝您的回覆@keety。 – user1876040