2015-04-17 32 views

回答

0

找到一個很好的解決方案:使用「」作爲標記分隔一個鵝卵石過濾器創建一個自定義anaylzer和使用,在查詢(使用布爾查詢與標準查詢組合)

0

要在分析的時間做到這一點,你也可以使用已知的「解析」令牌過濾器。這裏是邦元分解文本「catdogmouse」到 令牌「貓」,「狗」和「鼠標」的例子:

POST /decom 
{ 
    "settings": { 
    "index": { 
     "analysis": { 
     "analyzer": { 
      "decom_analyzer": { 
      "type": "custom", 
      "tokenizer": "standard", 
      "filter": ["decom_filter"] 
      } 
     }, 
     "filter": { 
      "decom_filter": { 
      "type": "dictionary_decompounder", 
      "word_list": ["cat", "dog", "mouse"] 
      } 
     } 
     } 
    } 
    }, 
    "mappings": { 
    "doc": { 
     "properties": { 
     "body": { 
      "type": "string", 
      "analyzer": "decom_analyzer" 
     } 
     } 
    } 
    } 
} 

然後你就可以看到他們是如何應用到某些方面:

POST /decom/_analyze?field=body&pretty 
racecatthings 
{ 
    "tokens" : [ { 
    "token" : "racecatthings", 
    "start_offset" : 1, 
    "end_offset" : 14, 
    "type" : "<ALPHANUM>", 
    "position" : 1 
    }, { 
    "token" : "cat", 
    "start_offset" : 1, 
    "end_offset" : 14, 
    "type" : "<ALPHANUM>", 
    "position" : 1 
    } ] 
} 

而另一:(你應該能夠推斷這單獨的「滑板」 成「波」和「董事會」)

POST /decom/_analyze?field=body&pretty 
catdogmouse 
{ 
    "tokens" : [ { 
    "token" : "catdogmouse", 
    "start_offset" : 1, 
    "end_offset" : 12, 
    "type" : "<ALPHANUM>", 
    "position" : 1 
    }, { 
    "token" : "cat", 
    "start_offset" : 1, 
    "end_offset" : 12, 
    "type" : "<ALPHANUM>", 
    "position" : 1 
    }, { 
    "token" : "dog", 
    "start_offset" : 1, 
    "end_offset" : 12, 
    "type" : "<ALPHANUM>", 
    "position" : 1 
    }, { 
    "token" : "mouse", 
    "start_offset" : 1, 
    "end_offset" : 12, 
    "type" : "<ALPHANUM>", 
    "position" : 1 
    } ] 
} 
+1

很好的解決方案,但缺點是對dicition的需要元 – longliveenduro