2014-12-04 51 views
1

我有一些產品,比如「99%巧克力」。如果我搜索巧克力,它會匹配這個特定的項目,但如果我搜索「99」,它不匹配。我遇到了這個問題相同的Using django haystack autocomplete with elasticsearch to search for digits/numbers?,但沒有人回答他的問題。有人可以幫忙嗎?不能匹配乾草堆彈性搜索中的數字

編輯2:對不起,我忽略了一個重要的細節。數字搜索本身有效,但自動完成功能無效。我包括相關線路:

#the relevant line in my index 
    name_auto = indexes.EdgeNgramField(model_attr='name') 

#the relevant line in my view 
prodSqs = SearchQuerySet().models(Product).autocomplete(name_auto=request.GET.get('q', '')) 

編輯:下面是運行分析的結果:

curl -XGET 'localhost:9200/haystack/_analyze?analyzer=standard&pretty' -d '99% chocolate' 
{ 
    "tokens" : [ { 
    "token" : "99", 
    "start_offset" : 0, 
    "end_offset" : 2, 
    "type" : "<NUM>", 
    "position" : 1 
    }, { 
    "token" : "chocolate", 
    "start_offset" : 4, 
    "end_offset" : 13, 
    "type" : "<ALPHANUM>", 
    "position" : 2 
    } ] 
} 
+0

什麼分析器是你使用的領域?您可以看到elasticsearch如何通過分析來標記所有內容。 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-analyze.html – 2014-12-04 06:35:50

+0

@AlainCollins對不起,我已經更新了這個問題,以反映正常搜索正常工作的事實。但是,它的自動完成不符合數字。 – Riz 2014-12-12 18:32:15

回答

3

終於在這裏找到了答案:ElasticSearch: EdgeNgrams and Numbers

添加下列類和改變Haystack_connections下的引擎在設置文件中使用下面的,而不是默認的草堆一個CustomElasticsearchSearchEngine:

class CustomElasticsearchBackend(ElasticsearchSearchBackend): 
    """ 
    The default ElasticsearchSearchBackend settings don't tokenize strings of digits the same way as words, so they 
    get lost: the lowercase tokenizer is the culprit. Switching to the standard tokenizer and doing the case- 
    insensitivity in the filter seems to do the job. 
    """ 
    def __init__(self, connection_alias, **connection_options): 
     # see https://stackoverflow.com/questions/13636419/elasticsearch-edgengrams-and-numbers 
     self.DEFAULT_SETTINGS['settings']['analysis']['analyzer']['edgengram_analyzer']['tokenizer'] = 'standard' 
     self.DEFAULT_SETTINGS['settings']['analysis']['analyzer']['edgengram_analyzer']['filter'].append('lowercase') 
     super(CustomElasticsearchBackend, self).__init__(connection_alias, **connection_options) 

class CustomElasticsearchSearchEngine(ElasticsearchSearchEngine): 
    backend = CustomElasticsearchBackend 
0

運行您的字符串99% chocolate通過標準的分析給出正確的結果(99是一個術語它自己),所以如果你目前沒有使用它,你應該切換到它。

curl -XGET 'localhost:9200/myindex/_analyze?analyzer=standard&pretty' -d '99% chocolate' 
{ 
    "tokens" : [ { 
    "token" : "99", 
    "start_offset" : 0, 
    "end_offset" : 2, 
    "type" : "<NUM>", 
    "position" : 1 
    }, { 
    "token" : "chocolate", 
    "start_offset" : 4, 
    "end_offset" : 13, 
    "type" : "<ALPHANUM>", 
    "position" : 2 
    } ] 
} 
+0

對不起,我更新了這個問題,以反映正常搜索正常工作的事實。但是,自動填充在數字上不匹配。 – Riz 2014-12-12 18:33:11