2012-01-11 111 views
1

我讀了很多來自stackoverflow的問題,但沒有找到答案,如何使Solr前綴搜索。例如我有文本:「solr文檔是不可讀的」,我需要找到這樣的東西:「solr docu *」,「文檔未讀*」,「不可讀取的是如此*」,但不是「un * so *」,我做這樣的事情:前綴搜索的Solr模式,howto?

<fieldType name="prefix_search" class="solr.TextField"> 
    <analyzer> 
    <tokenizer class="solr.LowerCaseTokenizerFactory"/> 
    <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="30" side="front"/> 
    </analyzer> 
</fieldType> 

但有時它會返回意外的結果,並且還可以使用「un * so *」查詢。也許問題與PHP SolrClient?謝謝你的回覆!

回答

1

ReversedWildcardFilterFactory正是你想要的,那麼就可以很容易地測試與捲曲如下:

curl 'http://example.com:8080/solr/select?q=prefix_search:un*+AND+prefix_search:so*'

<!-- Just like text_general except it reverses the characters of 
    each token, to enable more efficient leading wildcard queries. --> 
<fieldType name="text_general_rev" class="solr.TextField" positionIncrementGap="100"> 
    <analyzer type="index"> 
    <tokenizer class="solr.StandardTokenizerFactory"/> 
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    <filter class="solr.ReversedWildcardFilterFactory" withOriginal="true" 
     maxPosAsterisk="3" maxPosQuestion="2" maxFractionAsterisk="0.33"/> 
    </analyzer> 
    <analyzer type="query"> 
    <tokenizer class="solr.StandardTokenizerFactory"/> 
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> 
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    </analyzer> 
</fieldType>