2011-06-03 73 views
0

我使用SnowballPorterFilterFactory作爲索引和查詢分析器。 搜索「apple」單詞。 Solr成功找到必要的文章,但判斷該單詞拼寫錯誤並給出建議:「appl」。 如果我搜索「蘋果」,它的工作原理是正確的:沒有給出任何建議,並找到帶有「蘋果」字樣的文章。Solr SnowballPorterFilterFactory過濾器提供了錯誤的sugestions

schema.xml中:

<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100"> 
    <analyzer type="index"> 
    <tokenizer class="solr.WhitespaceTokenizerFactory"/> 
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt" enablePositionIncrements="true"/> 
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords_en.txt"/> 
    </analyzer> 
    <analyzer type="query"> 
    <tokenizer class="solr.WhitespaceTokenizerFactory"/> 
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms_en.txt" ignoreCase="true" expand="true"/> 
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_en.txt" enablePositionIncrements="true"/> 
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords_en.txt"/> 
    </analyzer> 
</fieldType> 

任何想法如何排除不正確的建議?

回答

2

您不應該在搜索中使用相同的字段&拼寫檢查...添加一個字段,不用拼寫檢查。

例子:

<!-- Basic Text Field for use with Spell Correction --> 
<fieldType name="textSpell" class="solr.TextField" positionIncrementGap="100"> 
    <analyzer> 
    <tokenizer class="solr.WhitespaceTokenizerFactory"/> 
    <filter class="solr.ASCIIFoldingFilterFactory"/> 
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> 
    <filter class="solr.LowerCaseFilterFactory"/> 
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> 
    </analyzer> 
</fieldType> 

<!-- TextSpell --> 
<field name="textSpelling" type="textSpell" indexed="true" stored="false" multiValued="true"/> 

然後在您的solrconfig.xml中:

<searchComponent name="spellcheck" class="solr.SpellCheckComponent"> 
    <lst name="spellchecker"> 
     <str name="name">default</str> 
     <str name="field">textSpelling</str> 
     <str name="termSourceField">textSpelling</str> 
     <str name="accuracy">0.7</str> 
     <str name="spellcheckIndexDir">./spellchecker</str> 
     <str name="queryAnalyzerFieldType">text</str> 
     <str name="buildOnOptimize">true</str> 
    </lst> 
</searchComponent>