指南針查詢包含/（斜槓）

我在我的項目上使用基於羅盤的索引。我在現場「名稱」基於註解的配置是：指南針查詢包含/（斜槓）

@SearchableProperty(name="name") 
@SearchableMetaData(name="ordering_name", index=Index.NOT_ANALYZED) 
private String name;

現在下面的值是「姓名」字段店：

1. Temp 0 New n/a 
2. e/f search 
3. c/d search

現在有了差異情景搜索結果如下：

1. 'c/d' -> +(+alias:TempClass +(c/d*)) +(alias:TempClass) -> 1 record found 
2. 'n/a' -> +(+alias:TempClass +(n/a*)) +(alias:TempClass) -> 0 record found 
3. 'search' -> +(+alias:TempClass +(search*)) +(alias:TempClass) -> 2 records found

因此，當我試圖搜索'n/a'時，它應該搜索值爲'Temp 0 New n/a'的第一條記錄。

任何幫助將不勝感激！

來源

2012-04-26 Nirmal

我看到'（n/a *）'應該是'（* n/a）'。 – 2012-04-26 11:20:54

@Joop ... plz檢查 – Nirmal 2012-04-26 11:35:54

問題中的更新對不起，另一個（太）瘋狂的猜測然後：「不適用」可能不是值，但部分的toString說「不可用」。也許嘗試搜索「/ a」。 – 2012-04-26 11:50:57

在某些時候，您的查詢分析與您的文檔分析不匹配。

最有可能你在內部使用的查詢解析Lucene的StandardAnalyzer但不是在索引時間，以作爲引爆：

@SearchableMetaData(name="ordering_name", index=Index.NOT_ANALYZED))

該分析儀內部使用的StandardTokenizer認爲字符/作爲一個單詞邊界（如空間會是），產生代幣n和a。稍後，令牌a被刪除StopFilter。

以下代碼是對此的解釋的示例（輸入爲"c/d e/f n/a"）：

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_36); 
TokenStream tokenStream = analyzer.tokenStream("CONTENT", new StringReader("c/d e/f n/a")); 
CharTermAttribute term = tokenStream.getAttribute(CharTermAttribute.class); 
PositionIncrementAttribute position = tokenStream.getAttribute(PositionIncrementAttribute.class); 
int pos = 0; 
while (tokenStream.incrementToken()) { 
    String termStr = term.toString(); 
    int incr = position.getPositionIncrement(); 
    if (incr == 0) { 
     System.out.print(" [" + termStr + "]"); 
    } else { 
     pos += incr; 
     System.out.println(" " + pos + ": [" + termStr +"]"); 
    } 
}

你會看到以下提取令牌：

1: [c] 
2: [d] 
3: [e] 
4: [f] 
5: [n]

注意，預期位置6：令牌a缺失。正如你所看到的，Lucene的QueryParser也執行這個分詞：

QueryParser parser = new QueryParser(Version.LUCENE_36, "content", new StandardAnalyzer(Version.LUCENE_36)); 
System.out.println(parser.parse("+n/a*"));

輸出是：

+content:n

編輯：該解決方案是使用WhitespaceAnalyzer，並設置到分析的領域。以下代碼是Lucene下的概念驗證：

IndexWriter writer = new IndexWriter(new RAMDirectory(), new IndexWriterConfig(Version.LUCENE_36, new WhitespaceAnalyzer(Version.LUCENE_36))); 
Document doc = new Document(); 
doc.add(new Field("content","Temp 0 New n/a", Store.YES, Index.ANALYZED)); 
writer.addDocument(doc); 
writer.commit(); 
IndexReader reader = IndexReader.open(writer, true); 
IndexSearcher searcher = new IndexSearcher(reader); 
BooleanQuery query = new BooleanQuery(); 
QueryParser parser = new QueryParser(Version.LUCENE_36, "content", new WhitespaceAnalyzer(Version.LUCENE_36)); 
TopDocs docs = searcher.search(parser.parse("+n/a"), 10); 
System.out.println(docs.totalHits); 
writer.close();

輸出結果爲：1。

來源

2012-06-18 15:55:31 jspboix

現在我明白了我的應用程序中究竟發生了什麼，因爲StandardAnalyzer。有什麼方法可以覆蓋停用詞（我需要從停用詞列表中刪除'a'）？或任何其他建議來解決這個問題將不勝感激.... – Nirmal 2012-06-19 11:38:01

我剛剛編輯答案，並給出了一個可能的解決方案。我希望它有幫助！ – jspboix 2012-06-19 13:49:40

指南針查詢包含/（斜槓）

回答

相關問題