給定一系列包含文本的文檔,我想搜索短語並返回所有匹配並對它們進行排名。我知道如何讓lucene/solr指出哪些文檔匹配,並在文檔中突出顯示,但是如何獲得包含來自同一文檔的多個匹配的排名?在lucene索引文檔中查找和排列多個短語匹配
First document. It has a single line of text.
Second document. This text line is quite short.
This is another line containing more text and is a bit longer.
如果我搜索 「文本行」,那麼我想它找到的三場比賽,排名如下:
2nd document -> ...This "text line" is quite short.
1st document -> ...It has a single "line of text".
2nd document -> ...another "line containing more text" and is...
這可能嗎?怎麼樣?
我本來有一個更復雜的問題,其中包括這一點,在這裏:http://stackoverflow.com/questions/8883390/obtain-metadata-associated-with-matched-content-in-solr-lucene – 2012-01-17 13:40:02
爲什麼要在結果中兩次使用document2?也許你應該將每一行索引爲一個文檔... – naresh 2012-01-18 09:44:02
這就是我所說的,如果你想匹配成行,每一行作爲一個文檔。 – milan 2012-01-18 10:24:19