我可以得到所有從ElasticSearch

的條款和的docId列表我怎樣才能得到所有ES.For例如倒排索引數據看起來像下面的條款和文檔的列表：我可以得到所有從ElasticSearch

word1: doc1,doc5,doc6... 
word2: doc3,doc9,doc12... 
word3: doc5,doc100...

我只是想獲取所有的條款和相應的文檔列表。任何API我可以做到這一點。謝謝！

來源

2016-04-26 Jack

我想嘗試使用這個工具：https：//simpsora.wordpress.com/2014/05/06/using-luke-with- elasticsearch / –

爲了檢索這個，你應該瞭解一點關於Lucene的運作方式。在Lucene中，索引的結構是按照Fields-> Terms-> PostingLists（表示爲PostingsEnums）結構化（您似乎知道）。

要檢索這些值，你可以用這個作爲模板Lucene的工具（假設你有機會獲得基本讀取器 - AtomicReader：

// get every one of the fields in the reader 
Fields fields = MultiFields.getFields(reader); 
for (String field: fields) { 
    // get the Terms for the field 
    TermsEnum terms = fields.terms(field).iterator(null); 

    // a term is represented by a BytesRef in lucene 
    // and we will iterate across all of them using 
    // the TermsEnum syntax (read lucene docs for this) 
    BytesRef t; 
    while ((t = terms.next()) != null) { 
     // get the PostingsEnum (not that this is called 
     // DocsEnum in Lucene 4.X) which represents an enumeration 
     // of all of the documents for the Term t 
     PostingsEnum docs = terms.postings(null, null); 
     String line = String.format("%s: ",t); 
     while (docs.nextDoc() != NO_MORE_DOCS) { 
      line += String.valueOf(docs.docID()); 
      line += ", " 
     } 
     System.out.println(line); 
    } 
}

我還沒有真正有機會正好運行這段代碼（我有一個類似的工具，我已經爲我的特定的Lucene編寫了比較索引的工具），但希望這可以讓您對Lucene的結構有一般的瞭解，以便您可以編寫自己的工具。

The棘手的部分將從您的索引獲得明確的AtomicReader - 但我確定有其他S tackOverflow的答案，以幫助你！（作爲一個小提示，你可能想看看用DirectoryReader#open(File f)#leaves()打開你的索引）

來源

2016-04-27 00:45:25 Almog

我可以得到所有從ElasticSearch

回答

相關問題