我已經開始使用重要術語聚合來查看哪些關鍵字在文檔組中與我已編入索引的整組文檔相比非常重要。Elasticsearch重要術語聚合
它很有效,直到很多文檔被索引。然後,對於曾經工作過的相同的查詢,elasticsearch只是說:
SearchPhaseExecutionException[Failed to execute phase [query],
all shards failed; shardFailures {[OIWBSjVzT1uxfxwizhS5eg][demo_paragraphs][0]:
CircuitBreakingException[Data too large, data for field [text]
would be larger than limit of [633785548/604.4mb]];
我的查詢看起來如下:
POST /demo_paragraphs/_search
{
"query": {
"match": {
"django_target_id": 1915661
}
},
"aggregations" : {
"signKeywords" : {
"significant_terms" : {
"field" : "text"
}
}
}
}
而且文檔結構:
"_source": {
"django_ct": "citations.citation",
"django_target_id": 1915661,
"django_id": 3414077,
"internal_citation_id": "CR7_151",
"django_source_id": 1915654,
"text": "Mucin 1 (MUC1) is a protein heterodimer that is overexpressed in lung cancers [6]. MUC1 consists of two subunits, an N-terminal extracellular subunit (MUC1-N) and a C-terminal transmembrane subunit (MUC1-C). Overexpression of MUC1 is sufficient for the induction of anchorage independent growth and tumorigenicity [7]. Other studies have shown that the MUC1-C cytoplasmic domain is responsible for the induction of the malignant phenotype and that MUC1-N is dispensable for transformation [8]. Overexpression of",
"id": "citations.citation.3414077",
"num_distinct_citations": 0
}
的數據,我索引是科學論文的段落。沒有文件真的很大。
關於如何分析或解決問題的任何想法?
所以真正的放緩..增加內存在冷杉真的放緩elasticsearch。必須嘗試更新的版本。 – paweloque 2014-11-25 14:10:39