2014-10-01 53 views
1

我們在CentOS有DatastaxEnterprise Solr的集羣(版本4.5)的兩個數據中心(DC1在歐洲,DC2在北美):Datastax Solr的節點:Nodetool維修卡

DC1: 2 nodes with rf set to 2 
DC2: 1 nodes with rf set to 1 

每個節點都有2個核心和4GB的RAM。 我們只創建了一個密鑰空間,DC1的2個節點每個數據有400MB,而DC2中的節點是空的。

如果我在DC2的節點上啓動nodetool修復,該命令可以正常工作大約20/30分鐘,然後停止工作,繼續停留。

在DC2的節點,我可以閱讀的日誌:

WARN [NonPeriodicTasks:1] 2014-10-01 05:57:44,188 WorkPool.java (line 398) Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log. 
millis, consider increasing it, or reducing load on the node. 
ERROR [NonPeriodicTasks:1] 2014-10-01 05:57:44,190 CassandraDaemon.java (line 199) Exception in thread Thread[NonPeriodicTasks:1,5,main] 
org.apache.solr.common.SolrException: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log. 
millis, consider increasing it, or reducing load on the node. 
    at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:351) 
    at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.doCommit(AbstractSolrSecondaryIndex.java:994) 
    at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.forceBlockingFlush(AbstractSolrSecondaryIndex.java:139) 
    at org.apache.cassandra.db.index.SecondaryIndexManager.flushIndexesBlocking(SecondaryIndexManager.java:338) 
    at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:144) 
    at org.apache.cassandra.streaming.StreamReceiveTask$OnCompletionRunnable.run(StreamReceiveTask.java:113) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
    at java.util.concurrent.FutureTask.run(Unknown Source) 
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) 
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
    at java.lang.Thread.run(Unknown Source) 
Caused by: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log. 
millis, consider increasing it, or reducing load on the node. 
    at com.datastax.bdp.concurrent.WorkPool.doFlush(WorkPool.java:399) 
    at com.datastax.bdp.concurrent.WorkPool.flush(WorkPool.java:339) 
    at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.flushIndexUpdates(AbstractSolrSecondaryIndex.java:484) 
    at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:278) 
    ... 12 more 
WARN [commitScheduler-3-thread-1] 2014-10-01 05:58:47,351 WorkPool.java (line 398) Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log. 
millis, consider increasing it, or reducing load on the node. 
ERROR [commitScheduler-3-thread-1] 2014-10-01 05:58:47,352 SolrException.java (line 136) auto commit error...:org.apache.solr.common.SolrException: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log. 
millis, consider increasing it, or reducing load on the node. 
    at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:351) 
    at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) 
    at java.util.concurrent.FutureTask.run(Unknown Source) 
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) 
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) 
    at java.lang.Thread.run(Unknown Source) 
Caused by: java.lang.RuntimeException: Timeout while waiting for workers when flushing pool {}. IndexCurrent timeout is Failure to flush may cause excessive growth of Cassandra commit log. 
millis, consider increasing it, or reducing load on the node. 
    at com.datastax.bdp.concurrent.WorkPool.doFlush(WorkPool.java:399) 
    at com.datastax.bdp.concurrent.WorkPool.flush(WorkPool.java:339) 
    at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.flushIndexUpdates(AbstractSolrSecondaryIndex.java:484) 
    at com.datastax.bdp.search.solr.handler.update.CassandraDirectUpdateHandler.commit(CassandraDirectUpdateHandler.java:278) 
    ... 8 more 

我試圖增加cassandra.yaml文件中的一些超時,沒有運氣。 謝謝

+0

DataStax有故障維修懸掛支撐柱:https://support.datastax.com /項/ 27229736,故障排除,懸維修 – Aaron 2014-10-01 14:49:52

回答

1

您的節點相當低於DSE solr安裝的指定值。

我通常建議至少8個核心和存儲器的至少64千兆。 分配堆高達12-14 Gb。

以下故障排除指南是相當不錯的:

https://support.datastax.com/entries/38367716-Solr-Configuration-Best-Practices-and-Troubleshooting-Tips

您當前的數據負載小,所以你可能不需要對內存的全面失衡 - 我在這裏猜測瓶頸是CPU的。

如果你沒有運行4.0.4或4.5.2,我會得到其中的一個版本。

1

兩個項目可能會有所幫助:

  1. RuntimeException您所看到的日誌中是沿着承諾指數的變化到磁盤Lucene的代碼路徑,所以我肯定會決定是否寫入磁盤是你的瓶頸。 (您正在使用不同的物理磁盤爲您的數據和提交日誌?)

  2. ,你可能需要的平均時間來調整參數是控制dse.yamlWorkPool沖洗超時了一個被稱爲flush_max_time_per_core

0

一個從Solr的索引減少爭的辦法就是增加autoSoftCommit MAXTIME在solrconfig.xml中

<autoSoftCommit> 
    <maxTime>1000000</maxTime> 
</autoSoftCommit>