2010-08-20 216 views
5

我有一個問題,我的java environement。我從現在開始運行Solr 1.3(搜索引擎),現在突然間我遇到了很多麻煩。我所有的線程池(250)都被隨機阻塞,每天一次或兩次。我沒有對我的solr應用程序或我的tomcat服務器進行任何更改。Java線程阻塞

我正在運行tomcat 5.5.25和Solr 1.3。我有一個線程轉儲當系統完全超載:

IGOT像240線程像這樣的:

"http-8080-Processor1" daemon prio=10 tid=0x0000000000b2e000 nid=0x193 waiting for monitor entry [0x000000004066c000..0x000000004066cb20] 
    java.lang.Thread.State: BLOCKED (on object monitor) 
    at java.util.logging.StreamHandler.publish(StreamHandler.java:174) 
    - waiting to lock <0x00007fe37e72b340> (a java.util.logging.ConsoleHandler) 
    at java.util.logging.ConsoleHandler.publish(ConsoleHandler.java:88) 
    at java.util.logging.Logger.log(Logger.java:472) 
    at java.util.logging.Logger.doLog(Logger.java:494) 
    at java.util.logging.Logger.log(Logger.java:517) 
    at java.util.logging.Logger.info(Logger.java:1036) 
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1212) 
    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) 
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) 
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) 
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) 
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) 
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) 
    at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874) 
    at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) 
    at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) 
    at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) 
    at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) 
    at java.lang.Thread.run(Thread.java:619) 

我們可以看到,這個線程被阻塞並等待:< 0x00007fe37e72b340>

究竟是誰擁有的線程< 0x00007fe37e72b340>是這樣的一個:

"http-8080-Processor156" daemon prio=10 tid=0x0000000000df2000 nid=0x1e52 runnable [0x0000000044521000..0x0000000044521c20] 
    java.lang.Thread.State: RUNNABLE 
    at java.io.FileOutputStream.writeBytes(Native Method) 
    at java.io.FileOutputStream.write(FileOutputStream.java:260) 
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) 
    - locked <0x00007fe37e3abcd8> (a java.io.BufferedOutputStream) 
    at java.io.PrintStream.write(PrintStream.java:430) 
    - locked <0x00007fe37e3abca0> (a java.io.PrintStream) 
    at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202) 
    at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272) 
    at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276) 
    at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122) 
    - locked <0x00007fe37e72cd90> (a java.io.OutputStreamWriter) 
    at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212) 
    at java.util.logging.StreamHandler.flush(StreamHandler.java:225) 
    - locked <0x00007fe37e72b340> (a java.util.logging.ConsoleHandler) 
    at java.util.logging.ConsoleHandler.publish(ConsoleHandler.java:89) 
    at java.util.logging.Logger.log(Logger.java:472) 
    at java.util.logging.Logger.doLog(Logger.java:494) 
    at java.util.logging.Logger.log(Logger.java:517) 
    at java.util.logging.Logger.info(Logger.java:1036) 
    at org.apache.solr.core.SolrCore.execute(SolrCore.java:1212) 
    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) 
    at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) 
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) 
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) 
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) 
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) 
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) 
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151) 
    at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874) 
    at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) 
    at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) 
    at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) 
    at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) 
    at java.lang.Thread.run(Thread.java:619) 

,這是我THRE的最後一部分廣告轉儲:

"ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10 tid=0x00007f6510349800 nid=0xbff waiting on condition [0x0000000041d8d000..0x0000000041d8dd20] 
    java.lang.Thread.State: TIMED_WAITING (sleeping) 
    at java.lang.Thread.sleep(Native Method) 
    at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1548) 
    at java.lang.Thread.run(Thread.java:619) 

"pool-1-thread-1" prio=10 tid=0x0000000000c26400 nid=0xbfe waiting on condition [0x000000004200e000..0x000000004200eca0] 
    java.lang.Thread.State: WAITING (parking) 
    at sun.misc.Unsafe.park(Native Method) 
    - parking to wait for <0x00007f651b275510> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) 
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) 
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) 
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:946) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:906) 
    at java.lang.Thread.run(Thread.java:619) 

"Low Memory Detector" daemon prio=10 tid=0x00007f6510004400 nid=0xbfa runnable [0x0000000000000000..0x0000000000000000] 
    java.lang.Thread.State: RUNNABLE 

"CompilerThread1" daemon prio=10 tid=0x00007f6510001000 nid=0xbf9 waiting on condition [0x0000000000000000..0x0000000040d5e340] 
    java.lang.Thread.State: RUNNABLE 

"CompilerThread0" daemon prio=10 tid=0x00000000006bc400 nid=0xbf8 waiting on condition [0x0000000000000000..0x0000000040c5d2d0] 
    java.lang.Thread.State: RUNNABLE 

"Signal Dispatcher" daemon prio=10 tid=0x00000000006bb000 nid=0xbf7 runnable [0x0000000000000000..0x0000000040b5da30] 
    java.lang.Thread.State: RUNNABLE 

"Finalizer" daemon prio=10 tid=0x0000000000690c00 nid=0xbf6 in Object.wait() [0x000000004065e000..0x000000004065ed20] 
    java.lang.Thread.State: WAITING (on object monitor) 
    at java.lang.Object.wait(Native Method) 
    - waiting on <0x00007f651aa10258> (a java.lang.ref.ReferenceQueue$Lock) 
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116) 
    - locked <0x00007f651aa10258> (a java.lang.ref.ReferenceQueue$Lock) 
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132) 
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) 

"Reference Handler" daemon prio=10 tid=0x000000000068f400 nid=0xbf5 in Object.wait() [0x000000004055d000..0x000000004055dca0] 
    java.lang.Thread.State: WAITING (on object monitor) 
    at java.lang.Object.wait(Native Method) 
    - waiting on <0x00007f651aa10338> (a java.lang.ref.Reference$Lock) 
    at java.lang.Object.wait(Object.java:485) 
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) 
    - locked <0x00007f651aa10338> (a java.lang.ref.Reference$Lock) 

"main" prio=10 tid=0x0000000000622400 nid=0xbeb runnable [0x0000000000000000..0x00007fff69fcbba0] 
    java.lang.Thread.State: RUNNABLE 

"VM Thread" prio=10 tid=0x000000000068a000 nid=0xbf4 runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000000062cc00 nid=0xbec runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0x000000000062e000 nid=0xbed runnable 

"GC task thread#2 (ParallelGC)" prio=10 tid=0x000000000062f400 nid=0xbee runnable 

"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000000630400 nid=0xbef runnable 

"GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000000631800 nid=0xbf0 runnable 

"GC task thread#5 (ParallelGC)" prio=10 tid=0x0000000000632c00 nid=0xbf1 runnable 

"GC task thread#6 (ParallelGC)" prio=10 tid=0x0000000000634000 nid=0xbf2 runnable 

"GC task thread#7 (ParallelGC)" prio=10 tid=0x0000000000635400 nid=0xbf3 runnable 

"VM Periodic Task Thread" prio=10 tid=0x00007f6510006800 nid=0xbfb waiting on condition 

JNI global references: 1201 

我知道這是不是一個線程死鎖的概率,因爲一個線程實際上是與所有的所有其他線程想要的ressource捉迷藏。

任何人都知道什麼會導致此概率?

回答

0

我從來沒有使用過java.util.logging,所以我不知道我的建議是有用的,但netherless:
嘗試使用不同java.util.logging.Logger實例,因此並非所有的240個線程將被阻塞在同一個顯示器上
(它將有助於Logger的不同實例使用java.util.logging.ConsoleHandler的不同實例)。

0

看來,擁有「0x00007fe37e72b340」的線程在IO級別被阻塞。也許一個磁盤(RAID?)問題?

你可以做一個線程轉儲5分鐘後,看看是否同一個線程仍然被阻止?

+0

感謝您的提示,虐待嘗試在幾小時內監測下次崩潰期間的線程活動;) – 2010-08-20 15:32:47

5

你所有的線程都在記錄事物。他們都需要不時寫在磁盤上。 每當您的240個線程中有一個線程遇到日誌記錄線時,就會出現磁盤訪問問題。

它讓我感到有鎖的線程處於RUNNABLE狀態。

我想可能是在等待一些外部的ressource被釋放(例如像磁盤訪問)

你的磁盤空間不足?您最近是否在存儲系統中更改過某些內容?

+1

我認爲這是正確的軌道。我會看看外部因素。另外,如果它不是寫入傳統文件,而是寫入Unix命名管道,請確保有人正在以足夠的速度讀取管道的另一端。如果緩衝區填滿,你會阻止。 – 2010-08-20 14:34:22

+0

磁盤空間很好,我們沒有改變系統上的任何東西。我們試圖在另一臺服務器上切換系統,並得到相同的概率。 – 2010-08-20 15:30:50

+0

我不認爲這是線程安全的問題。它早就破產了。 240線程對於JVM不是問題。 240線程編輯單個文件可能是。 一個醜陋的解決方法是減少記錄的消息量。 嘗試使用getLoggerNames(),getLogger()和setLevel()從 調整事物http://download-llnw.oracle.com/javase/1.4.2/docs/api/java/util/logging/LogManager.html和 http://download-llnw.oracle.com/javase/1.4.2/docs/api/java/util/logging/Logger.html – BenoitParis 2010-08-20 16:00:58

0

如果您有非常詳細的日誌,則在每個日誌記錄之後刷新會變得非常昂貴。

質量問題的解決方法是清理日誌記錄,可能基於審計。

作爲一個快速修復,覆蓋StreamHandler.flushOutputStream.flush不立即做。每隔一段時間只衝洗一次。但請注意,如果您這樣做,您可能會在崩潰之前立即丟失日誌記錄數據。

5

如果您在Windows下運行並且java應用程序啓動控制檯,請注意不要單擊DOS框。窗口的蹩腳標記和複製「功能」塊輸出到ConsoleHandler。所以任何試圖寫入屏幕的記錄器都會被阻塞。寫入控制檯是在本地調用中完成的,因此,當實際上它被阻塞時,java線程看起來會處於RUNNING狀態,只是無法將阻止的狀態反饋迴應用程序(因爲您處於本地空間)。

如果應用程序被阻止(您已在DOS框中單擊),請按下轉義以繼續。

0

根據你的日誌,這個問題涉及java.util.logging.ConsoleHandler的使用。

首先嚐試通過從'$ {TOMCAT_HOME} /conf/logging.properties'中的'處理程序'和'.handlers'列表中刪除它來禁用控制檯處理程序。看看問題是否仍然存在。

如果有幫助,那肯定是ConsoleHandler的輸出問題。嘗試檢查是否存在有關'catalina.out'文件的問題。這是tomcat重定向其控制檯輸出的文件。