2013-03-06 59 views
4

我們遇到Amazon-hosted Play application with RDS Mysql問題。連接偶爾掛在Amazon RDS上MySQL&Play Framework 1.2.5(c3p0默認設置)

有時應用程序最終沒有響應。今天我檢查與jstack -F線程狀態,並有卡在本地代碼爲C3P0連接多線程:

Thread 14060: (state = IN_NATIVE) 
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise) 
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=79, line=150 (Compiled frame) 
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=121 (Compiled frame) 
- com.mysql.jdbc.util.ReadAheadInputStream.fill(int) @bci=262, line=114 (Compiled frame) 
- com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) @bci=176, line=161 (Compiled frame) 
- com.mysql.jdbc.util.ReadAheadInputStream.read(byte[], int, int) @bci=48, line=189 (Compiled frame) 
- com.mysql.jdbc.MysqlIO.readFully(java.io.InputStream, byte[], int, int) @bci=34, line=3036 (Compiled frame) 
- com.mysql.jdbc.MysqlIO.reuseAndReadPacket(com.mysql.jdbc.Buffer, int) @bci=23, line=3489 (Compiled frame) 
- com.mysql.jdbc.MysqlIO.reuseAndReadPacket(com.mysql.jdbc.Buffer) @bci=3, line=3478 (Compiled frame) 
- com.mysql.jdbc.MysqlIO.checkErrorPacket(int) @bci=12, line=4019 (Compiled frame) 
- com.mysql.jdbc.MysqlIO.sendCommand(int, java.lang.String, com.mysql.jdbc.Buffer, boolean, java.lang.String, int) @bci=428, line=2490 (Compiled frame) 
- com.mysql.jdbc.MysqlIO.sqlQueryDirect(com.mysql.jdbc.StatementImpl, java.lang.String, java.lang.String, com.mysql.jdbc.Buffer, int, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[]) @bci=559, line=2651 (Compiled frame) 
- com.mysql.jdbc.ConnectionImpl.execSQL(com.mysql.jdbc.StatementImpl, java.lang.String, int, com.mysql.jdbc.Buffer, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[], boolean) @bci=130, line=2677 (Compiled frame) 
- com.mysql.jdbc.ConnectionImpl.execSQL(com.mysql.jdbc.StatementImpl, java.lang.String, int, com.mysql.jdbc.Buffer, int, int, boolean, java.lang.String, com.mysql.jdbc.Field[]) @bci=17, line=2627 (Compiled frame) 
- com.mysql.jdbc.StatementImpl.executeQuery(java.lang.String) @bci=534, line=1556 (Compiled frame) 
- com.mysql.jdbc.DatabaseMetaData$9.forEach(java.lang.Object) @bci=174, line=5013 (Compiled frame) 
- com.mysql.jdbc.IterateBlock.doForAll() @bci=25, line=51 (Compiled frame) 
- com.mysql.jdbc.DatabaseMetaData.getTables(java.lang.String, java.lang.String, java.lang.String, java.lang.String[]) @bci=280, line=4991 (Compiled frame) 
- com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnectionNoQuery(java.sql.Connection, java.lang.Throwable[]) @bci=21, line=185 (Compiled frame) 
- com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(java.sql.Connection, java.lang.String, java.lang.Throwable[]) @bci=7, line=62 (Compiled frame) 
- com.mchange.v2.c3p0.AbstractConnectionTester.activeCheckConnection(java.sql.Connection) @bci=4, line=67 (Compiled frame) 
- com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(java.lang.Object) @bci=122, line=368 (Compiled frame) 
- com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(java.lang.Object) @bci=26, line=310 (Compiled frame) 
- com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run() @bci=35, line=1999 (Interpreted frame) 
- com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run() @bci=41, line=255 (Interpreted frame) 

DefaultConnectionTester.activeCheckConnectionNoQuery似乎如果沒有其他連接測試設置檢查了DatabaseMetaData.getTables。

C3p0手冊中有關於configuring proper connection testing的說明和this answer說明了使用Play!的c3p0設置。但我想知道:

  • 有沒有什麼辦法來檢查這是什麼問題?
    • 標準「MySQL連接後8小時被打死」我的結果異常之前遇到
  • 似乎與C3P0 /播放一些默認的連接測試/ Hibernate的,因爲這樣做?
    • 我有一種預感,在默認情況C3P0沒有做連接測試
      • 默認值idleConnectionTestPeriod = 0,testConnectionOnCheckin =假,testConnectionOnCheckout =假
  • 是否有辦法設置連接測試查詢的超時時間?
    • 即使我配置了一些合理的preferredTestQuery,例如,該查詢的風險是否也會像當前的getTables()調用一樣掛起?
  • 有什麼我應該考慮的嗎?

更新與automaticTestTable(也不是辦法)

我們設置c3p0.automaticTestTable有更合理的連接測試,但似乎並沒有解決問題,只是改變了一下堆棧跟蹤線程。

在類似的情況下,我們現在有線程堆棧如下:

"Task-[email protected]6064cd08" daemon prio=10 tid=0x00007f53d403a000 nid=0x63dd runnable [0x00007f53dce4a000] 
    java.lang.Thread.State: RUNNABLE 
     at java.net.SocketInputStream.socketRead0(Native Method) 
     at java.net.SocketInputStream.read(SocketInputStream.java:150) 
     at java.net.SocketInputStream.read(SocketInputStream.java:121) 
     at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114) 
     at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161) 
     at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189) 
     - locked <0x00000000dbecd0d8> (a com.mysql.jdbc.util.ReadAheadInputStream) 
     at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3036) 
     at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3489) 
     at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3478) 
     at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4019) 
     at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490) 
     at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651) 
     at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2677) 
     - locked <0x00000000dbec1a70> (a com.mysql.jdbc.JDBC4Connection) 
     at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2627) 
     at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1556) 
     - locked <0x00000000dbec1a70> (a com.mysql.jdbc.JDBC4Connection) 
     - locked <0x00000000d1330078> (a com.mysql.jdbc.StatementImpl) 
     at com.mchange.v2.c3p0.impl.DefaultConnectionTester.activeCheckConnection(DefaultConnectionTester.java:73) 
     at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.testPooledConnection(C3P0PooledConnectionPool.java:374) 
     at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.refurbishIdleResource(C3P0PooledConnectionPool.java:310) 
     at com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:1999) 
     at com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run(ThreadPerTaskAsynchronousRunner.java:255) 

播放線程在等待獲得一個數據庫連接:

"play-thread-3" prio=10 tid=0x00007f53c400b000 nid=0x5b3e in Object.wait() [0x00007f53ddd57000] 
    java.lang.Thread.State: TIMED_WAITING (on object monitor) 
     at java.lang.Object.wait(Native Method) 
     - waiting on <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool) 
     at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:579) 
     - locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool) 
     at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584) 
     ... (näitä vaihtelevat määrät) 
     at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584) 
     - locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool) 
     at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584) 
     - locked <0x00000000db9f38f8> (a com.mchange.v2.resourcepool.BasicResourcePool) 
     at com.mchange.v2.resourcepool.BasicResourcePool.checkoutResource(BasicResourcePool.java:477) 
     at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:525) 
     at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128) 
     at org.hibernate.ejb.connection.InjectedDataSourceConnectionProvider.getConnection(InjectedDataSourceConnectionProvider.java:71) 
     at org.hibernate.jdbc.ConnectionManager.openConnection(ConnectionManager.java:446) 
     at org.hibernate.jdbc.ConnectionManager.getConnection(ConnectionManager.java:167) 
     at org.hibernate.jdbc.JDBCContext.connection(JDBCContext.java:160) 
     at org.hibernate.transaction.JDBCTransaction.begin(JDBCTransaction.java:81) 
     at org.hibernate.impl.SessionImpl.beginTransaction(SessionImpl.java:1473) 
     at org.hibernate.ejb.TransactionImpl.begin(TransactionImpl.java:60) 
     at play.db.jpa.JPAPlugin.startTx(JPAPlugin.java:377) 
     at play.db.jpa.JPAPlugin.beforeInvocation(JPAPlugin.java:345) 
     at play.plugins.PluginCollection.beforeInvocation(PluginCollection.java:473) 
     at play.Invoker$Invocation.before(Invoker.java:217) 
     at play.Invoker$Invocation.run(Invoker.java:277) 
     at play.server.PlayHandler$NettyInvocation.run(PlayHandler.java:229) 
     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
     at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) 
     at java.util.concurrent.FutureTask.run(FutureTask.java:166) 
     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) 
     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) 
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) 
     at java.lang.Thread.run(Thread.java:722) 

一些注意事項:

  • 它有可能(至少在理論上)許多連接恰好在堆棧轉儲的同時進行測試,儘管我發現這不太可能。
  • another question有人提到數據死鎖或硬件/網絡問題可能會導致這種行爲。
    • 至少應該不會有任何正在進行的模式修改。

BR,Touko


回答自己:

  • 沒有理由找到了連接懸掛:(
    • RDS有error.log中的MySQL ,不幸的是沒有
  • 播放有C3P0 ComboPooledDataSource一些默認設置(DBPlugin.onApplicationStart()
    • idleTimeExcessConnections:從application.conf db.pool.maxIdleTimeExcessConnections(默認爲0)
    • checkoutTimeout:從應用.conf文件db.pool.timeout(默認5000)
    • idleConnectionTestPeriod:10
    • setTestConnectionOnCheckin:真
    • 在此外,CONF/c3p0.properties可以用來設置自定義屬性
  • 沒有找到方法來設置連接測試(短暫DefaultConnectionTester源代碼查找它可能不支持)

超時我們將繼續設置c3p0.automaticTestTable並檢查稍後是否會有任何問題。

更新:不幸的是我們沒有擺脫問題尚未:(還與automaticTestTable連接測試似乎已經卡住java.net.SocketInputStream.socketRead0(本機方法)


接聽再次我自己:

我們結束了嘗試使用c3p0設置「爲連接測試查詢設置超時時間」方法:s maxAdministrativeTaskTime。這是一個小問題,所以看看它是否可以作爲最終解決方案(如果有效)。

到目前爲止還沒有問題,我們將看看它是如何工作的。日誌中沒有c3p0錯誤/堆棧跟蹤,我希望這個中斷會顯示在日誌中?或者超時還沒有發生,或者我們應該調整日誌配置以顯示c3p0日誌。

更新(2013年4月4日):在此之後,問題(可能)沒有發生如此頻繁,但已發生飄飛:(

+0

FYI大家:C3P0記錄與播放似乎是有點問題的。在登錄(具有類似socketRead0問題)最近有人問在https://groups.google.com/forum/#!topic/play-framework/o6dQymbq4hg/discussion – Touko 2013-03-15 06:58:48

回答

3

再次回答自己:

我們結束首先使用爲MySQL驅動程序指定的socketTimeout,以便連接測試不會卡住。

看來,底層TCP連接,最終使沒有從服務器(TCP ACK或任何東西)的答案。

UPDATE:最後的溶液用給tcpdump的爲AWS支持了,他們發現了AWS基礎設施中的一些情況。他們勸我們在EC2實例的安全組打開臨時端口範圍。更多的討論在AWS forum