2015-03-02 137 views
8

我在我的netty服務器應用程序中遇到資源問題。Netty服務器不關閉/釋放套接字

[io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: java.io.IOException: Too many open files 
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) [rt.jar:1.7.0_60] 
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) [rt.jar:1.7.0_60] 
    at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:135) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:69) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-all-4.0.25.Final.jar:4.0.25.Final] 
    at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_60] 

作爲一種變通方法我增加最大打開文件用ulimit -n,但我仍然可以豔記文件/插座的數量不斷增加:

lsof -p 5604 | grep socket | wc -l 
現在

超過3000 ...

使用netstat無法看到任何打開或掛起的連接...

我使用ReadTimeoutHandler以下面的異常處理代碼關閉未使用的連接:

@Override 
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception { 
    if (cause instanceof ReadTimeoutException) { 
    logger.debug("Read timeout - close connection"); 
    } else { 
    logger.info(cause.getMessage()); 
    } 
    ctx.close(); 
} 

服務器引導看起來是這樣的:

ServerBootstrap b = new ServerBootstrap(); 
b.group(bossGroup, workerGroup).channel(NioServerSocketChannel.class).childHandler(new ChannelInitializer<SocketChannel>() { 
@Override 
public void initChannel(SocketChannel ch) throws Exception { 
    ch.pipeline().addLast(new ReadTimeoutHandler(60)); 
    ch.pipeline().addLast(new LoggingHandler(mySpec.getPortLookupKey().toLowerCase())); 
    ch.pipeline().addLast(new RawMessageEncoder()); 
    ch.pipeline().addLast(new RawMessageDecoder()); 
    ch.pipeline().addLast(new RequestServerHandler(ctx.getWorkManager(), factory)); 
} 
}).option(ChannelOption.SO_BACKLOG, 128).childOption(ChannelOption.SO_KEEPALIVE, true); 

ChannelFuture channelFuture = b.bind(port).sync(); 

我錯過了什麼?一旦連接關閉(通過遠程主機或超時處理程序),打開文件的數量是否應該減少?

爲了節省資源,我需要做些什麼?

更新:我用網狀4.0.25

更新2: 按照要求我在ReadTimeouthandler前方移動的日誌處理程序,這裏的日誌。 情況下客戶端正常斷開:

09:41:39,755 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] REGISTERED 
09:41:39,756 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] ACTIVE 
09:41:39,810 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] RECEIVED(1024B) 
09:41:39,813 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] RECEIVED(1024B) 
09:41:39,814 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] RECEIVED(150B) 
09:41:40,854 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] WRITE(1385B) 
09:41:40,855 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 => /127.0.0.1:4300] FLUSH 
09:41:40,861 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 :> /127.0.0.1:4300] INACTIVE 
09:41:40,864 [3-1] [id: 0xca6601a2, /127.0.0.1:64258 :> /127.0.0.1:4300] UNREGISTERED 

情況,在客戶端不會斷開:

10:04:24,104 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] REGISTERED 
10:04:24,107 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] ACTIVE 
10:04:24,594 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(1024B) 
10:04:24,597 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(1024B) 
10:04:24,598 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(150B) 
10:04:25,638 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] WRITE(1383B) 
10:04:25,639 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] FLUSH 
10:05:25,389 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] CLOSE() 
10:05:25,390 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] CLOSE() 
10:05:25,390 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] INACTIVE 
10:05:25,394 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] UNREGISTERED 

所以這是在收盤前60秒差距(從ReadTimeoutHandler預期)

後一些更多的分析,我有一個印象,即使與客戶端正常斷開連接,打開文件的數量也會增加!此外,還有在這種情況下沒有CLOSE()...

+0

您使用的是Netty版本? – HCarrasko 2015-03-02 19:22:43

+0

您可以在管道中的'ReadTimeoutHandler'之前插入一個具有足夠高日誌級別的'LoggingHandler',並用日誌更新您的問題嗎? – trustin 2015-03-04 04:05:11

+0

也許,它與SO更相關,這些連接事件對我來說似乎很正常,而且我從未見過這樣的問題。有一個新的netty版本4.0.28,你可以試試嗎? – crigore 2015-06-26 08:07:39

回答

1

也許這是這個網狀問題https://github.com/netty/netty/issues/1731

這是預期行爲相關的,什麼東西是可以改變的。 JVM正在發出信號表示它不能接受該通道 - 因此, 沒有連接可以啓動,並且不能發送響應。客戶端 將看到連接失敗。如果您有負載平衡器,則應該 針對備用主機重試,或者以您的 應用程序的名義返回503。

相關問題