2012-08-04 637 views
1

在羣集環境中,我看到特定服務器的通道異常終止並在一天中頻繁恢復。例如:QMGR A有幾個連接到它的QMGRS(B,C,D,E,F)(每個在不同的服務器中)
QMGR B,C,D,E,F的簇接收信道異常結束QMGR A在一天內恢復得相當頻繁。IBM Websphere MQ羣集通道異常終止並且頻繁恢復

QMGR一個主機B

 

08/04/2012 08:44:09 AM - Process(17174.16023) User(mqad) Program(amqrmppa) 
AMQ9259: Connection timed out from host 'HOST.A'. 

EXPLANATION: 
A connection from host 'HOST.A' over TCP/IP timed out. 
ACTION: 
Check to see why data was not received in the expected time. Correct the 
problem. Reconnect the channel, or wait for a retrying channel to reconnect 
itself. 
----- amqccita.c : 3546 ------------------------------------------------------- 
08/04/2012 08:44:09 AM - Process(17174.16023) User(mqad) Program(amqrmppa) 
AMQ9999: Channel program ended abnormally. 

EXPLANATION: 
Channel program 'CHANNEL.TO.B' ended abnormally. 
ACTION: 
Look at previous error messages for channel program 'CHANNEL.TO.B' in the 
error files to determine the cause of the failure. 


QMGR日誌中記錄

 

    ------------------------------------------------------------------------------- 
08/04/12 08:44:41 - Process(1720412.1165) User(mqad) Program(amqrmppa) 
AMQ9209: Connection to host 'HOST.B (139.120.210.19)' closed. 

EXPLANATION: 
An error occurred receiving data from 'HOST.B (139.120.210.19)' over TCP/IP. 
The connection to the remote host has unexpectedly terminated. 
ACTION: 
Tell the systems administrator. 
----- amqccita.c : 3094 ------------------------------------------------------- 
08/04/12 08:44:41 - Process(1720412.1165) User(mqad) Program(amqrmppa) 
AMQ9999: Channel program ended abnormally. 

EXPLANATION: 
Channel program 'CHANNEL.TO.B' ended abnormally. 
ACTION: 
Look at previous error messages for channel program 'CHANNEL.TO.B' in the 
error files to determine the cause of the failure. 
----- amqrccca.c : 777 -------------------------------------------------------- 
08/04/12 08:44:41 - Process(1720412.1175) User(mqad) Program(amqrmppa) 
AMQ9209: Connection to host 'HOST.C (155.10.186.20)' closed. 

EXPLANATION: 
An error occurred receiving data from 'HOST.C (155.10.186.20)' over TCP/IP. 
The connection to the remote host has unexpectedly terminated. 
ACTION: 
Tell the systems administrator. 
----- amqccita.c : 3094 ------------------------------------------------------- 
08/04/12 08:44:41 - Process(1720412.1175) User(mqad) Program(amqrmppa) 
AMQ9999: Channel program ended abnormally. 

EXPLANATION: 
Channel program 'CHANNEL.TO.C' ended abnormally. 
ACTION: 
Look at previous error messages for channel program 'CHANNEL.TO.C' in the 
error files to determine the cause of the failure. 
    ------------------------------------------------------------------------------- 

QMGR LOG對主機C

 
------------------------------------------------------------------------------- 
08/04/12 08:44:35 - Process(462890.4658) User(mqad) Program(amqrmppa) 
AMQ9259: Connection timed out from host 'HOST.A'. 

EXPLANATION: 
A connection from host 'HOST.A' over TCP/IP timed out. 
ACTION: 
Check to see why data was not received in the expected time. Correct the 
problem. Reconnect the channel, or wait for a retrying channel to reconnect 
itself. 
----- amqccita.c : 3341 ------------------------------------------------------- 
08/04/12 08:44:35 - Process(462890.4658) User(mqad) Program(amqrmppa) 
AMQ9999: Channel program ended abnormally. 

EXPLANATION: 
Channel program 'CHANNEL.TO.C' ended abnormally. 
ACTION: 
Look at previous error messages for channel program 'CHANNEL.TO.C' in the 
error files to determine the cause of the failure. 
----- amqrmrsa.c : 468 -------------------------------------------------------- 

我嘗試要了解是什麼造成了這個?如果隊列管理器A重載了多少連接,是否會導致這種情況?我沒有看到任何記錄在qmgr日誌上的TCP/IP錯誤代碼。

+1

請提供CLUSRCVR定義形式A,B和C.此外,qm.ini文件。有許多調整選項可以闡明可能發生的情況。另外,什麼版本的WMQ? – 2012-08-04 16:26:22

+0

嗨Vignesh,沒有額外的信息,這個問題很難回答。我們可以認爲它被放棄並關閉它嗎?如果不是,你能提供更多信息嗎? – 2014-04-08 00:10:58

回答

4

看起來您正在運行MQ的V7.1版之前的版本?在MQ V7.1該錯誤消息是從更新: -

AMQ9259: Connection timed out from host 'HOST.A'. 

EXPLANATION: 
A connection from host 'HOST.A' over TCP/IP timed out. 
ACTION: 
Check to see why data was not received in the expected time. Correct the 
problem. Reconnect the channel, or wait for a retrying channel to reconnect 
itself. 

AMQ9259: Connection timed out from host 'HOST.A'. 

EXPLANATION: 
A connection from host 'HOST.A' over TCP/IP timed out. 
ACTION: 
The select() [TIMEOUT] 60 seconds call timed out. Check to see why data was 
not received in the expected time. Correct the problem. Reconnect the channel, 
or wait for a retrying channel to reconnect itself. 

,例如, AMQ9259錯誤消息的最可能原因是您的接收超時設置已導致通道彈出其接收並關閉通道。建議您查看qm.ini文件中的接收超時設置,以查看它們是否設置爲短於心跳間隔的時間。

通道會自動重新啓動,因爲您已重新定義間隔。這很好!