我一直在使用Perl :: Net :: SSH來自動運行遠程盒子上的一些腳本。但是,這些腳本中的一些需要很長時間才能完成(一兩個小時),有時我會停止從中獲取數據,而不會實際上丟失連接。有時長時間運行ssh命令會停止打印到標準輸出
下面是我使用的代碼:
sub run_regression_tests {
for(my $i = 0; $i < @servers; $i++){
my $inner = $users[$i];
foreach(@$inner){
my $user = $_;
my $server = $servers[$i];
my $outFile;
open($outFile, ">" . $outputDir . $user . "@" . $server . ".log.txt");
print $outFile "Opening connection to $user at $server on " . localtime() . "\n\n";
close($outFile);
my $pid = $pm->start and next;
print "Connecting to [email protected]" . "$server...\n";
my $hasWentToDownloadYet = 0;
my $ssh = Net::SSH::Perl->new($server, %sshParams);
$ssh->login($user, $password);
$ssh->register_handler("stdout", sub {
my($channel, $buffer) = @_;
my $outFile;
open($outFile, ">>", $outputDir . $user . "@" . $server . ".log.txt");
print $outFile $buffer->bytes;
close($outFile);
my @lines = split("\n", $buffer->bytes);
foreach(@lines){
if($_ =~ m/REGRESSION TEST IS COMPLETE/){
$ssh->_disconnect();
if(!$hasWentToDownloadYet){
$hasWentToDownloadYet = 1;
print "Caught exit signal.\n";
print("Regression tests for ${user}\@${server} finised.\n");
download_regression_results($user, $server);
$pm->finish;
}
}
}
});
$ssh->register_handler("stderr", sub {
my($channel, $buffer) = @_;
my $outFile;
open($outFile, ">>", $outputDir . $user . "@" . $server . ".log.txt");
print $outFile $buffer->bytes;
close($outFile);
});
if($debug){
$ssh->cmd('tail -fn 40 /GDS/gds/gdstest/t-gds-master/bin/comp.reg');
}else{
my ($stdout, $stderr, $exit) = $ssh->cmd('. ./.profile && cleanall && my.comp.reg');
if(!$exit){
print "SSH connection failed for ${user}\@${server} finised.\n";
}
}
#$ssh->cmd('. ./.profile');
if(!$hasWentToDownloadYet){
$hasWentToDownloadYet = 1;
print("Regression tests for ${user}\@${server} finised.\n");
download_regression_results($user, $server);
}
$pm->finish;
}
}
sleep(1);
print "\n\n\nAll tests started. Tests typically take 1 hour to complete.\n";
print "If they take significantly less time, there could be an error.\n";
print "\n\nNo output will be printed until all commands have executed and finished.\n";
print "If you wish to watch the progress tail -f one of the logs this script produces.\n Example:\n\t" . 'tail -f ./[email protected]' . "\n";
$pm->wait_all_children;
print "\n\nAll Tests are Finished. \n";
}
這裏是我的%sshParams:
my %sshParams = (
protocol => '2',
port => '22',
options => [
"TCPKeepAlive yes",
"ConenctTimeout 10",
"BatchMode yes"
]
);
有時長時間運行命令的隨機一個剛剛停止印刷/燒製的標準輸出或標準錯誤事件並不會退出。 SSH連接不會死(據我所知),因爲$ssh->cmd
仍然阻塞。
任何想法如何糾正這種行爲?
你必須運行此命令的服務器shell訪問?如果你這樣做,你可以通過'ps auxgmww |查看ssh命令是否存在grep ssh'。你至少可以測試你的假設,即ssh進程還在工作。假設工作正常,您可以運行ps來獲取程序的進程ID,然後運行'strace -fp $ PID'(將程序的PID替換爲$ PID)。看看是否可以揭示它可能會被卡住的原因。 –
也登錄到遠程服務器,看看是否有什麼進程在運行,並對這些進程執行strace,以查看是否能夠揭示卡住的位置。在某些情況下,您正在運行的某個迴歸測試可能需要STDIN上的某些東西嗎? –
'ConenctTimeout'是你的問題還是實際設置中的錯字? – TLP