我寫了一個PERL程序,該程序需要一個Excel工作表(通過將.xls擴展名改爲.txt來轉換爲文本文件)以及一個用於輸入的序列文件。 Excel工作表包含序列文件中某個區域的起始點和結束點(以及匹配區域任一側的70個側翼值),這些文件需要剪切並提取到第三個輸出文件中。有300個值。程序讀入每次需要切割的序列的起始點和結束點,但它重複告訴我,如果顯然不是輸入文件的長度,則該值超出了該長度。我只是不能似乎得到這個固定Perl程序錯誤
這是程序
use strict;
use warnings;
my $blast;
my $i;
my $idline;
my $sequence;
print "Enter Your BLAST result file name:\t";
chomp($blast = <STDIN>); # BLAST result file name
print "\n";
my $database;
print "Enter Your Gene list file name:\t";
chomp($database = <STDIN>); # sequence file
print "\n";
open IN, "$blast" or die "Can not open file $blast: $!";
my @ids =();
my @seq_start =();
my @seq_end =();
while (<IN>) {
#spliting the result file based on each tab
my @feilds = split("\t", $_);
push(@ids, $feilds[0]); #copying the name of sequence
#coping the 6th tab value of the result which is the start point of from where a value should be cut.
push(@seq_start, $feilds[6]);
#coping the 7th tab value of the result file which is the end point of a value should be cut.
push(@seq_end, $feilds[7]);
}
close IN;
open OUT, ">Result.fasta" or die "Can not open file $database: $!";
for ($i = 0; $i <= $#ids; $i++) {
($sequence) = &block($ids[$i]);
($idline, $sequence) = split("\n", $sequence);
#extracting the sequence from the start point to the end point
my $seqlen = $seq_end[$i] - $seq_start[$i] - 1;
my $Nucleotides = substr($sequence, $seq_start[$i], $seqlen); #storing the extracted substring into $sequence
$Nucleotides =~ s/(.{1,60})/$1\n/gs;
print OUT "$idline\n";
print OUT "$Nucleotides\n";
}
print "\nExtraction Completed...";
sub block {
#block for id storage which is the first tab in the Blast output file.
my $id1 = shift;
print "$id1\n";
my $start =();
open IN3, "$database" or die "Can not open file $database: $!";
my $blockseq = "";
while (<IN3>) {
if (($_ =~ /^>/) && ($start)) {
last;
}
if (($_ !~ /^>/) && ($start)) {
chomp;
$blockseq .= $_;
}
if (/^>$id1/) {
my $start = $. - 1;
my $blockseq .= $_;
}
}
close IN3;
return ($blockseq);
}
BLAST結果文件:http://www.fileswap.com/dl/Ws7ehftejp/
序列文件:http://www.fileswap.com/dl/lPwuGh2oKM/
錯誤
SUBSTR之外字符串在Nucleotide_Extractor.pl第39行。
0在 Nucleotide_Extractor.pl線在Nucleotide_Extractor.pl線44 41.使用未初始化值$核苷酸的級聯(。)或串
使用未初始化值$核苷酸的置換(一個或多個///)。
任何幫助是非常讚賞和查詢總是被邀請
什麼是phytophthora文件?沒有它,我無法處理塊功能。你看起來像substr(「Hello」,45,4)那樣帶有字符串長度以外的起始索引的子字符串。由於它不返回$核苷酸也未初始化。我建議你檢查substr的索引。 – xtreak 2014-09-19 05:42:51
@Wordzilla這是我在問題中提供的鏈接所使用的序列文件名。我已經將兩個輸入文件上傳到fileswap並提供了鏈接。請下載這兩個文件並進行處理。該序列屬於名爲Phytophthora的生物體。我現在改了文件名。謝謝 – 2014-09-19 05:56:01
您還應該在腳本中使用strict,並使用'my'聲明所有變量 - 即'my $ sequence = ...'。 – 2014-09-19 07:04:22