如何在perl中使用數組匹配兩個序列

當在兩個數組中循環時，我對如何將指針移動通過一個循環但在另一個循環中保持常量感到困惑。因此，例如：如何在perl中使用數組匹配兩個序列

陣列1：A T C G T C G A G C G
陣列2：A C G T C C T G T C G

所以A的第一陣列中的甲匹配所述第二陣列中的，所以我們移動到下一個元素。但由於T不會在第二索引的C匹配，我希望程序是t比較到下一個中的G陣列2，以此類推，直到找到匹配T.

my ($array1ref, $array2ref) = @_; 

my @array1 = @$array1ref; 
my @array2= @$array2ref; 
my $count = 0; 
foreach my $element (@array1) { 
foreach my $element2 (@array2) { 
if ($element eq $element2) { 
$count++; 
    }else { ??????????? 


}

來源

2013-05-02 user2344516

嵌套循環品牌沒有意義。你不想多次循環。

您沒有指定重新同步後想要發生的情況，因此您需要從以下開始，並根據需要進行調整。

my ($array1, $array2) = @_; 

my $idx1 = 0; 
my $idx2 = 0; 
while ($idx1 < @$array1 && $idx2 < @$array2) { 
    if ($array1->[$idx1] eq $array2->[$idx2]) { 
     ++$idx1; 
     ++$idx2; 
    } else { 
     ++$idx2; 
    } 
} 

...

由於是，上面的代碼將在最後一個指標是不能（最終）重新同步在離開$idx1。相反，如果你想，只要你第一次重新同步，你想

my ($array1, $array2) = @_; 

my $idx1 = 0; 
my $idx2 = 0; 
my $mismatch = 0; 
while ($idx1 < @$array1 && $idx2 < @$array2) { 
    if ($array1->[$idx1] eq $array2->[$idx2]) { 
     last if $mismatched;   
     ++$idx1; 
     ++$idx2; 
    } else { 
     ++$mismatched; 
     ++$idx2; 
    } 
} 

...

來源

2013-05-02 19:50:36 ikegami

停止的foreach循環將不會削減它：我們會想在循環，同時還有兩個陣列中可用的元素，或遍歷所有指數，我們可以增加我們喜歡：

EL1: while (defined(my $el1 = shift @array1) and @array2) { 
    EL2: while(defined(my $el2 = shift @array2)) { 
    ++$count and next EL1 if $el1 eq $el2; # break out of inner loop 
    } 
}

或

my $j = 0; # index of @array2 
for (my $i = 0; $i <= $#array1; $i++) { 
    $j++ until $j > $#array or $array1[$i] eq $array2[$j]; 
    last if $j > $#array; 
    $count++; 
}

或任意組合。

來源

2013-05-02 19:51:08 amon

這是一個複雜的用於循環使用while循環，而不是

my ($array1ref, $array2ref) = @_; 

my @array1 = @$array1ref; 
my @array2= @$array2ref; 
my $count = 0; 
my ($index, $index2) = (0,0); 
#loop while indexs are in arrays 
while($index <= @#array1 && $index2 <= @#array2) { 
    if($array1[$index] eq $array2[$index2]) { 
     $index++; 
     $index2++; 
    } else { 
     #increment index until we find a match 
     $index2++ until $array1[$index] eq $array2[$index2]; 
    } 
}

來源

2013-05-02 19:51:36 user1937198

這是一種可能的條件。它將使用索引來通過這兩個列表。

my @array1 = qw(A T C G T C G A G C G); 
my @array2 = qw(A C G T C C T G T C G); 

my $count = 0; 
my $idx1 = 0; 
my $idx2 = 0; 

while(($idx1 < scalar @array1) && ($idx2 < scalar @array2)) { 
    if($array1[$idx1] eq $array2[$idx2]) { 
     print "Match of $array1[$idx1] array1 \@ $idx1 and array2 \@ $idx2\n"; 
     $idx1++; 
     $idx2++; 
     $count++; 
    } else { 
     $idx2++; 
    } 
} 

print "Count = $count\n";

來源

2013-05-02 19:54:56

您可以使用while循環搜索匹配項。如果您找到匹配項，請在兩個陣列中推進。如果你不這樣做，則推進第二個數組。在結束時，你可以從第一陣列打印剩餘的無與倫比的人物：

# [1, 2, 3] is a reference to an anonymous array (1, 2, 3) 
# qw(1, 2, 3) is shorthand quoted-word for ('1', '2', '3') 
my $arr1 = [qw(A T C G T C G A G C G)]; 
my $arr2 = [qw(A C G T C C T G T C G)]; 

my $idx1 = 0; 
my $idx2 = 0; 

# Find matched characters 
# @$arr_ref is the size of the array referenced by $arr_ref 
while ($idx1 < @$arr1 && $idx2 < @$arr2) { 
    my $char1 = $arr1->[$idx1]; 
    my $char2 = $arr2->[$idx2]; 
    if ($char1 eq $char2) { 
     # Matched character, advance arr1 and arr2 
     printf("%s %s -- arr1[%d] matches arr2[%d]\n", $char1, $char2, $idx1, $idx2); 
     ++$idx1; 
     ++$idx2; 
    } else { 
     # Unmatched character, advance arr2 
     printf(". %s -- skipping arr2[%d]\n", $char2, $idx2); 
     ++$idx2; 
    } 
} 

# Remaining unmatched characters 
while ($idx1 < @$arr1) { 
    my $char1 = $arr1->[$idx1]; 
    printf("%s . -- arr1[%d] is beyond the end of arr2\n", $char1, $idx1); 
    $idx1++; 
}

腳本打印：

A A -- arr1[0] matches arr2[0] 
. C -- skipping arr2[1] 
. G -- skipping arr2[2] 
T T -- arr1[1] matches arr2[3] 
C C -- arr1[2] matches arr2[4] 
. C -- skipping arr2[5] 
. T -- skipping arr2[6] 
G G -- arr1[3] matches arr2[7] 
T T -- arr1[4] matches arr2[8] 
C C -- arr1[5] matches arr2[9] 
G G -- arr1[6] matches arr2[10] 
A . -- arr1[7] is beyond the end of arr2 
G . -- arr1[8] is beyond the end of arr2 
C . -- arr1[9] is beyond the end of arr2 
G . -- arr1[10] is beyond the end of arr2

來源

2013-05-02 20:02:43 Andomar

好像你可以用「grep」可以做到這一點很容易，如果你'保證array2總是和array1一樣長或更長。事情是這樣的：

sub align 
{ 
    my ($array1, $array2) = @_; 
    my $index = 0; 

    return grep 
      { 
       $array1->[$index] eq $array2->[$_] ? ++$index : 0 
      } 0 .. scalar(@$array2) - 1; 
}

基本上，grep的是說「返回我越來越指數爲數組2匹配在array1連續元素的列表。「

如果運行上述與這個測試代碼，可以看到它返回預期的對準陣列：

my @array1 = qw(A T C G T C G A G C G); 
my @array2 = qw(A C G T C C T G T C G); 

say join ",", align \@array1, \@array2;

此輸出預期映射： 0,3,4,7,8,9 ，10這份名單意味着@array1[0 .. 6]對應@array2[0,3,4,7,8,9,10]

。（注意：您需要use Modern::Perl或類似使用say）

現在，你還沒有真正說你所需要的輸出的操作。我假定你想要這個映射數組。如果您只需要計算在@array2中跳過的元素數，並將其與@array1對齊，則仍然可以使用上面的grep，而不是列表，最後只使用return scalar(@$array2) - $index。

來源

2013-05-06 09:45:39

如您所知，您的問題叫做Sequence Alignment。有很好的算法可以有效地做到這一點，在CPAN上有一個這樣的模塊Algorithm :: NeedlemanWunsch。以下是你如何將它應用於你的問題。

#!/usr/bin/perl 

use Algorithm::NeedlemanWunsch; 

my $arr1 = [qw(A T C G T C G A G C G)]; 
my $arr2 = [qw(A C G T C C T G T C G)]; 

my $matcher = Algorithm::NeedlemanWunsch->new(sub {@_==0 ? -1 : $_[0] eq $_[1] ? 1 : -2}); 

my (@align1, @align2); 
my $result = $matcher->align($arr1, $arr2, 
    { 
    align => sub {unshift @align1, $arr1->[shift]; unshift @align2, $arr2->[shift]}, 
    shift_a => sub {unshift @align1, $arr1->[shift]; unshift @align2,   '.'}, 
    shift_b => sub {unshift @align1,   '.'; unshift @align2, $arr1->[shift]}, 
    }); 

print join("", @align1), "\n"; 
print join("", @align2), "\n";

打印出最佳的解決方案中，我們在構造函數中指定的成本方面：從一個在你原來的問題

ATCGT.C.GAGCG 
A.CGTTCGG.TCG

一個非常不同的方法，但我認爲這是值得了解。

來源

2013-05-09 22:26:17

如何在perl中使用數組匹配兩個序列

回答

相關問題