在Perl中需要峯值信號檢測方面的幫助

大家好我有一些酵母菌羣平板圖像的強度值。我需要能夠從強度值中找到峯值。下面是一個示例圖像，顯示了繪圖時值的外觀。在Perl中需要峯值信號檢測方面的幫助

的一些價值觀

這些值顯示在75.4和75.3兩個峯的例子，你可以看到，值增加後減小。這種變化並不總是一樣的。

圖強度值

http://lh4.ggpht.com/_aEDyS6ECO8s/THKTLgDPhaI/AAAAAAAAAio/HQW7Ut-HBhA/s400/peaks.png從 research

其中之一，我想這樣做是爲了保存每個組，即山中的哈希，然後尋找一個組中的最大價值的東西。其中一個如果我看到的問題是如何確定每個組的邊界。

這裏是我到目前爲止的代碼的鏈接： http://paste-it.net/public/y485822/

這裏是一個鏈接到一個完整的數據集： http://paste-it.net/public/ub121b4/

我寫我的代碼在Perl。任何幫助將不勝感激。謝謝

來源

2010-08-23 Alos

我相信你會得到更多的幫助是你要顯示你現在遇到問題的代碼，而不是要求其他人爲你提供算法和代碼。您的示例代碼可能包含樣本數據的數組，並且您希望返回的峯的預期結果。 – mfontani 2010-08-23 15:49:01

你需要決定你想要的峯值的地方。這裏的方法在數據的廣泛區域內找到峯值和谷值。

use strict; 
use warnings; 

my @data = (
    5.7, 5.3, 8.2, 16.5, 34.2, 58.8, 75.4, 75, 65.9, 62.6, 
    58.6, 66.4, 71.4, 53.5, 40.5, 26.8, 14.2, 8.6, 5.9, 7.7, 
    14.9, 30.5, 49.9, 69.1, 75.3, 69.8, 58.8, 57.2, 56.3, 67.1, 
    69, 45.1, 27.6, 13.4, 8, 5, 
); 

# Determine mean. Or use Statistics::Descriptive. 
my $sum; 
$sum += $_ for @data; 
my $mean = $sum/@data; 

# Make a pass over the data to find contiguous runs of values 
# that are either less than or greater than the mean. Also 
# keep track of the mins and maxes within those groups. 
my $group = -1; 
my $gt_mean_prev = ''; 
my @mins_maxs; 
my $i = -1; 

for my $d (@data){ 
    $i ++; 
    my $gt_mean = $d > $mean ? 1 : 0; 

    unless ($gt_mean eq $gt_mean_prev){ 
     $gt_mean_prev = $gt_mean; 
     $group ++; 
     $mins_maxs[$group] = $d; 
    } 

    if ($gt_mean){ 
     $mins_maxs[$group] = $d if $d > $mins_maxs[$group]; 
    } 
    else { 
     $mins_maxs[$group] = $d if $d < $mins_maxs[$group]; 
    } 

    $d = { 
     i  => $i, 
     val  => $d, 
     group => $group, 
     gt_mean => $gt_mean, 
    }; 
} 

# A fun picture. 
for my $d (@data){ 
    printf 
     "%6.1f %2d %1s %1d %3s %s\n", 
     $d->{val}, 
     $d->{i}, 
     $d->{gt_mean} ? '+' : '-', 
     $d->{group}, 
     $d->{val} == $mins_maxs[$d->{group}] ? '==>' : '', 
     '.' x ($d->{val}/2), 
    ;

}

輸出：

5.7 0 - 0  .. 
    5.3 1 - 0 ==> .. 
    8.2 2 - 0  .... 
    16.5 3 - 0  ........ 
    34.2 4 - 0  ................. 
    58.8 5 + 1  ............................. 
    75.4 6 + 1 ==> ..................................... 
    75.0 7 + 1  ..................................... 
    65.9 8 + 1  ................................ 
    62.6 9 + 1  ............................... 
    58.6 10 + 1  ............................. 
    66.4 11 + 1  ................................. 
    71.4 12 + 1  ................................... 
    53.5 13 + 1  .......................... 
    40.5 14 - 2  .................... 
    26.8 15 - 2  ............. 
    14.2 16 - 2  ....... 
    8.6 17 - 2  .... 
    5.9 18 - 2 ==> .. 
    7.7 19 - 2  ... 
    14.9 20 - 2  ....... 
    30.5 21 - 2  ............... 
    49.9 22 + 3  ........................ 
    69.1 23 + 3  .................................. 
    75.3 24 + 3 ==> ..................................... 
    69.8 25 + 3  .................................. 
    58.8 26 + 3  ............................. 
    57.2 27 + 3  ............................ 
    56.3 28 + 3  ............................ 
    67.1 29 + 3  ................................. 
    69.0 30 + 3  .................................. 
    45.1 31 + 3  ...................... 
    27.6 32 - 4  ............. 
    13.4 33 - 4  ...... 
    8.0 34 - 4  .... 
    5.0 35 - 4 ==> ..

來源

2010-08-23 16:30:59 FMc

僅供參考，您從原始數據的開頭刪除了幾個數據點。 – ysth 2010-08-23 23:22:37

@ysth謝謝，很好。 – FMc 2010-08-23 23:35:11

my @data = ...; 

# filter out sequential duplicate values 
my @orig_index = 0; 
my @deduped = $data[0]; 
for my $index (1..$#data) { 
    if ($data[$index] != $data[$index-1]) { 
     push @deduped, $data[$index]; 
     push @orig_index, $index; 
    } 
} 

# add a sentinel (works for both ends) 
push @deduped, -9**9**9; 

my @local_maxima_indexes; 
for my $index (0..$#deduped-1) { 
    if ($deduped[$index] > $deduped[$index-1] && $deduped[$index] > $deduped[$index+1]) { 
     push @local_maxima_indexes, $orig_index[$index]; 
    } 
}

注意，這考慮了第一值的局部最大值，也是值71.4和69.我不知道你怎麼樣正在區分你想要包括哪些。

來源

2010-08-23 16:33:14 ysth

你有控制數據集嗎？如果是這樣，我建議使用酵母強度和對照圖像之間簡單的對數比例來標準化數據。

然後，您可以使用perl port of ChiPOTle搶顯著峯，這聽起來方式比搜索局部/全局最大值更穩健等

的Chipotle「是用來分析芯片到芯片微陣列數據的峯值尋找算法「，但是我在其他許多應用程序中成功地使用了它（例如ChIP-seq，它確實比您的情況更接近它的原始目的）。

生成的日誌（酵母/對照）負值將用於建立顯着性估計的高斯背景模型。該算法然後使用錯誤發現率進行多個測試校正。

Here's the original paper。

來源

2010-08-24 06:03:28

嗨，佩德羅，我正在做的是把酵母盤子變成矩陣然後看強度值。然後繪製數值，峯值代表有菌落的區域。在這種情況下沒有控制數據集。我正在看看ChiPOTle謝謝！ – Alos 2010-08-24 15:44:35

在Perl中需要峯值信號檢測方面的幫助

回答

相關問題