0
所以我試圖完成的任務是從字符串s中獲取長度爲n的所有可能模式的頻率列表。哈斯克爾頻率計數
Input (Text,Length)
Output (Frequency)
String -> Int -> [Int]
freqCount s n = frequency list [int] in alphabetical order
[ 「A」, 「B」, 「C」, 「d」]
其中s中的字符被限定於以上所提到的四個,所以我第一步驟是獲得長度爲n的所有可能的排列都允許重複。
permutationsR k = sort(replicateM k ['A','B','C','D'])
例如,
permutationsR 2
會給輸出
["AA","AB","AC","AD","BA","BB","BC","BD","CA","CB","CC","CD","DA","DB","DC","DD"]
,然後每個模式得到了多少次,每次發生計數 所以像
patternCount:: String -> String -> Int
patternCount text pattern = length (filter (\x -> x == pattern) [take (length(pattern)) (drop x text)| x <- [0..length(text)- length(pattern)]])
frequencyCount s n = map (\x -> patternCount s x) (permutationsR n)
但是我想,這是怎麼回事是非常低效的,因爲我基本上是通過整個列表來檢查每個模式長度(permutationsRn)次,而不是我理由應該是poss只需要一次迭代即可完成。
有沒有一種方法可以生成頻率圖,就像我在命令式語言中那樣。
即僞
where s = string
and n = length of pattern
//pattern is a map where key = pattern and value = frequencyCount
patterns = {"AA":0,"AB":0,"AC:0...}
for (i = 0; i < (length s - n); i++){
patterns[s[i:(i+n)]] += 1
}
基本上只是通過迭代一次,從分割的字符串(I:I + N) 和更新模式映射與每次出現。
例如輸入,輸出將是這樣的
s= "AABBCA"
n = 2
frequencyList s n = [1,1,0,0,0,1,1,0,1,0,0,0,0,0,0,0]