-1
我有一堆數據收集的,例如:AWK,濾波器的特定信息:時間,平均和β
1.00 3 4
1.00 0 1
51.00 1 4
84.00 3 4
95.00 0 2
110.00 2 4
120.00 0 1
121.00 1 2
124.00 2 4
158.00 3 4
159.00 1 3
172.00 0 4
214.00 0 4
223.00 2 4
224.00 1 2
228.00 1 4
229.00 0 1
232.00 2 3
233.00 3 4
233.00 1 3
246.00 0 2
292.00 0 3
294.00 0 4
294.00 2 4
294.00 3 4
318.00 1 2
331.00 0 1
383.00 2 4
402.00 3 4
然後,我要生成的輸出是這樣的:
node_src node_dst time_repeated time1 time2 ... average_time ß
細節:
*node_src = 2nd column
*node_dst = 3rd column
*time_repeated = the number of the same line is repeated, example 3 4 is repeated 5 time
*time1, time2 .. = are the value of column 1
*average_time = the average time for the different interval,
example see below,
*ß = time_repeated/average_time
我試圖產生這樣的結果:
node1 node2 nbrepeated time1 time2 time3 time4 time5 time6 time7 average ß
2 4 6 110.0 124.0 223.0 294.0 383.0 461.0 543.0 6.0 0
2 3 1 232.0 402.0 0.0 0.0 0.0 0.0 0.0 1.0 0
1 3 2 159.0 233.0 521.0 0.0 0.0 0.0 0.0 2.0 4
1 2 4 121.0 224.0 318.0 461.0 573.0 0.0 0.0 4.0 5
0 4 4 172.0 214.0 294.0 415.0 543.0 0.0 0.0 4.0 5
0 2 5 95.0 246.0 415.0 536.0 572.0 588.0 0.0 5.0 :
0 3 3 292.0 403.0 455.0 588.0 0.0 0.0 0.0 3.0 :
1 4 2 51.0 228.0 494.0 0.0 0.0 0.0 0.0 2.0 :
0 1 4 1.0 120.0 229.0 331.0 536.0 0.0 0.0 4.0 :
3 4 6 1.0 84.0 158.0 233.0 294.0 402.0 431.0 6.0 :
我無法細的平均時間和SS由於計算 找到的平均時間複雜度是這樣的:
121.0 224.0 318.0 461.0 573.0
avg_time = ((224-121)+(318-224)+(461-318)+(573-461))/4
這裏的挑戰是使其動態,因爲數時間字段是未知... 使用bash做...
這裏是代碼,感謝格倫·傑克曼
#!/bin/bash
declare -A t
while read tm f1 f2; do
t["$f1:$f2"]+=" $tm"
done < $1
max=0
for key in "${!t[@]}"; do
set -- ${t[$key]}
[[ $# -gt $max ]] && max=$#
done
{
printf "field1 field2 nbrepeated"
for i in $(seq $max); do printf " %s" time$i; done
echo " average_time beta"
for key in "${!t[@]}"; do
f1=${key%:*}
f2=${key#*:}
set -- ${t[$key]}
f3=$(($# - 1))
f4=$(($# - 1))
f5= 1
printf "%d %d %d" $f1 $f2 $f3
for i in $(seq $max); do
printf " %.1f" ${1-0}
shift
done
printf " %.1f %.1f" $f4 $f5
echo ""
done
} | column -t
修改需要做:
- 發現的平均時間:avg_time
- 找到公測
P/S:通常找到的平均時間,人們做:sum/NR
,但它是不是我的問題
情況下的情況下解決:這裏是輸出
field1 field2 nbrepeated time1 time2 time3 time4 time5 time6 time7 average_time beta
2 4 6 110.0 124.0 223.0 294.0 383.0 461.0 543.0 72.16 0.08
2 3 1 232.0 402.0 0.0 0.0 0.0 0.0 0.0 170.00 0.00
1 3 2 159.0 233.0 521.0 0.0 0.0 0.0 0.0 181.00 0.01
1 2 4 121.0 224.0 318.0 461.0 573.0 0.0 0.0 113.00 0.03
後續問題http://stackoverflow.com/questions/6198882/awk-calculate-the-average-for-different-interval-of-time – daxim 2011-06-01 10:27:43
稍微有點代碼以上我的口味,你可以縮小這個問題,包括你曾經嘗試過的? – 2011-06-01 10:39:36
也是這樣的:http://stackoverflow.com/questions/6185305/awk-regroup-by-lines-pattern – ex001 2011-06-01 10:42:11