2017-09-11 56 views
-1

我有一個15200行的excel表,對應於一個樹的結構分析。我有所有的結構(48個結構),它們已經被計算在每一棵樹上。例如,樹12607具有3個結構CV11,1個結構IN12,並且所有結構的其餘部分都沒有(0)。因此,該表看起來像是一個巨大的表格,其中有很多0和樹上結構的一些數字。最後一列是樹的價值,根據其上的結構(每個結構通過它的存在給樹提供了許多指向)。兩個數據框的比較

問題是:是否有一些結構或結構的組合,給樹提供了很高的價值。當然,根據每個結構的價值,我們可以看出哪一個具有比其他結構更高的值(例如,結構CV11具有值15,結構IN12具有值4)。但是我想知道的是,如果我們把所有樹的最終值都大於100(我們創建一個新的數據幀「data100」),並且我們將比較最終值低於100的樹(我們創建另一個數據幀「 data0「),我們可以發現在這些樹上發現的結構的數量和發生有顯着差異嗎?因爲價值高的結構可能只在100以下的樹上才能找到;因爲例如,這個結構不允許在同一棵樹上找到其他結構。

Voilà,我希望我已經提供了足夠的細節......如果你有任何想法或主張來解決這個問題..它會很棒!

下面是我的腳本。

> data100 
     CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13 
1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
4  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
5  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
6  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
10  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
11  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
12  0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 
13  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
14  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
15  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
     IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32 
1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2  0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 
3  0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 
4  0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 
5  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
6  0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
10  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
11  0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 2 0 0 0 0 0 
12  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 
13  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 3 0 0 
14  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 
15  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
     EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval 
1  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
2  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  56 
3  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  10 
4  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  10 
5  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  4 
6  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  24 
7  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
8  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
9  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
10  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  0 
11  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  18 
12  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  63 
13  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  77 
14  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  54 
15  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  20 
[ reached getOption("max.print") -- omitted 60749 rows ] 
> sortdata100<-data100[order(data100[,64],decreasing=T),] 

> rsortdata100<-sortdata100[sortdata100$ecoval>100,] 
> rsortdata100<-na.omit(rsortdata100)#181 lignes 
> rsortdata100 
     CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13 
1291  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1083  0 4 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3919  0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 
14685 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
4021  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
5452  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
14686 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 
4022  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 
1013  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2895  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
4719  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 
682  0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 
3444  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1299  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
2713  0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 
     IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32 
1291  0 0 0 0 0 0 0 0 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1083  3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3919  0 0 1 0 2 0 0 0 2 0 0 0 3 0 0 0 0 0 0 11 0 0 0 
14685 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
4021  0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
5452  0 0 1 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 
14686 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 2 
4022  0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1013  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2895  0 0 0 1 0 0 0 0 4 0 0 3 0 4 3 0 0 0 0 0 0 0 0 
4719  0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
682  0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 0 0 0 0 0 0 0 0 
3444  0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1299  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 
2713  0 0 0 2 0 3 0 0 2 0 0 0 1 5 1 0 0 0 0 0 0 0 0 
     EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval 
1291  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1192 
1083  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 424 
3919  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 380 
14685 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 370 
4021  0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 358 
5452  0 0 0 0 0 0 1 0 0 11 0 0 0 0 1 0 0 356 
14686 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 354 
4022  0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 346 
1013  0 8 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 326 
2895  0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 325 
4719  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 324 
682  0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 311 
3444  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 306 
1299  0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 302 
2713  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 302 
[ reached getOption("max.print") -- omitted 166 rows ] 
> data0<-sortdata100[sortdata100$ecoval<100,] 
> data0<-na.omit(data0) 
> data0 
     CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 CV25 CV26 CV31 CV32 CV33 CV41 CV42 CV43 CV44 CV51 CV52 IN11 IN12 IN13 
4728  0 0 0 1 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 
5339  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 
11766 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
796  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3561  0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
10581 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
10618 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 0 0 0 
14376 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 
14389 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 
790  0 0 0 1 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 1 0 0 
3974  0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
4739  0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 
156  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2740  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
2950  0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 
     IN14 IN21 IN22 IN23 IN31 IN32 IN33 IN34 BA11 BA12 BA21 DE11 DE12 DE13 DE14 DE15 GR11 GR12 GR13 GR21 GR22 GR31 GR32 
4728  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 
5339  1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 
11766 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 
796  1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
3561  0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
10581 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 
10618 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 
14376 1 0 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 0 0 0 0 0 
14389 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 
790  0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 
3974  0 0 0 0 0 0 0 0 1 0 0 0 4 0 0 0 1 0 0 0 0 0 0 
4739  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
156  0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 
2740  0 0 0 0 0 0 0 0 0 0 0 0 0 6 2 0 0 0 0 0 0 0 0 
2950  0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
     EP11 EP12 EP13 EP14 EP21 EP31 EP32 EP33 EP34 EP35 NE11 NE12 NE21 OT11 OT12 OT21 OT22 ecoval 
4728  0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0  99 
5339  0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0  99 
11766 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1  99 
796  1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
3561  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
10581 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0  98 
10618 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0  98 
14376 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
14389 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  98 
790  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  97 
3974  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  97 
4739  0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 1 0  97 
156  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  96 
2740  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0  96 
2950  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  96 
[ reached getOption("max.print") -- omitted 14984 rows ] 
+2

對不起我也不清楚,請閱讀[如何提出一個很好的問題(HTTP的信息:// stackoverflow.com/help/how-to-ask)以及如何給出一個[可重現的例子](http://stackoverflow.com/questions/5963269)。這會讓其他人更容易幫助你。 – zx8754

回答

0

也許是這樣的?

library(dplyr) 
data %>% group_by(ecoval > 100) %>% summarize_all(mean) 

,應該給你的ecoval ><=每列平均爲100

+0

非常感謝您的回答!我不太清楚如何解釋R的結果,FALSE和TRUE行是什麼?在名爲TRUE的行上的平均值是多少? –

+0

'甲tibble:2×65 ecoval> 100 CV11 CV12 CV13 CV14 CV15 CV21 CV22 CV23 CV24 1 FALSE 0.00299880 0.003398641 0.0003332001 0..0005331201 0.005997601 0.00206584 0.003531921 0.00146608 2 TRUE 0.03314917 0.154696133 0.0441988950 0.535911602 0.0552486188 0.060773481 0.03867403 0.077348066 0.03867403' –

+0

我按條件'ecoval> 100'對行進行分組,因此包含'TRUE'的行是彙總'ec橢圓> 100',而包含'FALSE'的行包含'ecoval <= 100'的數據:) –