2016-09-23 62 views
1

嘿,我學習了R,並試圖計算融化數據中有多少零點。所以,我想知道有多少個零與列a和b相對應,並打印出兩個結果。 我產生一個例子:在「融化」數據幀內計算零點數量

library(reshape) 
library(plyr) 
library(dplyr) 
id = c(1,2,3,4,5,6,7,8,9,10) 
b = c(0,0,5,6,3,7,2,8,1,8) 
c = c(0,4,9,87,0,87,0,4,5,0) 
test = data.frame(id,b,c) 
test_melt = melt(test, id.vars = "id") 
test_melt 

我想象,我應該創建一個if語句。如果(test $ value == 0){print()}與 有什麼關係,但是如何告訴R爲已經被融化的列計數了零?

回答

2

與您的數據:

test_melt %>% 
    group_by(variable) %>% 
    summarize(zeroes = sum(value == 0)) 
# # A tibble: 2 x 2 
# variable zeroes 
#  <fctr> <int> 
# 1  b  2 
# 2  c  4 

基礎R:

aggregate(test_melt$value, by = list(variable = test_melt$variable), 
      FUN = function(x) sum(x == 0)) 
# variable x 
# 1  b 2 
# 2  c 4 

...和好奇:

library(microbenchmark) 
microbenchmark(
    dplyr = group_by(test_melt, variable) %>% summarize(zeroes = sum(value == 0)), 
    base1 = aggregate(test_melt$value, by = list(variable = test_melt$variable), FUN = function(x) sum(x == 0)), 
    # @PankajKaundal's suggested "formula" notation reads easier 
    base2 = aggregate(value ~ variable, test_melt, function(x) sum(x == 0)) 
) 
# Unit: microseconds 
# expr  min  lq  mean median  uq  max neval 
# dplyr 916.421 986.985 1069.7000 1022.1760 1094.7460 2272.636 100 
# base1 647.658 682.302 783.2065 715.3045 765.9940 1905.411 100 
# base2 813.219 867.737 950.3247 897.0930 959.8175 2017.001 100 
+0

如果第一個選項不適合你,那麼別的東西是錯誤的。您的數據在我的電腦上正常工作。無論如何,很高興聽到它的工作。 – r2evans

0
sum(test_melt$value==0) 

這應該這樣做。

+0

它可以工作,但它計算零的總量,但是,我需要計算b中有多少個零和c中有多少個零。我應該使用unique()函數嗎? – marianess

0

這可能會有幫助。這是你在找什麼?

> test_melt[4] <- 1 
    > test_melt2 <- aggregate(V4 ~ value + variable, test_melt, sum) 
    > test_melt2 
     value variable V4 
    1  0  b 2 
    2  1  b 1 
    3  2  b 1 
    4  3  b 1 
    5  5  b 1 
    6  6  b 1 
    7  7  b 1 
    8  8  b 2 
    9  0  c 4 
    10  4  c 2 
    11  5  c 1 
    12  9  c 1 
    13 87  c 2 

V4 is the count 
+0

它不計算零:(但它似乎是函數聚合()是關鍵 – marianess

+0

V4是「0」的計數和變量b和c的所有其他數字。 –