dplyr每組百分比

2017-08-02 119 views 0 likes

如何計算dplyr中每個另一列的每組數據組的百分比？dplyr每組百分比

df包含以下記錄

A target 
    a 1 
    b 0 
    a 0 
    a 1

這完成了第一部分

df %>% 
    group_by(A) %>% 
    summarise (n = n())

這第二

df %>% 
    group_by(A, target) %>% 
    summarise (n = n(), target_sum = sum(target))%>% 
    filter(target == 1) %>% 
    mutate(freq = n/target_sum)

但商從取自

在蟒蛇/大熊貓

grouped = df_original.groupby(['A', 'target']).size() 
df = (grouped/grouped.groupby(level=0).sum()) 
grouped = df.reset_index(name='percentageA') 
groupedOnly = grouped[grouped.target == 1]

會達到理想的計算用的結果：

a 1 0.666667

來源

2017-08-02 Georg Heiler

回答

使用table與prop.table你覺得太複雜。嘗試

df %>% 
    group_by(A) %>% 
    summarise (mean(target)) 

# A tibble: 2 x 2 
#  A `mean(target)` 
#  <fctr>   <dbl> 
# 1  a  0.6666667 
# 2  b  0.0000000

來源

2017-08-02 17:43:52 Alex

我們可以在R

prop.table(table(df), 1)[,2] 
# a   b 
#0.6666667 0.0000000

來源

2017-08-02 17:40:36 akrun

這是您看到數據如何流動的一種方式，但我喜歡Alex的效率解決方案。

df <- tribble(
    ~A , ~target, 
    "a" , 1, 
    "b" , 0, 
    "a" , 0, 
    "a" , 1 
) 


df %>% 
    group_by(A) %>% 
    mutate(n = n()) %>% 
    group_by(A,target,n) %>% 
    mutate(n_target = n()) %>% 
    mutate(freq = n_target/n) %>% 
    filter(target==1) %>% 
    ungroup() %>% 
    distinct(A,target,freq)

來源

2017-08-02 17:58:20

相關問題

1. 組通過在dplyr和計算百分比
2. 每月百分比計算
3. 計算組數百分比（*）
4. 按行組的百分比
5. 組總數的百分比
6. Spark：列值的百分比百分比
7. 百分比排佔總數的比例，其中每一行由分組決定
8. 獲取每個級別的百分比
9. 獲取每對人的百分比
10. jquery - 計算每個值的百分比
11. php mysql計算每個百分比
12. JSR 275 - 單位，每秒百分比
13. 計算分組內的百分比
14. MySql的拆分查詢組百分比
15. 每個組的百分比數和pyspark的關鍵點
16. 百分比SSRS
17. 百分比
18. 如何使用facet_wrap繪製ggplot2，顯示每個組的百分比，而不是總體百分比？
19. 基本sql組由百分比
20. 計算假組的百分比
21. 獲取組合中的百分比
22. 如何獲得數組的百分比？
23. 按組計算SQL中的百分比
24. 獲取較大組的百分比
25. 2組3D點的百分比誤差
26. 使用dplyr :: PERCENT_RANK（）來計算組內的百分等級
27. R組內分組，plyr/dplyr
28. PHP Levenshtein百分比
29. MySQL的百分比
30. 進程百分比