2017-02-14 62 views
0

我需要了解如何在dplyr的group_by函數中輸入字符串值(NSE)。我的數據集和代碼在「group_by」下工作正常,但不適用於「group_by_」版本。在這方面我無法找到我的錯誤。字符串輸入到dplyr group_by

ID,Region,Dimension,BlogsInd.,BlogsNews,BlogsTech,Columns 
1,PK,Dim1,-4.75,NA,NA,NA 
2,PK,Dim1,-5.69,NA,NA,NA 
3,PK,Dim1,-0.27,NA,NA,NA 
4,PK,Dim1,-2.76,NA,NA,NA 
5,PK,Dim1,-8.24,NA,NA,NA 
6,PK,Dim1,-12.51,NA,NA,NA 
7,PK,Dim1,-1.28,NA,NA,NA 
8,PK,Dim1,0.95,NA,NA,NA 
9,PK,Dim1,-5.96,NA,NA,NA 
10,PK,Dim1,-8.81,NA,NA,NA 
11,PK,Dim1,-8.46,NA,NA,NA 
12,PK,Dim1,-6.15,NA,NA,NA 
13,PK,Dim1,-13.98,NA,NA,NA 
14,PK,Dim1,-16.43,NA,NA,NA 
15,PK,Dim1,-4.09,NA,NA,NA 
16,PK,Dim1,-11.06,NA,NA,NA 
17,PK,Dim1,-9.04,NA,NA,NA 
18,PK,Dim1,-8.56,NA,NA,NA 
19,PK,Dim1,-8.13,NA,NA,NA 
20,PK,Dim2,-14.46,NA,NA,NA 
21,PK,Dim2,-4.21,NA,NA,NA 
22,PK,Dim2,-4.96,NA,NA,NA 
23,PK,Dim2,-5.48,NA,NA,NA 
24,PK,Dim2,-4.53,NA,NA,NA 
25,PK,Dim2,6.31,NA,NA,NA 
26,PK,Dim2,-11.16,NA,NA,NA 
27,PK,Dim2,-1.27,NA,NA,NA 
28,PK,Dim2,-11.49,NA,NA,NA 
29,PK,Dim2,-0.9,NA,NA,NA 
30,PK,Dim2,-12.27,NA,NA,NA 
31,PK,Dim2,6.85,NA,NA,NA 
32,PK,Dim2,-5.21,NA,NA,NA 
33,PK,Dim2,-1.06,NA,NA,NA 
34,PK,Dim2,-2.6,NA,NA,NA 
35,PK,Dim2,-0.95,NA,NA,NA 
36,PK,Dim3,-0.82,NA,NA,NA 
37,PK,Dim3,-7.65,NA,NA,NA 
38,PK,Dim3,0.64,NA,NA,NA 
39,PK,Dim3,-2.25,NA,NA,NA 
40,PK,Dim3,-1.58,NA,NA,NA 
41,PK,Dim3,-5.73,NA,NA,NA 
42,PK,Dim3,0.37,NA,NA,NA 
43,PK,Dim3,-5.46,NA,NA,NA 
44,PK,Dim3,-3.48,NA,NA,NA 
45,PK,Dim3,0.88,NA,NA,NA 
46,PK,Dim3,-2.11,NA,NA,NA 
47,PK,Dim3,-10.13,NA,NA,NA 
48,PK,Dim3,-2.08,NA,NA,NA 
49,PK,Dim3,-4.33,NA,NA,NA 
50,PK,Dim3,1.09,NA,NA,NA 
51,PK,Dim3,-4.23,NA,NA,NA 
52,PK,Dim3,-1.46,NA,NA,NA 
53,PK,Dim3,9.37,NA,NA,NA 
54,PK,Dim3,5.84,NA,NA,NA 
55,PK,Dim3,8.21,NA,NA,NA 
56,PK,Dim3,7.34,NA,NA,NA 
57,PK,Dim4,1.83,NA,NA,NA 
58,PK,Dim4,14.39,NA,NA,NA 
59,PK,Dim4,22.02,NA,NA,NA 
60,PK,Dim4,4.83,NA,NA,NA 
61,PK,Dim4,-3.24,NA,NA,NA 
62,PK,Dim4,-5.69,NA,NA,NA 
63,PK,Dim4,-22.92,NA,NA,NA 
64,PK,Dim4,0.41,NA,NA,NA 
65,PK,Dim4,-4.42,NA,NA,NA 
66,PK,Dim4,-10.72,NA,NA,NA 
67,PK,Dim4,-11.29,NA,NA,NA 
68,PK,Dim4,-2.89,NA,NA,NA 
69,PK,Dim4,-7.59,NA,NA,NA 
70,PK,Dim4,-7.45,NA,NA,NA 
71,US,Dim1,-12.49,NA,NA,NA 
72,US,Dim1,-11.59,NA,NA,NA 
73,US,Dim1,-4.6,NA,NA,NA 
74,US,Dim1,-22.83,NA,NA,NA 
75,US,Dim1,-4.83,NA,NA,NA 
76,US,Dim1,-14.76,NA,NA,NA 
77,US,Dim1,-15.93,NA,NA,NA 
78,US,Dim1,-2.78,NA,NA,NA 
79,US,Dim1,-16.39,NA,NA,NA 
80,US,Dim1,-15.22,NA,NA,NA 
81,US,Dim1,3.25,NA,NA,NA 
82,US,Dim1,-2.73,NA,NA,NA 
83,US,Dim1,0.96,NA,NA,NA 
84,US,Dim1,-1.12,NA,NA,NA 
85,US,Dim1,-0.33,NA,NA,NA 
86,US,Dim1,-6.45,NA,NA,NA 
87,US,Dim1,2.52,NA,NA,NA 
88,US,Dim1,3.18,NA,NA,NA 
89,US,Dim1,4.65,NA,NA,NA 
90,US,Dim2,-1.75,NA,NA,NA 
91,US,Dim2,-0.22,NA,NA,NA 
92,US,Dim2,8.16,NA,NA,NA 
93,US,Dim2,1.89,NA,NA,NA 
94,US,Dim2,4.31,NA,NA,NA 
95,US,Dim2,-0.41,NA,NA,NA 
96,US,Dim2,-23.02,NA,NA,NA 
97,US,Dim2,3.87,NA,NA,NA 
98,US,Dim2,-4.76,NA,NA,NA 
99,US,Dim2,4.95,NA,NA,NA 
100,US,Dim2,4.78,NA,NA,NA 
101,US,Dim2,-15.11,NA,NA,NA 
102,US,Dim2,-3.74,NA,NA,NA 
103,US,Dim2,-6.15,NA,NA,NA 
104,US,Dim2,-8.33,NA,NA,NA 
105,US,Dim2,-5.55,NA,NA,NA 
106,US,Dim3,-5.1,NA,NA,NA 
107,US,Dim3,-0.41,NA,NA,NA 
108,US,Dim3,-8,NA,NA,NA 
109,US,Dim3,-11.8,NA,NA,NA 
110,US,Dim3,-10.39,NA,NA,NA 
111,US,Dim3,-14.98,NA,NA,NA 
112,US,Dim3,-13.14,NA,NA,NA 
113,US,Dim3,-16.06,NA,NA,NA 
114,US,Dim3,-16.75,NA,NA,NA 
115,US,Dim3,-17.58,NA,NA,NA 
116,US,Dim3,-13.12,NA,NA,NA 
117,US,Dim3,-15.69,NA,NA,NA 
118,US,Dim3,-9.29,NA,NA,NA 
119,US,Dim3,-14.93,NA,NA,NA 
120,US,Dim3,-18.75,NA,NA,NA 
121,US,Dim3,-16.15,NA,NA,NA 
122,US,Dim3,-14.38,NA,NA,NA 
123,US,Dim3,-11.33,NA,NA,NA 
124,US,Dim3,2.06,NA,NA,NA 
125,US,Dim3,1.55,NA,NA,NA 
126,US,Dim3,3.17,NA,NA,NA 
127,US,Dim4,3.33,NA,NA,NA 
128,US,Dim4,-3.31,NA,NA,NA 
129,US,Dim4,5.67,NA,NA,NA 
130,US,Dim4,-1.94,NA,NA,NA 
131,US,Dim4,-4.2,NA,NA,NA 
132,US,Dim4,-13.53,NA,NA,NA 
133,US,Dim4,-10.84,NA,NA,NA 
134,US,Dim4,-1.04,NA,NA,NA 
135,US,Dim4,-8.02,NA,NA,NA 
136,US,Dim4,-14.65,NA,NA,NA 
137,US,Dim4,-6.39,NA,NA,NA 
138,US,Dim4,-3.69,NA,NA,NA 
139,US,Dim4,-11.62,NA,NA,NA 
140,US,Dim4,-3.02,NA,NA,NA 
141,US,Dim4,-28.84,NA,NA,NA 

attach(dims_Blog) 
d1 <- dims_Blog %>% group_by(Dimension, Region) %>% summarise(mean=mean(BlogsInd., na.rm=TRUE)) 
d1 
Dimension Region  mean 
    <fctr> <fctr>  <dbl> 
1  Dim1  PK -3.7385551 
2  Dim1  US -4.2264179 
3  Dim2  PK 1.9985551 
4  Dim2  US 1.3509577 
5  Dim3  PK 0.8965019 
6  Dim3  US 1.5335199 
7  Dim4  PK 1.4830672 
8  Dim4  US 0.3913806 

但是與其他版本相同的代碼不起作用。我錯在哪裏?

d1 <- dims_Blog %>% group_by_("Dimension", "Region") %>% summarise_(mean="mean(BlogsInd.)", na.rm=TRUE) 
> d1 
Source: local data frame [8 x 4] 
Groups: Dimension [?] 

    Dimension Region mean na.rm 
    <fctr> <fctr> <dbl> <lgl> 
1  Dim1  PK NA TRUE 
2  Dim1  US NA TRUE 
3  Dim2  PK NA TRUE 
4  Dim2  US NA TRUE 
5  Dim3  PK NA TRUE 
6  Dim3  US NA TRUE 
7  Dim4  PK NA TRUE 
8  Dim4  US NA TRUE 
+0

它的意思是,你可以通過存儲在特定的對象,而不是引用版本 – akrun

+0

是的,我明白了名。我在一個函數中使用它(因此我將相關對象的字符串名稱傳遞給group_by_)。但正如你上面看到的,它不適合我。我需要知道「group_by_」應用中的錯誤在哪裏。 –

+0

您可能需要使用'group_by _(。dots =' – akrun

回答

1

問題不在於nse,而在於na.rm的說法。在第一個例子中,你發送這個參數意味着,在第二個例子中,它被分開,並且summarise將它解釋爲要被添加的新變量。通過移動na.rmmean電話,我從兩種方法得到相同的結果:

d1 <- dims_Blog %>% group_by(Dimension, Region) %>% summarise(mean=mean(BlogsInd., na.rm=TRUE)) 
d2 <- dims_Blog %>% group_by_("Dimension", "Region") %>% summarise_(mean="mean(BlogsInd., na.rm=TRUE)") 
identical(d1,d2) #Returns TRUE 
+0

問題是我將字段名作爲字符串對象發送到函數所在的函數。請參閱我上面的最後一條評論,它解釋了字符串作爲'y =「BlogsInd。」'然後'summarise_(mean =「mean(y,na.rm = TRUE)」)傳遞的字符串給出了未找到對象(y)的錯誤。 –

+0

這是一個不同的問題,你需要在變量名中創建字符串,所以'summarise_(mean = paste(「mean(」,y,「,na.rm = TRUE)」))'應該可以工作。 –

+0

非常感謝。那是我尋找的日子的解決方案。 R並不容易。 –