2017-10-11 74 views
0

我有以下的數據幀變化的科學數量的因素列的類型轉換成數據幀中的數字:如何使用dplyr

library(tidyverse) 
df <- structure(list(rank = structure(c(1L, 10L, 11L, 12L, 13L, 14L, 
15L, 16L, 17L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", 
"10", "11", "12", "13", "14", "15", "16", "17\n*", "2", "3", 
"4", "5", "6", "7", "8", "9"), class = "factor"), p_value = structure(c(2L, 
5L, 17L, 16L, 13L, 12L, 11L, 10L, 9L, 8L, 4L, 3L, 14L, 7L, 6L, 
1L, 15L), .Label = c("1e-12", "1e-12262", "1e-164", "1e-176", 
"1e-2381", "1e-26", "1e-27", "1e-274", "1e-369", "1e-397", "1e-413", 
"1e-422", "1e-429", "1e-57", "1e-6", "1e-855", "1e-919"), class = "factor")), row.names = c(NA, 
-17L), class = c("tbl_df", "tbl", "data.frame"), .Names = c("rank", 
"p_value")) 

df看起來是這樣的:

# A tibble: 17 x 2 
     rank p_value 
    <fctr> <fctr> 
1  1 1e-12262 
2  2 1e-2381 
3  3 1e-919 
4  4 1e-855 
5  5 1e-429 
6  6 1e-422 
7  7 1e-413 
8  8 1e-397 
9  9 1e-369 
10  10 1e-274 
11  11 1e-176 
12  12 1e-164 
13  13 1e-57 
14  14 1e-27 
15  15 1e-26 
16  16 1e-12 
17 "17\n*"  1e-6 

我的問題是如何將p_value列類型從fctr轉換爲數字,以便我可以使用它執行數學運算。

我試圖與錯誤

> df %>% mutate(logp = log(p_value)) 
Error in mutate_impl(.data, dots) : 
    Evaluation error: ‘log’ not meaningful for factors. 

回答

1

您可以將這些轉換爲這樣的數字。您首先需要在數字之前將因子轉換爲字符,否則您只需獲取數字因子水平。

df %>% mutate(logp = log(as.numeric(as.character(p_value)))) 

# A tibble: 17 x 3 
     rank p_value  logp 
    <fctr> <fctr>  <dbl> 
1  1 1e-12262  -Inf 
2  2 1e-2381  -Inf 
3  3 1e-919  -Inf 
4  4 1e-855  -Inf 
5  5 1e-429  -Inf 
6  6 1e-422  -Inf 
7  7 1e-413  -Inf 
8  8 1e-397  -Inf 
9  9 1e-369  -Inf 
10  10 1e-274 -630.90832 
11  11 1e-176 -405.25498 
12  12 1e-164 -377.62396 
13  13 1e-57 -131.24735 
14  14 1e-27 -62.16980 
15  15 1e-26 -59.86721 
16  16 1e-12 -27.63102 
17 "17\n*"  1e-6 -13.81551