2017-08-24 79 views
5

我有一堆數據框與不同的變量。我想將它們讀入R中,並向那些缺少一些變量的列添加列,以便它們都有一組通用標準變量,即使有些變量是不可見的。添加列,如果它不存在

換句話說......有沒有辦法在列不存在的情況下在tidyverse中添加NA列?我當前的嘗試適用於在列不存在的情況下添加新變量(top_speed),但當列已存在時(mpg)(它將所有觀察值設置爲第一個值,Mazda RX4)失敗。

library(tidyverse) 
mtcars %>% 
    tbl_df() %>% 
    rownames_to_column("car") %>% 
    mutate(top_speed = ifelse("top_speed" %in% names(.), top_speed, NA), 
     mpg = ifelse("mpg" %in% names(.), mpg, NA)) %>% 
    select(car, top_speed, mpg, everything()) 

# # A tibble: 32 x 13 
#     car top_speed mpg cyl disp hp drat wt qsec vs am gear carb 
#    <chr>  <lgl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
# 1   Mazda RX4  NA 21  6 160.0 110 3.90 2.620 16.46  0  1  4  4 
# 2  Mazda RX4 Wag  NA 21  6 160.0 110 3.90 2.875 17.02  0  1  4  4 
# 3  Datsun 710  NA 21  4 108.0 93 3.85 2.320 18.61  1  1  4  1 
# 4 Hornet 4 Drive  NA 21  6 258.0 110 3.08 3.215 19.44  1  0  3  1 
# 5 Hornet Sportabout  NA 21  8 360.0 175 3.15 3.440 17.02  0  0  3  2 
# 6   Valiant  NA 21  6 225.0 105 2.76 3.460 20.22  1  0  3  1 
# 7  Duster 360  NA 21  8 360.0 245 3.21 3.570 15.84  0  0  3  4 
# 8   Merc 240D  NA 21  4 146.7 62 3.69 3.190 20.00  1  0  4  2 
# 9   Merc 230  NA 21  4 140.8 95 3.92 3.150 22.90  1  0  4  2 
# 10   Merc 280  NA 21  6 167.6 123 3.92 3.440 18.30  1  0  4  4 

回答

3

我們可以創建一個輔助函數來創建列

fncols <- function(data, cname) { 
    add <-cname[!cname%in%names(data)] 

    if(length(add)!=0) data[add] <- NA 
    data 
} 
fncols(mtcars, "mpg") 
fncols(mtcars, c("topspeed","nhj","mpg")) 
+1

非常感謝......認爲它迄今爲止運行速度最快......我將在具有多個(〜15)變量的鏈中使用它......'mtcars%>%fncols (「mpg」)%>%fncols(「top_speed」)' – gjabel

+1

而不是在鏈中使用它,查看編輯版本 – Onyambu

+0

@Onyambu感謝您的編輯。這是更一般的 – akrun

1

可以使用rowwise功能是這樣的:

library(tidyverse) 
mtcars %>% 
    tbl_df() %>% 
    rownames_to_column("car") %>% 
    rowwise() %>% 
    mutate(top_speed = ifelse("top_speed" %in% names(.), top_speed, NA), 
     mpg = ifelse("mpg" %in% names(.), mpg, NA)) %>% 
    select(car, top_speed, mpg, everything()) 
0

您可以綁定新的data.frame列僞造完整的data.frame填充NA,重命名重複的列,然後僅篩選原始名稱。

# your default complete vector of col names 
standard.variables = names(mtcars) 
# prep 
default=mtcars %>% mutate_all(.funs=function(x) NA) 
# treat with a data.frame missing 3 columns 
test=mtcars %>% select(-mpg, -disp, -am) 
bind_cols(test, default) %>% setNames(make.names(names(.), unique=TRUE)) %>% 
    select_(.dots=standard.variables) %>% head(2) 
#### mpg cyl disp hp drat wt qsec vs am gear carb 
#### 1 NA 6 NA 110 3.9 2.620 16.46 0 NA 4 4 
#### 2 NA 6 NA 110 3.9 2.875 17.02 0 NA 4 4 
1

嘗試以下,

library(tidyverse) 

mtcars %>% 
    tbl_df() %>% 
    rownames_to_column("car") %>% 
    mutate(top_speed = if ("top_speed" %in% names(.)){return(top_speed)}else{return(NA)}, 
     mpg = if ("mpg" %in% names(.)){return(mpg)}else{return(NA)}) %>% 
    select(car, top_speed, mpg, everything()) 
# A tibble: 32 x 13 
       car top_speed mpg cyl disp hp drat wt qsec vs am gear carb 
       <chr>  <lgl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
1   Mazda RX4  NA 21.0  6 160.0 110 3.90 2.620 16.46  0  1  4  4 
2  Mazda RX4 Wag  NA 21.0  6 160.0 110 3.90 2.875 17.02  0  1  4  4 
3  Datsun 710  NA 22.8  4 108.0 93 3.85 2.320 18.61  1  1  4  1 
4 Hornet 4 Drive  NA 21.4  6 258.0 110 3.08 3.215 19.44  1  0  3  1 
5 Hornet Sportabout  NA 18.7  8 360.0 175 3.15 3.440 17.02  0  0  3  2 
6   Valiant  NA 18.1  6 225.0 105 2.76 3.460 20.22  1  0  3  1 
7  Duster 360  NA 14.3  8 360.0 245 3.21 3.570 15.84  0  0  3  4 
8   Merc 240D  NA 24.4  4 146.7 62 3.69 3.190 20.00  1  0  4  2 
9   Merc 230  NA 22.8  4 140.8 95 3.92 3.150 22.90  1  0  4  2 
10   Merc 280  NA 19.2  6 167.6 123 3.92 3.440 18.30  1  0  4  4 
# ... with 22 more rows 

我認爲ifelse()不從對象繼承的類。

2

如果您有一個空數據框包含要檢查的所有名稱,則可以使用bind_rows添加列。

我用purrr:map_dfr使空的tibble與適當的列名稱。

columns = c("top_speed", "mpg") %>% 
    map_dfr(~tibble(!!.x := logical())) 

# A tibble: 0 x 2 
# ... with 2 variables: top_speed <lgl>, mpg <lgl> 

bind_rows(columns, mtcars) 

# A tibble: 32 x 12 
    top_speed mpg cyl disp hp drat wt qsec vs am gear carb 
     <lgl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 
1  NA 21.0  6 160.0 110 3.90 2.620 16.46  0  1  4  4 
2  NA 21.0  6 160.0 110 3.90 2.875 17.02  0  1  4  4 
3  NA 22.8  4 108.0 93 3.85 2.320 18.61  1  1  4  1 
相關問題