2017-07-28 80 views
0

我定期從eurostat下載一個包含R的eurostat包的數據集,並使用函數label_eurostat()標記它。下面的代碼只是工作在過去的罰款,但給了我一些錯誤,因爲這一週:如何解決label_eurostat()中的錯誤:「字典信息丟失」

> emprt <- get_eurostat("lfst_r_lfe2emprt", time_format = "num") 
> emprt <- filter(emprt, sex == "T", age == "Y15-64", geo %in% c("AT", "DE", "FR")) 
> emprt <- dcast(emprt, geo ~ time) 
Using values as value column: use value.var to override. 
> emprt <- label_eurostat(emprt, lang = "de") 
Error in label_eurostat(emprt, lang = "de") : 
Dictionary information is missing 

我也嘗試了具體的解釋,但收到另一條警告消息:如果

> emprt <- label_eurostat(emprt, dic = "geo", lang = "de") 
Warning message: 
In label_eurostat(emprt, dic = "geo", lang = "de") : 
    All labels for geo were not found. 

我不確定字典是可供選擇的字典,但它是我在eurostat找到的唯一字典。 我也看到,還有其他的一些問題具有這種功能造成這樣的錯誤:

Error in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels, : 
factor level [19] is duplicated 

但我不確定是否這是一個關係到我的問題。 我很感謝每一個提示!

回答

0

你可以使用

packageVersion("eurostat") 
# [1] ‘3.1.1’ 
library(eurostat) 
library(tidyverse) 
library(reshape2) 
get_eurostat("lfst_r_lfe2emprt", time_format = "num") %>% 
    filter(sex == "T", age == "Y15-64", geo %in% c("AT", "DE", "FR")) %>% 
    dcast(geo ~ time) %>% 
    droplevels %>% 
    mutate(geo = label_eurostat(geo, dic = "geo", lang = "de")) 

get_eurostat("lfst_r_lfe2emprt", time_format = "num") %>% 
    filter(sex == "T", age == "Y15-64", geo %in% c("AT", "DE", "FR")) %>% 
    label_eurostat(lang = "de") %>% 
    dcast(geo ~ time) 

至於警告:如果不刪除未使用geo因子水平,label_eurostat可以分配重複的標籤;例如考慮

get_eurostat("lfst_r_lfe2emprt", time_format = "num") %>% 
    pull(geo) %>% 
    levels %>% 
    grep(pattern = "^DE3", value = TRUE) 
# [1] "DE3" "DE30" 

如果你現在看get_eurostat_dic("geo"),既DE3DE30導致Berlin

get_eurostat_dic("geo") %>% filter(grepl("^DE30?$", code_name)) 
# # A tibble: 2 x 2 
# code_name full_name 
#  <chr>  <chr> 
# 1  DE3 Berlin 
# 2  DE30 Berlin 

旁註:你不需要reshape2::dcast如果你有加載的tidyverse;您也可以改爲select(geo, time, values) %>% spread(time, values)

相關問題