2015-11-07 91 views
2

我正在讀取R中的.csv excel文件並提取我的數據的第5列。我得到一個字符串,但我需要將字符串bmi2014轉換爲數值而不會丟失信息。這是我到目前爲止已經試過:將字符串轉換爲R中的數字值

options(stringsAsFactors = FALSE) 
setwd("~/CAD Project") 
bmi <- read.csv("~/CAD Project/BMI sex and province.csv") 
bmi <- bmi[8:46 , ] #removing rows I don't need 
bmi <- bmi[, 2:6] #removing columns I don't need 
bmi2014 <- bmi[, 5] 
bmi2014 
[1] "272,818" "146,959" "125,859" "65,238" "32,132" "33,106"   
"443,317" "234,307" "209,010" "355,959" "192,160" "163,799"  
"3,226,705" "1,865,444" "1,361,261" "5,508,224" "3,133,853" "2,374,371" 
"533,910" "296,162" "237,748" "446,312" "254,005" "192,307" 
[25] "1,658,172" "984,981" "673,190" "1,667,339" "990,920" "676,418"   
"15,453" "8,482" "6,971"  "19,607" "11,312" "8,294"  "9,469"   
"5,187"  "4,282"  
mydata <- as.numeric(as.character(bmi2014)) 
Warning message: 
NAs introduced by coercion 

我使用type.convert嘗試和

as.matrix(sapply(bmi2014, as.numeric), na.rm = TRUE) 

以及,但似乎沒有返回NA值來解決這個問題。我還有什麼可以嘗試的,以便獲得272,818,146,959等數字的列表......謝謝!

回答

1

問題是逗號(,)。在轉換爲數字之前,您必須先使用gsub刪除它們。

bmi2014 <-c("272,818","146,959","125,859","65,238", "32,132","33,106", 
"443,317","234,307","209,010") 
as.numeric(gsub(",","",bmi2014)) 
1[1] 272818 146959 125859 65238 32132 33106 443317 234307 209010 
+0

或者使用一個感知區域的解析器,例如readr,它可以處理它們。如果他們只是一千個分隔符,刪除它們可能更容易一些。 –

+0

非常感謝你! – csik