2017-07-18 96 views
0

我有一個非ASCII字符的csv文件。我只是想刪除這些字符並閱讀我的csv文件。只跳過非ASCII字符與read.table

> tables <- lapply('/.././abc.csv', read.csv,header=F,stringsAsFactors=FALSE,fileEncoding="UTF-8") 
Warning message: 
In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : 
    invalid input found on input connection '/.././abc.csv' 
> df= suppressWarnings(do.call(rbind, tables)) 

它沒有讀取完整的文件。它只讀取Non-ASCII字符前的記錄。它在非ASCII字符後跳過所有記錄。

我不能使用iconv('/.././abc.csv', "latin1", "ASCII", sub=""),因爲它期望x作爲向量。

cat '/.././abc.csv' 
88036,120,151036.656250,2017-07-17 22:27:49,17-07-17 22:27:49 
88036,120,151036.671875,2017-07-17 22:27:53,17-07-17 22:27:53 
88036,310,151036.687500,2017-07-17 22:27:58,17-07-17 22:27:58 
88036,310,151036.703▒▒F▒▒B▒▒▒D▒%▒▒▒2▒T▒▒K222642,17-07-17 22:28:03,2017-07-17 22:28:03 
88036,310,151036.484375,2017-07-17 22:26:54,17-07-17 22:26:54 
88036,310,151036.500000,2017-07-17 22:26:59,17-07-17 22:26:59 

它讀取CSV文件後跳過最後2條記錄。任何幫助。

回答

0

如果你第一次讀它,然後你做

td <- td[,lapply(.SD,function(x){ iconv(x, "latin1", "ASCII", sub="")})]

假定你讀你的csv文件作爲data.table