問題使用read.csv函數read.table和R中讀取數據

-1

我具有被從MySQL使用下面的命令中導出的數據，問題使用read.csv函數read.table和R中讀取數據

SELECT 
    id_code,info_text INTO OUTFILE '/tmp/company-desc.csv' 
    FIELDS TERMINATED BY ';' 
    OPTIONALLY ENCLOSED BY '"' 
    LINES TERMINATED BY '\n' 
FROM 
    dx_company WHERE LENGTH(id_code) = 8 AND 
    id_code REGEXP '^[0-9]+$';

但是當我嘗試使用下面的命令來加載CSV R，

dt.companydesc <- read.csv("company-desc.csv",sep=';',fill=T, encoding = "UTF-8",quote="\n",header=FALSE)

或

dt.companydesc <- read.csv("company-desc.csv",sep=';',fill=T, encoding = "UTF-8",quote="\"",header=FALSE)

它yeilds類似的結果：

Id code description 
2345  This is the description \n344555 \n737384 \n388383 \n000083

某些id與說明混在一起。它在閱讀時基本上存在引號和\ n問題。如果我試圖讓我干擾整個桌子。我也試過gsub和readLines。任何幫助。

的快照：（CSV文件）

"1000004";"general" 
    "1000000";"licensed version, and products" 
    "1000007";"" 
    "1000003";"" 
    "1000002";"" 
    "1000006";"" 
    "1000002";"automobiles; well organised"

所需的輸出：

Id_code Description 
    1000004 general 
    1000000 licensed version, and products 
    1000007 NA 
    1000003 NA 
    1000002 NA 
    1000006 NA 
    1000002 automobiles and industry; well organised

來源

2015-10-05 Maddy

沿後一個例子預期產出。 –

我的猜測是你的'quote'參數不正確，但我不能確定沒有看到CSV文件的示例。 – Benjamin

'quote =「\ n」'是一種看不見的東西。你對MySQL說分隔符是逗號，然後當你調用'read.csv'時使用';'。你確定嗎？ – nicola

這裏是一種使用data.table::fread，這也是快：

require(data.table) # v1.9.6+ 
fread(' "1000004";"general" 
    "1000000";"licensed version, and products" 
    "1000007";"" 
    "1000003";"" 
    "1000002";"" 
    "1000006";"" 
    "1000002";"automobiles; well organised"', na.strings="", 
header=FALSE, col.names=c("Id_code", "Description")) 

# Id_code     Description 
# 1: 1000004      general 
# 2: 1000000 licensed version, and products 
# 3: 1000007        NA 
# 4: 1000003        NA 
# 5: 1000002        NA 
# 6: 1000006        NA 
# 7: 1000002 automobiles; well organised

來源

2015-10-05 12:07:24 Arun

他們應該避免CSV步驟並從R查詢數據庫。 – Roland

謝謝@Arun，它工作:)）） – Maddy

問題使用read.csv函數read.table和R中讀取數據

回答

相關問題