2017-02-19 38 views
0

使用SQL腳本,我需要寫一個SQL查詢如何R中

這裏是我的表

x <- read.csv("C:/Users/Admin/Downloads/Set 1-1.csv",sep=",",dec=".") 
y <- read.csv("C:/Users/Admin/Downloads/Set 1-2 - Copy.csv",sep=",",dec=".") 
y$score <- 1 

我試圖加入它

library("sqldf") 
select clientid,emailmessageid,null cnttrn,idatediff,null score from x 
union all select clientid,emailmessageid,cnttrn,idatediff,score from y 

,但我得到了以下錯誤:

select clientid,emailmessageid,null cnttrn,idatediff,null score from x

Error: unexpected symbol in "select clientid"

union all select clientid,emailmessageid,cnttrn,idatediff,score from y

Error: unexpected symbol in "union all"

請幫助糾正我噸。謝謝。

dput(X)

ClientID EmailMessageId MinDate MaxDate IdSlip WwsCreatedDate ProductArticle ProductGroupName MainProductGroupName CategoryGroupName QtytItems SumAmount iDateDiff 
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 580.0 -342 
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 3190.0 -342 

dput(Y)

ClientID EmailMessageId CntTrn iDateDiff score 
86139F31664463A8B7592B6887B731A9FC2C3489BB1756A5BF334CFDEA4EF604 9EDCC1391C208BA0 1 4 1 
BD483D69913E3EBFE5FBA87A1FFAB7DCD061055FFB4342C2F27AC01F36833254 EF72D53990BC4805 1 5 1 
0B3B2F06C3033B3AFD83BA59B405BCC79BC69801FD3B69931F117B8D754A80EB 9EDCC1391C208BA0 1 3 1 
+1

您需要使用'sqldf()'來包裝您的查詢。像'sqldf(「select * from x」)' –

+0

'我看到了它的幫助,但同樣的錯誤:'( –

+0

)使用'dput(head(x))'和'dput(head(y ))'讓它重現性。 –

回答

3

這種運行沒有爲我的錯誤。唯一的區別是查詢格式。結果是否正確?

library(sqldf) 

y <- read.table(text = "ClientID EmailMessageId CntTrn iDateDiff score 
86139F31664463A8B7592B6887B731A9FC2C3489BB1756A5BF334CFDEA4EF604 9EDCC1391C208BA0 1 4 1 
BD483D69913E3EBFE5FBA87A1FFAB7DCD061055FFB4342C2F27AC01F36833254 EF72D53990BC4805 1 5 1 
0B3B2F06C3033B3AFD83BA59B405BCC79BC69801FD3B69931F117B8D754A80EB 9EDCC1391C208BA0 1 3 1", header = TRUE) 

x <- read.table(header = TRUE, text = "ClientID EmailMessageId MinDate MaxDate IdSlip WwsCreatedDate ProductArticle ProductGroupName MainProductGroupName CategoryGroupName QtytItems SumAmount iDateDiff 
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 580.0 -342 
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8 894DB62F7B7A6ED2 31.08.2016 31.08.2016 4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02 24.09.2015 item1 item2 item3 item4 1 3190.0 -342") 

sqldf(" 
SELECT 
    ClientId, 
    EmailMessageId, 
    null CntTrn, 
    iDateDiff, 
    null Score 
FROM x 

UNION ALL 

SELECT 
     ClientId, 
     EmailMessageId, 
     CntTrn, 
     iDateDiff, 
     Score 
FROM y") 
+0

是它的工作原理,但如果我想添加像ProductGroupName,MainProductGroupName從(X) 我得到錯誤的任一列 ' 「> MYDATA = sqldf(」 +選擇 +客戶端ID, + EmailMessageId, +空ProductGroupName +空CntTrn, + iDateDiff, +空分數 + FROM X + + UNION ALL + + SELECT +客戶端ID, + EmailMessageId, + ProductGroupName + CntTrn, + iDateDiff, +分數 + FROM Y「)在rsqlite_send_query 錯誤(康涅狄格州@ PTR,語句): 近 」零「:語法錯誤' > 請幫忙 –

+0

產品組名稱在數據集中不存在,如何添加? –

+1

只需像使用普通SQL一樣將任何列添加到SELECT子句即可。如果你想要一個沒有數據的新列:'sqldf(「SELECT ClientID,EmailMessageId,null ProductGroupName,null CntTrn,iDateDiff,null Score FROM x UNION ALL SELECT ClientID,EmailMessageId,NULL,CntTrn,iDateDiff,Score FROM y) ' –