-1
我想總結一個在R中的數據集。我是R中的初學者。下面的代碼工作,但有很多步驟。有沒有更簡單的方法來實現這一點?我想完成下列工作:在多條件下在R中聚合
1)由CLIENT_ID
2基)計數所有ClaimNumbers(是否與DS相關聯或不)
3)只計算權利要求數字與DS 4)之和零售和WS只適用於DS
5)另外,我想只計算一次索賠。在數據中,每個服務年份和服務都會重複一個索賠編號。
# example
ds <- read.table(text = "
Client_ID ClaimNumber ServiceYr Service Retail WS
A00002 WC1 2012 DS 100 25
A00002 WC1 2013 DS 100 25
A00002 WC1 2014 BR 50 10
A00002 WC2 2014 BR 50 10
A00002 WC3 2014 BR 50 10
A00003 WC4 2014 BR 50 10
A00003 WC4 2015 BR 50 10
A00003 WC5 2015 BR 50 10
A00003 WC5 2016 BR 50 10
A00003 WC6 2016 DS 100 25",
sep="",header=TRUE)
# group by client ID and claim number to get one row per claim number
total_claims <- sqldf("select Client_ID,ClaimNumber from ds group
by Client_ID,ClaimNumber")
# For DS claims only - group by client ID and claim number
# to get one row per claim number
ds_claims <- sqldf("select Client_ID,ClaimNumber, sum(Retail) as Retail,
sum(WS) as WS from ds where Service='DS' group by Client_ID,ClaimNumber")
# count the total number of claims by client
total_counts <- aggregate(total_claims[,2],b=list(total_claims$Client_ID),FUN=length)
# fix column headers
colnames(total_counts)[1:2] <- c("Client_ID","ClaimCount")
# count the number of DS claims by client
ds_claim_counts <- aggregate(ds_claims[,2],b=list(ds_claims$Client_ID),FUN=length)
# fix column headers
colnames(ds_claim_counts)[1:2] <- c("Client_ID","ClaimCount")
# merge to get both total counts and ds counts on the same table
total <- merge(total_counts,ds_claim_counts, by="Client_ID",all.x=TRUE)
# merge to add ds retail and ws amounts to total table
total <- merge(total,ds_claims[,c(1,3,4)], by="Client_ID",all.x=TRUE)
# fix column headers
colnames(total)[2:3] <- c("Total_CC","DS_CC")
請看看如何產生[這些技巧最低限度,com完整和可驗證的例子](http://stackoverflow.com/help/mcve),以及這篇文章[在R中創建一個很好的例子](http://stackoverflow.com/questions/5963269/how-to - 製作 - 一個偉大-R重現-例子)。 – lmo