2016-09-16 68 views
-1

我有一個數據集:創建列binarise日期R中

Date Customer ID Customer Delivery City Category 
31/12/2015 14057267 a NewCity Software - System Infrastructure 
31/12/2015 14057267 a NewCity Software - Information/Data Management 
31/12/2015 14057267 a NewCity Software - Information/Data Management 
31/12/2015 14057267 b NewCity Software - Information/Data Management 
31/12/2015 14057267 b OldCity Software - Information/Data Management 
31/12/2015 14057267 c OldCity Software - Information/Data Management 
31/12/2015 14057267 c OldCity Software - Information/Data Management 

我想根據日期來創建新列,所以如果最大日期是31,我需要儘可能多的列數天。這些列將有0或1個值,這取決於日期列中的日期,例如如果日期是01,那麼X_1=1 &剩下31天的列X_2 ... X31 = 0。我想對日期進行二進制化,同樣我想爲客戶名稱做X_a,X_b,X_c,它們也將具有值0 & 1。

有人可以幫忙嗎?

+0

能否請您提供您的數據的採樣('dput(頭(your_data))')和預期輸出? –

回答

2

如何以下(只是在數據幀2列所示):

# initial dataframe 
head(df) 
# Date  Customer 
#1 01/12/2015  b 
#2 02/12/2015  c 
#3 03/12/2015  a 
#4 04/12/2015  b 
#5 05/12/2015  b 
#6 06/12/2015  b 

df$X <- substring(as.character(df$Date), 1, 2) 
df <- cbind.data.frame(df, model.matrix(~X-1, df))[-3] 

# final dataframe 
head(df) 
# Date  Customer X01 X02 X03 X04 X05 X06 X07 X08 X09 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28 X29 X30 X31 
#1 01/12/2015  c 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#2 02/12/2015  a 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#3 03/12/2015  a 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#4 04/12/2015  b 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#5 05/12/2015  c 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
#6 06/12/2015  a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
+0

感謝您發送此通過,但我已經找到了一個方便的方式:for(level in unique(B2B_Data $ Day)){B2B_Data [paste(「Day」,level,sep =「_」)] - ifelse(B2B_Data $ Day == level,1,0) } – user6016731