2015-10-19 167 views
0
> str(store) 
'data.frame': 1115 obs. of 10 variables: 
$ Store     : int 1 2 3 4 5 6 7 8 9 10 ... 
$ StoreType    : Factor w/ 4 levels "a","b","c","d": 3 1 1 3 1 1 1 1 1 1 ... 
$ Assortment    : Factor w/ 3 levels "a","b","c": 1 1 1 3 1 1 3 1 3 1 ... 
$ CompetitionDistance  : int 1270 570 14130 620 29910 310 24000 7520 2030 3160 ... 
$ CompetitionOpenSinceMonth: int 9 11 12 9 4 12 4 10 8 9 ... 
$ CompetitionOpenSinceYear : int 2008 2007 2006 2009 2015 2013 2013 2014 2000 2009 ... 
$ Promo2     : int 0 1 1 0 0 0 0 0 0 0 ... 
$ Promo2SinceWeek   : int NA 13 14 NA NA NA NA NA NA NA ... 
$ Promo2SinceYear   : int NA 2010 2011 NA NA NA NA NA NA NA ... 
$ PromoInterval   : Factor w/ 4 levels "","Feb,May,Aug,Nov",..: 1 3 3 1 1 1 1 1 1 1 ... 

我試圖用Promo2值替換NA。值應該用列均值代替。替換NA取決於條件的值

不明白爲什麼我的代碼不能編輯商店數據。

for (i in 1:nrow(store)){ 
    if(is.na(store[i,])== TRUE & store$Promo2[i] ==0){ 
    store[i,] <- ifelse(is.na(store[i,]),0,store[i,]) 
    } 
    else if (is.na(store[i,])== TRUE & store$Promo2[i] ==1){ 
    for(j in 1:ncol(store)){ 
     store[is.na(store[i,j]), j] <- mean(store[,j], na.rm = TRUE) 
    } 
    } 
} 
+0

你需要學習一些基本的R. –

回答

3

對於Promo2SinceWeek柱:

store$Promo2SinceWeek[store$Promo2==0 & is.na(store$Promo2SinceWeek)] <- 0 
store$Promo2SinceWeek[store$Promo2==1 & is.na(store$Promo2SinceWeek)] <- mean(store$Promo2SinceWeek, na.rm=TRUE) 

對於其他列,使用同樣的方法。矢量化功能R.

0

的一個非常有用的功能來修復for循環:

for(i in 1:nrow(store)) { 
    col <- which(is.na(store[i,])) 
    store[i,][col] <- if(store$Promo2[i] == 1) colMeans(store[col], na.rm=TRUE) else 0 
} 

或者,如果你不希望任何if語句:

for (i in 1:nrow(store)) { 

    store[i,][is.na(store[i,]) & store$Promo2[i] ==0] <- 0 

    store[i,][is.na(store[i,]) & store$Promo2[i] ==1] <- 
     colMeans(store[,is.na(store[i,]) & store$Promo2[i] ==1], na.rm = TRUE) 

} 

你的循環是不因爲if陳述接受一個條件值從測試工作。您的循環向它發送if(is.na(store[i,])== TRUE & store$Promo2[i] ==0)。但是該條件聲明將具有許多值TRUE FALSE FALSE FALSE TRUE...。這是一系列的修復和錯誤時,它應該只有一個值,或者是一個 TRUE或一個錯誤。只有當您給出倍數時,該函數纔會取第一個值。

重複的例子,

store 
#     Promo2 gear carb 
#Mazda RX4    1 NA NA 
#Mazda RX4 Wag   1 4 4 
#Datsun 710    1 4 1 
#Hornet 4 Drive   0 3 1 
#Hornet Sportabout  0 3 NA 
#Valiant    0 3 1 

    for(i in 1:nrow(store)) { 
     col <- which(is.na(store[i,])) 
     store[i,][col] <- if(store$Promo2[i] == 1) colMeans(store[col], na.rm=TRUE) else 0 
    } 

store 
#     Promo2 gear carb 
#Mazda RX4    1 3.4 1.75 
#Mazda RX4 Wag   1 4.0 4.00 
#Datsun 710    1 4.0 1.00 
#Hornet 4 Drive   0 3.0 1.00 
#Hornet Sportabout  0 3.0 0.00 
#Valiant    0 3.0 1.00 

數據

store <- head(mtcars) 
store <- store[-(1:8)] 
names(store)[1] <- "Promo2" 
store[1,2] <- NA 
store[5,3] <- NA 
store[1,3] <- NA 
store