2017-04-23 83 views
-1

我正在處理NLS數據,並希望對幾個獨立變量運行婚姻狀態的邏輯迴歸。 婚姻狀況是編碼方式如下: 1084 1未婚,同居 2441 2未婚,不同居 2744 3已婚,配偶本 188 4已婚,配偶缺席 18 5分離,同居 66 6分離,而不是同居 202 7離婚,同居 361 8離婚,不同居 4 9寡,同居 12 10寡,​​不同居在R中重組變量

我想只是爲了2組的已婚,從未結婚,其中組1和2將總結爲結婚= 0,其餘結婚= 1。我的數據集稱爲nlsy。 我知道這是一個基本問題,但我將不勝感激任何幫助。 謝謝!

+1

請在提問時提供一個[可重現的例子]。還請以您的預期產出爲例說明問題。 –

回答

0

嘗試像(代入實際變量的名稱,因爲你沒有提供的最小重複的例子):

nlsy$never_married <- nlsy$marital_status %in% c("1084 1 Never married, cohabiting", "2441 2 Never married, not cohabiting") 

這會讓你data.frame的列(假設NLSY是數據。框架),這是一個邏輯值,TRUE如果從未結婚,FALSE如果結婚。

+0

我最終做了以下事情,它不漂亮,但它似乎工作:nlsy $已婚[nlsy $已婚== 1] < - 0 nlsy $已婚[nlsy $已婚== 2] < - 0 nlsy $已婚[nlsy $ married == 3] < - 1 nlsy $已婚[nlsy $已婚== 4] < - 1 nlsy $已婚[nlsy $已婚== 5] < - 1 nlsy $已婚[nlsy $已婚== 6] < - 1 nlsy $ married [nlsy $ married == 7] < - 1 nlsy $ married [nlsy $ married == 8] < - 1 nlsy $ married [nlsy $ married == 9] < - 1 nlsy $已婚[nlsy $已婚== 10] < - 1 – krilee

0

使用此。你做了什麼沒有錯,但這是一個長期的方法。

install.packages("dplyr") 
library(dplyr) 

a <- cbind.data.frame(status=c("Never married, cohabiting","Never married, not cohabiting","Married, spouse present", 
     "Married, spouse absent","Separated, cohabiting","Separated, not cohabiting", 
     "Divorced, cohabiting","Divorced, not cohabiting","Widowed, cohabiting", 
     "Widowed, not cohabiting"), value=c(1084 ,2441,2744,188,18,66,202,361,4,12)) 

a=a %>% 
    mutate(married_status=as.numeric(status %in% 
         c("Married, spouse present", 
     "Married, spouse absent","Separated, cohabiting","Separated, not cohabiting", 
     "Divorced, cohabiting","Divorced, not cohabiting","Widowed, cohabiting", 
     "Widowed, not cohabiting"))) %>% 
select(-status) 

如有任何疑問,請通知我。