有困難中的R

我試圖子集以下要求的數據集合獲得子集：有困難中的R

ethnicity是xyz
education是本科及以上學歷，即Bachelor's Degree或Graduate Degree
我然後想看看符合上述要求的人的收入狀況。括號可以是$30,000 - $39,999或$100,000 - $124,999。
最後，作爲我的最終結果，我想看看從第三個項目（上面）獲得的子集與這些人是否是宗教的列。在數據集中，對應於religious和not religious。

因此，這將是這個樣子

income    religious 
$30,000 - $39,999  not religious 
$50,000 - $59,999   religious 
    ....     .... 
    ....     ....

保持頭腦列出的那些滿足條件1和2

請記住，我是新來編程。我試圖弄清楚很長一段時間，並已經挖掘了很多帖子。我似乎無法得到任何工作。我該如何解決？有人請幫忙。

以便不採取從崗位的清晰了，我會寄我已經試過以下（但隨時忽略它，因爲它可能是垃圾）。

我曾嘗試只是爲了得到第3步以下的許多變化，但都遭到慘敗，而我即將與鍵盤來砸我的頭：

df$income[which(df$ethnicity == "xyz" & df$education %in% c("Bachelor's Degree", "Graduate Degree"), ]

我也試過：

race <- df$ethnicity == "xyz" 
ba_ma_phd <- df$education %in% c("Graduate Degree", "Bachelor's Degree") 
income_sub <- df$income[ba_ma_phd & race]

我相信income_sub讓我到步驟3，但我不知道如何得到它的步驟4

來源

2015-10-04 AlanH

你幾乎沒有;因爲收入是一個矢量而不是數據框，所以你不需要尾隨的逗號。即你可以使用'df $ income ['％d'（df $ ethnicity ==「xyz」＆df $ education％in％c（「Bachelor's Degree」，「Graduate Degree」）]'注意，如果種族或教育缺失，你可能希望在你的子集聲明中包含非缺失變量（如果你想創建一個子集數據，那麼在開始時不要包括'df $ income'，只需使用'df'並保留這個逗號， ...所以'sub_df < - df [其中（df $種族==「xyz」＆df $ education％in％c（「學士學位」，「研究生學位」）]' – user20650

@ user20650那麼我該如何獲得對應的列'宗教'？ – AlanH

我有點不清楚你想要什麼...只是這可能是'表（sub_df $收入，sub_df $宗教）'還是你想要全列'sub_df [c（「收入」，「宗教」）]' – user20650

改變我的評論，因爲它有點太長。

首先你的代碼，你幾乎在那裏;因爲收入是一個矢量而不是數據框，所以你不需要尾隨的逗號。即你可以使用

df$income[which(df$ethnicity == "xyz" & 
     df$education %in% c("Bachelor's Degree", "Graduate Degree") ] 
# note no comma after the closing bracket

如果你想創建一個子集化的數據，然後不包括df$income在一開始，就用df並保持逗號這段時間。這會子集數據，但保留所有列

sub_df <- df[which(df$ethnicity == "xyz" & 
     df$education %in% c("Bachelor's Degree", "Graduate Degree"), ]

爲了再看看income水平的子集數據，可以使用table

table(sub_df$income)

您可以再次使用table檢查的次數通過religious狀態觀察每個income。

table(sub_df$income, sub_df$religious)

如果你只是想使用選擇income和religious列，你也可以做到這一點[

sub_df[c("religious", "income")]

來源

2015-10-04 22:57:57 user20650

非常感謝。這花了我很長時間:( – AlanH

你非常歡迎，[R標籤信息]（http://stackoverflow.com/tags/r/info）有一些非常有用的鏈接 – user20650

library(dplyr) 

df %>% 
    filter(ethnicity == "xyz" & 
     education %in% c("Bachelor's Degree", "Graduate Degree")) %>% 
    group_by(religious) %>% 
    summarize(lower_bound = min(income), 
      upper_bound = max(income))

來源

2015-10-04 22:22:10 bramtayl

有困難中的R

回答

相關問題