我對R比較新。我想知道如何使用「調查」包（http://r-survey.r-forge.r-project.org/survey/）來分析加權樣本的多重回答問題？棘手的是，可以勾選多個響應，以便將響應存儲在多個列中。如何使用R調查軟件包分析加權樣本中的多個回答問題？

例子：

我有500名受訪者誰是隨機來自全國10個地區得出的調查數據。假設被問到的主要問題是（存儲在H1_AreYouHappy列中）：'你快樂嗎？' - 是/否/不知道

被訪者被問到後續問題：'你爲什麼（快樂）？這是一個選擇題，可以選擇多個答案框，因此答案被存儲在單獨的列中，例如：

H1Yes_Why1（0/1，即選擇框打勾或未打勾） - '因爲economny「;

H1Yes_Why2（0/1） - '因爲我健康';

H1Yes_Why3（0/1） - '因爲我的社交生活'。

下面是根據各地區

library(survey) 
# Create an unweighted survey object 
mySurvey.unweighted <- svydesign(ids=~1, data=myDataFrame) 

# Choose which variable contains the sample distribution to be weighted by 
sample.distribution <- list(~District) 

# Specify (from Census data) how often each level occurs in the population 
population.distribution <- data.frame(District = c('Green', 'Red','Orange','Blue','Purple','Grey','Black','Yellow','White','Lavender'), 
           freq = c(0.1824885, 0.0891206, 0.1381343, 0.1006533, 0.1541269, 0.0955853, 0.0268172, 0.0398353, 0.0809459, 0.0922927)) 

# Apply the weights 
mySurvey.rake <- rake(design = mySurvey.unweighted, sample.margins=sample.distribution, population.margins=list(population.distribution)) 

# Calculate the weighted mean for the main question 
svymean(~H1_AreYouHappy, mySurvey.rake) 

# How can I calculate the WEIGHTED means for the multiple choice - multiple response follow-up question?

的事實上的人口規模我的假數據集

districts <- c('Green', 'Red','Orange','Blue','Purple','Grey','Black','Yellow','White','Lavender') 
myDataFrame <- data.frame(H1_AreYouHappy=sample(c('Yes','No','Dont Know'),500,rep=TRUE), 
          H1Yes_Why1 = sample(0:1,500,rep=TRUE), 
          H1Yes_Why2 = sample(0:1,500,rep=TRUE), 
          H1Yes_Why3 = sample(0:1,500,rep=TRUE), 
          District = sample(districts,500,rep=TRUE), stringsAsFactors=TRUE)

我使用的R「調查」包申請後分層權重我如何計算多項選擇問題的加權平均值（即跨越0/1響應列）？

如果我想它不加權的，我可以使用此功能橫跨符合我的前綴「H1Yes_Why」

multipleResponseFrequencies = function(data, question.prefix) { 
    # Find the columns with the questions 
    a = grep(question.prefix, names(data)) 
    # Find the total number of responses 
    b = sum(data[, a] != 0) 
    # Find the totals for each question 
    d = colSums(data[, a] != 0) 
    # Find the number of respondents 
    e = sum(rowSums(data[,a]) !=0) 
    # d + b as a vector. This is the overfall frequency 
    f = as.numeric(c(d, b)) 
    result <- data.frame(question = c(names(d), "Total"), 
         freq = f, 
         percent = (f/b)*100, 
         percentofcases = (f/e)*100) 
    result 
} 
multipleResponseFrequencies(myDataFrame, 'H1Yes_Why')

任何幫助，將不勝感激所有列計算的頻率。

來源

2016-07-30 Chris G.

你可能會更好，通過分析一個例子，在工作http://asdfree.com/ –

@AnthonyDamico請問你的例子告訴我們如何分析多個響應問題？任何示例？ – SmallChess

我想你想

svyratio(~ H1Yes_Why1 + H1Yes_Why2 + H1Yes_Why3 , ~ as.numeric(H1Yes_Why1 + H1Yes_Why2 + H1Yes_Why3) , mySurvey.rake)

來源

2017-01-23 19:02:41

如何使用R調查軟件包分析加權樣本中的多個回答問題？

例子：

下面是根據各地區

的事實上的人口規模我的假數據集

我使用的R「調查」包申請後分層權重我如何計算多項選擇問題的加權平均值（即跨越0/1響應列）？

回答

相關問題