2016-05-31 64 views
0

我有一張表,其中包含每個學生2個學期選擇的課程信息。那些學生沒有驗證他們的第一個學期,因此,所有的valid_or_not_of_semester='N'semester='1st'SAS proc頻率爲變量的不同值

student semester course_selected valid_or_not_of_semester 
    A   1st   math    N 
    A   1st   english   N 
    A   2nd   math    Y 
    A   2nd   english   Y 
    B   1st   math    N 
    B   2nd   math    Y 
    B   2nd   english   Y 
    C   1st   math    N 
    C   2nd   math    N 

對於誰在第一學期選擇math(或english)的學生,我想如果他們選擇math研究(或english)在第二個學期,如果是,我要創建一個交叉表,其中計算誰驗證或沒有他們的第二個學期的學生人數:

-------------------------------------------------------------------------- 
    1st semester \ 2nd semester |   Math  | English 
    invalid  \    |---------------------|-------------------- 
    students  \   | valid | invalid | valid | invalid 
-------------------------------------------------------------------------- 
      Math     | 2 | 1  | 2 |  0 
-------------------------------------------------------------------------- 
     English    | 1 | 0  | 1 |  0 
-------------------------------------------------------------------------- 

每一行代表數量學生誰沒有驗證第一學期,並在第一學期選擇了該課程。專欄將選擇課程的學生分爲有效和無效的第二學期。更精確地說,

-------------------------------------------------------------------------- 
    1st semester \ 2nd semester |   Math  | English 
    invalid  \    |---------------------|-------------------- 
    students  \   | valid | invalid | valid | invalid 
-------------------------------------------------------------------------- 
      Math     | 2 |  1  | 2 |  0 
            |   |   | 
            \/  \/  \/
         (students A&B) (student C) (students A&B) 

我試圖PROC SQL:

data math; 
    merge have 
    have (where=(semester='1st') in=these); 
    by student; 
    if these then output; 
run; 

proc sql; 
    create table result as 
    select count(distinct student) as nb_student 
    from math (where=(semester='2nd')) 
    group by course_selected, valid_or_not_of_semester; 
quit; 

而對於english做同樣的事情。

但是有沒有辦法直接獲得2個課程的結果?我如何使用proc freq?

希望得到你的答案。

回答

1

這並不完全給出您要查找的表格,但它確實會生成您感興趣的值。該想法是轉置原始數據集,然後計算觀察值。

您可能還想查看proc tabulate,但您可能會遇到問題,因爲您在某些情況下對學生進行了重複計算。

data temp; 
    input student $ semester $ course_selected $ valid_or_not_of_semester $; 
    datalines; 
    A 1st math N 
    A 1st english N 
    A 2nd math Y 
    A 2nd english Y 
    B 1st math N 
    B 2nd math Y 
    B 2nd english Y 
    C 1st math N 
    C 2nd math N 
    ; 
    proc sort; by student; 
run; 

proc transpose data = temp out = temp2; 
    by student; 
    id course_selected semester; 
    var valid_or_not_of_semester; 
run; 

proc sql; 
    create table temp3 as select distinct 
     sum(case when math1st = "N" and math2nd = "Y" then 1 else 0 end) as math_math_valid, 
     sum(case when math1st = "N" and math2nd = "N" then 1 else 0 end) as math_math_invalid, 
     sum(case when english1st = "N" and math2nd = "Y" then 1 else 0 end) as english_math_valid, 
     sum(case when english1st = "N" and math2nd = "N" then 1 else 0 end) as english_math_invalid, 
     sum(case when math1st = "N" and english2nd = "Y" then 1 else 0 end) as math_english_valid, 
     sum(case when math1st = "N" and english2nd = "N" then 1 else 0 end) as math_english_invalid, 
     sum(case when english1st = "N" and english2nd = "Y" then 1 else 0 end) as english_english_valid, 
     sum(case when english1st = "N" and english2nd = "N" then 1 else 0 end) as english_english_invalid 
     from temp2; 
quit;