2017-11-17 183 views
0

我在R中使用函數分位數來計算第90,第75,第50,第25百分位數,但是我的同事使用SAS proc單變量來做相同的計算,不同的結果(例如,來自R的第90百分位數是47.36,但來自SAS的第90百分位數是50.64)。我試圖找出原因。有人能給我一些指導嗎?SAS Proc單變量和R分位數函數的不同結果

R代碼裏面:

位數(C(43.55,41.30,39.40,40.93,38.74,39.97,45.38,41.48,45.01,42.03,44.71,43.42,45.83,43.44,37.84,50.64,53.16,45.95 ),概率= C(0.90,0.10,0.75,0.50,0.25))

SAS代碼:R中

data x; 
    input x; 
    datalines; 
    43.55 
    41.30 
    39.40 
    40.93 
    38.74 
    39.97 
    45.38 
    41.48 
    45.01 
    42.03 
    44.71 
    43.42 
    45.83 
    43.44 
    37.84 
    50.64 
    53.16 
    45.95 

    ; 
    run; 
    proc univariate data=x noprint ; 
    var x; 
    output out=new p90=p90 p10=p10 q3=p75 median=p50 q1=p25 ; 
    run; 
+3

在R中,'quantile()'有'type ='參數。 (關於分位數是什麼沒有統一的定義。)嘗試'type = 3'(應該符合SAS​​定義)。有關不同的定義,請參閱「?quantile」。 – MrFlick

回答

0

默認方法是7而在SAS默認大概empirical distribution function with averaging

如果您在R中使用添加選項type = 1,您將得到與SAS中相同的結果。

quantile(c(43.55,41.30,39.40,40.93,38.74,39.97,45.38,41.48,45.01, 
      42.03,44.71,43.42,45.83,43.44,37.84,50.64,53.16,45.95), 
     prob=c(0.90, 0.10, 0.75, 0.50, 0.25), 
     type = 1) 
    90% 10% 75% 50% 25% 
50.64 38.74 45.38 43.42 40.93 
+0

非常感謝! –