2016-02-25 353 views
0

我想在python和R中計算泰爾指數,但用給定的函數,我得到不同的答案。下面是我想使用的公式:泰爾指數Python與R

Theil Calculation

R中使用ineq包,我可以輕鬆地獲得泰爾指數:

library(ineq) 
x=c(26.1,16.1,15.5,15.4,14.8,14.7,13.7,12.1,11.7,11.6,11,10.8,10.8,7.5) 
Theil(x) 
0.04152699 

這個實現似乎是有道理的,我可以看看提供的代碼,看看發生了什麼確切的計算,它似乎遵循公式(當我得到他們爲了取日誌時刪除零):

getAnywhere(Theil) 
Out[24]: 
A single object matching ‘Theil’ was found 
It was found in the following places 
    package:ineq 
    namespace:ineq 
with value 

function (x, parameter = 0, na.rm = TRUE) 
{ 
    if (!na.rm && any(is.na(x))) 
     return(NA_real_) 
    x <- as.numeric(na.omit(x)) 
    if (is.null(parameter)) 
     parameter <- 0 
    if (parameter == 0) { 
     x <- x[!(x == 0)] 
     Th <- x/mean(x) 
     Th <- sum(x * log(Th)) 
     Th <- Th/sum(x) 
    } 
    else { 
     Th <- exp(mean(log(x)))/mean(x) 
     Th <- -log(Th) 
    } 
    Th 
} 

但是,我發現此問題之前已經回答了python here。該代碼是在這裏,但答案不匹配出於某種原因:

def T(x): 
    n = len(x) 
    maximum_entropy = math.log(n) 
    actual_entropy = H(x) 
    redundancy = maximum_entropy - actual_entropy 
    inequality = 1 - math.exp(-redundancy) 
    return redundancy,inequality 

def Group_negentropy(x_i): 
    if x_i == 0: 
     return 0 
    else: 
     return x_i*math.log(x_i) 

def H(x): 
    n = len(x) 
    entropy = 0.0 
    summ = 0.0 
    for x_i in x: # work on all x[i] 
     summ += x_i 
     group_negentropy = Group_negentropy(x_i) 
     entropy += group_negentropy 
    return -entropy 
x=np.array([26.1,16.1,15.5,15.4,14.8,14.7,13.7,12.1,11.7,11.6,11,10.8,10.8,7.5]) 
T(x) 
(512.62045438815949, 1.0) 

回答

3

它不是在其他問題明確規定,但執行預計其輸入視爲歸一化,讓每個x_i比例收入的,而不是實際的金額。 (這就是爲什麼其他代碼有error_if_not_in_range01功能,如果有的話x_i是引發錯誤不是0和1之間)

如果你的正常化x,你會得到相同的結果將R代碼:

>>> T(x/x.sum()) 
(0.041526988117662533, 0.0406765553418974) 

(第一個值就是R所報告的內容)