局部加權爲二進制值的隨機可變

F（X）= 1的概率爲G（X）

F（X）= 0的概率爲1-G （x）的

其中0<克（x）的< 1.

假設G（X）= X。比方說，我觀察這個變量不知道函數g並獲得了100個樣本如下：現在

import numpy as np 
import matplotlib.pyplot as plt 
from scipy.stats import binned_statistic 

list = np.ndarray(shape=(200,2)) 

g = np.random.rand(200) 
for i in range(len(g)): 
    list[i] = (g[i], np.random.choice([0, 1], p=[1-g[i], g[i]])) 

print(list) 
plt.plot(list[:,0], list[:,1], 'o')

Plot of 0s and 1s

，我想找回從這些點的函數g。我能想到的最好的就是用畫一個柱狀圖，並使用平均統計：

bin_means, bin_edges, bin_number = binned_statistic(list[:,0], list[:,1], statistic='mean', bins=10) 
plt.hlines(bin_means, bin_edges[:-1], bin_edges[1:], lw=2)

Histogram mean statistics

相反，我想有發電功能的連續估計。

我想這是關於內核密度估計，但我找不到合適的指針。

來源

2017-07-14 user1860037

你可以在'Statsmodels''sklearn'中找到kdes，'scipy'也有一個。如果你只想看一看'seaborn'並且它是'distplot'或'kdeplot'。但爲什麼你想要一個KDE二進制數據？ –

@MarvinTaschenberger有可能我對kde的評論可能會引起誤解。似乎我有一個邏輯迴歸問題。 https://en.wikipedia.org/wiki/Logistic_regression#Example:_Probability_of_passing_an_exam_versus_hours_of_study。但我並不是想要適合一個模型。我想以平滑的方式繪製它。 – user1860037

這也看起來相關：http://thestatsgeek.com/2014/09/13/checking-functional-form-in-logistic-regression-using-loess/ – user1860037

簡單而不明確裝修的估計：

import seaborn as sns 
g = sns.lmplot(x= , y= , y_jitter=.02 , logistic=True)

插上x=您的外生變量和類似y =因變量。 y_jitter如果您有很多數據點，則可以提高可視性。 logistic = True是這裏的要點。它會給你數據的邏輯迴歸線。

Seaborn基本上是圍繞matplotlib定製的，並且在pandas的情況下效果很好，以防您想要將數據擴展到DataFrame。

來源

2017-07-14 16:11:02

現在，我明白我在找的是本地加權散點圖平滑。謝謝指點sns。 df = pd.DataFrame（） df ['x'] = list [：，0] df ['y'] = list [：，1] sns.lmplot（x ='x'，y =' y'，data = df，lowess = True） plt.show（） – user1860037

局部加權爲二進制值的隨機可變

回答

相關問題