如何生成由N = 100
從2維高斯分佈畫出2維樣本x = (x1,x2)T ∈ R2
的數據組,平均生成由N = 100 2維樣本
µ = (1,1)T
的數據集
和協方差矩陣
Σ = (0.3 0.2
0.2 0.2)
有人告訴我,你可以使用Matlab的功能randn
,但不知道如何在Python中實現它?
如何生成由N = 100
從2維高斯分佈畫出2維樣本x = (x1,x2)T ∈ R2
的數據組,平均生成由N = 100 2維樣本
µ = (1,1)T
的數據集
和協方差矩陣
Σ = (0.3 0.2
0.2 0.2)
有人告訴我,你可以使用Matlab的功能randn
,但不知道如何在Python中實現它?
只是要詳細說明@ EamonNerbonne的答案:下面使用協方差矩陣的Cholesky decomposition從不相關的正態分佈隨機變量生成相關變量。
import numpy as np
import matplotlib.pyplot as plt
linalg = np.linalg
N = 1000
mean = [1,1]
cov = [[0.3, 0.2],[0.2, 0.2]]
data = np.random.multivariate_normal(mean, cov, N)
L = linalg.cholesky(cov)
# print(L.shape)
# (2, 2)
uncorrelated = np.random.standard_normal((2,N))
data2 = np.dot(L,uncorrelated) + np.array(mean).reshape(2,1)
# print(data2.shape)
# (2, 1000)
plt.scatter(data2[0,:], data2[1,:], c='green')
plt.scatter(data[:,0], data[:,1], c='yellow')
plt.show()
通過np.random.multivariate_normal
產生黃色點。綠點是通過將正態分佈點乘以喬列斯基分解矩陣L
而生成的。
是的,這正是我想到的:-) - 很好的例子! – 2013-02-17 14:10:17
當我嘗試運行它時,出現此錯誤:L = linalg.cholesky(cov) NameError:名稱'linalg'未定義 – pythonnewbie 2013-02-17 20:32:35
哎呀。我忘了加入'linalg = np.linalg'。 (帖子已被更正。) – unutbu 2013-02-17 20:49:01
您正在尋找numpy.random.multivariate_normal
代碼
>>> import numpy
>>> print numpy.random.multivariate_normal([1,1], [[0.3, 0.2],[0.2, 0.2]], 100)
[[ 0.02999043 0.09590078]
[ 1.35743021 1.08199363]
[ 1.15721179 0.87750625]
[ 0.96879114 0.94503228]
[ 1.23989167 1.13473083]
[ 1.55917608 0.81530847]
[ 0.89985651 0.7071519 ]
[ 0.37494324 0.739433 ]
[ 1.45121732 1.17168444]
[ 0.69680785 1.2727178 ]
[ 0.35600769 0.46569276]
[ 2.14187488 1.8758589 ]
[ 1.59276393 1.54971412]
[ 1.71227009 1.63429704]
[ 1.05013136 1.1669758 ]
[ 1.34344004 1.37369725]
[ 1.82975724 1.49866636]
[ 0.80553877 1.26753018]
[ 1.74331784 1.27211784]
[ 1.23044292 1.18110192]
[ 1.07675493 1.05940509]
[ 0.15495771 0.64536509]
[ 0.77409745 1.0174171 ]
[ 1.20062726 1.3870498 ]
[ 0.39619719 0.77919884]
[ 0.87209168 1.00248145]
[ 1.32273339 1.54428262]
[ 2.11848535 1.44338789]
[ 1.45226461 1.42061198]
[ 0.33775737 0.24968543]
[ 1.06982557 0.64674411]
[ 0.92113229 1.0583153 ]
[ 0.54987592 0.73198037]
[ 1.06559727 0.77891362]
[ 0.84371805 0.72957046]
[ 1.83614557 1.40582746]
[ 0.53146009 0.72294094]
[ 0.98927818 0.73732053]
[ 1.03984002 0.89426628]
[ 0.38142362 0.32471126]
[ 1.44464929 1.15407227]
[-0.22601279 0.21045592]
[-0.01995875 0.45051782]
[ 0.58779449 0.44486237]
[ 1.31335981 0.92875936]
[ 0.42200098 0.6942829 ]
[ 0.10714426 0.11083002]
[ 1.44997839 1.19052704]
[ 0.78630506 0.45877582]
[ 1.63432202 1.95066539]
[ 0.56680926 0.92203111]
[ 0.08841491 0.62890576]
[ 1.4703602 1.4924649 ]
[ 1.01118864 1.44749407]
[ 1.19936276 1.02534702]
[ 0.67893239 0.8482461 ]
[ 0.71537211 0.53279103]
[ 1.08031573 1.00779064]
[ 0.66412568 0.57121041]
[ 0.96098528 0.72318386]
[ 0.7690299 0.76058713]
[ 0.77466896 0.77559282]
[ 0.47906664 0.58602633]
[ 0.52481326 0.78486453]
[-0.40240438 0.17374116]
[ 0.75730444 0.22365892]
[ 0.67811008 1.17730408]
[ 1.62245699 1.71775386]
[ 1.12317847 1.04252136]
[-0.06461117 0.23557416]
[ 0.46299482 0.51585414]
[ 0.88125676 1.23284201]
[ 0.57920534 0.63765861]
[ 0.88239858 1.32092112]
[ 0.63500551 0.94788141]
[ 1.76588148 1.63856465]
[ 0.65026599 0.6899672 ]
[ 0.06854287 0.29712499]
[ 0.61575737 0.87526625]
[ 0.30057552 0.54475194]
[ 0.66578769 0.21034844]
[ 0.94670438 0.7699764 ]
[ 0.39870371 0.91681577]
[ 1.37531351 1.62337899]
[ 1.92350877 1.34382017]
[ 0.56631877 0.77456137]
[ 1.18702642 0.63700271]
[ 0.74002244 1.04535471]
[ 0.3272063 0.75097037]
[ 1.57583435 1.55809705]
[ 0.44325124 0.39620769]
[ 0.59762516 0.58304621]
[ 0.72253698 0.68302097]
[ 0.93459597 1.01101948]
[ 0.50139577 0.52500942]
[ 0.84696441 0.68679341]
[ 0.63483432 0.22205385]
[ 1.43642478 1.34724612]
[ 1.58663111 1.49941374]
[ 0.73832806 0.95690866]]
>>>
雖然numpy的具有方便實用的功能,可以隨時「重新調整」多個獨立的正態分佈變量來匹配您給出的協方差矩陣。因此,如果您可以生成每個元素正態分佈的列向量x
(或按矩陣分組的多個向量),並且您按矩陣M
進行縮放,則結果將具有協變性M M^T
。相反,如果你將協方差C
分解成M M^T
的形式,那麼即使沒有numpy提供的效用函數(只是將你正常分佈的一組矢量乘以M
),生成這樣的分佈也非常簡單。
這也許不是你直接尋找答案,但要記住例如,它是非常有用的:
你嘗試過'numpy'嗎? – ATOzTOA 2013-02-17 10:58:17
Python模塊'numpy','scipy'和'MDP'實現了大量用Matlab可以完成的工作。 – Will 2013-02-17 11:02:03