2017-09-14 142 views
0

我有幾組數據,我試圖適合不同的配置文件。在其中一個最小值的中心有污染,可以防止我做得很好,正如您可以在此圖像中看到的那樣:fits of the profilesPython - 剪切數據以適合配置文件

如何在我的數據底部夾取這些尖峯,並考慮到秒殺並不總是在同一個位置?或者你將如何處理這樣的數據?我使用lmfit來擬合輪廓,在這種情況下是洛倫茲和高斯。這裏就是我與初始值起到更緊密地擬合數據的最小工作示例:

import numpy as np 
import matplotlib.pyplot as plt 
from lmfit import Model 
from lmfit.models import GaussianModel, ConstantModel, LorentzianModel 

x = np.array([4085.18084467, 4085.38084374, 4085.5808428 , 4085.78084186, 4085.98084092, 4086.18083999, 4086.38083905, 4086.58083811, 4086.78083717, 4086.98083623, 4087.1808353 , 4087.38083436, 4087.58083342, 4087.78083248, 4087.98083155, 4088.18083061, 4088.38082967, 4088.58082873, 4088.78082779, 4088.98082686, 4089.18082592, 4089.38082498, 4089.58082404, 4089.78082311, 4089.98082217, 4090.18082123, 4090.38082029, 4090.58081935, 4090.78081842, 4090.98081748, 4091.18081654, 4091.3808156 , 4091.58081466, 4091.78081373, 4091.98081279, 4092.18081185, 4092.38081091, 4092.58080998, 4092.78080904, 4092.9808081 , 4093.18080716, 4093.38080622, 4093.58080529, 4093.78080435, 4093.98080341, 4094.18080247, 4094.38080154, 4094.5808006 , 4094.78079966, 4094.98079872, 4095.18079778, 4095.38079685, 4095.58079591, 4095.78079497, 4095.98079403, 4096.1807931 , 4096.38079216, 4096.58079122, 4096.78079028, 4096.98078934, 4097.18078841, 4097.38078747, 4097.58078653, 4097.78078559,4097.98078466, 4098.18078372, 4098.38078278, 4098.58078184, 4098.7807809 , 4098.98077997, 4099.18077903, 4099.38077809, 4099.58077715, 4099.78077622, 4099.98077528, 4100.18077434, 4100.3807734 , 4100.58077246, 4100.78077153, 4100.98077059, 4101.18076965, 4101.38076871, 4101.58076778, 4101.78076684, 4101.9807659 , 4102.18076496, 4102.38076402, 4102.58076309, 4102.78076215, 4102.98076121, 4103.18076027, 4103.38075934, 4103.5807584 , 4103.78075746, 4103.98075652, 4104.18075558, 4104.38075465, 4104.58075371, 4104.78075277, 4104.98075183, 4105.1807509 , 4105.38074996, 4105.58074902, 4105.78074808, 4105.98074714, 4106.18074621, 4106.38074527, 4106.58074433, 4106.78074339, 4106.98074246, 4107.18074152, 4107.38074058, 4107.58073964, 4107.7807387 , 4107.98073777, 4108.18073683, 4108.38073589, 4108.58073495, 4108.78073401, 4108.98073308, 4109.18073214, 4109.3807312 , 4109.58073026, 4109.78072933, 4109.98072839, 4110.18072745, 4110.38072651, 4110.58072557, 4110.78072464, 4110.9807237 , 4111.18072276, 4111.38072182, 4111.58072089, 4111.78071995, 4111.98071901, 4112.18071807, 4112.38071713, 4112.5807162 , 4112.78071526, 4112.98071432, 4113.18071338, 4113.38071245, 4113.58071151, 4113.78071057, 4113.98070963, 4114.18070869, 4114.38070776, 4114.58070682, 4114.78070588, 4114.98070494, 4115.18070401, 4115.38070307, 4115.58070213, 4115.78070119, 4115.98070025, 4116.18069932, 4116.38069838, 4116.58069744, 4116.7806965 , 4116.98069557, 4117.18069463, 4117.38069369, 4117.58069275, 4117.78069181, 4117.98069088, 4118.18068994, 4118.380689 , 4118.58068806, 4118.78068713, 4118.98068619, 4119.18068525, 4119.38068431, 4119.58068337, 4119.78068244, 4119.9806815 , 4120.18068056, 4120.38067962, 4120.58067869, 4120.78067775, 4120.98067681, 4121.18067587, 4121.38067493, 4121.580674 , 4121.78067306, 4121.98067212, 4122.18067118, 4122.38067025, 4122.58066931, 4122.78066837, 4122.98066743, 4123.18066649, 4123.38066556, 4123.58066462, 4123.78066368, 4123.98066274, 4124.1806618 , 4124.38066087, 4124.58065993, 4124.78065899, 4124.98065805, 4125.18065712, 4125.38065618, 4125.58065524, 4125.7806543 , 4125.98065336, 4126.18065243, 4126.38065149, 4126.58065055, 4126.78064961, 4126.98064868, 4127.18064774, 4127.3806468 , 4127.58064586, 4127.78064492, 4127.98064399, 4128.18064305, 4128.38064211, 4128.58064117, 4128.78064024, 4128.9806393 , 4129.18063836, 4129.38063742, 4129.58063648, 4129.78063555, 4129.98063461, 4130.18063367, 4130.38063273, 4130.5806318 , 4130.78063086, 4130.98062992, 4131.18062898, 4131.38062804, 4131.58062711, 4131.78062617, 4131.98062523, 4132.18062429, 4132.38062336, 4132.58062242, 4132.78062148, 4132.98062054, 4133.1806196 , 4133.38061867, 4133.58061773, 4133.78061679, 4133.98061585, 4134.18061492, 4134.38061398, 4134.58061304, 4134.7806121 , 4134.98061116]) 
y = np.array([0.90312759, 1.00923175, 0.94618369, 0.98284045, 0.91510612,  0.96737804, 0.97690214, 0.94363369, 1.00887784, 1.00110387,  0.91647096, 0.97943202, 1.00672907, 1.01552094, 1.01089407,  0.96914584, 0.9908419 , 1.0176613 , 0.97032148, 0.96003562,  0.9702355 , 0.93684173, 0.94652734, 0.94895018, 1.01214356,  0.85777678, 0.89308203, 0.9789272 , 0.93901884, 0.9684622 ,  0.96969321, 0.86326307, 0.89607392, 0.92459571, 1.00454429,  1.06019733, 0.97291196, 0.95646497, 0.95899707, 1.02830351,  0.94938178, 0.91481128, 0.92606219, 0.97085631, 0.93597434,  0.91316857, 0.90644542, 0.91726926, 0.91686184, 0.96445563,  0.92166362, 0.95831572, 0.93859066, 0.85285273, 0.89944073,  0.91812428, 0.94265677, 0.88281406, 0.9470601 , 0.94921529,  0.97289222, 0.94632251, 0.96633195, 0.94096512, 0.95324803,  0.90920845, 0.92100257, 0.91181745, 0.95715298, 0.91715382,  0.90219214, 0.87585035, 0.86592191, 0.89335902, 0.85536392,  0.89619274, 0.9450366 , 0.82780137, 0.81214176, 0.83461329,  0.82858317, 0.80851704, 0.79253546, 0.85440086, 0.81679169,  0.80579976, 0.72312218, 0.75583125, 0.75204599, 0.84519188,  0.68686821, 0.71472154, 0.71706318, 0.72640234, 0.70526356,  0.68295282, 0.66795774, 0.65004383, 0.68096834, 0.72697547,  0.72436393, 0.77128385, 0.79666758, 0.67349101, 0.61479406,  0.57046337, 0.51614312, 0.52945366, 0.53112169, 0.53757761,  0.56680358, 0.63839684, 0.60704329, 0.62377533, 0.67862515,  0.64587581, 0.71316115, 0.76309798, 0.72217569, 0.7477785 ,  0.79731849, 0.76934137, 0.77063868, 0.77871584, 0.77688526,  0.84342722, 0.85382332, 0.88700466, 0.85837992, 0.79589266,  0.83798993, 0.79835529, 0.84612746, 0.83214907, 0.86373676,  0.90729115, 0.82111605, 0.86165685, 0.84090099, 0.90389133,  0.89554032, 0.90792356, 0.92798016, 0.95588479, 0.95019718,  0.95447497, 0.89845759, 0.91638311, 0.99263342, 0.97477606,  0.95482538, 0.94489498, 0.94344967, 0.90526465, 0.92538486,  0.96279787, 0.94005143, 0.96842454, 0.92296494, 0.89954172,  0.8684367 , 0.95039002, 0.95229769, 0.93752274, 0.94741173,  0.96704449, 1.01130839, 0.95499414, 0.99596569, 0.95130622,  1.00014723, 1.00252218, 0.95130331, 1.0022896 , 0.99851989,  0.94405282, 0.95814021, 0.94851972, 1.01302067, 1.01400272,  0.97960083, 0.97070283, 1.01312797, 0.9842154 , 1.01147273,  0.97331853, 0.91403182, 0.96813051, 0.92319169, 0.9294103 ,  0.96960715, 0.94811518, 0.97115083, 0.84687543, 0.90725159,  0.88061293, 0.87319615, 0.85331661, 0.89775082, 0.90956716,  0.83174505, 0.89753388, 0.89554364, 0.95329739, 0.87687031,  0.93883127, 0.97433899, 0.99515225, 0.97519981, 0.91956466,  0.97977674, 0.93582089, 1.00662722, 0.90157277, 1.02887754,  0.9777419 , 0.94257094, 1.02359615, 0.98968414, 1.00075502,  1.03230265, 1.05904074, 1.00488442, 1.05507886, 1.05085518,  1.02561781, 1.05896008, 0.98024381, 1.08005691, 0.94528977,  1.03853637, 1.02064405, 1.0467137 , 1.05375156, 1.12907949,  0.99295611, 1.06601022, 1.02846374, 0.98006807, 0.96446772,  0.97702428, 0.97788589, 0.93889781, 0.96366778, 0.96645265,  0.95857242, 1.05796304, 0.99441763, 1.00573183, 1.05001927]) 
e = np.array([0.0647344 , 0.04583914, 0.05665552, 0.04447208, 0.05644753,  0.03968611, 0.05985188, 0.04252311, 0.03366922, 0.04237672,  0.03765898, 0.03290132, 0.04626836, 0.05106203, 0.03619188,  0.03944098, 0.08115469, 0.05859644, 0.06091101, 0.05170821,  0.0427244 , 0.06804469, 0.06708318, 0.03369381, 0.04160575,  0.08007032, 0.09292148, 0.04378329, 0.08216214, 0.06087074,  0.05375458, 0.06185891, 0.06385766, 0.08084546, 0.04864063,  0.06400878, 0.04988693, 0.06689165, 0.05989534, 0.08010138,  0.0681177 , 0.04478208, 0.03876582, 0.05977015, 0.06610619,  0.05020086, 0.07244604, 0.0445143 , 0.06970626, 0.04423994,  0.0414573 , 0.06892836, 0.05715395, 0.04014724, 0.07908425,  0.06082051, 0.08380691, 0.08576757, 0.06571406, 0.04842625,  0.05298355, 0.05271857, 0.06340425, 0.10849621, 0.0811072 ,  0.03642638, 0.10614094, 0.09865099, 0.06711037, 0.10244762,  0.11843505, 0.1092357 , 0.09748241, 0.09657009, 0.09970179,  0.10203563, 0.18494082, 0.14097796, 0.1151294 , 0.16172895,  0.17611204, 0.16226913, 0.2295418 , 0.17795924, 0.1253298 ,  0.1771586 , 0.15139061, 0.14739618, 0.1620105 , 0.19158538,  0.21431605, 0.19292715, 0.23308884, 0.30519423, 0.31401994,  0.30569885, 0.31216375, 0.35147676, 0.25016472, 0.16232236,  0.09058787, 0.0604483 , 0.05168302, 0.21432774, 0.38149791,  0.5061975 , 0.44281541, 0.50646427, 0.43761581, 0.44989111,  0.47778238, 0.39944325, 0.32462726, 0.34560857, 0.3175776 ,  0.30253441, 0.23059451, 0.24516185, 0.20708065, 0.26429751,  0.1830661 , 0.15155041, 0.16497299, 0.15794139, 0.13626666,  0.17839823, 0.13502886, 0.14148522, 0.10869864, 0.11723602,  0.09074029, 0.06922157, 0.07719777, 0.13181317, 0.11441895,  0.10655855, 0.12073767, 0.0846133 , 0.07974657, 0.06538693,  0.0573741 , 0.07864047, 0.08351471, 0.08130351, 0.0768824 ,  0.07951992, 0.04478989, 0.0765122 , 0.04842814, 0.04355571,  0.05138656, 0.07215294, 0.04681987, 0.05790133, 0.06163808,  0.082449 , 0.06127927, 0.04971221, 0.05107901, 0.04493687,  0.06072161, 0.06094332, 0.03630467, 0.04162285, 0.04058228,  0.04526251, 0.06191432, 0.04901982, 0.0454908 , 0.06186274,  0.0407017 , 0.03865571, 0.04353665, 0.03898987, 0.04666321,  0.05856035, 0.04225933, 0.04797901, 0.03523971, 0.04728414,  0.05494382, 0.04773011, 0.954, 0.05651663, 0.03625933,  0.03596701, 0.03800191, 0.06267668, 0.06431192, 0.0602614 ,  0.05139896, 0.04571979, 0.04375182, 0.0576867 , 0.07491418,  0.05339972, 0.07619115, 0.11569378, 0.07087871, 0.09076518,  0.13554717, 0.07811761, 0.07180695, 0.05831886, 0.06042863,  0.08759576, 0.06650081, 0.08420164, 0.08185432, 0.04338836,  0.04970979, 0.04008252, 0.03605485, 0.03456321, 0.05594584,  0.03856822, 0.03576337, 0.03118799, 0.0441686 , 0.0469118 ,  0.03591666, 0.03562582, 0.04934832, 0.03280972, 0.03201576,  0.04338048, 0.07443531, 0.04121059, 0.03774147, 0.03717577,  0.03354207, 0.03806978, 0.0319364 , 0.03715712, 0.0379478 ,  0.04867626, 0.0304592 , 0.03393844, 0.034518 , 0.04293514,  0.05177898, 0.05332907, 0.0352937 , 0.03359781, 0.04625272,  0.03733088, 0.03501259, 0.03346308, 0.04333749, 0.05741173]) 

cont = ConstantModel(prefix='cte_') 
pars = cont.guess(y, x=x) 

gauss = GaussianModel(prefix='g_') 
pars.update(gauss.make_params())  
pars['cte_c'].set(1) 
pars['g_center'].set(4125, min=4120, max=4130) 
pars['g_sigma'].set(1, min=0.5) 
pars['g_amplitude'].set(-0.2, min=-0.5) 

loren = LorentzianModel(prefix='l_') 
pars.update(loren.make_params())  
pars['l_center'].set(4106, min=4095, max=4115) 
pars['l_sigma'].set(4, max=6) 
pars['l_amplitude'].set(-6., max=-4.) 

model = gauss + loren + cont 

init = model.eval(pars, x=x) 
result = model.fit(y, pars, x=x, weights=1/e) 

#print(result.fit_report(min_correl=0.5)) 

fig, ax = plt.subplots(figsize=(8,6)) 

ax.plot(x, y, 'k-', lw=2) # data in red 
ax.plot(x, init, 'g--', lw=2) # initial guess 
ax.plot(x, result.best_fit, 'r-', lw=2) # best fit 
ax.set(xlim=(4085,4135), ylim=(0.4,1.14)) 

回答

0

如果壞點總是在相同的x值,你可以從數據中刪除這一點,也許像這樣的東西:

import numpy as np 
def index_nearest(array, value): 
    """index of array nearest to value""" 
    return np.abs(array-value).argmin() 

ybad = index_nearest(x, 4150) 
y[ybad] = x[ybad] = np.nan 
x = x[np.where(np.isfinite(y))] 
y = y[np.where(np.isfinite(y))] 

然後適合你的模型去那些壞點的數據。

但是,也:如果一個明顯錯誤的點和數據「只是」噪音大,有可能是沒有優勢移除什麼樣子差了點。你的數據對我來說聽起來很嘈雜,但很難看出有一個系統的壞點。如果您要刪除一個要點,請記住您聲稱此測量不僅受到正常噪音的影響,而且是錯誤的。最後:另一種處理噪聲數據的方法可能是嘗試平滑數據,比如用Savitzky-Golay濾波器。使用這種方法總會有一些平滑特徵的危險,但適度的S-G濾波器通常適用於清除足以檢測特徵的噪聲數據。當然,如果擬合過濾後的數據給出的結果與未過濾的數據非常不同,那麼您可能需要了解其原因。

+0

感謝您的回答。那麼數據是嘈雜的。但是主要的問題是,在所有情況下,更大的生產線的核心部分(我正在適應洛倫茲分佈)的污染是存在的,不僅僅是噪音,而且它在X軸上的位置也會發生變化,所以如果我只是刪除大部分數據以剪輯該功能,我失去了大量有關配置文件形狀的信息。此外,在這種情況下,我無法平滑數據,但正如我所說的,噪音不是問題,但主要最小中心的污染是什麼導致了問題。 – JVR

+0

你*可以*刪除點,這就是你實際問了怎麼做(我認爲是什麼)。如果某些x值始終不好,可以將其刪除。如果它*不好,你不會刪除你關心的數據。但是,我不確定是否在數據中發現了一個明顯不好的地方 - 它看起來太吵了,以至於確定最大峯值的變化不是隨機的。此外,雖然從情節中難以分辨,但它似乎並不總是處於相同的X位置。如果你還沒有,我會鼓勵你仔細調查是否真的有一個壞點。 –