如何在大熊貓中使用`Series.interpolate`並修改舊值

interploate方法pandas使用有效數據插值nan值。但是，它將保持舊的有效數據不變，如下面的代碼。如何在大熊貓中使用`Series.interpolate`並修改舊值

有什麼方法可以使用interploate方法改變舊值，使系列變得平滑？

In [1]: %matplotlib inline 
In [2]: from scipy.interpolate import UnivariateSpline as spl 
In [3]: import numpy as np 
In [4]: import pandas as pd 
In [5]: samples = { 0.0: 0.0, 0.4: 0.5, 0.5: 0.9, 0.6: 0.7, 0.8:0.3, 1.0: 1.0 } 
In [6]: x, y = zip(*sorted(samples.items())) 

In [7]: df1 = pd.DataFrame(index=np.linspace(0, 1, 31), columns=['raw', 'itp'], dtype=float) 

In [8]: df1.loc[x] = np.array(y)[:, None] 
In [9]: df1['itp'].interpolate('spline', order=3, inplace=True) 
In [10]: df1.plot(style={'itp': 'b-', 'raw': 'rs'}, figsize=(8, 6))

In [11]: df2 = pd.DataFrame(index=np.linspace(0, 1, 31), columns=['raw', 'itp'], dtype=float) 
In [12]: df2.loc[x, 'raw'] = y 
In [13]: f = spl(x, y, k=3) 
In [14]: df2['itp'] = f(df2.index) 
In [15]: df2.plot(style={'itp': 'b-', 'raw': 'rs'}, figsize=(8, 6))

來源

2015-08-15 Eastsun

當您使用Series.interpolate與method='spline'，引擎蓋Pandas uses interpolate.UnivariateSpline下。

通過 UnivariateSpline 返回的花鍵不能保證穿過給定爲輸入unless s=0數據點。但是，默認s=None，它使用不同的平滑因子，從而導致不同的結果。

的Series.interpolate方法總是fills in NaN values 而不改變非NaN值。沒有辦法使 Series.interpolate修改非NaN值。所以，當s != 0，結果產生鋸齒狀跳躍。

所以，如果你想s=None（默認），樣條插值，但沒有鋸齒狀的跳躍，因爲你已經發現了，你必須直接調用UnivariateSpline 並覆蓋所有值df['itp']：

df['itp'] = interpolate.UnivariateSpline(x, y, k=3)(df.index)

如果你想通過所有非NaN的數據點通過三次樣條，然後使用s=0

df['itp'].interpolate('spline', order=3, s=0, inplace=True)

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import scipy.interpolate as interpolate 

samples = { 0.0: 0.0, 0.4: 0.5, 0.5: 0.9, 0.6: 0.7, 0.8:0.3, 1.0: 1.0 } 
x, y = zip(*sorted(samples.items())) 

fig, ax = plt.subplots(nrows=3, sharex=True) 
df1 = pd.DataFrame(index=np.linspace(0, 1, 31), columns=['raw', 'itp'], dtype=float) 
df1.loc[x] = np.array(y)[:, None] 

df2 = df1.copy() 
df3 = df1.copy() 

df1['itp'].interpolate('spline', order=3, inplace=True) 
df2['itp'] = interpolate.UnivariateSpline(x, y, k=3)(df2.index) 
df3['itp'].interpolate('spline', order=3, s=0, inplace=True) 
for i, df in enumerate((df1, df2, df3)): 
    df.plot(style={'itp': 'b-', 'raw': 'rs'}, figsize=(8, 6), ax=ax[i]) 
plt.show()

來源

2015-08-15 10:43:39 unutbu

如何在大熊貓中使用`Series.interpolate`並修改舊值

回答

相關問題