2009-01-28 242 views
18

我對這些日期中的每一個都有一個日期範圍和一個度量。我想計算每個日期的指數移動平均線。有人知道怎麼做這個嗎?計算python中的指數移動平均值

我是新來的蟒蛇。似乎沒有將平均值構建到標準的Python庫中,這讓我覺得有點奇怪。也許我沒有找到正確的地方。

因此,根據以下代碼,我如何計算日曆日期中IQ點的移動加權平均值?

from datetime import date 
days = [date(2008,1,1), date(2008,1,2), date(2008,1,7)] 
IQ = [110, 105, 90] 

(有可能是一個更好的方式來組織數據,任何意見,將不勝感激)

+1

平均值實際上不是在圖書館,因爲它是如此簡單:SUM(IQ)/ LEN(IQ)給出了智商的算術平均值。 – Kiv 2009-01-28 18:19:50

+1

簡單的平均值...很簡單。但是更復雜的算法可能在標準庫中很有用。 – Jim 2009-01-28 18:26:58

+1

numpy和scipy有大量的統計函數,包括平均值:) – Ryan 2009-01-28 18:28:30

回答

17

編輯: 似乎從scikits.timeseries.lib.moving_funcs子模塊mov_average_expw()功能從SciKits(附加補充SciPy工具包)更適合您的問題的措辭。


要與平滑因子alpha計算數據的exponential smoothing(這是維基百科的條款(1 - alpha)):

>>> alpha = 0.5 
>>> assert 0 < alpha <= 1.0 
>>> av = sum(alpha**n.days * iq 
...  for n, iq in map(lambda (day, iq), today=max(days): (today-day, iq), 
...   sorted(zip(days, IQ), key=lambda p: p[0], reverse=True))) 
95.0 

以上是不漂亮,所以讓我們重構了一點:

from collections import namedtuple 
from operator import itemgetter 

def smooth(iq_data, alpha=1, today=None): 
    """Perform exponential smoothing with factor `alpha`. 

    Time period is a day. 
    Each time period the value of `iq` drops `alpha` times. 
    The most recent data is the most valuable one. 
    """ 
    assert 0 < alpha <= 1 

    if alpha == 1: # no smoothing 
     return sum(map(itemgetter(1), iq_data)) 

    if today is None: 
     today = max(map(itemgetter(0), iq_data)) 

    return sum(alpha**((today - date).days) * iq for date, iq in iq_data) 

IQData = namedtuple("IQData", "date iq") 

if __name__ == "__main__": 
    from datetime import date 

    days = [date(2008,1,1), date(2008,1,2), date(2008,1,7)] 
    IQ = [110, 105, 90] 
    iqdata = list(map(IQData, days, IQ)) 
    print("\n".join(map(str, iqdata))) 

    print(smooth(iqdata, alpha=0.5)) 

實施例:

$ python26 smooth.py 
IQData(date=datetime.date(2008, 1, 1), iq=110) 
IQData(date=datetime.date(2008, 1, 2), iq=105) 
IQData(date=datetime.date(2008, 1, 7), iq=90) 
95.0 
4

我不知道Python,但對於平均化部,你的意思是指數衰減的低通形式的過濾器

y_new = y_old + (input - y_old)*alpha 

其中alpha = DT/tau蛋白,DT =過濾器,tau的時間步長=濾波器的時間常數τ (這個可變時間步格式如下,只是夾DT /頭至不超過1.0)

y_new = y_old + (input - y_old)*dt/tau 

如果你想過濾像一個日期,一定要轉換爲浮點數量像秒#自1月1日1970年

8

我做了一些谷歌搜索,我發現下面的示例代碼(http://osdir.com/ml/python.matplotlib.general/2005-04/msg00044.html):

def ema(s, n): 
    """ 
    returns an n period exponential moving average for 
    the time series s 

    s is a list ordered from oldest (index 0) to most 
    recent (index -1) 
    n is an integer 

    returns a numeric array of the exponential 
    moving average 
    """ 
    s = array(s) 
    ema = [] 
    j = 1 

    #get n sma first and calculate the next n period ema 
    sma = sum(s[:n])/n 
    multiplier = 2/float(1 + n) 
    ema.append(sma) 

    #EMA(current) = ((Price(current) - EMA(prev)) x Multiplier) + EMA(prev) 
    ema.append(((s[n] - sma) * multiplier) + sma) 

    #now calculate the rest of the values 
    for i in s[n+1:]: 
     tmp = ((i - ema[j]) * multiplier) + ema[j] 
     j = j + 1 
     ema.append(tmp) 

    return ema 
+0

爲什麼函數使用與函數名稱相同的局部變量?除了使代碼稍微不明顯之外,它可能很難檢測更深層次的邏輯錯誤... – 2012-06-04 11:13:25

+0

's = array(s)'的含義是什麼?我有語法錯誤,直到我剛剛評論它。 – swdev 2017-11-06 07:43:58

5

我Python是有點有點生疏(任何人都可以隨意編輯這段代碼來進行更正,如果我弄亂了語法mehow),但在這裏不用....

def movingAverageExponential(values, alpha, epsilon = 0): 

    if not 0 < alpha < 1: 
     raise ValueError("out of range, alpha='%s'" % alpha) 

    if not 0 <= epsilon < alpha: 
     raise ValueError("out of range, epsilon='%s'" % epsilon) 

    result = [None] * len(values) 

    for i in range(len(result)): 
     currentWeight = 1.0 

     numerator  = 0 
     denominator = 0 
     for value in values[i::-1]: 
      numerator  += value * currentWeight 
      denominator += currentWeight 

      currentWeight *= alpha 
      if currentWeight < epsilon: 
       break 

     result[i] = numerator/denominator 

    return result 

此功能通過對元件向後工作,直到加權係數向後移動,從列表中的開頭的端部,計算每個值的指數移動平均小於給定的epsilon。

在函數結束時,它在返回列表之前顛倒這些值(以便它們按照調用者的正確順序)。 (注意:如果我使用python以外的語言,我會先創建一個全尺寸的空數組,然後將它向後順序填充,這樣我就不必在最後反轉它。但是我不認爲你可以在python中聲明一個大的空數組,並且在python列表中,追加比prepending要便宜得多,這就是爲什麼我以相反的順序創建列表的原因,如果我錯了,請糾正我。 )

'alpha'參數是每次迭代的衰減因子。從十

today:  1.0 
yesterday: 0.5 
2 days ago: 0.25 
3 days ago: 0.125 
...etc... 

當然,如果你已經有了一個巨大的價值的數組,值:例如,如果您使用的0.5α,那麼今天的移動平均值將被由以下加權值的或十五天前對今天的加權平均數不會有太大的貢獻。 'epsilon'的觀點可以讓你設定一個截止點,在這個點之下,你將不再關心舊的價值(因爲它們對當今價值的貢獻將是微不足道的)。

你會調用該函數是這樣的:

result = movingAverageExponential(values, 0.75, 0.0001) 
2

我發現上面的代碼片段通過@earino非常有用的 - 但我需要的東西,可以連續平滑值流 - 所以我把它重構爲這樣:

def exponential_moving_average(period=1000): 
    """ Exponential moving average. Smooths the values in v over ther period. Send in values - at first it'll return a simple average, but as soon as it's gahtered 'period' values, it'll start to use the Exponential Moving Averge to smooth the values. 
    period: int - how many values to smooth over (default=100). """ 
    multiplier = 2/float(1 + period) 
    cum_temp = yield None # We are being primed 

    # Start by just returning the simple average until we have enough data. 
    for i in xrange(1, period + 1): 
     cum_temp += yield cum_temp/float(i) 

    # Grab the timple avergae 
    ema = cum_temp/period 

    # and start calculating the exponentially smoothed average 
    while True: 
     ema = (((yield ema) - ema) * multiplier) + ema 

,我使用它是這樣的:

def temp_monitor(pin): 
    """ Read from the temperature monitor - and smooth the value out. The sensor is noisy, so we use exponential smoothing. """ 
    ema = exponential_moving_average() 
    next(ema) # Prime the generator 

    while True: 
     yield ema.send(val_to_temp(pin.read())) 

(其中pin.read()產生我想要消耗的下一個值)。

5

在matplotlib.org實例(http://matplotlib.org/examples/pylab_examples/finance_work2.html)提供了一種使用numpy的移動平均(EMA)功能指數的一個很好的例子:

def moving_average(x, n, type): 
    x = np.asarray(x) 
    if type=='simple': 
     weights = np.ones(n) 
    else: 
     weights = np.exp(np.linspace(-1., 0., n)) 

    weights /= weights.sum() 

    a = np.convolve(x, weights, mode='full')[:len(x)] 
    a[:n] = a[n] 
    return a 
1

下面是一個簡單的示例我後處理根據http://stockcharts.com/school/doku.php?id=chart_school:technical_indicators:moving_averages

請注意,與他們的電子表格不同,我不計算SMA,並且我不等待10個樣本後生成EMA。這意味着我的數值略有不同,但如果您繪製圖表,則會在10個樣本之後精確顯示。在前10個樣本中,我計算的EMA被適當平滑。

def emaWeight(numSamples): 
    return 2/float(numSamples + 1) 

def ema(close, prevEma, numSamples): 
    return ((close-prevEma) * emaWeight(numSamples)) + prevEma 

samples = [ 
22.27, 22.19, 22.08, 22.17, 22.18, 22.13, 22.23, 22.43, 22.24, 22.29, 
22.15, 22.39, 22.38, 22.61, 23.36, 24.05, 23.75, 23.83, 23.95, 23.63, 
23.82, 23.87, 23.65, 23.19, 23.10, 23.33, 22.68, 23.10, 22.40, 22.17, 
] 
emaCap = 10 
e=samples[0] 
for s in range(len(samples)): 
    numSamples = emaCap if s > emaCap else s 
    e = ema(samples[s], e, numSamples) 
    print e 
6

我一直在計算均線與熊貓:

下面是一個例子,如何做到這一點:

import pandas as pd 
import numpy as np 

def ema(values, period): 
    values = np.array(values) 
    return pd.ewma(values, span=period)[-1] 

values = [9, 5, 10, 16, 5] 
period = 5 

print ema(values, period) 

更多的相關信息約熊貓EWMA:

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.ewma.html

2

您也可以使用SciPy過濾器方法,因爲EMA是IIR過濾器。與列舉()方法相比,這將具有大約64倍於在我的系統上使用時間點在大數據集上測量的益處。

import numpy as np 
from scipy.signal import lfilter 

x = np.random.normal(size=1234) 
alpha = .1 # smoothing coefficient 
zi = [x[0]] # seed the filter state with first value 
# filter can process blocks of continuous data if <zi> is maintained 
y, zi = lfilter([1.-alpha], [1., -alpha], x, zi=zi) 
0

的一個快速方法(複製粘貼從here)如下:

def ExpMovingAverage(values, window): 
    """ Numpy implementation of EMA 
    """ 
    weights = np.exp(np.linspace(-1., 0., window)) 
    weights /= weights.sum() 
    a = np.convolve(values, weights, mode='full')[:len(values)] 
    a[:window] = a[window] 
    return a