2016-02-26 87 views
1

如何從PANDAS中爲ols迴歸選擇python數據框中的特定行?從PANDAS中爲ols迴歸選擇python數據框中的特定行

我有一個1000行的熊貓數據框。我想回溯列B + C上的列A,前10行。當我輸入:

mod = pd.ols(y=df[‘A’], x=df[[‘B’,’C’]], window=10) 

我得到991-1000行的迴歸結果。我如何指定我想要FIRST(或第二等)10行?

在此先感謝。

+0

也許有助於添加'頭(10)''等MOD = pd.ols(Y = DF [ 'A']頭(10)中,x = df [['B','C']]。head(10),window = 10)' – jezrael

+0

謝謝你,工作!你知道我將如何獲得第2-11行,或10-20? –

+0

你的索引是什麼? 'print df.index' – jezrael

回答

0

我認爲你可以使用iloc

mod = pd.ols(y=df['A'].iloc[2:12], x=df[['B','C']].iloc[2:12], window=10) 

或者ix

mod = pd.ols(y=df.ix[2:12, 'A'], x=df.ix[2:12, ['B', 'C']], window=10) 

如果你需要的所有組使用range

for i in range(10): 
    #print i, i+10 
    mod = pd.ols(y=df['A'].iloc[i:i + 10], x=df[['B','C']].iloc[i:i + 10], window=10) 

如果您需要幫助約ols,在中嘗試,因爲這個功能在熊貓文檔丟失:

In [79]: help(pd.ols) 
Help on function ols in module pandas.stats.interface: 

ols(**kwargs) 
    Returns the appropriate OLS object depending on whether you need 
    simple or panel OLS, and a full-sample or rolling/expanding OLS. 

    Will be a normal linear regression or a (pooled) panel regression depending 
    on the type of the inputs: 

    y : Series, x : DataFrame -> OLS 
    y : Series, x : dict of DataFrame -> OLS 
    y : DataFrame, x : DataFrame -> PanelOLS 
    y : DataFrame, x : dict of DataFrame/Panel -> PanelOLS 
    y : Series with MultiIndex, x : Panel/DataFrame + MultiIndex -> PanelOLS 

    Parameters 
    ---------- 
    y: Series or DataFrame 
     See above for types 
    x: Series, DataFrame, dict of Series, dict of DataFrame, Panel 
    weights : Series or ndarray 
     The weights are presumed to be (proportional to) the inverse of the 
     variance of the observations. That is, if the variables are to be 
     transformed by 1/sqrt(W) you must supply weights = 1/W 
    intercept: bool 
     True if you want an intercept. Defaults to True. 
    nw_lags: None or int 
     Number of Newey-West lags. Defaults to None. 
    nw_overlap: bool 
     Whether there are overlaps in the NW lags. Defaults to False. 
    window_type: {'full sample', 'rolling', 'expanding'} 
     'full sample' by default 
    window: int 
     size of window (for rolling/expanding OLS). If window passed and no 
     explicit window_type, 'rolling" will be used as the window_type 

    Panel OLS options: 
     pool: bool 
      Whether to run pooled panel regression. Defaults to true. 
     entity_effects: bool 
      Whether to account for entity fixed effects. Defaults to false. 
     time_effects: bool 
      Whether to account for time fixed effects. Defaults to false. 
     x_effects: list 
      List of x's to account for fixed effects. Defaults to none. 
     dropped_dummies: dict 
      Key is the name of the variable for the fixed effect. 
      Value is the value of that variable for which we drop the dummy. 

      For entity fixed effects, key equals 'entity'. 

      By default, the first dummy is dropped if no dummy is specified. 
     cluster: {'time', 'entity'} 
      cluster variances 

    Examples 
    -------- 
    # Run simple OLS. 
    result = ols(y=y, x=x) 

    # Run rolling simple OLS with window of size 10. 
    result = ols(y=y, x=x, window_type='rolling', window=10) 
    print(result.beta) 

    result = ols(y=y, x=x, nw_lags=1) 

    # Set up LHS and RHS for data across all items 
    y = A 
    x = {'B' : B, 'C' : C} 

    # Run panel OLS. 
    result = ols(y=y, x=x) 

    # Run expanding panel OLS with window 10 and entity clustering. 
    result = ols(y=y, x=x, cluster='entity', window_type='expanding', window=10) 

    Returns 
    ------- 
    The appropriate OLS object, which allows you to obtain betas and various 
    statistics, such as std err, t-stat, etc.