2017-04-24 108 views
1

我有一個DataFrame,它有一個'pred'列,它是空的,我希望用一些特定的值更新它。他們原本在numpy的數組,但我還是堅持他們在一個叫「本」系列: 打印(類型(預測)) 如何使用新值更新列的特定DataFrame切片?

print(predictions) 
['collection2' 'collection2' 'collection2' 'collection1' 'collection2' 
'collection1'] 

this = pd.Series(predictions, index=test_indices) 

print(type(data)) 
<class 'pandas.core.frame.DataFrame'> 

print(data.shape) 
(35, 4) 

print(data.iloc[test_indices]) 
    class   pred           text \ 
223 collection2 [] Fellow-Citizens of the Senate and House of Rep... 
20 collection1 [] The period for a new election of a citizen to ... 
12 collection1 [] Fellow Citizens of the Senate and of the House... 
13 collection1 [] Whereas combinations to defeat the execution o... 
212 collection2 [] MR. PRESIDENT AND FELLOW-CITIZENS OF NEW-YORK:... 
230 collection2 [] Fellow-Countrymen:\nAt this second appearing t... 

               title 
223        First Annual Message 
20         Farewell Address 
12     Fifth Annual Message to Congress 
13 Proclamation against Opposition to Execution o... 
212        Cooper Union Address 
230       Second Inaugural Address 

print(type(this)) 
<class 'pandas.core.series.Series'> 

print(this.shape) 
(6,) 

print(this) 
0 collection2 
1 collection1 
2 collection1 
3 collection1 
4 collection2 
5 collection2 

我想我可以做這樣的:

data.iloc[test_indices, [4]] = this 

但導致

IndexError: positional indexers are out-of-bounds 

data.ix[test_indices, ['pred']] = this 
KeyError: '[0] not in index' 

回答

1

嘗試:

data.loc[data.index[test_indices], 'pred'] = this 
1

我喜歡.IX過的.loc。您可以使用

data.ix[bool_series, 'pred'] = this 

這裏,bool_series是包含真爲你想更新值的行,否則爲假的布爾系列。例如:

bool_series = ((data['col1'] > some_number) & (data['col2'] < some_other_number)) 

但是,請確保你已經有了一個「預解碼」列使用data.ix之前[bool_series,「預解碼」。否則,它會給出錯誤。

+0

ix將被棄用 – piRSquared

+0

哦,謝謝你的更新。我沒有意識到這一點。 –

相關問題