2016-04-30 70 views
1

如何使用'oo'列中的NAN值刪除DATE和TIME。如何刪除具有NAN值的日期時間

這是我的CSV

DATE,TIME,開盤價,最高價,最低價,收盤價,成交量 02/03/1997,09:04:00,3046.00,3048.50,3046.00,3047.50,505
02 /03/1997,09:05:00,3047.00,3048.00,3046.00,3047.00,162
02/03/1997,09:06:00,3047.50,3048.00,3047.00,3047.50,98
02/03/1997 ,09:07:00,3047.50,3047.50,3047.00,3047.50,228
02/03/1997,09:08:00,3048.00,3048.00,3047.50,3048.00,136
02/03/1997,09:09 :00,3048.00,3048.00,3046.50,3046.50,174
02/03/1997,09:10:00,3046.50,3046.50,3045.00,3045.00,134
02/03/1997,09:11:00,3045.50,3046.00,3044.00,3045.00,43
02/03 /1997,09:12:00,3045.00,3045.50,3045.00,3045.00,214
02/03/1997,09:13:00,3045.50,3045.50,3045.50,3045.50,8
02/03/1997,09 :14:00,3045.50,3046.00,3044.50,3044.50,152
02/03/1997,09:15:00,3044.00,3044.00,3042.50,3042.50,126
02/03/1997,09:16:00 ,3043.50,3043.50,3043.00,3043.00,128
02/03/1997,09:17:00,3042.50,3043.50,3042.50,3043.50,23
02/03/1997,09:18:00,3043.50,3044.50 ,304 3.00,3044.00,51
02/03/1997,09:19:00,3044.50,3044.50,3043.00,3043.00,18
02/03/1997,09:20:00,3043.00,3045.00,3043.00,3045.00, 23
02/03/1997,09:21:00,3045.00,3045.00,3044.50,3045.00,51
02/03/1997,09:22:00,3045.00,3045.00,3045.00,3045.00,47
02 /03/1997,09:23:00,3045.50,3046.00,3045.00,3045.00,77
02/03/1997,09:24:00,3045.00,3045.00,3045.00,3045.00,131
02/03/1997 ,09:25:00,3044.50,3044.50,3043.50,3043.50,138
02/03/1997,09:26:00,3043.50,3043.50,3043.50,3043.50,6
02/03/1997,09:27:00,3043.50,3043.50,3043.00,3043.00,56
02/03/1997,09:28:00,3043.00,3044.00,3043.00,3044.00,32
02/03 /1997,09:29:00,3044.50,3044.50,3044.50,3044.50,63
02/03/1997,09:30:00,3045.00,3045.00,3045.00,3045.00,28

這裏是我的代碼。

exp = pd.read_csv('example.txt', parse_dates = [["DATE", "TIME"]], index_col=0) 

exp['oo'] = opcl.OPEN.resample("5Min").first() 
print exp['oo'] 

,我得到這個

DATE_TIME 
1997-02-03 09:04:00  NaN 
1997-02-03 09:05:00 3047.0 
1997-02-03 09:06:00  NaN 
1997-02-03 09:07:00  NaN 
1997-02-03 09:08:00  NaN 
1997-02-03 09:09:00  NaN 
1997-02-03 09:10:00 3046.5 

我想擺脫所有DATE_TIME行與南在「OO」列vaules。 我試過了。

exp['oo'] = exp['oo'].dropna() 

但我得到同樣的東西。 我看了都扔了http://pandas.pydata.org/pandas-docs/stable/missing_data.html

而且看了這個網站。

我想讓我的csv閱讀器保持一致,但idk。

如果有人可以幫助它將不勝感激非常感謝您的時間。

+0

'opcl'上面沒有定義。 – Alexander

回答

1

我想你想要這樣的:

>>> exp.OPEN.resample("5Min", how='first') 

DATE_TIME 
1997-02-03 09:00:00 3046.0 
1997-02-03 09:05:00 3047.0 
1997-02-03 09:10:00 3046.5 
1997-02-03 09:15:00 3044.0 
1997-02-03 09:20:00 3043.0 
1997-02-03 09:25:00 3044.5 
1997-02-03 09:30:00 3045.0 
Freq: 5T, Name: OPEN, dtype: float64 
+0

當我這樣做時,我仍然得到所有NaN的一些原因 –

+0

您使用的是哪個版本的Pandas? 'pd .__ version__' – Alexander

+0

我使用0.18.0 –