刪除第一個NaN後的DataFrame行

我有一個問題，我只能找到相反問題的解決方案。我需要能夠刪除特定列中第一個NaN值後面的DataFrame中的所有行。我無法找到類似於熊貓功能first_valid_index的功能，但相反。刪除第一個NaN後的DataFrame行

我所擁有的是類似的東西;

data = {'state': ['Ohio', 'Ohio', 'Ohio', 'NaN', 'Nevada'], 
     'year': [2000, 2001, 2002, 2001, 2002], 
     'pop': [1.5, 1.7, 3.6, 2.4, 2.9]} 
frame = pd.DataFrame(data)

我想在最後看到的是這個;

data = {'state': ['Ohio', 'Ohio', 'Ohio'], 
     'year': [2000, 2001, 2002], 
     'pop': [1.5, 1.7, 3.6]} 
frame = pd.DataFrame(data)

所以之後的第一楠的state列中找到，數據幀被切成僅包括它上面。

非常感謝！

來源

2017-08-03 jim mako

假設 '的NaN' 表示示例數據集實際NaN：

In [341]: new = frame.loc[:frame.state.isnull().idxmax()-1] 

In [342]: new 
Out[342]: 
    pop state year 
0 1.5 Ohio 2000 
1 1.7 Ohio 2001 
2 3.6 Ohio 2002

說明：idxmax() - 返回第一最大值的索引。

演示：

In [345]: frame.loc[1,'state'] = np.nan 

In [346]: frame 
Out[346]: 
    pop state year 
0 1.5 Ohio 2000 
1 1.7  NaN 2001 
2 3.6 Ohio 2002 
3 2.4  NaN 2001 
4 2.9 Nevada 2002 

In [347]: frame.loc[:frame.state.isnull().idxmax()-1] 
Out[347]: 
    pop state year 
0 1.5 Ohio 2000 

In [348]: frame.state.isnull().idxmax() 
Out[348]: 1

來源

2017-08-03 16:28:46 MaxU

這不應該是'idxmin'嗎？ –

顯然，這裏的'NaN'是字符串。 – Zero

@WillemVanOnsem，no，'idxmin（）' - 將返回一個索引__first__非空值 – MaxU

如果NaN是第一個元素的系列或如果在該系列沒有NaN值低於該解決方案將工作。

對於NaN，我允許使用空值或任何以NaN開頭的字符串。

它找到第一個NaN值的索引位置（如果沒有NaN值，則爲None），然後索引數據幀。

idx = (frame['state'].isnull() | frame['state'].str.startswith('NaN')) 
idx = idx.idxmax() if idx.any() else None 
frame[:idx]

來源

2017-08-03 16:40:13 Alexander

++用於解決NaN不存在的情況 – MaxU

刪除第一個NaN後的DataFrame行

回答

相關問題