使用Numpy和Pandas替換缺失值和更新數據幀中的舊值

我試圖用np.nan值替換我的數據框中由'...'反映的缺失值。我也想更新一些舊的值，但我的方法似乎不工作。使用Numpy和Pandas替換缺失值和更新數據幀中的舊值

這裏是我的代碼：

import numpy as np 
import pandas as pd 


def func(): 
    energy=pd.ExcelFile('Energy Indicators.xls').parse('Energy') 
    energy=energy.iloc[16:][['Environmental Indicators: Energy','Unnamed: 3','Unnamed: 4','Unnamed: 5']].copy() 
    energy.columns=['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable'] 
    o="..." 
    n=np.NaN 

    # Trying to replace missing values with np.nan values 
    energy[energy['Energy Supply']==o]=n 


    energy['Energy Supply']=energy['Energy Supply']*1000000 


    # Here, I want to replace old values by new ones ==> Same problem 
    old=["Republic of Korea","United States of America","United Kingdom of " 
           +"Great Britain and Northern Ireland","China, Hong " 
           +"Kong Special Administrative Region"] 
    new=["South Korea","United States","United Kingdom","Hong Kong"] 
    for i in range(0,4): 


     energy[energy['Country']==old[i],'Country']=new[i] 


    return energy

這裏是.xls文件我的工作：https://drive.google.com/file/d/0B80lepon1RrYeDRNQVFWYVVENHM/view?usp=sharing

來源

2017-10-21 sali333

我會用正則表達式做基於df.replace：

energy = energy.replace(r'\s*\.+\s*', np.nan, regex=True)

MaxU提出了一個alternative，這將工作我如果你的單元格不包含除點之外的任何特殊/空白字符。

energy = energy.replace('...', np.nan, regex=False)

來源

2017-10-21 23:34:16

我覺得應該是'能量= energy.replace（ '...'，np.nan，正則表達式= FALSE）' – MaxU

@MaxU正則表達式默認爲false，這意味着有什麼事不對勁列值（可能導致空白），所以我決定去正則表達式。也會加入你的！ –

'energy = energy.replace（'...'，np.nan）'效果很好 – sali333

使用Numpy和Pandas替換缺失值和更新數據幀中的舊值

回答

相關問題