2016-07-07 235 views
1

您好我想通過幾個Excel文件運行我的Python代碼,並從每個文件中獲取數據並保存到數據框中。這裏是我的代碼..無法使用熊貓Python訪問Excel文件

import os 
import glob 
import pandas as pd 


path =r'C:\Users\user1\Desktop\test' 
files = os.listdir(path) 
files_xls = [f for f in files if f[-3:] == 'xls'] 
df = pd.DataFrame() 
for f in files_xls: 
    filename, ext = os.path.splitext(f) 
    data = pd.read_excel(f, filename) 
    df = df.append(data) 

a = df.describe() 
print (a) 

,我得到這個錯誤..我在工作文件夾中的第一個文件是TEST.XLS

Traceback (most recent call last): 
    File "test.py", line 20, in <module> 
    data = pd.read_excel(f, filename) 
    File "C:\Users\user1\AppData\Local\Programs\Python\Python35-32\lib\site- packages\pandas\io\excel.py", line 170, in read_excel 
    io = ExcelFile(io, engine=engine) 
    File "C:\Users\user1\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pandas\io\excel.py", line 227, in __init__ 
    self.book = xlrd.open_workbook(io) 
    File "C:\Users\user1\AppData\Local\Programs\Python\Python35-32\lib\site-packages\xlrd\__init__.py", line 395, in open_workbook 
    with open(filename, "rb") as f: 
FileNotFoundError: [Errno 2] No such file or directory: 'test.xls' 
+0

我檢查你的代碼行'data = pd.read_excel(f,filename)'改爲'data = pd.read_excel(f)'並且它正常工作。爲什麼使用'filename'參數? – Valilutzik

+0

我試過..它也給我同樣的錯誤也 –

+0

你嘗試下面的解決方案嗎? – Valilutzik

回答

1
import os 
import pandas as pd 

path =r'C:\Users\user1\Desktop\test' 
os.chdir(path) 
files = os.listdir(path) 
files_xls = [f for f in files if f[-3:] == 'xls'] 
df = pd.DataFrame() 
for f in files_xls: 
    data = pd.read_excel(f) 
    df = df.append(data) 

a = df.describe() 
print (a) 
+0

嗨,我不是很確定在哪裏添加這個.. –

+0

你可以把它放在'path'變量之前的第一行。順便說一下,使用'data = pd.read_excel(f)'而不是'data = pd.read_excel(f,filename)' – Valilutzik

0

文件找不到,因爲您正在調用對Excel文件的相對引用,並且Python腳本可能不與文件位於同一文件夾中。因此,使用絕對引用,它不會影響被調用腳本的位置。您可以通過連接路徑使用os.path.join()到文件名這樣做:

import os 
import pandas as pd 

path = r'C:\Users\user1\Desktop\test' 

files = os.listdir(path) 
files_xls = [f for f in files if f[-3:] == 'xls'] 

dfList = [] 
for f in files_xls: 
    data = pd.read_excel(os.path.join(path, f)) 
    dfList.append(data) 

df = pd.concat(dfList) 

另外,使用水珠避免了上延伸檢查和檢索文件的完整路徑:

import glob 
import pandas as pd 

path = r'C:\Users\user1\Desktop\test' 
files_xls = glob.glob(path+'\*.xls') 

dfList = [] 
for f in files_xls: 
    data = pd.read_excel(f) 
    dfList.append(data) 

df = pd.concat(dfList)