2014-11-05 52 views

回答

0

好的,通過__init__book,和xlsx類在xlrd代碼在GitHub(https://github.com/python-excel/xlrd/),我看到沒有返回該文件類型的Book對象的屬性讀取之後。我能得到的最接近的是使用日誌文件,並設置詳細程度爲True:

import xlrd 

def ReadSpreadsheet(filePath): 
    myLog = open(''.join([filePath,'.log.txt']), 'w') 
    myLog.write(''.join(['Opening ',filePath,'\n'])) 
    wBook = xlrd.open_workbook(filePath, logfile=myLog, verbosity=True) 
    myLog.close() 

此功能將寫入日誌文件,顯示每個文件的組成部分。有四個文件測試,它是從哪個文件被識別爲XLSX文件,這些文件被識別爲XLS文件的日誌非常明顯,而這是無法識別:

Office 2010的XLSX文件:

>>> testing_xls.ReadSpreadsheet('MS.xlsx') 

Opening MS.xlsx 
ZIP component_names: 
['[Content_Types].xml', 
'_rels/.rels', 
'xl/_rels/workbook.xml.rels', 
'xl/workbook.xml', 
'xl/sharedStrings.xml', 
'xl/worksheets/_rels/sheet1.xml.rels', 
'xl/theme/theme1.xml', 
'xl/styles.xml', 
'xl/worksheets/sheet1.xml', 
'docProps/core.xml', 
'xl/printerSettings/printerSettings1.bin', 
'docProps/app.xml'] 

Office 2010的XLS文件:

>>> testing_xls.ReadSpreadsheet('MS.xls') 

Opening MS.xls 
CODEPAGE: codepage 1200 -> encoding 'utf_16_le' 
DATEMODE: datemode 0 
Countries: (1, 1) 

Colour indexes used: 
[] 

NOTE *** sheet 0 (u'Sheet1'): DIMENSIONS R,C = 26,9 should be 23,9 

的LibreOffice 4.2 XLSX文件

>>> testing_xls.ReadSpreadsheet('Libre.xlsx') 

Opening Libre.xlsx 
ZIP component_names: 
[u'_rels/.rels', 
u'docProps/app.xml', 
u'docProps/core.xml', 
u'xl/_rels/workbook.xml.rels', 
u'xl/sharedStrings.xml', 
u'xl/worksheets/_rels/sheet1.xml.rels', 
u'xl/worksheets/sheet1.xml', 
u'xl/styles.xml', 
u'xl/workbook.xml', 
u'[Content_Types].xml'] 

的LibreOffice 4.2 ODS文件

>>> testing_xls.ReadSpreadsheet('Libre.ods') 
Traceback (most recent call last): 
    File "<stdin>", line 1, in <module> 
    File "testing_xls.py", line 6, in ReadSpreadsheet 
    wBook = xlrd.open_workbook(filePath, logfile=myLog, verbosity=True) 
    File "/usr/local/lib/python2.7/dist-packages/xlrd/__init__.py", line 422, in open_workbook 
    raise XLRDError('Openoffice.org ODS file; not supported') 
xlrd.biffh.XLRDError: Openoffice.org ODS file; not supported 

[沒有寫入日誌文件。]

我想我可以趕上XLRDError並返回ODS,或讀取日誌文件,並返回XLSX如果component_names是找到並返回XLS如果找到codepage

+0

將留下這個答案几天,如果沒有人提出異議或想出更好的答覆,請接受我自己的答案。謝謝! – 2014-11-07 15:35:38