如何從文件中讀取文件內容？

使用Python3，希望到os.walk一個文件目錄，將它們讀入一個二進制對象（字符串？）並對它們做一些進一步的處理。第一步，但：如何閱讀os.walk的文件結果？如何從文件中讀取文件內容？

# NOTE: Execute with python3.2.2 

import os 
import sys 

path = "/home/user/my-files" 

count = 0 
successcount = 0 
errorcount = 0 
i = 0 

#for directory in dirs 
for (root, dirs, files) in os.walk(path): 
# print (path) 
print (dirs) 
#print (files) 

for file in files: 

    base, ext = os.path.splitext(file) 
    fullpath = os.path.join(root, file) 

    # Read the file into binary? -------- 
    input = open(fullpath, "r") 
    content = input.read() 
    length = len(content) 
    count += 1 
    print (" file: ---->",base,"/",ext," [count:",count,"]", "[length:",length,"]") 
    print ("fullpath: ---->",fullpath)

錯誤：

Traceback (most recent call last): 
    File "myFileReader.py", line 41, in <module> 
    content = input.read() 
    File "/usr/lib/python3.2/codecs.py", line 300, in decode 
    (result, consumed) = self._buffer_decode(data, self.errors, final) 
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe2 in position 11: invalid continuation byte

來源

2011-12-28 DrLou

要讀你必須以二進制方式打開文件的二進制文件。改變

input = open(fullpath, "r")

到

input = open(fullpath, "rb")

讀（的結果）將是一個字節（）對象。

來源

2011-12-29 04:29:35

韓國社交協會，倫納特 - 是的，這是祕密武器，我需要。有點新的Python3！ – DrLou 2011-12-29 17:03:29

這實際上並不是Python的3特定。二進制文件也應該在Python 2中用'b'標誌打開。 – 2011-12-29 20:20:45

是啊，回想起來，這一切似乎有點愚蠢 - 但這就是我們白癡學習的方式！你可能會想：RTFM！再次感謝您的幫助。 – DrLou 2014-11-03 21:17:34

由於您的某些文件是二進制文件，因此無法將其成功解碼爲Python 3用於在解釋器中存儲所有字符串的Unicode字符。請注意，Python 2和Python 3之間的巨大變化涉及將字符串表示從ASCII轉換爲unicode字符，這意味着每個字符不能簡單地視爲一個字節（是的，Python 3中的文本字符串需要2x 或4x儘可能多地存儲與Python 2一樣的內存，因爲UTF-8每個字符最多使用4個字節）。

您這樣有很多的選擇，將取決於您的項目：

忽略二進制文件，通過該文件擴展名的過濾，
閱讀二進制文件，要麼趕解碼異常，如果當它發生，並跳過該文件，或使用在這個線程How can I detect if a file is binary (non-text) in python?

說明在這方面的方法之一，您可以編輯您的解決方案簡單地趕上UnicodeDecode錯誤，跳過文件。

無論您的決定如何，請注意，如果系統中的文件中存在大量不同的字符編碼，則需要指定編碼，因爲Python 3.0將假定字符以UTF編碼-8。

作爲參考，在Python 3的I/O一個偉大的演講：http://www.dabeaz.com/python3io/MasteringIO.pdf

來源

2011-12-29 04:56:41

感謝您的鏈接和您的意見 - 這些對我的學習過程非常有用。到目前爲止，至少，所有的文件都像二進制文件一樣易於閱讀。 – DrLou 2011-12-29 17:07:20

如何從文件中讀取文件內容？

回答

相關問題