從具有多個BOM的文件中刪除所有物料清單

我有一個文本文件，其中包含以字節順序標記開頭的多行。通過encoding='utf-8-sig'到open會在文件開始時刪除物料清單，但所有後續物料清單仍保留。是否有更正確的方法來刪除這些比這：從具有多個BOM的文件中刪除所有物料清單

import codecs 

filepath = 'foo.txt' 
bom_len = len(codecs.BOM_UTF8) 

def remove_bom(s): 
    s = str.encode(s) 

    if codecs.BOM_UTF8 in s: 
     s = s[bom_len:] 

    return s.decode() 

try: 
    with open(filepath, encoding='utf-8-sig') as file_object: 
     for line in file_object: 
      line = line.rstrip() 
      line = remove_bom(line) 
      if line != '': 
       print([line[0]]) 
except FileNotFoundError: 
    print('No file found at ' + filepath)

來源

2016-06-10 maxhallinan

讀取文件作爲二進制字符串，算多少的BOM有，然後刪除字節數從字符串的開頭* 3。 –

我可能會誤解：這會在文件的開始處返回多個BOM *全部*這個文件有整個的物料清單。 – maxhallinan

我有類似的問題。這有點讓我：

import codecs 
with open(path, "rb") as infile: 
    bytecontent = infile.read() 
bytecontent = bytecontent.replace(codecs.BOM_UTF8, b"")

來源

2017-12-12 13:03:28 walkslowly

從具有多個BOM的文件中刪除所有物料清單

回答

相關問題