2016-08-18 67 views
2

在我的項目中創建了一組文件,並打包到ZIP壓縮文件以用於Android手機。 Android應用程序打開這樣的ZIP文件來讀取初始數據,然後將其工作結果存儲到相同的ZIP中。我無法訪問前面提到的Android應用程序的源代碼和生成zip文件的舊腳本(實際上,我不知道如何創建舊的ZIP文件)。但是,ZIP存檔的結構是已知的,我已經編寫了新的python腳本來製作相同的文件。Python中的zipfile產生的不是很正常的ZIP文件

我遇到了以下問題:由我的腳本生成的ZIP文件無法通過Android應用程序打開(關於不正確文件結構拖欠的錯誤消息),但是如果我解壓縮所有內容並將其打包回新的ZIP文件相同的名稱由WinZIP,7-Zip或「發送到 - >壓縮(壓縮)文件夾」(在Windows 7中)文件通常在電話上處理(這導致我的結論是問題是不在Android應用程序中)。

的代碼段在ZIP文件夾打包爲如下

# make zip 
try: 
    with zipfile.ZipFile(prefix + '.zip', 'w') as zipf: 
     for root, dirs, files in os.walk(prefix): 
      for file in files: 
       zipf.write(os.path.join(root, file)) 
    # remove dir, that was packed 
    shutil.rmtree(prefix) 
    # Report about resulting 
    print('File ' + prefix + '.zip was created') 
except: 
    print('Unexpected error occurred while creating file ' + prefix + '.zip') 

後,我注意到,文件沒有壓縮,我添加壓縮選項:

zipfile.ZipFile(prefix + '.zip', 'w', zipfile.ZIP_DEFLATED) 

但這並沒有解決我的問題並且設置True的值爲allowZip64也沒有改變的情況。

順便說一句,使用zipfile.ZIP_DEFLATED製作的ZIP文件比Windows生成的ZIP文件大小大約5千字節,比同樣壓縮文件內容的7-Zip結果小大約14千字節。與此同時,所有這些ZIP文件都可以通過7-Zip和Windows Explorer進行視覺比較。

所以我有三個相關的問題:

1)什麼可能導致我的劇本的這種奇怪的行爲與zipfile

2)我還能如何影響zipfile

3)如何檢查使用zipfile創建的ZIP文件以查找可能的結構問題或確保沒有問題?

當然,如果我不得不放棄使用zipfile,我可以使用外部存檔器(例如7-zip)進行文件打包,但是如果存在的話我希望找到一個優雅的解決方案。

UPDATE:

爲了檢查與zipfile創建ZIP文件的內容我作了如下

# make zip 
flist = [] 
try: 
    with zipfile.ZipFile(prefix + '.zip', 'w', zipfile.ZIP_DEFLATED) as zipf: 
     for root, dirs, files in os.walk(prefix): 
      for file in files: 
       zipf.write(os.path.join(root, file)) 
       # Store item in the list 
       flist.append(os.path.join(root, file).replace("\\","/")) 
    # remove dir, that was packed 
    shutil.rmtree(prefix) 
    # Report about resulting 
    print('File ' + prefix + '.zip was created') 
except: 
    print('Unexpected error occurred while creating file ' + prefix + '.zip') 
# Check of zip 
with closing(zipfile.ZipFile(prefix + '.zip')) as zfile: 
    for info in zfile.infolist(): 
     print(info.filename + \ 
       ' (extra = ' + str(info.extra) + \ 
       '; compress_type = ' + ('ZIP_DEFLATED' if info.compress_type == zipfile.ZIP_DEFLATED else 'NOT ZIP_DEFLATED') + \ 
       ')') 
     # remove item from list 
     if info.filename in flist: 
      flist.remove(info.filename) 
     else: 
      print(info.filename + ' is unexpected item') 
print('Number of items that were missed:') 
print(len(flist)) 

並查看輸出結果如下:

File en_US_00001.zip was created 
en_US_00001/en_US_00001_0001/en_US_00001_0001_big.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_info.xml (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_small.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.pkl (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.tex (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_user.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_big.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_info.xml (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_small.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.pkl (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.tex (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_user.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_big.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_info.xml (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_small.png (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.pkl (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.tex (extra = b''; compress_type = ZIP_DEFLATED) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_user.png (extra = b''; compress_type = ZIP_DEFLATED) 
Number of items that were missed: 
0 

因此,所有被寫入的內容都被讀取了,但問題仍然存在 - 如果所有必要的內容都被寫入了?例如。哈羅德在評論相關路徑時說道......也許,這是關鍵的答案

UPDATE 2

當我通過使用外部的7-Zip代碼

# make zip 
subprocess.call(["7z.exe","a",prefix + ".zip", prefix]) 
shutil.rmtree(prefix) 
# Check of zip 
with closing(zipfile.ZipFile(prefix + '.zip')) as zfile: 
    for info in zfile.infolist(): 
     print(info.filename) 
     print(' (extra = ' + str(info.extra) + '; compress_type = ' + str(info.compress_type) + ')') 
print('Values for compress_type:') 
print(str(zipfile.ZIP_DEFLATED) + ' = ZIP_DEFLATED') 
print(str(zipfile.ZIP_STORED) + ' = ZIP_STORED') 

替換zipfile產生以下結果

Creating archive en_US_00001.zip 

Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_big.png 
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_info.xml 
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_small.png 
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_source.pkl 
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_source.tex 
Compressing en_US_00001\en_US_00001_0001\en_US_00001_0001_user.png 
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_big.png 
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_info.xml 
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_small.png 
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_source.pkl 
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_source.tex 
Compressing en_US_00001\en_US_00001_0002\en_US_00001_0002_user.png 
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_big.png 
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_info.xml 
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_small.png 
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_source.pkl 
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_source.tex 
Compressing en_US_00001\en_US_00001_0003\en_US_00001_0003_user.png 

Everything is Ok 

en_US_00001/ 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00Faf\xd2Y\xf9\xd1\x01Faf\xd2Y\xf9\xd1\x01%\xc9c\xd2Y\xf9\xd1\x01'; compress_type = 0) 
en_US_00001/en_US_00001_0001/ 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xbe(e\xd2Y\xf9\xd1\x01\xbe(e\xd2Y\xf9\xd1\x016\xf0c\xd2Y\xf9\xd1\x01'; compress_type = 0) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_big.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00G\x17d\xd2Y\xf9\xd1\x01G\x17d\xd2Y\xf9\xd1\x01G\x17d\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_info.xml 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00X>d\xd2Y\xf9\xd1\x01X>d\xd2Y\xf9\xd1\x01X>d\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_small.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00z\x8cd\xd2Y\xf9\xd1\x01ied\xd2Y\xf9\xd1\x01ied\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.pkl 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x8b\xb3d\xd2Y\xf9\xd1\x01\x8b\xb3d\xd2Y\xf9\xd1\x01\x8b\xb3d\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.tex 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xad\x01e\xd2Y\xf9\xd1\x01\xad\x01e\xd2Y\xf9\xd1\x01\xad\x01e\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_user.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xbe(e\xd2Y\xf9\xd1\x01\xbe(e\xd2Y\xf9\xd1\x01\xbe(e\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0002/ 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x005:f\xd2Y\xf9\xd1\x015:f\xd2Y\xf9\xd1\x01\xcfOe\xd2Y\xf9\xd1\x01'; compress_type = 0) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_big.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xe0ve\xd2Y\xf9\xd1\x01\xcfOe\xd2Y\xf9\xd1\x01\xcfOe\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_info.xml 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xf1\x9de\xd2Y\xf9\xd1\x01\xe0ve\xd2Y\xf9\xd1\x01\xe0ve\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_small.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x02\xc5e\xd2Y\xf9\xd1\x01\x02\xc5e\xd2Y\xf9\xd1\x01\x02\xc5e\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.pkl 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x13\xece\xd2Y\xf9\xd1\x01\x13\xece\xd2Y\xf9\xd1\x01\x13\xece\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.tex 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00$\x13f\xd2Y\xf9\xd1\x01$\x13f\xd2Y\xf9\xd1\x01$\x13f\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_user.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x005:f\xd2Y\xf9\xd1\x015:f\xd2Y\xf9\xd1\x015:f\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0003/ 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xdf\xc0g\xd2Y\xf9\xd1\x01\xdf\xc0g\xd2Y\xf9\xd1\x01Faf\xd2Y\xf9\xd1\x01'; compress_type = 0) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_big.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00W\x88f\xd2Y\xf9\xd1\x01W\x88f\xd2Y\xf9\xd1\x01W\x88f\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_info.xml 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00h\xaff\xd2Y\xf9\xd1\x01h\xaff\xd2Y\xf9\xd1\x01h\xaff\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_small.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\x9b$g\xd2Y\xf9\xd1\x01y\xd6f\xd2Y\xf9\xd1\x01y\xd6f\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.pkl 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xacKg\xd2Y\xf9\xd1\x01\xacKg\xd2Y\xf9\xd1\x01\xacKg\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.tex 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xce\x99g\xd2Y\xf9\xd1\x01\xce\x99g\xd2Y\xf9\xd1\x01\xce\x99g\xd2Y\xf9\xd1\x01'; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_user.png 
    (extra = b'\n\x00 \x00\x00\x00\x00\x00\x01\x00\x18\x00\xdf\xc0g\xd2Y\xf9\xd1\x01\xdf\xc0g\xd2Y\xf9\xd1\x01\xdf\xc0g\xd2Y\xf9\xd1\x01'; compress_type = 8) 

Values for compress_type: 
8 = ZIP_DEFLATED 
0 = ZIP_STORED 

據我所知,最重要的發現是:

  • 帶有文件夾信息的項目(例如, en_US_00001/en_US_00001/en_US_00001_0001/),未在ZIP我的zipfile
  • 文件夾的使用產生了具有compress_type == ZIP_STORED,而對於文件compress_type == ZIP_DEFLATED
  • extra■找不同的值(產生相當長的字符串)
+2

您是否嘗試將'arcname'傳遞給'zipf.write'來存儲文件的相對路徑而不是完整的系統路徑?你可以通過'os.path.relpath'獲得相對路徑。 – gukoff

+0

如何使用'ZipFile.infolist()'來比較好的和壞的獲取關於結構的信息? –

+0

@AlastairMcCormack我已經添加了一些代碼來打開和分析創建的文件通過'zipfile':'infolist'和'extractall'正常工作 - 打開ZIP文件時沒有問題 – VolAnd

回答

1

基礎的根據問題UPDATE 2中列出的差異以及other question about zipfile中的示例,我嘗試了以下代碼將目錄添加到ZIP文件並檢查結果:

# make zip 
try: 
    with zipfile.ZipFile(prefix + '.zip', 'w', zipfile.ZIP_DEFLATED) as zipf: 
     info = zipfile.ZipInfo(prefix+'\\') 
     zipf.writestr(info, '') 
     for root, dirs, files in os.walk(prefix): 
      for d in dirs: 
       info = zipfile.ZipInfo(os.path.join(root, d)+'\\') 
       zipf.writestr(info, '') 
      for file in files: 
       zipf.write(os.path.join(root, file)) 
    # remove dir, that was packed 
    shutil.rmtree(prefix) 
    # Report about resulting 
    print('File ' + prefix + '.zip was created') 
except: 
    print('Unexpected error occurred while creating file ' + prefix + '.zip') 
# Check zip content 
with closing(zipfile.ZipFile(prefix + '.zip')) as zfile: 
    for info in zfile.infolist(): 
     print(info.filename) 
     print(' (extra = ' + str(info.extra) + '; compress_type = ' + str(info.compress_type) + ')') 
print('Values for compress_type:') 
print(str(zipfile.ZIP_DEFLATED) + ' = ZIP_DEFLATED') 
print(str(zipfile.ZIP_STORED) + ' = ZIP_STORED') 

輸出是

File en_US_00001.zip was created 
en_US_00001/ 
    (extra = b''; compress_type = 0) 
en_US_00001/en_US_00001_0001/ 
    (extra = b''; compress_type = 0) 
en_US_00001/en_US_00001_0002/ 
    (extra = b''; compress_type = 0) 
en_US_00001/en_US_00001_0003/ 
    (extra = b''; compress_type = 0) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_big.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_info.xml 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_small.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.pkl 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_source.tex 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0001/en_US_00001_0001_user.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_big.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_info.xml 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_small.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.pkl 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_source.tex 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0002/en_US_00001_0002_user.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_big.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_info.xml 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_small.png 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.pkl 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_source.tex 
    (extra = b''; compress_type = 8) 
en_US_00001/en_US_00001_0003/en_US_00001_0003_user.png 
    (extra = b''; compress_type = 8) 
Values for compress_type: 
8 = ZIP_DEFLATED 
0 = ZIP_STORED 

添加斜線目錄名(+'\\'+'/')出現強制性的。

而最重要的是 - 現在ZIP文件已被Android應用程序正確接受。

+0

所以我猜對了?祝賀您解決了您的問題。 –

+0

這可行,但可能會有更優雅的解決方案... – VolAnd

相關問題