2016-06-14 61 views
0

我嘗試使用requests庫下載多個pdf,並使用pypdf將它們合併在一起。一般來說,這工作正常,但對於一些PDF,我只是得到一個錯誤。Unicode錯誤PyPdf

MWE.py

import requests 
from pyPdf import PdfFileWriter, PdfFileReader 
from StringIO import StringIO 


input = PdfFileReader(StringIO(response.content)) 
input.decrypt("") 
output = PdfFileWriter() 
output.addPage(input.getPage(0)) 

outputStream = file("document-output.pdf", "wb") 
output.write(outputStream) 
outputStream.close() 

session.close() 

錯誤

Traceback (most recent call last): 
    File "mwe.py", line 21, in <module> 
    input.decrypt("") 
    File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 894, in decrypt 
    return self._decrypt(password) 
    File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 904, in _decrypt 
    user_password, key = self._authenticateUserPassword(password) 
    File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 945, in _authenticateUserPassword 
    encrypt.get("/EncryptMetadata", BooleanObject(False)).getObject()) 
    File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1818, in _alg35 
    key = _alg32(password, rev, keylen, owner_entry, p_entry, id1_entry) 
    File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1729, in _alg32 
    m.update(id1_entry) 
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128) 

對於跟蹤我從文件中讀取輸入,但我不認爲它在這種情況下很重要。

我發現這個問題有一些相關的問題,但我無法解決我的具體問題。

+0

你打算分享追蹤的其餘部分嗎? –

+0

解密方法中發生錯誤不是嗎?其實pdf沒有加密,但我發現這個解決方法與空密碼。否則,它會在addPage方法內出現'Exception:file has not decrypted'錯誤。 –

+0

你爲什麼使用'file'?你應該真的使用'打開' –

回答