如何糾正TypeError：Unicode對象必須在散列之前進行編碼？

130

我有這樣的錯誤：如何糾正TypeError：Unicode對象必須在散列之前進行編碼？

Traceback (most recent call last): 
    File "python_md5_cracker.py", line 27, in <module> 
    m.update(line) 
TypeError: Unicode-objects must be encoded before hashing

當我嘗試在執行這段代碼的Python 3.2.2：

import hashlib, sys 
m = hashlib.md5() 
hash = "" 
hash_file = input("What is the file name in which the hash resides? ") 
wordlist = input("What is your wordlist? (Enter the file name) ") 
try: 
     hashdocument = open(hash_file,"r") 
except IOError: 
     print("Invalid file.") 
     raw_input() 
     sys.exit() 
else: 
     hash = hashdocument.readline() 
     hash = hash.replace("\n","") 

try: 
     wordlistfile = open(wordlist,"r") 
except IOError: 
     print("Invalid file.") 
     raw_input() 
     sys.exit() 
else: 
     pass 
for line in wordlistfile: 
     m = hashlib.md5() #flush the buffer (this caused a massive problem when placed at the beginning of the script, because the buffer kept getting overwritten, thus comparing incorrect hashes) 
     line = line.replace("\n","") 
     m.update(line) 
     word_hash = m.hexdigest() 
     if word_hash==hash: 
       print("Collision! The word corresponding to the given hash is", line) 
       input() 
       sys.exit() 

print("The hash given does not correspond to any supplied word in the wordlist.") 
input() 
sys.exit()

來源

2011-09-28 JohnnyFromBF

我發現用'rb'打開一個文件幫助了我的情況。 – dlamblin

132

它可能正在尋找從wordlistfile的字符編碼。

wordlistfile = open(wordlist,"r",encoding='utf-8')

或者，如果你的工作一行接一行的基礎：

line.encode('utf-8')

來源

2011-09-28 15:10:20 cwallenpoole

'open（wordlist，「r」，encoding ='utf-8'）'爲什麼要使用特定編碼打開，編碼被指定爲解碼編解碼器，沒有這個選項，它使用平臺相關編碼。 –

錯誤已經說你必須做的事情。 MD5對字節進行操作，因此必須將Unicode字符串編碼爲bytes，例如，與line.encode('utf-8')。

來源

2011-09-28 15:09:17

+52

Downvoted，因爲句子「錯誤已經說明了你必須做的事情。」是粗暴無禮，並沒有增加任何東西。 – timthelion

+12

@timthelion它增加了含義，即閱讀理解是編程的先決條件。太可怕了，我知道。 –

+11

@timthelion真的。對於大約4歲的答案的措詞，你沒有什麼比做出道德判斷更好的了。這不是粗魯（這是事實），它是有幫助的（你可以閱讀信息的方式可以幫助你找到解決方案）。 – sehe

請that答案先來看看。現在

，該錯誤信息是明確的：你只能使用字節，而不是Python字符串（曾經被認爲是unicode在Python < 3），所以你必須與你的首選編碼編碼字符串：utf-32，utf-16，utf-8或者甚至是受限制的8位編碼之一（有些人可能稱之爲代碼頁）。

從文件讀取時，wordlist文件中的字節將被Python 3自動解碼爲Unicode。我建議你做：

m.update(line.encode(wordlistfile.encoding))

，這樣的編碼數據推到了MD5算法進行編碼完全一樣的底層文件。

來源

2011-10-15 14:14:05 tzot

您必須定義encoding format像utf-8，試試這個簡單的方法，

這個例子使用SHA256算法生成一個隨機數：

>>> import hashlib 
>>> hashlib.sha256(str(random.getrandbits(256)).encode('utf-8')).hexdigest() 
'cd183a211ed2434eac4f31b317c573c50e6c24e3a28b82ddcb0bf8bedf387a9f'

來源

2014-03-19 12:03:59

你可以以二進制方式打開文件：

import hashlib 

with open(hash_file) as file: 
    control_hash = file.readline().rstrip("\n") 

wordlistfile = open(wordlist, "rb") 
# ... 
for line in wordlistfile: 
    if hashlib.md5(line.rstrip(b'\n\r')).hexdigest() == control_hash: 
     # collision

來源

2014-03-25 19:36:49 jfs

要存儲的密碼（PY3）：

import hashlib, os 
password_salt = os.urandom(32).hex() 
password = '12345' 

hash = hashlib.sha512() 
hash.update(('%s%s' % (password_salt, password)).encode('utf-8')) 
password_hash = hash.hexdigest()

來源

2017-09-11 09:09:18

如何糾正TypeError：Unicode對象必須在散列之前進行編碼？

回答

相關問題