Using variable to recursively hash files in Python gives false hash -
i trying print our md5 hashes files within directory recursively using python, i'm having problems variable in open command producing false hashes. here's code:
import os import hashlib  blocksize = 65536 md5_hash = hashlib.md5()  root, dirs, files in os.walk('/path/to/folder'):     filename in files:         os.chdir(root)         open(filename, 'rb') cur_file:             print filename             while true:                 data = cur_file.read(blocksize)                 if not data:                     break                 md5_hash.update(data)         print md5_hash.hexdigest()   if change "filename" variable specific file, this:
with open('nameoffile.txt', 'rb') cur_file: then correct hash produced, leading me believe loops faulty in way. on right track that? can fix variable or loops work properly?
you never reset hash object, i.e. calculate hash of concatenation of files. try moving md5_hash = hashlib.md5() loop:
for root, dirs, files in os.walk('/path/to/folder'):     filename in files:         md5_hash = hashlib.md5()         os.chdir(root)         open(filename, 'rb') cur_file:             print filename             while true:                 data = cur_file.read(blocksize)                 if not data:                     break             md5_hash.update(data)         print md5_hash.hexdigest() also: why chdir? open(os.path.join(root, filename), 'rb') should fine without additional syscall (and possible indeterminate state in case of error).
Comments
Post a Comment