2012-02-29 52 views
0

我如何每5秒報告一次文件的處理量?我想我需要線程,但它是如何控制的?如何在處理大文件時報告進度?

#!/bin/env python 
# -*- coding: utf8 -*- 

import os 
import sys 
import logging 
import hashlib 

logger = logging.getLogger() 
FORMAT = "%(asctime)s %(levelname)s: %(message)s" 
logging.basicConfig(format=FORMAT, level=logging.DEBUG, datefmt="%H:%M:%S") 

class fileScanner: 
    readBytes = 0 
    lastReadBytes = 0 
    fileSize = 0 
    reportSeconds = 5 

    def scanFile(self, filePath): 
    self.readBytes = 0 
    self.lastReadBytes = 0 

    logging.getLogger() 
    self.fileSize = os.path.getsize(filePath) 

    with open(filePath, 'rb') as f: 
     m = hashlib.sha512() 
     while True: 
     data = f.read(1024) 
     if not data: 
      break 
     self.readBytes += len(data) 
     m.update(data) 
     return m.hexdigest() 
    raise IOError("Couldn't process file '%s'" % filePath) 

    def reportProcess(self): 
    logging.getLogger() 
    percent = float((self.readBytes/self.fileSize) * 100) 
    secAvg = (self.readBytes - self.lastReadBytes)/self.reportSeconds 
    estimatedTime = (self.fileSize - self.readBytes)/secAvg 
    logging.info("%s%% (%s/%s bytes) read in average of %s MB/sec. Estimated time left: %s seconds." % (percent, self.readBytes, self.fileSize, secAvg, estimatedTime)) 
    self.lastReadBytes = self.readBytes 


if __name__ == "__main__": 
    fs = fileScanner() 
    hash = fs.scanfile('largefile.dat') 

我如何開始和結束reportProcess()?

是的我知道計算可能是錯誤的。

+1

當您執行fs = fileScanner時,您並未創建fileScanner的實例。您只將類對象分配給fs。它應該是fs = fileScanner()。 – 2012-02-29 18:35:45

回答

1

只需每5秒鐘在讀取循環中調用reportProcess

lastTime = time.time() 
while True: 
    data = f.read(1024) 
    if not data: 
     break 
    self.readBytes += len(data) 
    if time.time() - lastTime > 5: 
        self.reportProcess() 
        lastTime = time.time() 

Unreleated:你爲什麼要使用類級別的屬性,通常他們應該在實例級別例如

class FileScanner: 
    def __init__(self): 
     self.readBytes = 0 
     self.lastReadBytes = 0 
+0

我使用類級屬性來表示實例將擁有哪些屬性並將它們暴露給* sphinx-doc *。 – 2012-02-29 18:58:46

0

你能不能從while循環內的scanFile()函數中調用reportProcess()。例如,讀取每x個字節,請致電reportProcess()(在while循環內添加條件)。這能解決你的問題嗎?