2011-01-23 191 views
29

我試圖創建一個文件下載程序作爲後臺服務,但是當計劃一個大文件時,它首先放在內存中,然後在下載結束時文件被寫入磁盤。使用node.js下載大文件以避免高內存消耗

考慮到我可能同時有很多文件正在下載,我該如何讓文件逐漸寫入磁盤保留內存?

這裏使用的代碼我真的:

var sys = require("sys"), 
    http = require("http"), 
    url = require("url"), 
    path = require("path"), 
    fs = require("fs"), 
    events = require("events"); 

var downloadfile = "http://nodejs.org/dist/node-v0.2.6.tar.gz"; 

var host = url.parse(downloadfile).hostname 
var filename = url.parse(downloadfile).pathname.split("/").pop() 

var theurl = http.createClient(80, host); 
var requestUrl = downloadfile; 
sys.puts("Downloading file: " + filename); 
sys.puts("Before download request"); 
var request = theurl.request('GET', requestUrl, {"host": host}); 
request.end(); 

var dlprogress = 0; 


setInterval(function() { 
    sys.puts("Download progress: " + dlprogress + " bytes"); 
}, 1000); 


request.addListener('response', function (response) { 
    response.setEncoding('binary') 
    sys.puts("File size: " + response.headers['content-length'] + " bytes.") 
    var body = ''; 
    response.addListener('data', function (chunk) { 
     dlprogress += chunk.length; 
     body += chunk; 
    }); 
    response.addListener("end", function() { 
     fs.writeFileSync(filename, body, 'binary'); 
     sys.puts("After download finished"); 
    }); 

}); 
+0

任何機會,你可以分享的最終結果?我正在尋找這樣的東西... – Eli 2011-10-07 17:40:20

+0

我試圖實現一個功能,遵循302重定向,但我不認爲它的工作正常。也許你可以試試。那裏是:https://gist.github.com/1297063 – Carlosedp 2011-10-18 23:33:34

回答

26

我改變了回調:

request.addListener('response', function (response) { 
     var downloadfile = fs.createWriteStream(filename, {'flags': 'a'}); 
     sys.puts("File size " + filename + ": " + response.headers['content-length'] + " bytes."); 
     response.addListener('data', function (chunk) { 
      dlprogress += chunk.length; 
      downloadfile.write(chunk, encoding='binary'); 
     }); 
     response.addListener("end", function() { 
      downloadfile.end(); 
      sys.puts("Finished downloading " + filename); 
     }); 

    }); 

這完美地工作。

+0

我們是不是應該更喜歡setEncoding(null)而不是'binary'? – 2011-02-04 12:34:30

+0

`{'flags':'a'}`會將數據追加到文件中,如果它已經存在 – respectTheCode 2012-08-23 12:33:53

1

而不是拿着你應該寫在追加模式的文件"data"事件偵聽器的內容到內存中。

2

下載大文件時請使用fs.write而不是writeFile,因爲它會覆蓋以前的內容。

function downloadfile(res) { 
    var requestserver = http.request(options, function(r) { 
     console.log('STATUS: ' + r.statusCode); 
     console.log('HEADERS: ' + JSON.stringify(r.headers)); 

     var fd = fs.openSync('sai.tar.gz', 'w'); 

     r.on('data', function (chunk) { 
      size += chunk.length; 
      console.log(size+'bytes received'); 
      sendstatus(res,size); 
      fs.write(fd, chunk, 0, chunk.length, null, function(er, written) { 
      }); 
     }); 
     r.on('end',function(){ 
      console.log('\nended from server'); 
      fs.closeSync(fd); 
      sendendstatus(res); 
     }); 
    }); 
} 
+1

如果您不等待回調,則fs.write不安全。你應該使用WriteStream。 – respectTheCode 2012-08-23 12:32:10

2

請求包是否適合您的使用?

它可以讓你做這樣的事情:

request(downloadurl).pipe(fs.createWriteStream(downloadtohere)) 
1

像卡特·科爾使用流建議。這裏是一個更完整的例子

var inspect = require('eyespect').inspector(); 
var request = require('request'); 
var filed = require('filed'); 
var temp = require('temp'); 
var downloadURL = 'http://upload.wikimedia.org/wikipedia/commons/e/ec/Hazard_Creek_Kayaker.JPG'; 
var downloadPath = temp.path({prefix: 'singlePageRaw', suffix: '.jpg'}); 

var downloadFile = filed(downloadPath); 
var r = request(downloadURL).pipe(downloadFile); 


r.on('data', function(data) { 
    inspect('binary data received'); 
}); 
downloadFile.on('end', function() { 
    inspect(downloadPath, 'file downloaded to path'); 
}); 

downloadFile.on('error', function (err) { 
    inspect(err, 'error downloading file'); 
}); 

您可能需要安裝模塊,您可以通過 npm install filed request eyespect temp

4

做在http-request看看:

// shorthand syntax, buffered response 
http.get('http://localhost/get', function (err, res) { 
    if (err) throw err; 
    console.log(res.code, res.headers, res.buffer.toString()); 
}); 

// save the response to 'myfile.bin' with a progress callback 
http.get({ 
    url: 'http://localhost/get', 
    progress: function (current, total) { 
     console.log('downloaded %d bytes from %d', current, total); 
    } 
}, 'myfile.bin', function (err, res) { 
    if (err) throw err; 
    console.log(res.code, res.headers, res.file); 
});