2012-12-07 51 views
16

我試圖從公用URL下載大文件。它似乎起初工作正常,但1/10計算機似乎超時。我最初的嘗試是使用WebClient.DownloadFileAsync,但因爲它永遠不會完成,所以我又回到使用WebRequest.Create並直接讀取響應流。WebRequest無法正確下載大文件(〜1 GB)

我的第一個版本的使用WebRequest.Create發現與WebClient.DownloadFileAsync相同的問題。操作超時並且文件未完成。

如果下載超時,我的下一個版本會添加重試次數。這是奇怪的。下載最終會以1次重試結束最後的7092個字節。因此,文件的下載尺寸完全相同,但文件已損壞,與源文件不同。現在我預計腐敗會在最後的7092字節,但事實並非如此。

使用BeyondCompare我發現從損壞的文件中總共丟失了2個字節的字節,總共丟失了7092個字節!這個丟失的字節在1CA49FF01E31F380,在下載超時並重新啓動之前的方式。

這裏可能會發生什麼?有關如何進一步追蹤此問題的任何提示?

這裏是有問題的代碼。

public void DownloadFile(string sourceUri, string destinationPath) 
{ 
    //roughly based on: http://stackoverflow.com/questions/2269607/how-to-programmatically-download-a-large-file-in-c-sharp 
    //not using WebClient.DownloadFileAsync as it seems to stall out on large files rarely for unknown reasons. 

    using (var fileStream = File.Open(destinationPath, FileMode.Create, FileAccess.Write, FileShare.Read)) 
    { 
     long totalBytesToReceive = 0; 
     long totalBytesReceived = 0; 
     int attemptCount = 0; 
     bool isFinished = false; 

     while (!isFinished) 
     { 
      attemptCount += 1; 

      if (attemptCount > 10) 
      { 
       throw new InvalidOperationException("Too many attempts to download. Aborting."); 
      } 

      try 
      { 
       var request = (HttpWebRequest)WebRequest.Create(sourceUri); 

       request.Proxy = null;//http://stackoverflow.com/questions/754333/why-is-this-webrequest-code-slow/935728#935728 
       _log.AddInformation("Request #{0}.", attemptCount); 

       //continue downloading from last attempt. 
       if (totalBytesReceived != 0) 
       { 
        _log.AddInformation("Request resuming with range: {0} , {1}", totalBytesReceived, totalBytesToReceive); 
        request.AddRange(totalBytesReceived, totalBytesToReceive); 
       } 

       using (var response = request.GetResponse()) 
       { 
        _log.AddInformation("Received response. ContentLength={0} , ContentType={1}", response.ContentLength, response.ContentType); 

        if (totalBytesToReceive == 0) 
        { 
         totalBytesToReceive = response.ContentLength; 
        } 

        using (var responseStream = response.GetResponseStream()) 
        { 
         _log.AddInformation("Beginning read of response stream."); 
         var buffer = new byte[4096]; 
         int bytesRead = responseStream.Read(buffer, 0, buffer.Length); 
         while (bytesRead > 0) 
         { 
          fileStream.Write(buffer, 0, bytesRead); 
          totalBytesReceived += bytesRead; 
          bytesRead = responseStream.Read(buffer, 0, buffer.Length); 
         } 

         _log.AddInformation("Finished read of response stream."); 
        } 
       } 

       _log.AddInformation("Finished downloading file."); 
       isFinished = true; 
      } 
      catch (Exception ex) 
      { 
       _log.AddInformation("Response raised exception ({0}). {1}", ex.GetType(), ex.Message); 
      } 
     } 
    } 
} 

下面是從腐敗的下載輸出日誌:

Request #1. 
Received response. ContentLength=939302925 , ContentType=application/zip 
Beginning read of response stream. 
Response raised exception (System.Net.WebException). The operation has timed out. 
Request #2. 
Request resuming with range: 939295833 , 939302925 
Received response. ContentLength=7092 , ContentType=application/zip 
Beginning read of response stream. 
Finished read of response stream. 
Finished downloading file. 
+1

我可以想到兩件事情在我頭上。 a)增加對大文件的超時(如果可能的話)b)數據的編碼和解碼是否會損壞?我曾在一個不同的項目中遇到過這個問題。嘗試使用UTF-8編碼它 – Steven

+0

它不應該是一個編碼問題,它是一個二進制blob(zip文件)。 – Spish

+5

聽起來對我來說,你正試圖調試服務器錯誤的電線錯誤的一端。 –

回答

0

這是我通常使用的方法,它並沒有在同一種加載你需要我失敗爲止。嘗試使用我的代碼來改變你的一些,看看是否有幫助。

if (!Directory.Exists(localFolder)) 
{ 
    Directory.CreateDirectory(localFolder); 
} 


try 
{ 
    HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create(Path.Combine(uri, filename)); 
    httpRequest.Method = "GET"; 

    // if the URI doesn't exist, exception gets thrown here... 
    using (HttpWebResponse httpResponse = (HttpWebResponse)httpRequest.GetResponse()) 
    { 
     using (Stream responseStream = httpResponse.GetResponseStream()) 
     { 
      using (FileStream localFileStream = 
       new FileStream(Path.Combine(localFolder, filename), FileMode.Create)) 
      { 
       var buffer = new byte[4096]; 
       long totalBytesRead = 0; 
       int bytesRead; 

       while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0) 
       { 
        totalBytesRead += bytesRead; 
        localFileStream.Write(buffer, 0, bytesRead); 
       } 
      } 
     } 
    } 
} 
catch (Exception ex) 
{   
    throw; 
} 
0

您應該更改超時設置。似乎有兩個可能的超時問題:

  • 客戶端超時 - 嘗試更改WebClient中的超時。我發現大文件下載有時需要這樣做。
  • 服務器端超時 - 嘗試更改服務器上的超時。您可以驗證這是使用其他客戶端的問題,例如PostMan
0

對我來說,你如何通過緩衝來讀取文件的方法看起來很奇怪。 也許問題是,你做

while(bytesRead > 0) 

如果由於某種原因,流犯規在某個時候返回任何字節,但它仍然是尚未完成下載,然後將退出循環,永遠不會到來背部。您應該獲取Content-Length,並通過bytesRead增加一個變量totalBytesReceived。最後你改變回路到

while(totalBytesReceived < ContentLength)