任務
我有一個包含整數並且想在C#中讀取它們的大文件(大約20 GB)。從二進制文件中讀取巨型int數組
簡單方法
讀文件到存儲器(成字節數組)是相當快的(使用SSD,整個文件裝配到存儲器)。但是當我用二進制閱讀器(通過內存流)讀取這些字節時,ReadInt32-方法比讀取文件到內存花費的時間要長得多。我期望成爲磁盤IO的瓶頸,但這是轉換!
想法和問題
有沒有辦法直接投全字節數組轉換成一個int數組沒有給它一個接一個與ReadInt32法轉換?寫在5499ms
class Program
{
static int size = 256 * 1024 * 1024;
static string filename = @"E:\testfile";
static void Main(string[] args)
{
Write(filename, size);
int[] result = Read(filename, size);
Console.WriteLine(result.Length);
}
static void Write(string filename, int size)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
BinaryWriter bw = new BinaryWriter(new FileStream(filename, FileMode.Create), Encoding.UTF8);
for (int i = 0; i < size; i++)
{
bw.Write(i);
}
bw.Close();
stopwatch.Stop();
Console.WriteLine(String.Format("File written in {0}ms", stopwatch.ElapsedMilliseconds));
}
static int[] Read(string filename, int size)
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
byte[] buffer = File.ReadAllBytes(filename);
BinaryReader br = new BinaryReader(new MemoryStream(buffer), Encoding.UTF8);
stopwatch.Stop();
Console.WriteLine(String.Format("File read into memory in {0}ms", stopwatch.ElapsedMilliseconds));
stopwatch.Reset();
stopwatch.Start();
int[] result = new int[size];
for (int i = 0; i < size; i++)
{
result[i] = br.ReadInt32();
}
br.Close();
stopwatch.Stop();
Console.WriteLine(String.Format("Byte-array casted to int-array in {0}ms", stopwatch.ElapsedMilliseconds));
return result;
}
}
- 文件在3382ms
您必須最終執行轉換。你能否將數組讀入內存並使用BitConverter根據需要從數組中獲取值? – 2014-11-02 14:36:13
可能的重複http://stackoverflow.com/questions/3206391/directly-reading-large-binary-file-in-c-sharp-w-out-copying。 – 2014-11-02 14:37:35
@PatrickHofman:似乎他已經知道如何將文件讀入內存。 – 2014-11-02 14:43:40