我需要一個大文件分割成chunks.since我的文件大小是更大（50GB），我需要拆分成更大的塊

#include<iostream> 
#include<fstream> 
#define BUFFER_SIZE 11788889 

using namespace std; 

int main() 
{ 
ifstream infile("hello.txt"); 
unsigned char buffer[BUFFER_SIZE]; 
int read_file_position=infile.tellg(); 
cout<<"input file position"<<read_file_position<<endl; 
while(infile.read((char *)buffer,BUFFER_SIZE)) 
{ 
read_file_position=infile.tellg(); 
cout<<"input file position"<<read_file_position<<endl; 

} 
}

什麼，我試圖分裂我的文件只到bytes..splitting成MB或GB的數據塊會很棒..如果有辦法將它分成更大的塊，這將是有益的。因爲我的記錄沒有固定長度，所以塊大小會有所不同。我需要一個大文件分割成chunks.since我的文件大小是更大（50GB），我需要拆分成更大的塊

來源

2014-12-04 Ayushi Tripathi

你需要將它分割成塊，爲什麼？你是指在內存中，還是將其分割成磁盤上單獨的較小文件？ – EJP 2014-12-04 08:53:02

因爲我需要進一步提供這些具體卡盤分開線程.. – 2014-12-04 10:00:43

但這是一個後期部分..首先分成塊是主要 – 2014-12-04 10:10:43

是的，但因爲我有一個較大的文件，我不希望它寫入到另一個文件，浪費時間......

我有記錄，這樣的..

ID:1002:: TP://reports/timing_report1.txt::TPS:counter/ffa::TPE: counter/ffd:: PGR: CLK::PTY:max::SL:-0.48::LAY:M2:: SEL::SLLT:1.0:: PTY:ANY::LAY:M1&M2:: PRG:ANY:: CELL:ANY:: REG:ANY 
ID:1003:: TP://reports/timing_report1.txt::TPS:counter/ffb::TPE: counter/ffc:: PGR:CLK:: PTY:max::SL:-0.3::LAY:M1:: SEL::SLLT:1.0:: PTY:ANY::LAY: M1&M2:: PRG:ANY:: CELL:ANY:: REG:ANY

現在如果我想分成大塊..我不希望大塊包含一半的記錄..所以我想大塊有全部記錄..如果我分成兩半，那麼我不想要一個記錄分裂所以我需要搜索下一次出現的ID，並在該塊中的下一個ID中添加前一個塊n中的前一半

來源

2014-12-04 09:54:50

如果您想通過塊比讀取數據傳遞到塊幾個線程執行以下操作

void *pManyChunks = malloc(NUM_THREADS * sizeof(YourRecord)); 

while(not end of file) 
{ 
    read sizeof(YourRecord)*NUM_THREADS bytes to pManuChunks 

    pass (YourRecord*)((char*)pManuChunks + sizeof(YourThread)*0) pointer and sizeof(YourRecord) to first thread 
    pass (YourRecord*)((char*)pManuChunks + sizeof(YourThread)*1) and sizeof(YourRecord) to second thread 
    pass (YourRecord*)((char*)pManuChunks + sizeof(YourThread)*2) and sizeof(YourRecord) to third thread 
    etc 
}

來源

2014-12-04 10:43:06 Fl0

我的記錄大小不固定..它變化... n相同的文件size.that是我用#define爲緩衝區大小n的文件大小，以便我可以改變它的任何給定的時間點..我想分割我的文件塊有效閱讀...線程將發生一旦記錄被分裂妥當。目前主要關注的是......只是分裂成恰當的塊。 – 2014-12-04 11:45:27

我需要一個大文件分割成chunks.since我的文件大小是更大（50GB），我需要拆分成更大的塊

回答

相關問題