我有大量的Excel工作表,我希望能夠使用PHPExcel讀入MySQL。如何使用PHPExcel從大型Excel文件(27MB +)中讀取大型工作表?
我現在用的是recent patch,讓您無需打開整個文件讀取工作表。這樣,我可以一次讀取一張工作表。
然而,一個Excel文件是27MB大。我可以在第一個工作表成功讀取,因爲它很小,但第二個工作是如此之大,是在22:00開始的過程中cron作業在上午8:00沒有完成,的工作很簡單太大。
有什麼方法可以在工作表中按行(例如,是這樣的:
$inputFileType = 'Excel2007';
$inputFileName = 'big_file.xlsx';
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$worksheetNames = $objReader->listWorksheetNames($inputFileName);
foreach ($worksheetNames as $sheetName) {
//BELOW IS "WISH CODE":
foreach($row = 1; $row <=$max_rows; $row+= 100) {
$dataset = $objReader->getWorksheetWithRows($row, $row+100);
save_dataset_to_database($dataset);
}
}
附錄
@馬克,我用你貼創建下面的示例代碼:
function readRowsFromWorksheet() {
$file_name = htmlentities($_POST['file_name']);
$file_type = htmlentities($_POST['file_type']);
echo 'Read rows from worksheet:<br />';
debug_log('----------start');
$objReader = PHPExcel_IOFactory::createReader($file_type);
$chunkSize = 20;
$chunkFilter = new ChunkReadFilter();
$objReader->setReadFilter($chunkFilter);
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load('data/' . $file_name);
debug_log('reading chunk starting at row '.$startRow);
$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
var_dump($sheetData);
echo '<hr />';
}
debug_log('end');
}
如下面的日誌文件顯示,它運行罰款一小8K Excel文件,但是當我在3 MB運行 Excel文件,它永遠不會越過杉ST塊,有沒有什麼辦法可以優化性能的代碼,否則它看起來不像是不夠的高性能得到大塊了大量的Excel文件的:
2011-01-12 11:07:15: ----------start
2011-01-12 11:07:15: reading chunk starting at row 2
2011-01-12 11:07:15: reading chunk starting at row 22
2011-01-12 11:07:15: reading chunk starting at row 42
2011-01-12 11:07:15: reading chunk starting at row 62
2011-01-12 11:07:15: reading chunk starting at row 82
2011-01-12 11:07:15: reading chunk starting at row 102
2011-01-12 11:07:15: reading chunk starting at row 122
2011-01-12 11:07:15: reading chunk starting at row 142
2011-01-12 11:07:15: reading chunk starting at row 162
2011-01-12 11:07:15: reading chunk starting at row 182
2011-01-12 11:07:15: reading chunk starting at row 202
2011-01-12 11:07:15: reading chunk starting at row 222
2011-01-12 11:07:15: end
2011-01-12 11:07:52: ----------start
2011-01-12 11:08:01: reading chunk starting at row 2
(...at 11:18, CPU usage at 93% still running...)
補遺2
當我註釋:
//$sheetData = $objPHPExcel->getActiveSheet()->toArray(null, true, true, true);
//var_dump($sheetData);
然後它解析以可接受的速度(每秒約行),反正是有增加的toArray()
的表現?
2011-01-12 11:40:51: ----------start
2011-01-12 11:40:59: reading chunk starting at row 2
2011-01-12 11:41:07: reading chunk starting at row 22
2011-01-12 11:41:14: reading chunk starting at row 42
2011-01-12 11:41:22: reading chunk starting at row 62
2011-01-12 11:41:29: reading chunk starting at row 82
2011-01-12 11:41:37: reading chunk starting at row 102
2011-01-12 11:41:45: reading chunk starting at row 122
2011-01-12 11:41:52: reading chunk starting at row 142
2011-01-12 11:42:00: reading chunk starting at row 162
2011-01-12 11:42:07: reading chunk starting at row 182
2011-01-12 11:42:15: reading chunk starting at row 202
2011-01-12 11:42:22: reading chunk starting at row 222
2011-01-12 11:42:22: end
附錄3
這似乎工作充分,例如,至少在3 MB文件:
for ($startRow = 2; $startRow <= 240; $startRow += $chunkSize) {
echo 'Loading WorkSheet using configurable filter for headings row 1 and for rows ', $startRow, ' to ', ($startRow + $chunkSize - 1), '<br />';
$chunkFilter->setRows($startRow, $chunkSize);
$objPHPExcel = $objReader->load('data/' . $file_name);
debug_log('reading chunk starting at row ' . $startRow);
foreach ($objPHPExcel->getActiveSheet()->getRowIterator() as $row) {
$cellIterator = $row->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(false);
echo '<tr>';
foreach ($cellIterator as $cell) {
if (!is_null($cell)) {
//$value = $cell->getCalculatedValue();
$rawValue = $cell->getValue();
debug_log($rawValue);
}
}
}
}
$ sheetData的的var_dump只是在我的代碼片段演示瞭如何分塊工程,可能不是你需要在「現實世界」中使用。如果您確實需要執行工作表數據轉儲,那麼我現在添加到Worksheet類中的rangeToArray()方法也會比toArray()方法更高效。 – 2011-01-12 11:53:43
@Edward Tanguay嗨,你有沒有找到任何解決方案/替代方案?我遇到同樣的問題 – 2013-10-11 09:07:51