2011-02-07 186 views
4

我正在使用C#中服務器端的xslx Excel文件。在電子表格中,總共有15列(單元格)。在單元行中,有些值缺失。所以第一行是我的頭將正確擁有15個單元格。但是我的數據行,有些單元格可能有空值,所以Open XML有一組「鋸齒」的單元格值。第1行將包含完整的15個單元格,第2行可能包含13個單元格,因爲其中兩個值爲空。什麼!如何正確映射這些數據?它基本上把所有東西都轉移到左邊,我的單元格值是錯誤的。我錯過了什麼?看起來他們正在使用Open XML術語「摺疊」。如何在電子表格中打開XML電子表格「uncollapse」單元格?

WorkbookPart workbookPart = spreadSheetDocument.WorkbookPart; 
       IEnumerable<Sheet> sheets = spreadSheetDocument.WorkbookPart.Workbook.GetFirstChild<Sheets>().Elements<Sheet>(); 
       string relationshipId = sheets.First().Id.Value; 
       WorksheetPart worksheetPart = (WorksheetPart)spreadSheetDocument.WorkbookPart.GetPartById(relationshipId); 
       Worksheet workSheet = worksheetPart.Worksheet; 
       SheetData sheetData = workSheet.GetFirstChild<SheetData>(); 
       IEnumerable<Row> rows = sheetData.Descendants<Row>(); 

CLARIFICATION:這是另一種方法來問這個問題。如果我想將Excel文件中的內容放入DataTable中,該怎麼辦?我希望所有的數據列都能正確排列。我怎麼能做到這一點?

這個問題是問比我在這裏:reading Excel Open XML is ignoring blank cells

+0

MS文章上有所話題:http://msdn.microsoft.com/en-us/library/documentformat.openxml.spreadsheet.row.aspx – Shane 2011-02-07 15:58:16

+0

能你詳細說明一下?你是否正在修改現有的xlsx文件並且輸出錯誤?您是否試圖從現有的xlsx文件中讀取數據,並且OpenXML單元不符合您的期望?無法理解這個問題,以及你的代碼試圖完成什麼。 – Sapph 2011-02-07 16:01:18

+0

嘗試閱讀XLSX文件並使用它。例如,這個:foreach(Row row in rows)但是在第一行中,它有正確的15個單元格,第2行,只有13個單元格,因爲在我的Excel文件中,兩個只是空字符串。所以它將Row2中的所有數據都轉移過來。現在所有單元格都沒有正確映射,因此對於第2行,例如單元格6,它確實是單元格7的值,這是不正確的。很難解釋我想。 – Shane 2011-02-07 16:05:27

回答

3

的一種方式,你可以完成你想要的是要弄清楚在所有行的最大列索引和然後填寫所有的空單元格值將空白,這將保持你所有的列正確排列在一起。他最大的列索引:

int? biggestColumnIndex = 0; 
foreach (Row row in rows) 
{ 
    if (row.Descendants<Cell>().Any()) 
    { 
     // Figure out the if this row has a bigger column index than the previous rows 
     int? columnIndex = GetColumnIndexFromName(((Cell)(row.LastChild)).CellReference); 
     biggestColumnIndex = columnIndex.HasValue && columnIndex > biggestColumnIndex ? columnIndex : biggestColumnIndex;     
    } 
} 

     /// <summary> 
     /// Given just the column name (no row index), it will return the zero based column index. 
     /// Note: This method will only handle columns with a length of up to two (ie. A to Z and AA to ZZ). 
     /// A length of three can be implemented when needed. 
     /// </summary> 
     /// <param name="columnName">Column Name (ie. A or AB)</param> 
     /// <returns>Zero based index if the conversion was successful; otherwise null</returns> 
     public static int? GetColumnIndexFromName(string columnName) 
     { 
      int? columnIndex = null; 

      string[] colLetters = Regex.Split(columnName, "([A-Z]+)"); 
      colLetters = colLetters.Where(s => !string.IsNullOrEmpty(s)).ToArray(); 

      if (colLetters.Count() <= 2) 
      { 
       int index = 0; 
       foreach (string col in colLetters) 
       { 
        List<char> col1 = colLetters.ElementAt(index).ToCharArray().ToList(); 
        int? indexValue = Letters.IndexOf(col1.ElementAt(index)); 

        if (indexValue != -1) 
        { 
         // The first letter of a two digit column needs some extra calculations 
         if (index == 0 && colLetters.Count() == 2) 
         { 
          columnIndex = columnIndex == null ? (indexValue + 1) * 26 : columnIndex + ((indexValue + 1) * 26); 
         } 
         else 
         { 
          columnIndex = columnIndex == null ? indexValue : columnIndex + indexValue; 
         } 
        } 

        index++; 
       } 
      } 

      return columnIndex; 
     } 

然後調用InsetCellsForCellRange方法,你有最大的列索引來填補所有空白細胞與空白單元格後。然後閱讀您的數據,他們應該都排隊。 (所有輔助方法都低於InsetCellsForCellRange法)

/// <summary> 
/// Inserts cells if required for a rectangular range of cells 
/// </summary> 
/// <param name="startCellReference">Upper left cell of the rectangle</param> 
/// <param name="endCellReference">Lower right cell of the rectangle</param> 
/// <param name="worksheetPart">Worksheet part to insert cells</param> 
public static void InsertCellsForCellRange(string startCellReference, string endCellReference, WorksheetPart worksheetPart) 
{ 
    uint startRow = GetRowIndex(startCellReference); 
    uint endRow = GetRowIndex(endCellReference); 
    string startColumn = GetColumnName(startCellReference); 
    string endColumn = GetColumnName(endCellReference); 

    // Insert the cells row by row if necessary 
    for (uint currentRow = startRow; currentRow <= endRow; currentRow++) 
    { 
     string currentCell = startColumn + currentRow.ToString(); 
     string endCell = IncrementCellReference(endColumn + currentRow.ToString(), CellReferencePartEnum.Column); 

     // Check to make sure all cells exist in the range; if not create them 
     while (!currentCell.Equals(endCell)) 
     { 
      if (GetCell(worksheetPart, currentCell) == null) 
      { 
       InsertCell(GetColumnName(currentCell), GetRowIndex(currentCell), worksheetPart); 
      } 

      // Move the reference to the next cell in the range 
      currentCell = IncrementCellReference(currentCell, CellReferencePartEnum.Column); 
     } 
    } 
} 

     /// <summary> 
     /// Given a cell name, parses the specified cell to get the row index. 
     /// </summary> 
     /// <param name="cellReference">Address of the cell (ie. B2)</param> 
     /// <returns>Row Index (ie. 2)</returns> 
     public static uint GetRowIndex(string cellReference) 
     { 
      // Create a regular expression to match the row index portion the cell name. 
      Regex regex = new Regex(@"\d+"); 
      Match match = regex.Match(cellReference); 

      return uint.Parse(match.Value); 
     } 



    /// <summary> 
    /// Given a cell name, parses the specified cell to get the column name. 
    /// </summary> 
    /// <param name="cellReference">Address of the cell (ie. B2)</param> 
    /// <returns>Column Name (ie. B)</returns> 
    public static string GetColumnName(string cellReference) 
    { 
     // Create a regular expression to match the column name portion of the cell name. 
     Regex regex = new Regex("[A-Za-z]+"); 
     Match match = regex.Match(cellReference); 

     return match.Value; 
    } 

     /// <summary> 
     /// Increments the reference of a given cell. This reference comes from the CellReference property 
     /// on a Cell. 
     /// </summary> 
     /// <param name="reference">reference string</param> 
     /// <param name="cellRefPart">indicates what is to be incremented</param> 
     /// <returns></returns> 
     public static string IncrementCellReference(string reference, CellReferencePartEnum cellRefPart) 
     { 
      string newReference = reference; 

      if (cellRefPart != CellReferencePartEnum.None && !String.IsNullOrEmpty(reference)) 
      { 
       string[] parts = Regex.Split(reference, "([A-Z]+)"); 

       if (cellRefPart == CellReferencePartEnum.Column || cellRefPart == CellReferencePartEnum.Both) 
       { 
        List<char> col = parts[1].ToCharArray().ToList(); 
        bool needsIncrement = true; 
        int index = col.Count - 1; 

        do 
        { 
         // increment the last letter 
         col[index] = Letters[Letters.IndexOf(col[index]) + 1]; 

         // if it is the last letter, then we need to roll it over to 'A' 
         if (col[index] == Letters[Letters.Count - 1]) 
         { 
          col[index] = Letters[0]; 
         } 
         else 
         { 
          needsIncrement = false; 
         } 

        } while (needsIncrement && --index >= 0); 

        // If true, then we need to add another letter to the mix. Initial value was something like "ZZ" 
        if (needsIncrement) 
        { 
         col.Add(Letters[0]); 
        } 

        parts[1] = new String(col.ToArray()); 
       } 

       if (cellRefPart == CellReferencePartEnum.Row || cellRefPart == CellReferencePartEnum.Both) 
       { 
        // Increment the row number. A reference is invalid without this componenet, so we assume it will always be present. 
        parts[2] = (int.Parse(parts[2]) + 1).ToString(); 
       } 

       newReference = parts[1] + parts[2]; 
      } 

      return newReference; 
     } 

     /// <summary> 
     /// Returns a cell Object corresponding to a specifc address on the worksheet 
     /// </summary> 
     /// <param name="workSheetPart">WorkSheet to search for cell adress</param> 
     /// <param name="cellAddress">Cell Address (ie. B2)</param> 
     /// <returns>Cell Object</returns> 
     public static Cell GetCell(WorksheetPart workSheetPart, string cellAddress) 
     { 
      return workSheetPart.Worksheet.Descendants<Cell>() 
           .Where(c => cellAddress.Equals(c.CellReference)) 
           .SingleOrDefault(); 
     } 

     /// <summary> 
     /// Inserts a new cell at the specified colName and rowIndex. If a cell 
     /// already exists, then the existing cell is returned. 
     /// </summary> 
     /// <param name="colName">Column Name</param> 
     /// <param name="rowIndex">Row Index</param> 
     /// <param name="worksheetPart">Worksheet Part</param> 
     /// <returns>Inserted Cell</returns> 
     public static Cell InsertCell(string colName, uint rowIndex, WorksheetPart worksheetPart) 
     { 
      return InsertCell(colName, rowIndex, worksheetPart, null); 
     } 

     /// <summary> 
     /// Inserts a new cell at the specified colName and rowIndex. If a cell 
     /// already exists, then the existing cells are shifted to the right. 
     /// </summary> 
     /// <param name="colName">Column Name</param> 
     /// <param name="rowIndex">Row Index</param> 
     /// <param name="worksheetPart">Worksheet Part</param> 
     /// <param name="cell"></param> 
     /// <returns>Inserted Cell</returns> 
     public static Cell InsertCell(string colName, uint rowIndex, WorksheetPart worksheetPart, Cell insertCell) 
     { 
      Worksheet worksheet = worksheetPart.Worksheet; 
      SheetData sheetData = worksheet.GetFirstChild<SheetData>(); 
      string insertReference = colName + rowIndex; 

      // If the worksheet does not contain a row with the specified row index, insert one. 
      Row row; 
      if (sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).Count() != 0) 
      { 
       row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowIndex).First(); 
      } 
      else 
      { 
       row = new Row() { RowIndex = rowIndex }; 
       sheetData.Append(row); 
      } 

      Cell retCell = row.Elements<Cell>().FirstOrDefault(c => c.CellReference.Value == colName + rowIndex); 
      // If retCell is not null and we are not inserting a new cell, then just skip everything and return the cell 
      if (retCell != null) 
      { 
       // NOTE: if conditions are not combined because we want to skip the parent 'else when the outside 'if' is true. 
       // if retCell is not null and we are inserting a new cell, then move all existing cells to the right. 
       if (insertCell != null) 
       { 
        // Get all the cells in the row with equal or higher column values than the one being inserted. 
        // Add the cell to be inserted into the temp list and re-index all of the cells. 
        List<Cell> cells = row.Descendants<Cell>().Where(c => String.Compare(c.CellReference.Value, insertReference) >= 0).ToList(); 
        cells.Insert(0, insertCell); 
        string cellReference = insertReference; 

        foreach (Cell cell in cells) 
        { 
         // Update the references for the rows cells. 
         cell.CellReference = new StringValue(cellReference); 
         IncrementCellReference(cellReference, CellReferencePartEnum.Column); 
        } 

        // actually insert the new cell into the row 
        retCell = row.InsertBefore(insertCell, retCell); // at this point, retCell still points to the row that had the insertReference 
       } 
      } 
      // Else retCell is null, this means no cell exists at the specified location so we need to put a new cell in that space. 
      // If a cell was passed into this method, then it will be inserted. If not, a new one will be inserted. 
      else 
      { 
       // Cells must be in sequential order according to CellReference. Determine where to insert the new cell. 
       // Sequencial order can't be string comparison order, has to be Excel order ("A", "B", ... "AA", "BB", etc) 
       Cell refCell = null; 
       foreach (Cell cell in row.Elements<Cell>()) 
       { 
        string cellColumn = Regex.Replace(cell.CellReference.Value, @"\d", ""); 
        if (colName.Length <= cellColumn.Length && string.Compare(cell.CellReference.Value, insertReference, true) > 0) 
        { 
         refCell = cell; 
         break; 
        } 
       } 

       // Insert cell parameter is supplied, otherwise, create a new cell 
       retCell = insertCell ?? new Cell() { CellReference = insertReference }; 
       row.InsertBefore(retCell, refCell); 
      } 

      return retCell; 
     } 

//Other missing pieces 

public enum CellReferencePartEnum 
    { 
     None, 
     Column, 
     Row, 
     Both 
    } 

private static List<char> Letters = new List<char>() { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', ' ' }; 
5

據我所知,你是遍歷細胞在一行,並假定你讀的第一個單元格在列A,第二次在B列,等等?

我建議你(解析?)正則表達式從

DocumentFormat.OpenXml.Spreadsheet.Cell currentcell 
currentcell.CellReference 

CellReference列位置/參考給你在「A1」符號單元格引用。提取列部分(「A,B,CC等)

您必須對一行中的每個單元執行此操作,然後如果某列的單元格缺失,請填寫佔位符值。 DBNull的可能?

我不知道是否有另一種方式來找出該小區屬於哪個列。

相關問題