2017-05-31 115 views
2

親愛的,好心的我試圖從.docx文件中提取整個文件到java中的文本區域,而我只收到沒有圖像或表格的文本,所以有什麼建議?提前致謝。如何使用apache poi從.docx文件中獲取圖片和表格?

我的代碼是:

try{ 
JFileChooser chooser = new JFileChooser(); 
chooser.showOpenDialog(null); 
XWPFDocument doc = new XWPFDocument(new 
FileInputStream(chooser.getSelectedFile())); 
XWPFWordExtractor extract = new XWPFWordExtractor(doc); 
content.setText(extract.getText()); 
content.setFont(new Font("Serif", Font.ITALIC, 16)); 
content.setLineWrap(true); 
content.setWrapStyleWord(true); 
content.setBackground(Color.white); 

} catch(Exception e){ 
JOptionPane.showMessageDialog(null, e); 
} 
} 

回答

2

爲了提取表使用List<XWPFTable> table = doc.getTables()

下面

public static void readWordDocument() { 
try { 
     String fileName = "C:\\sample.docx"; 

     if(!(fileName.endsWith(".doc") || fileName.endsWith(".docx"))) { 
       throw new FileFormatException(); 
     } else { 

     XWPFDocument doc = new XWPFDocument(new FileInputStream(fileName)); 

       List<XWPFTable> table = doc.getTables();   

       for (XWPFTable xwpfTable : table) { 
                List<XWPFTableRow> row = xwpfTable.getRows(); 
                for (XWPFTableRow xwpfTableRow : row) { 
                  List<XWPFTableCell> cell = xwpfTableRow.getTableCells(); 
                  for (XWPFTableCell xwpfTableCell : cell) { 
                    if(xwpfTableCell!=null) 
                    { 
                      System.out.println(xwpfTableCell.getText()); 
                      List<XWPFTable> itable = xwpfTableCell.getTables(); 
                      if(itable.size()!=0) 
                      { 
                        for (XWPFTable xwpfiTable : itable) { 
                          List<XWPFTableRow> irow = xwpfiTable.getRows(); 
                          for (XWPFTableRow xwpfiTableRow : irow) { 
                            List<XWPFTableCell> icell = xwpfiTableRow.getTableCells(); 
                            for (XWPFTableCell xwpfiTableCell : icell) { 
                              if(xwpfiTableCell!=null) 
                              { 
                                System.out.println(xwpfiTableCell.getText()); 
                              } 
                            } 
                          } 
                        } 
                      } 
                    } 
                  } 
                } 
       } 
     } 
} catch(FileFormatException e) { 
     e.printStackTrace(); 
} catch (FileNotFoundException e) { 
     e.printStackTrace(); 
} catch (IOException e) { 
     e.printStackTrace(); 
} 

示例}

要extarct圖像使用List<XWPFPictureData> piclist=docx.getAllPictures()

見下文實例

public static void extractImages(String src){ 
    try{ 

    //create file inputstream to read from a binary file 
    FileInputStream fs=new FileInputStream(src); 
    //create office word 2007+ document object to wrap the word file 
    XWPFDocument docx=new XWPFDocument(fs); 
    //get all images from the document and store them in the list piclist 
    List<XWPFPictureData> piclist=docx.getAllPictures(); 
    //traverse through the list and write each image to a file 
    Iterator<XWPFPictureData> iterator=piclist.iterator(); 
    int i=0; 
    while(iterator.hasNext()){ 
    XWPFPictureData pic=iterator.next(); 
    byte[] bytepic=pic.getData(); 
    BufferedImage imag=ImageIO.read(new ByteArrayInputStream(bytepic)); 
      ImageIO.write(imag, "jpg", new File("D:/imagefromword"+i+".jpg")); 
      i++; 
    } 

    }catch(Exception e){System.exit(-1);} 

} 
相關問題