2013-03-27 45 views
0

我有一個只包含.xml文件的文件夾。我的程序需要讀取每個文件,然後返回標籤之間爲'false'的文件的名稱。我在想:讀取文件夾中所有.xml文件以及特定標記之間的值的java代碼

 final Pattern pattern = Pattern.compile("<isTest>(.+?)</isTest>"); 
     final Matcher matcher = pattern.matcher("<isTest>false</isTest>"); 
     matcher.find(); 
     System.out.println(matcher.group(1)); 

我是新來的java所以任何幫助將不勝感激。

你能告訴我我要去哪裏嗎?

public class FileIO 
{ 
    public static void main(String[] args) 
    { 
     File dir = new File("d:\temp"); 

     List<String> list = new ArrayList<String>(); 

     //storing the names of the files in an array. 
     if (dir.isDirectory()) 
     { 
      String[] fileList = dir.list(); 
      Pattern p = Pattern.compile("^(.*?)\\.xml$"); 

      for (String file : fileList) 
      { 
      Matcher m = p.matcher(file); 
      if (m.matches()) 
      { 
       list.add(m.group(1)); 
      } 
      } 
     } 

     try 
     { 

      XPathFactory xPathFactory = XPathFactory.newInstance(); 
      XPath xpath = xPathFactory.newXPath(); 
      DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance(); 
      DocumentBuilder builder = docBuilderFactory.newDocumentBuilder(); 

      //Loop over files 

      for (int i = 0; i < fileList.length; i++) 
      { 
       Document doc = builder.parse(fileList[i]); 
       boolean matches = "false".equals(xpath.evaluate("//isTest/text()", doc)); 
      } 
     } 

     catch(Exception e) 
     { 
      e.printStackTrace(); 
     } 
    } 
} 
+1

解析HTML,通常是XML,通常幾乎總是一個壞主意。如果你必須檢查標籤之間的內容,你可能會考慮實際解析XML(使用SAX或更有用的東西)。 – TC1 2013-03-27 15:18:15

回答

1

如果文件具有可以使用的XSD,則JAXB是選擇的解決方案。您不希望在XML上使用正則表達式,因爲CDATA會像嵌套標籤一樣毀掉您的一天。

使用SAX像這樣是一個可能的解決方案:改編自here

0

public static void main(String[] args) 
{ 
SAXParserFactory factory = SAXParserFactory.newInstance(); 
    SAXParser saxParser = factory.newSAXParser(); 

    DefaultHandler handler = new DefaultHandler() { 

    boolean isTest= false; 

    public void startElement(String uri, String localName,String qName, 
       Attributes attributes) throws SAXException { 

     System.out.println("Start Element :" + qName); 

     if (qName.equalsIgnoreCase("isTest")) { 
      isTest= true; 
     } 

    } 

    public void endElement(String uri, String localName, 
     String qName) throws SAXException { 

     System.out.println("End Element :" + qName); 

    } 

    public void characters(char ch[], int start, int length) throws SAXException { 

     if (isTest) { 
      System.out.println("is test : " + new String(ch, start, length)); 
      isTest= false; 
     } 
    } 

    }; 

     saxParser.parse("c:\\file.xml", handler); 
} 

代碼薩克斯可能是更有效的(內存明智),但這裏是一個XPath版本的一個片段,有可能更短,線明智

XPathFactory xPathFactory = XPathFactory.newInstance(); 
XPath xpath = xPathFactory.newXPath(); 
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance(); 
DocumentBuilder builder = docBuilderFactory.newDocumentBuilder(); 

/* Loop over files */ 

Document doc = builder.parse(file); 
boolean matches = "false".equals(xpath.evaluate("//isTest/text()", doc)); 
相關問題