2017-05-24 80 views
1

OracleWebRowSet有一個writeXml(FileWriter)方法將結果集轉換爲XML文件。OracleWebRowSet writeXml方法無法退出特殊字符,如&符號&

使用時,它無法逃避特殊字符,例如&符號,因此生成的XML文件不符合XML 1.0標準

雖然默認WebRowSet從rt.jar中工作得很好,但也有具體原因我使用OracleWebRowSet

我試過StringEscapeUtils.EscapeXML10.translate()但它不像規則,但作爲即時字符串翻譯。

如:

OracleWebRowSet owrs = new OracleWebRowSet(); 
FileWriter fWriter = = new FileWriter("file1.xml"); 
owrs.setEscapeProcessing(true); 
//this is where resultset is converted to XML but not escaped properly 
owrs.writeXml(fWriter); 
fWriter.flush(); 

我左右爲難我...我可能會嘗試讀取生成的XML爲文本文件和逃避的內容並把它寫回文件...但是,在處理700個xml文件時不健全有效

解決方案?任何人?

回答

0

我發現了一個變通方法來解決這個問題......但我不知道,如果它的正確方法...

這裏有雲......

更新:

擴展java.io.FileWriter並重寫write(String)方法

package customizations.java.io; 
import java.io.IOException; 
import java.util.regex.Matcher; 
import java.util.regex.Pattern; 
import org.apache.commons.lang3.StringEscapeUtils; 
public class XMLFileWriter extends java.io.FileWriter { 
    private Pattern html_prefix_pattern; 
    private Pattern html_suffix_pattern; 
    private Pattern common_tags_pattern1; 
    private Pattern common_tags_pattern2; 
    private Pattern common_tags_pattern3; 

    public XMLFileWriter(String fileName) throws IOException { 
     super(fileName); 
     html_prefix_pattern = Pattern.compile("(?i)(.*)<[\\s]*html(.*)>(.*)", Pattern.DOTALL); 
     html_suffix_pattern = Pattern.compile("(?i)(.*)<[\\s]*/html[\\s]*>(.*)", Pattern.DOTALL); 
     common_tags_pattern1 = Pattern.compile("(.+)<[^/?](\"[^\"]*\"|'[^']*'|[^'\">])*[^?]>(.+)", Pattern.DOTALL); 
     common_tags_pattern2 = Pattern.compile("^<[^/?](\"[^\"]*\"|'[^']*'|[^'\">])*[^?]>(.+)", Pattern.DOTALL); 
     common_tags_pattern3 = Pattern.compile("(.+)<[^/?](\"[^\"]*\"|'[^']*'|[^'\">])*[^?]>$", Pattern.DOTALL); 
    } 

    @Override 
    public void write(String str) throws IOException { 
     Matcher html_prefixMatcher = html_prefix_pattern.matcher(str); 
     Matcher html_suffixMatcher = html_suffix_pattern.matcher(str); 

     boolean cdata_proc = false; 
     //if(str.matches("(?i)(.*)[\\s]*<[\\s]*/html[\\s]*>[\\s]*(.*)")) { 
     //for CLOB data in oracle table, html tags in content will violate the XMLWebRowSet Schema Structure. So enclose them in CDATA 

     if(html_prefixMatcher.find()) { 
      str = "<![CDATA["+str; 
      cdata_proc = true; 
     } 

     if(html_suffixMatcher.find()) { 
      str = str+"]]>"; 
      cdata_proc = true; 
     } 

     if(!cdata_proc) { 
      Matcher common_tagsMatcher1 = common_tags_pattern1.matcher(str); 
      Matcher common_tagsMatcher2 = common_tags_pattern2.matcher(str); 
      Matcher common_tagsMatcher3 = common_tags_pattern3.matcher(str); 
      if(str.matches("(.*)&(.*)") || common_tagsMatcher1.find() || common_tagsMatcher2.find() || common_tagsMatcher3.find()) { 
       str = StringEscapeUtils.ESCAPE_XML10.translate(str); 
      } 
     } 
     super.write(str); 
    } 
} 

所以每當OracleWebRowset我們我們的代碼開始並檢查文本是否需要轉義...我們需要限制StringEscapeUtils,否則,XML標籤也會被轉義,導致xml文件的尷尬

修改後的代碼看起來像:

OracleWebRowSet owrs = new OracleWebRowSet(); 
XMLFileWriter fWriter = = new XMLFileWriter("file1.xml"); 
owrs.setEscapeProcessing(true); 
//this is where resultset is converted to XML but not escaped properly 
owrs.writeXml(fWriter); 
fWriter.flush(); 

希望這有助於人誰碰到這個問題絆...如果需要完善這些代碼,發表您的建議傢伙