2013-04-25 65 views
3

我正在使用StringEscapeUtils轉義和unescape html。我有以下代碼字符串內容相同,但等於方法返回false

import org.apache.commons.lang.StringEscapeUtils; 

public class EscapeUtils { 

    public static void main(String args[]) { 

     String string = " 4-Spaces ,\"Double Quote\", 'Single Quote', \\Back-Slash\\, /Forward Slash/ "; 

     String escaped = StringEscapeUtils.escapeHtml(string); 
     String myEscaped = escapeHtml(string); 

     String unescaped = StringEscapeUtils.unescapeHtml(escaped); 
     String myUnescaped = StringEscapeUtils.unescapeHtml(myEscaped); 

     System.out.println("Real String: " + string); 
     System.out.println(); 
     System.out.println("Escaped String: " + escaped); 
     System.out.println("My Escaped String: " + myEscaped); 
     System.out.println(); 
     System.out.println("Unescaped String: " + unescaped); 
     System.out.println("My Unescaped String: " + myUnescaped); 
     System.out.println(); 
     System.out.println("Comparison:"); 
     System.out.println("Real String == Unescaped String: " + string.equals(unescaped)); 
     System.out.println("Real String == My Unescaped String: " + string.equals(myUnescaped)); 
     System.out.println("Unescaped String == My Unescaped String: " + unescaped.equals(myUnescaped)); 

    } 

    public static String escapeHtml(String s) { 
     String escaped = ""; 
     if(null != s) { 
      escaped = StringEscapeUtils.escapeHtml(s); 
      escaped = escaped.replaceAll(" "," "); 
      escaped = escaped.replaceAll("'","'"); 
      escaped = escaped.replaceAll("\\\\","\"); 
      escaped = escaped.replaceAll("/","/"); 
     } 
     return escaped; 
    } 

} 

輸出:

Real String:  4-Spaces ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 

Escaped String:  4-Spaces ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Escaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/  

Unescaped String:  4-Spaces ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/ 
My Unescaped String:     4-Spaces    ,"Double Quote", 'Single Quote', \Back-Slash\, /Forward Slash/  

Comparison: 
Real String == Unescaped String: true 
Real String == My Unescaped String: false 
Unescaped String == My Unescaped String: false 

escaped真正string,然後unescaped它。但myEsceped首先用相同的進程轉義,然後用他們的html代碼替換一些更多的html字符。 myUnescaped實際上是myEscaped的內容,它與真實字符串的內容相同。

輸出顯示真實stringunescapedmyUnescaped內容相同。但是,如比較部分所示,myUnescaped不等於stringunescaped

我不明白它在這裏實際發生了什麼。任何人都可以解釋嗎?

+0

噢,我的頭在旋轉 – muneebShabbir 2013-04-25 06:44:50

+0

可以請你調試和檢查字符串的字符數組,驗證和請分享 – muneebShabbir 2013-04-25 06:52:47

+0

我不看行'轉義的字符串==我轉義的字符串:'在你的代碼。你可以在你的程序中添加這個比較的部分嗎? – Patashu 2013-04-25 07:04:04

回答

3

這是由於當逃脫的HTML,你與 

public static String escapeHtml(String s) { 
     String escaped = ""; 
     if(null != s) { 
      escaped = StringEscapeUtils.escapeHtml(s); 
      escaped = escaped.replaceAll(" "," "); // HERE 
      escaped = escaped.replaceAll("'","'"); 
      escaped = escaped.replaceAll("\\\\","\"); 
      escaped = escaped.replaceAll("/","/"); 
     } 
     return escaped; 
    } 

雖然StringEscapeUtils.escapeHtml更換' '不逃避' ',下面是他們的網站的例子:

"bread" & "butter" 

成爲

"bread" & "butter" 

這意味着StringEscapeUtils.escapeHtml保留空間

如果從escapeHtml刪除escaped = escaped.replaceAll(" "," ");unescapedmyUnescaped比賽!

1

Apurv Answer之後,我分析了字節數組的字節。

String:  32, 32, 32, 32, 52, 45, 83, 112, 97, 99, 101, 115, 32, 32, 32, 32, 44, 34, 68, 111, 117, 98, 108, 101, 32, 81, 117, 111, 116, 101, 34, 44, 32, 39, 83, 105, 110, 103, 108, 101, 32, 81, 117, 111, 116, 101, 39, 44, 32, 92, 66, 97, 99, 107, 45, 83, 108, 97, 115, 104, 92, 44, 32, 47, 70, 111, 114, 119, 97, 114, 100, 32, 83, 108, 97, 115, 104, 47, 32 
unescaped : 32, 32, 32, 32, 52, 45, 83, 112, 97, 99, 101, 115, 32, 32, 32, 32, 44, 34, 68, 111, 117, 98, 108, 101, 32, 81, 117, 111, 116, 101, 34, 44, 32, 39, 83, 105, 110, 103, 108, 101, 32, 81, 117, 111, 116, 101, 39, 44, 32, 92, 66, 97, 99, 107, 45, 83, 108, 97, 115, 104, 92, 44, 32, 47, 70, 111, 114, 119, 97, 114, 100, 32, 83, 108, 97, 115, 104, 47, 32 
myUnescaped: -96, -96, -96, -96, 52, 45, 83, 112, 97, 99, 101, 115, -96, -96, -96, -96, 44, 34, 68, 111, 117, 98, 108, 101, -96, 81, 117, 111, 116, 101, 34, 44, -96, 39, 83, 105, 110, 103, 108, 101, -96, 81, 117, 111, 116, 101, 39, 44, -96, 92, 66, 97, 99, 107, 45, 83, 108, 97, 115, 104, 92, 44, -96, 47, 70, 111, 114, 119, 97, 114, 100, -96, 83, 108, 97, 115, 104, 47, -96 

我似乎myUnescaped,空間已經轉換爲ASCII -96而不是32

所以我寫了unescapeHtml方法如下。此方法首先用空格替換&nbsp,然後使用StringEscapeUtils來查看html。

public static String unescapeHtml(String s) { 
    String unescaped = ""; 
    if(null != s) { 
     unescaped = s.replaceAll(" ", " "); 
     unescaped = StringEscapeUtils.unescapeHtml(unescaped); 
    } 
    return unescaped; 
} 

然後我得到了myUnescaped使用下面的代碼。

String myUnescaped = unescapeHtml(myEscaped); 

這給了我myUnescaped串等於stringunescaped

ALTERNATIVELY我用 替換 。這並不要求我寫unescapeHtml mehod。更新了escapeHtml方法的代碼如下。

public static String escapeHtml(String s) { 
    String escaped = ""; 
    if(null != s) { 
     escaped = StringEscapeUtils.escapeHtml(s); 
     escaped = escaped.replaceAll(" "," "); //updated line 
     escaped = escaped.replaceAll("'","'"); 
     escaped = escaped.replaceAll("\\\\","\"); 
     escaped = escaped.replaceAll("/","/"); 
    } 
    return escaped; 
}