2010-11-01 65 views
2

I am using PrintWriter as follows to get the output in the browser:Using PrintWriter, I am getting Chinese junk characters in browser

PrintWriter pw = response.getwriter(); 
StringBuffer sb = getTextFromDatabase(); 
pw.print(sb); 

However, this prints the following Chinese junk characters:

格㸳潃浭湥獴⼼㍨‾瓊扡敬㰾牴戠捧汯牯✽䔣䔷䔷❆㰾摴傾獯整⁤湏›〱㈭ⴷ〲〱ㄠ㨴㌰㔺਱‬祂›教桳慷瑮丠祡歡⠊湹祡歡捀獩潣挮浯਩硅散汬湥㱴琯㹤⼼牴㰾牴戠捧汯牯✽䔣䔷䔷❆㰾摴㰾琯㹤⼼牴㰾牴戠捧汯牯✽䔣䔷䔷❆㰾摴傾獯整湏›〱㈭ⴷ〲〱ㄠ㨴㐰ㄺ਱‬祂›教桳慷瑮丠祡歡⠊湹祡歡捀獩潣挮浯਩敶祲朠潯㱤琯㹤⼼牴㰾牴戠捧汯牯✽䔣䔷䔷❆㰾摴㰾琯㹤⼼牴㰾牴戠捧汯牯✽䔣䔷䔷❆㰾摴傾獯整⁤湏›〱㈭ⴷ〲〱ㄠ㨴㜱㌺ਸ਼‬祂›教桳慷瑮丠祡歡⠊湹祡歡捀獩潣挮浯਩桔獩槧⁳瀦琠獥㱴琯㹤⼼牴㰾琯扡敬㰾牢⼠‾格㸳潐瑳夠畯⁲潃浭湥㱴栯㸳㰠潦浲慍瑣潩㵮䌢浯敭瑮即牥汶瑥•敭桴摯∽敧≴渠浡㵥撟浯敭瑮瀠浲•湯畳浢瑩∽爠瑥牽慖楬慤整瀠浲⤨∻‾瓊扡敬†眠摩桴∽〳 ∰栠楥檜㵴㌢〰㸢ठ瓊㹲瓊㹤氼扡汥映牯∽慮敭㸢潃浭湥㩴猼慰汣獡㵳洢湡呤汃獡≳⨾⼼灳湡㰾氯扡汥㰾牢㸯瓊硥懾敲⁡慮敭∽潣瑮湥≴槧㵤撟浯敭瑮硔䅴敲≡撓慬獳∽整璸牡慥氠牡敧•潣獬∽㠲•潲獷∽∶㸠⼼整璸牡慥㰾琯㹤⼼牴㰾牴㰾摴㰾慬敢潦㵲渢浡≥舉浡㩥猼慰汣獡㵳洢湡呤汃獡≳⨾灳湡㰾氯扡汥㰾牢㸯椼灮瑵槧㵤渢浡≥琠灹㵥琢硥≴渠浡㵥渢浡≥撓慬獳∽慮敭•慶畲㵥∢洠硡敬杮桴∽㔲∵†楳驅∽㘳⼢㰾琯㹤⼼牴㰾牴㰾摴㰾慬敢潦㵲攢憖汩㸢ⵅ慍汩㰺灳湡撓慬獳∽憖摮䍔慬獳㸢㰪猯慰㹮⼼慬敢㹬戼⽲㰾湩異⁴摩∽浥楡≬琠灹㵥琢硥≴渠浡㵥攢憖汩•汣獡㵳攢憖汩•慶畲㵥∢洠硡敬杮桴∽㔲∵†楳驅∽㘳⼢㰾琯㹤⼼牴㰾牴㰾摴㰾湩異⁴琠灹㵥猢扵業≴†慮敭∽潰瑳•慶畲㵥倢獯≴㸯⼼摴㰾琯㹲⼼懾汢㹥⼼潦浲

I tried to use String instead of StringBuffer , but that didn't help. I also tried to set the content type header as follows

response.setContentType("text/html;charset=UTF-8"); 

before getting the response writer, but that did also not help.

In the DB there are no issues with the data as I have used the same data for 2 different purposes. In one I get correct output , but in other I get the above junk. I have used the above code in JSP using scriptlets. I have also given content type for the JSP.

+2

是'StringBuffer'中的垃圾字符嗎?你可以粘貼一些示例輸出嗎?這聽起來像是一個編碼問題。 – 2010-11-01 12:32:56

回答

0

Clearly, you have some kind of encoding problem here, but my guess is it is on the server or database side, not in the browser.

In the DB there are no issues with the data as i have used the same data for 2 different options,but in one i get correct output n in other junk.

I don't find that argument convincing. In fact, I think you may be overlooking the real cause of the problem.

What I think you need to do is add some server-side logging to capture what is actually in that StringBuffer that you are sending to the PrintWriter

Also, look at what is different about the way that the server side handles the "2 different options". (What do you mean by that phrase?).

Finally, please provide some REAL code, not just 3 line snippets that won't compile.

2

Getting Chinese characters as Mojibake indicates that you're incorrectly showing UTF-16LE data as UTF-8. UTF16- LE stores each character in 4 bytes. In UTF-8, the 4-byte panels contains usually CJK (Chinese/Japanese/Korean) characters.

To fix this, you need to either show the data as UTF-16LE or to have stored the data in the DB as UTF-8 from the beginning on. Since you're attempting to display them as UTF-8, I think that your DB has to be reconfigured/converted to use UTF-8 instead of UTF- 16LE.


無關的具體問題,存儲HTML(這也正是這些人物原本代表)在數據庫中是真的一個壞主意;)這是原來的內容:

<h3>Comments</h3> <table><tr bgcolor='#E7E7EF'><td>Posted On: 10-27-2010 14:03:51 
, By: Yeshwant Nayak 
([email protected]) 
Excellent</td></tr><tr bgcolor='#E7E7EF'><td></td></tr><tr bgcolor='#E7E7EF'><td>Posted On: 10-27-2010 14:04:11 
, By: Yeshwant Nayak 
([email protected]) 
very good</td></tr><tr bgcolor='#E7E7EF'><td></td></tr><tr bgcolor='#E7E7EF'><td>Posted On: 10-27-2010 14:17:36 
, By: Yeshwant Nayak 
([email protected]) 
This is to test</td></tr></table><br /> <h3>Post Your Comment</h3> <form action="CommentsServlet" method="get" name="commentForm" onsubmit=" return ValidateForm();"> <table width="300" height="300"> <tr><td><label for="name">Comment:<span class="mandTClass">*</span></label><br/><textarea name="content" id="commentTxtArea" class="textarea large" cols="28" rows="6" ></textarea></td></tr><tr><td><label for="name">Name:<span class="mandTClass">*</span></label><br/><input id="name" type="text" name="name" class="name" value="" maxlength="255" size="36"/></td></tr><tr><td><label for="email">E-Mail:<span class="mandTClass">*</span></label><br/><input id="email" type="text" name="email" class="email" value="" maxlength="255" size="36"/></td></tr><tr><td><input type="submit" name="post" value="Post"/></td></tr></table></form 

下面介紹如何將錯誤編碼的中文恢復爲正常字符:

String incorrect = "格㸳潃浭湥獴⼼㍨‾瓊扡敬㰾牴戠捧汯"; 
String original = new String(incorrect.getBytes("UTF-16LE"), "UTF-8"); 

請注意,這不應該用作解決方案!它只是作爲問題根源的證據。

+2

+1因爲你的頭上有鸚鵡 – willcodejavaforfood 2010-11-03 13:38:23