2012-02-04 89 views
0

我正在使用Ruby 1.8.7並將XML內容作爲API響應的字符串。我想解析此回覆,以便我可以不使用HTML標記:如何從XML中忽略HTML標記?

<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<response>\n <data>\n <publisher_share_percent>0.0</publisher_share_percent>\n <detailed_description>&lt;b&gt;this is the testing detailed&lt;/b&gt; </detailed_description>\n <title>Only &#163;5.00. food (Regular &#163;50.00/90% discount)</title>\n </data>\n <request_id>ed96dd50-3127-012f-3e93-042b2b8686e6</request_id>\n <message>The resource has been created successfully.</message>\n <status>201</status>\n</response>\n 

回答

2

您可以使用CGI::unescapeHTML

require 'cgi' 
CGI::unescapeHTML("Usage: foo &quot;bar&quot; &lt;baz&gt;") 
# => "Usage: foo \"bar\" <baz>" 
0

如果處理XML,因爲它是什麼,XML和使用XML解析器解析它,這項工作變得更加容易:

require 'nokogiri' 

xml = <<EOT 
<?xml version="1.0" encoding="UTF-8"?> 
<response> 
    <data> 
    <publisher_share_percent>0.0</publisher_share_percent> 
    <detailed_description>&lt;b&gt;this is the testing detailed&lt;/b&gt; </detailed_description> 
    <title>Only &#163;5.00. food (Regular &#163;50.00/90% discount)</title> 
    </data> 
    <request_id>ed96dd50-3127-012f-3e93-042b2b8686e6</request_id> 
    <message>The resource has been created successfully.</message> 
    <status>201</status> 
    </response> 
EOT 

doc = Nokogiri::XML(xml) 
puts doc.at('detailed_description').text 
puts doc.at('title').text 

保存和運行文件輸出:

ruby ~/Desktop/test2.rb 
<b>this is the testing detailed</b> 
Only £5.00. food (Regular £50.00/90% discount)