2009-11-23 126 views
2

我有一個XML格式的故事集合。我想分析這個文件並將每個故事作爲散列或Ruby對象返回,以便我可以在Ruby腳本中進一步處理數據。將XML集合(Pivotal Tracker故事)轉換爲Ruby散列/對象

請問Nokogiri支持這個,還是有更好的工具/庫可以使用?

XML文檔具有以下結構,通過Pivotal Tracker's web API返回:

<?xml version="1.0" encoding="UTF-8"?> 
<stories type="array" count="145" total="145"> 
    <story> 
    <id type="integer">16376</id> 
    <story_type>feature</story_type> 
    <url>http://www.pivotaltracker.com/story/show/16376</url> 
    <estimate type="integer">2</estimate> 
    <current_state>accepted</current_state> 
    <description>A description</description> 
    <name>Receivable index listing will allow selection viewing</name> 
    <requested_by>Tony Superman</requested_by> 
    <owned_by>Tony Superman</owned_by> 
    <created_at type="datetime">2009/11/04 15:49:43 WST</created_at> 
    <accepted_at type="datetime">2009/11/10 11:06:16 WST</accepted_at> 
    <labels>index ui,receivables</labels> 
    </story> 
    <story> 
    <id type="integer">17427</id> 
    <story_type>feature</story_type> 
    <url>http://www.pivotaltracker.com/story/show/17427</url> 
    <estimate type="integer">3</estimate> 
    <current_state>unscheduled</current_state> 
    <description></description> 
    <name>Validations in wizards based on direction</name> 
    <requested_by>Matthew McBoggle</requested_by> 
    <created_at type="datetime">2009/11/17 15:52:06 WST</created_at> 
    </story> 
    <story> 
    <id type="integer">17426</id> 
    <story_type>feature</story_type> 
    <url>http://www.pivotaltracker.com/story/show/17426</url> 
    <estimate type="integer">2</estimate> 
    <current_state>unscheduled</current_state> 
    <description>Manual payment needs a description field.</description> 
    <name>Add description to manual payment</name> 
    <requested_by>Tony Superman</requested_by> 
    <created_at type="datetime">2009/11/17 15:10:41 WST</created_at> 
    <labels>payment process</labels> 
    </story> 
    <story> 
    <id type="integer">17636</id> 
    <story_type>feature</story_type> 
    <url>http://www.pivotaltracker.com/story/show/17636</url> 
    <estimate type="integer">3</estimate> 
    <current_state>unscheduled</current_state> 
    <description>The SMS and email templates needs to be editable by merchants.</description> 
    <name>Notifications are editable by the merchant</name> 
    <requested_by>Matthew McBoggle</requested_by> 
    <created_at type="datetime">2009/11/19 16:44:08 WST</created_at> 
    </story> 
</stories> 

回答

5

您可以利用ActiveSupport中的哈希擴展。然後,您只需要在Nokogiri中解析文檔,然後將節點集結果轉換​​爲散列。此方法將保留屬性類型(例如整數,日期,數組)。 (當然,如果你使用Rails你沒有要求/包括積極支持或引入nokogiri如果您有它在您的環境。我在此假設一個純Ruby實現)

require 'rubygems' 
require 'nokogiri' 
require 'activesupport' 

include ActiveSupport::CoreExtensions::Hash 

doc = Nokogiri::XML.parse(File.read('yourdoc.xml')) 
my_hash = doc.search('//story').map{ |e| Hash.from_xml(e.to_xml)['story'] } 

這將產生哈希值的數組(每個故事節點),並保留根據屬性的類型,如下所示:

my_hash.first['name'] 
=> "Receivable index listing will allow selection viewing" 

my_hash.first['id'] 
=> 16376 

my_hash.first['id'].class 
=> Fixnum 

my_hash.first['created_at'].class 
=> Time 
1

我想你可以堅持this答案。

更簡單的可以找到here

1

這個xml是由Rails的ActiveRecord#to_xml方法生成的。如果你使用rails,你應該可以使用Hash#from_xml來解析它。

+0

我在這個例子中沒有使用Rails。 – mlambie 2009-11-23 05:49:51

2

類的一行解決方案將是這樣的:

# str_xml contains your xml 
xml = Nokogiri::XML.parse(str_xml) 
xml.search('//story').to_a.map{|node| node.children.inject({}){|a,c| a[c.name] = c.text if c.class == Nokogiri::XML::Element; a}} 

返回散列的數組:

>> xml.search('//story').to_a.map{|node| node.children.inject({}){|a,c| a[c.name] = c.text if c.class == Nokogiri::XML::Element; a}} 
=> [{"id"=>"16376", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/16376", "estimate"=>"2", "current_state"=>"accepted", "description"=>"A description", "name"=>"Receivable index listing will allow selection viewing", "requested_by"=>"Tony Superman", "owned_by"=>"Tony Superman", "created_at"=>"2009/11/04 15:49:43 WST", "accepted_at"=>"2009/11/10 11:06:16 WST", "labels"=>"index ui,receivables"}, {"id"=>"17427", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/17427", "estimate"=>"3", "current_state"=>"unscheduled", "description"=>"", "name"=>"Validations in wizards based on direction", "requested_by"=>"Matthew McBoggle", "created_at"=>"2009/11/17 15:52:06 WST"}, {"id"=>"17426", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/17426", "estimate"=>"2", "current_state"=>"unscheduled", "description"=>"Manual payment needs a description field.", "name"=>"Add description to manual payment", "requested_by"=>"Tony Superman", "created_at"=>"2009/11/17 15:10:41 WST", "labels"=>"payment process"}, {"id"=>"17636", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/17636", "estimate"=>"3", "current_state"=>"unscheduled", "description"=>"The SMS and email templates needs to be editable by merchants.", "name"=>"Notifications are editable by the merchant", "requested_by"=>"Matthew McBoggle", "created_at"=>"2009/11/19 16:44:08 WST"}] 

然而,這忽略所有XML屬性,但是你有沒有說他們怎麼處理它;;)

0

也許一個Ruby接口樞紐API可以更好的解決方案,你的任務,請https://github.com/jsmestad/pivotal-tracker ...然後你可以像Ruby這樣的普通對象(來自docs)獲取故事:

@a_project = PivotalTracker::Project.find(84739)        
@a_project.stories.all(:label => 'overdue', :story_type => ['bug', 'chore'])