使用R僅從xml文件（聲明）提取數據

我試圖從xml文件中提取數據，看起來像這樣（見下文）。我需要提取id裏面的節點對於節點其中type = 0。我必須爲R找到解決方案。現在我可以通過xmlToList（「test.xml」）[[3]] [[1]]和id by xmlToList（「test.xml」）[[3]] [[4]]。將3改爲6,9等 - 我可以檢索所有需要的類型和ID。但我不確定這是否正確，因爲它基於可以更改的編號（在xml結構更改的情況下）。你能否提出另一種更簡單的從xml中提取數據的方法？或對我的非理想解決方案進行任何修改？謝謝！使用R僅從xml文件（聲明）提取數據

<?xml version="1.0" encoding="UTF-8"?> 
<image name="test1" id="367432589" width="952" height="1024" create_date="Mar 2, 2009" > 
    <nodes> 
    <node type="16" name="Target532" url="/cgi/im?id=5657" id="5657" x="67" y="45" width="153" height="69"> 
     <alt>Synthesis1</alt> 
     <Appearance TextArea="Rectangle: 550" Comlex="Boolean: true" /> 
    </node> 
    <node type="0" name="Target1" url="/cgi/im?id=680" id="680" x="193" y="535" width="70" height="70"> 
     <alt>Object &lt;b&gt;Target1&lt;TestingCond32</alt> 
     <Appearance TextArea="Rectangle: 210" Comlex="Boolean: false" /> 
    </node> 
    </nodes> 
    <edges> 
    <edge type="-100" id="234523"> 
     <alt /> 
     <Appearance Visualization="String: Hexa" HexagonIndex="Integer: 0" /> 
    </edge> 
    <edge type="-100" id="23"> 
     <alt /> 
     <Appearance Visualization="String: Hexa" HexagonIndex="Integer: 0" /> 
    </edge> 
    </edges> 
</image>

我是xml新手，有R的基本知識謝謝！

來源

2012-07-07 John Amraph

如果您不熟悉解析，可以建議在talkstats.com上查看此主題[（LINK）]（http://www.talkstats.com/showthread.php/26153-Still-trying-to-learn -to-刮？亮點=刮）。在這篇文章中，我問了許多初學者的問題，Bryan Goodrich給出了很好的建議和指導。我一直在做一個博客文章，開始颳了一段時間.. – 2012-07-07 22:36:28

你可以嘗試以下

xpathSApply(xdata,"//*/node[@type=\"0\"]/@id") 

> xpathSApply(xdata,"//*/node[@type=\"0\"]/@id") 
    id 
"680"

這看起來對於一個名爲「節點」與屬性「類型」與值0然後返回與此節點關聯標識的屬性值節點

來源

2012-07-07 22:21:07 shhhhimhuntingrabbits

對不起，我應該將xpathSApply應用於xml或列表？ – 2012-07-08 00:10:08

就像一個魅力！非常感謝你。正則表達式是力量！ – 2012-07-08 00:15:32

http://stackoverflow.com/questions/31423931/extract-data-from-raw-html-in-r你可以幫助解決這個問題 – 2015-07-15 07:33:56

使用R僅從xml文件（聲明）提取數據

回答

相關問題