2010-07-17 78 views
4

我想要構建一個圖表,顯示哪些標記被用作給定XML文檔中其他標記的子元素。構建一個XML文檔的結構圖

我寫這個函數來獲得一套獨特的子標籤的對於給定的標籤在lxml.etree樹:

def iter_unique_child_tags(root, tag): 
    """Iterates through unique child tags for all instances of tag. 

    Iteration starts at `root`. 
    """ 
    found_child_tags = set() 
    instances = root.iterdescendants(tag) 
    from itertools import chain 
    child_nodes = chain.from_iterable(i.getchildren() for i in instances) 
    child_tags = (n.tag for n in child_nodes) 
    for t in child_tags: 
     if t not in found_child_tags: 
      found_child_tags.add(t) 
      yield t 

有,我可以用這個使用通用的圖形生成器函數來以某種其他格式構建點文件或圖形?

我也越來越懷疑有一個工具明確地爲此目的而設計;那可能是什麼?

回答

0

我結束了使用python-graph。我還最終使用argparse構建了一個命令行界面,該界面從XML文檔中提取一些基本信息,並以pydot支持的格式構建圖形圖像。它被稱爲xmlearn,是有用的:

usage: xmlearn [-h] [-i INFILE] [-p PATH] {graph,dump,tags} ... 

optional arguments: 
    -h, --help   show this help message and exit 
    -i INFILE, --infile INFILE 
         The XML file to learn about. Defaults to stdin. 
    -p PATH, --path PATH An XPath to be applied to various actions. 
         Defaults to the root node. 

subcommands: 
    {graph,dump,tags} 
    dump    Dump xml data according to a set of rules. 
    tags    Show information about tags. 
    graph    Build a graph from the XML tags relationships.