從路徑/值列表中寫入xml

這是上一個問題的後續操作：Write xml with a path and value。我現在想添加兩個額外的東西：1）屬性和2）與父節點的多個項目。下面是路徑的名單上有：從路徑/值列表中寫入xml

[ 
    {'Path': 'Item/Info/Name', 'Value': 'Body HD'}, 
    {'Path': 'Item/Info/Synopsis', 'Value': 'A great movie'}, 
    {'Path': 'Item/Locales/Locale[@Country="US"][@Language="ES"]/Name', 'Value': 'El Grecco'},  
    {'Path': 'Item/Genres/Genre', 'Value': 'Action'}, 
    {'Path': 'Item/Genres/Genre', 'Value': 'Drama'}, 
    {'Path': 'Item/Purchases/Purchase[@Country="US"]/HDPrice', 'Value': '10.99'}, 
    {'Path': 'Item/Purchases/Purchase[@Country="US"]/SDPrice', 'Value': '9.99'}, 
    {'Path': 'Item/Purchases/Purchase[@Country="CA"]/SDPrice', 'Value': '4.99'}, 
]

應該生成XML是：

<Item> 
    <Info> 
     <Name>Body HD</Name> 
     <Synopsis>A great movie</Synopsis> 
    </Info> 
    <Locales> 
     <Locale Country="US" Language="ES"> 
      <Name>El Grecco</Name> 
     </Locale> 
    </Locales> 
    <Genres> 
     <Genre>Action</Genre> 
     <Genre>Drama</Genre> 
    </Genres> 
    <Purchases> 
     <Purchase Country="US"> 
      <HDPrice>10.99</HDPrice> 
      <SDPrice>9.99</SDPrice> 
     </Purchase> 
     <Purchase Country="CA"> 
      <SDPrice>4.99</SDPrice> 
     </Purchase> 
    </Purchases> 
</Item>

我怎麼會建了這一點？

來源

2016-08-16 David542

在''項目/區域設置/區域設置[國家= 「美國」]語言= [「ES」]/Name''，'Country'*爲什麼在*括號內，而'Language' *在*之外？ –

@Robᵩ謝謝我更新了它。 – David542

爲了建立從XPath的和值的XML樹，我使用正則表達式和lxml：

import re 

from lxml import etree

的條目是：

entries = [ 
    {'Path': 'Item/Info/Name', 'Value': 'Body HD'}, 
    {'Path': 'Item/Info/Synopsis', 'Value': 'A great movie'}, 
    {'Path': 'Item/Locales/Locale[@Country="US"][@Language="ES"]/Name', 'Value': 'El Grecco'}, 
    {'Path': 'Item/Genres/Genre', 'Value': 'Action'}, 
    {'Path': 'Item/Genres/Genre', 'Value': 'Drama'}, 
    {'Path': 'Item/Purchases/Purchase[@Country="US"]/HDPrice', 'Value': '10.99'}, 
    {'Path': 'Item/Purchases/Purchase[@Country="US"]/SDPrice', 'Value': '9.99'}, 
    {'Path': 'Item/Purchases/Purchase[@Country="CA"]/SDPrice', 'Value': '4.99'}, 
]

爲了解析每個XPath步驟，我使用下面的正則表達式（非常簡單）：

TAG_REGEX = r"(?P<tag>\w+)" 
CONDITION_REGEX = r"(?P<condition>(?:\[.*?\])*)" 
STEP_REGEX = TAG_REGEX + CONDITION_REGEX 
ATTR_REGEX = r"@(?P<key>\w+)=\"(?P<value>.*?)\"" 

search_step = re.compile(STEP_REGEX, flags=re.DOTALL).search 
findall_attr = re.compile(ATTR_REGEX, flags=re.DOTALL).findall 


def parse_step(step): 
    mo = search_step(step) 
    if mo: 
     tag = mo.group("tag") 
     condition = mo.group("condition") 
     return tag, dict(findall_attr(condition)) 
    raise ValueError(xpath)

parse_step return a 標籤名稱和屬性字典。

然後，我處理相同的方式來構建XML樹：

root = None 
for entry in entries: 
    path = entry["Path"] 
    parts = path.split("/") 
    xpath_list = ["/" + parts[0]] + parts[1:] 
    curr = root 
    for xpath in xpath_list: 
     tag_name, attrs = parse_step(xpath) 
     if curr is None: 
      root = curr = etree.Element(tag_name, **attrs) 
     else: 
      nodes = curr.xpath(xpath) 
      if nodes: 
       curr = nodes[0] 
      else: 
       curr = etree.SubElement(curr, tag_name, **attrs) 
    if curr.text: 
     curr = etree.SubElement(curr.getparent(), curr.tag, **curr.attrib) 
    curr.text = entry["Value"] 

print(etree.tostring(root, pretty_print=True))

結果是：

<Item> 
    <Info> 
    <Name>Body HD</Name> 
    <Synopsis>A great movie</Synopsis> 
    </Info> 
    <Locales> 
    <Locale Country="US" Language="ES"> 
     <Name>El Grecco</Name> 
    </Locale> 
    </Locales> 
    <Genres> 
    <Genre>Action</Genre> 
    <Genre>Drama</Genre> 
    </Genres> 
    <Purchases> 
    <Purchase Country="US"> 
     <HDPrice>10.99</HDPrice> 
     <SDPrice>9.99</SDPrice> 
    </Purchase> 
    <Purchase Country="CA"> 
     <SDPrice>4.99</SDPrice> 
    </Purchase> 
    </Purchases> 
</Item>

來源

2016-08-16 21:40:43

不錯的工作！唯一缺少的是流派的多個值。似乎每一個連續的流派都會覆蓋前一個流派。 – David542

有限制！在這個實現中，當xpath匹配時，我考慮第一個節點（'curr = nodes [0]'），我不附加新的子元素。難以猜測要做什麼：追加一個新元素或使用最後一個元素？例如，你不能爲根目錄添加一個元素。 –

如果文本已經存在，我們可以複製最後一個元素。請參閱編輯。 –

從路徑/值列表中寫入xml

回答

相關問題