2016-09-18 343 views
5

我有以下YAML文件名爲input.yaml如何使用PyYAML讀取python元組?

cities: 
    1: [0,0] 
    2: [4,0] 
    3: [0,4] 
    4: [4,4] 
    5: [2,2] 
    6: [6,2] 
highways: 
    - [1,2] 
    - [1,3] 
    - [1,5] 
    - [2,4] 
    - [3,4] 
    - [5,4] 
start: 1 
end: 4 

我加載它使用PyYAML和打印結果如下:

import yaml 

f = open("input.yaml", "r") 
data = yaml.load(f) 
f.close() 

print(data) 

結果是數據結構如下:

{ 'cities': { 1: [0, 0] 
      , 2: [4, 0] 
      , 3: [0, 4] 
      , 4: [4, 4] 
      , 5: [2, 2] 
      , 6: [6, 2] 
      } 
, 'highways': [ [1, 2] 
       , [1, 3] 
       , [1, 5] 
       , [2, 4] 
       , [3, 4] 
       , [5, 4] 
       ] 
, 'start': 1 
, 'end': 4 
} 

正如你所看到的,每個城市和高速公路都被列爲一個列表。但是,我希望他們能夠被表示爲一個元組。因此,我使用解析手動將它們轉換爲元組:

import yaml 

f = open("input.yaml", "r") 
data = yaml.load(f) 
f.close() 

data["cities"] = {k: tuple(v) for k, v in data["cities"].items()} 
data["highways"] = [tuple(v) for v in data["highways"]] 

print(data) 

但是,這看起來像一個黑客。有什麼方法可以指示PyYAML直接將它們讀爲元組而不是列表?

回答

5

我不會打電話給你所做的事情。根據我的理解,您的替代方法是在您的YAML文件中使用特定於python的標記,以便在加載yaml文件時適當地表示它。然而,這需要你修改你的yaml文件,如果這個文件很大,可能會非常令人生氣並且不理想。

看看PyYaml doc,進一步說明了這一點。最終,您希望將!!python/tuple放置在您想要代表的結構之前。把你的樣本數據,它想:

YAML文件:

cities: 
    1: !!python/tuple [0,0] 
    2: !!python/tuple [4,0] 
    3: !!python/tuple [0,4] 
    4: !!python/tuple [4,4] 
    5: !!python/tuple [2,2] 
    6: !!python/tuple [6,2] 
highways: 
    - !!python/tuple [1,2] 
    - !!python/tuple [1,3] 
    - !!python/tuple [1,5] 
    - !!python/tuple [2,4] 
    - !!python/tuple [3,4] 
    - !!python/tuple [5,4] 
start: 1 
end: 4 

示例代碼:

import yaml 

with open('y.yaml') as f: 
    d = yaml.load(f.read()) 

print(d) 

將輸出:

{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'start': 1, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]} 
+1

帶標籤的IMO最大的問題是它們強迫你通過load()方法使用不安全的'Constructor',並且不能再通過'safe_load()'使用'SafeConstructor'。 – Anthon

+0

事實上,我不認爲有更好的方法來完成我想要做的事情。 –

1

視您YAML輸入來自你的「黑客」是一個很好的解決方案,特別是如果你使用yaml.safe_load()而不是un安全yaml.load()。如果你的YAML文件中只有「葉子」的序列需要是元組,你可以做¹:

import pprint 
import ruamel.yaml 
from ruamel.yaml.constructor import SafeConstructor 


def construct_yaml_tuple(self, node): 
    seq = self.construct_sequence(node) 
    # only make "leaf sequences" into tuples, you can add dict 
    # and other types as necessary 
    if seq and isinstance(seq[0], (list, tuple)): 
     return seq 
    return tuple(seq) 

SafeConstructor.add_constructor(
    u'tag:yaml.org,2002:seq', 
    construct_yaml_tuple) 

with open('input.yaml') as fp: 
    data = ruamel.yaml.safe_load(fp) 
pprint.pprint(data, width=24) 

它打印:

{'cities': {1: (0, 0), 
      2: (4, 0), 
      3: (0, 4), 
      4: (4, 4), 
      5: (2, 2), 
      6: (6, 2)}, 
'end': 4, 
'highways': [(1, 2), 
       (1, 3), 
       (1, 5), 
       (2, 4), 
       (3, 4), 
       (5, 4)], 
'start': 1} 

如果再需要處理更多的材料,其中序列需「正常」 名單再次,使用方法:

SafeConstructor.add_constructor(
    u'tag:yaml.org,2002:seq', 
    SafeConstructor.construct_yaml_seq) 

¹這是使用完成是一個YAML 1.2解析器,其中我是作者。你應該能夠做到同樣與老PyYAML如果你只有永遠需要支持YAML 1.1和/或出於某種原因

+0

不幸的是,我需要'高速公路「來代替元組元組的列表,而不是元組元組。儘管如此,使用'safe_load'而不是'load'是一個很好的建議。謝謝。 –

+1

糟糕,我錯過了,我通過使元組構造函數檢查第一個序列元素並且如果它是列表而不將它轉換爲元組來解決它。你當然可以微調(檢查所有元素,檢查字典等)。 – Anthon

0

我同樣的問題,因爲這個問題跑了不能升級,我是不是太滿意兩個答案。在瀏覽pyyaml文檔時,我發現 真的有兩種有趣的方法:yaml.add_constructoryaml.add_implicit_resolver

隱式解析器解決了通過將字符串與正則表達式進行匹配來標記具有!!python/tuple的所有條目的問題。我也想使用元組語法,所以編寫tuple: (10,120)而不是寫一個列表tuple: [10,120]然後把 轉換成元組,我個人覺得非常煩人。我也不想安裝外部庫。下面是代碼:

import yaml 
import re 

# this is to convert the string written as a tuple into a python tuple 
def yml_tuple_constructor(loader, node): 
    # this little parse is really just for what I needed, feel free to change it!                        
    def parse_tup_el(el):                            
     # try to convert into int or float else keep the string                  
     if el.isdigit():                            
      return int(el)                           
     try:                               
      return float(el)                           
     except ValueError:                           
      return el                             

    value = loader.construct_scalar(node)                        
    # remove the () from the string                         
    tup_elements = value[1:-1].split(',')                        
    # remove the last element if the tuple was written as (x,b,)                  
    if tup_elements[-1] == '':                          
     tup_elements.pop(-1)                           
    tup = tuple(map(parse_tup_el, tup_elements))                      
    return tup                              

# !tuple is my own tag name, I think you could choose anything you want                                 
yaml.add_constructor(u'!tuple', yml_tuple_constructor) 
# this is to spot the strings written as tuple in the yaml                    
yaml.add_implicit_resolver(u'!tuple', re.compile(r"\(([^,\W]{,},){,}[^,\W]*\)")) 

最後通過執行此:

>>> yml = yaml.load(""" 
    ...: cities: 
    ...: 1: (0,0) 
    ...: 2: (4,0) 
    ...: 3: (0,4) 
    ...: 4: (4,4) 
    ...: 5: (2,2) 
    ...: 6: (6,2) 
    ...: highways: 
    ...: - (1,2) 
    ...: - (1,3) 
    ...: - (1,5) 
    ...: - (2,4) 
    ...: - (3,4) 
    ...: - (5,4) 
    ...: start: 1 
    ...: end: 4""") 
>>> yml['cities'] 
{1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)} 
>>> yml['highways'] 
[(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)] 

有可能是同一個save_load潛在的缺點相比load我沒有測試。