2015-06-21 46 views
1

到字典中,我得到了下面的腳本輸出:將多行腳本輸出使用正則表達式

*************************************************** 
[g4u2680c]: searching for domains 
--------------------------------------------------- 
host = g4u2680c.houston.example.com 
     ipaddr = [16.208.16.72] 
     VLAN = [352] 
     Gateway= [16.208.16.1] 
     Subnet = [255.255.248.0] 
     Subnet = [255.255.248.0] 
     Cluster= [g4u2679c g4u2680c g9u1484c g9u1485c] 

host = g4u2680c.houston.example.com 
     ipaddr = [16.208.16.72] 
     VLAN = [352] 
     Gateway= [16.208.16.1] 
     Subnet = [255.255.248.0] 
     Subnet = [255.255.248.0] 
     Cluster= [g4u2679c g4u2680c g9u1484c g9u1485c] 

* script completed Mon Jun 15 06:13:14 UTC 2015 ** 
* sleeping 30 to avoid DOS on dns via a loop ** 

我需要2個主機列表中提取到一個字典,用括號。

這裏是我的代碼:

#!/bin/env python 

import re 

text="""*************************************************** 
[g4u2680c]: searching for domains 
--------------------------------------------------- 
host = g4u2680c.houston.example.com 
     ipaddr = [16.208.16.72] 
     VLAN = [352] 
     Gateway= [16.208.16.1] 
     Subnet = [255.255.248.0] 
     Subnet = [255.255.248.0] 
     Cluster= [g4u2679c g4u2680c g9u1484c g9u1485c] 

host = g4u2680c.houston.example.com 
     ipaddr = [16.208.16.72] 
     VLAN = [352] 
     Gateway= [16.208.16.1] 
     Subnet = [255.255.248.0] 
     Subnet = [255.255.248.0] 
     Cluster= [g4u2679c g4u2680c g9u1484c g9u1485c] 

* script completed Mon Jun 15 06:13:14 UTC 2015 ** 
* sleeping 30 to avoid DOS on dns via a loop ** 
*************************************************** 
""" 

seq = re.compile(r"host.+?\n\n",re.DOTALL) 

a=seq.findall(text) 

matches = re.findall(r'\w.+=.+', a[0]) 

matches = [m.split('=', 1) for m in matches] 

matches = [ [m[0].strip().lower(), m[1].strip().lower()] for m in matches] 

#should have function with regular expression to remove bracket here 

d = dict(matches) 

print d 

我走到這一步,第一個主機是什麼:

{'subnet': '[255.255.248.0]', 'vlan': '[352]', 'ipaddr': '[16.208.16.72]', 'cluster': '[g4u2679c g4u2680c g9u1484c g9u1485c]', 'host': 'g4u2680c.houston.example.com', 'gateway': '[16.208.16.1]'} 

我需要幫助找到正則表達式來卸下支架在詞典中的價值包含帶和不帶括號的數據。

或者如果有更好更簡單的方法將原始腳本輸出轉換爲字典。

+0

檢查我的答案.. –

回答

1

您可以使用:(\w+)\s*=\s*\[?([^\n\]]+)\]?

demo

import re 
p = re.compile(ur'(\w+)\s*=\s*\[?([^\n\]]+)\]?', re.MULTILINE) 
test_str = u"host = g4u2680c.houston.example.com\n   ipaddr = [16.208.16.72]\n   VLAN = [352]\n   Gateway= [16.208.16.1]\n   Subnet = [255.255.248.0]\n   Subnet = [255.255.248.0]\n   Cluster= [g4u2679c g4u2680c g9u1484c g9u1485c]\n\nhost = g4u2680c.houston.example.com\n   ipaddr = [16.208.16.72]\n   VLAN = [352]\n   Gateway= [16.208.16.1]\n   Subnet = [255.255.248.0]\n   Subnet = [255.255.248.0]\n   Cluster= [g4u2679c g4u2680c g9u1484c g9u1485c]\n" 

re.findall(p, test_str) 
+0

尼斯。並感謝演示網站:) –

+0

好的,也謝謝。如果這對你有幫助,那麼接受這個答案。 –

1

您可以簡單地使用re.findalldict

>>> dict([(i,j.strip('[]')) for i,j in re.findall(r'(\w+)\s*=\s*(.+)',text)]) 
{'Subnet': '255.255.248.0', 'VLAN': '352', 'ipaddr': '16.208.16.72', 'Cluster': 'g4u2679c g4u2680c g9u1484c g9u1485c', 'host': 'g4u2680c.houston.example.com', 'Gateway': '16.208.16.1'} 

而且你可以通過str.strip方法刪除括號。

+0

您的解決方案不匹配的主機名,如主機名沒有支架。 –

+0

@SharuzzamanAhmatRaslan如果你也想要主機名,你可以遍歷're.findall()'並用'str.strip'去掉括號。 – Kasramvd

+1

我希望我可以接受多個答案,因爲你的答案也很有趣 –

0

你可以試試這個。

matches = [m.replace('[','').replace(']','').split('=', 1) for m in matches]