1
解析我有包含XML信封(2種類型的XML結構的:請求和響應)的日誌文件。我需要做的是分析此文件,提取XML-S並把它們放到2個數組作爲字符串(請求和響應的第2個數組第1個數組),所以我可以在以後解析。如何從日誌文件提取XML在python
任何想法如何我可以在Python實現這一目標?日誌文件的
片段被解析(日誌包含):
2014-10-31 12:27:33,600 INFO Recharger_MTelemedia2Channel [mbpa.module.mgw.mtelemedia.mtbilling.MTSender][] Sending BILL request
2014-10-31 12:27:33,601 INFO Recharger_MTelemedia2Channel [mbpa.module.mgw.mtelemedia.mtbilling.MTSender][] <?xml version="1.0" encoding="UTF-8"?>
<request xmlns="XXX" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<transactionheader>
<username>XXX</username>
<password>XXX</password>
<time>31/10/2014 12:27:33</time>
<clientreferencenumber>123</clientreferencenumber>
<numberrequests>3</numberrequests>
<information>Description</information>
<postbackurl>http://localhost/status</postbackurl>
</transactionheader>
<transactiondetails>
<items>
<item id="1" client="XXX1" keyword="test"/>
<item id="2" client="XXX2" keyword="test"/>
<item id="3" client="XXX3" keyword="test"/>
</items>
</transactiondetails>
</request>
2014-10-31 12:27:34,487 INFO Recharger_MTelemedia2Channel [mbpa.module.mgw.mtelemedia.mtbilling.MTSender][] Response code 200 for bill request
2014-10-31 12:27:34,489 INFO Recharger_MTelemedia2Channel [mbpa.module.mgw.mtelemedia.mtbilling.MTSender][] <?xml version="1.0" encoding="UTF-8"?>
<response xmlns="XXX" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<serverreferencenumber>XXX123XXX</serverreferencenumber>
<clientreferencenumber>123</clientreferencenumber>
<information>Queued for Processing</information>
<status>OK</status>
</response>
的答覆非常感謝!
問候, 羅伯特
使用像etree解析器:https://docs.python.org/2/library/xml.etree.elementtree.html – Paco 2014-10-31 10:36:31
使用正則表達式identifiing他們後刪除非XML線,然後一個標準的解析器會做。除了XML解析(beautifulsoup對我很好)我會把請求和響應放在字典中,請求作爲關鍵字。 – 2014-10-31 10:43:42