使用Python來提取一個文件字符串和整理

-1

"": "<a href=\"#\" class=\"tree-title\" title=\"IP: 10.0.0.1\nHostname: hello1\nModel: 2901\nVersion: 1.1.1.1_80000\nState: Normal\">hello1(10.0.0.1)</a>" 
    }, 
    { 
    "": "<a href=\"#\" class=\"tree-title\" title=\"IP: 10.0.0.2\nHostname: hello2\nModel: 2911\nVersion: 1.1.1.1_80000\nState: Normal\">hello2 (10.0.0.2)</a>" 
    }, 
    { 
    "": "<a href=\"#\" class=\"tree-title\" title=\"IP: 10.0.0.3\nHostname: hello3\nModel: 2911\nVersion: 1.1.1.1_80000\nState: Normal\">hello3(10.0.0.3)</a>" 
    }, 
    {

這不是正確的結構，因爲它是刮掉並傾倒到一個文本文件中。有超過100個這樣的細分市場。儘管看起來如此，頁面並不僅僅是html，因此我不能簡單地將數據作爲結構化表單提取。

現在我想用Python來提取hostname, Model number和IP address的有序列表。

所以看起來像新的線路如下：

hostname: hello1  Model No: 2901  IP address: 10.0.0.1<br> 
hostname: hello2  Model No: 2911  IP address: 10.0.0.2<br> 
hostname: hello3  Model No: 2911  IP address: 10.0.0.3

但我努力尋找如何通過首先從第一部分提取必要的信息，那麼接下來等

做到這一點

任何建議將不勝感激。

來源

2016-11-18 whatis.python

嘗試編碼。嘗試。 –

我認爲你需要正則表達式的魔力。 [在Python中重新模塊]（https://docs.python.org/2/library/re.html） – Ezio