需要從Python的括號內的文本文件中讀取數據

我的文本文件的行如下。這種類型的行在文件中多次出現。需要從Python的括號內的文本文件中讀取數據

[年11月22 22點27分13秒] INFO - [com.macys.seo.business.impl.LinkBusinessImpl] - 執行搜索（WS）網關請求：KeywordVO（關鍵字= GUESS得分= 83965 normalizedKeyword = GUESS productIds = [] categoryIds = [] hotListed =假列入黑名單=假globalHotList =假URL = /購買/猜測）

我想僅在以下數據中提取到一個文件，如：

關鍵字=猜測，分數= 83965，hotListed = false，globalHotList = false url =/buy/GUESS

這是我到目前爲止有：

def get_sentences(filename): 
    with open('d:\log.log') as file_contents: 
     d1, d2 = '(', ')' # just example delimiters 
     for line in file_contents: 
      if d1 in line: 
       results = [] 
      elif d2 in line: 
       yield results 
      else: results.append(line) 
    print results

請指教。

來源

2011-11-27 newcane

解決方案是否必須是Python的？你堅持哪部分？ – Johnsyweb

我只需要使用python。我的代碼段是def get_sentences（filename）： with open（'d：\ log.log'）as file_contents： d1，d2 ='（'，'）'＃只是示例分隔符 for file in file_contents： if D1在行：結果= [] ELIF D2在行：收效其他： results.append（線）打印效果 – newcane

Regular expressions可以幫助做一個單次解析：

import re, pprint 

with open('d:\log.log') as f: 
    s = f.read() 
results = re.findall(r'KeywordVO \((.*?)\)', s) 
pprint.pprint(results)

上述正則表達式使用KeywordVO識別哪個括號是有關（我猜你不想的(WS)部分匹配示例文本）。您可能需要仔細查看日誌文件，確定提取所需數據的準確正則表達式。

一旦你有所有關鍵字對的長文本字符串，使用另一個正則表達式分裂鍵/值對：r'[A-Za-z]+\s*=\s*[A-Za-z\[\]\,]'。這個正則表達式很棘手，因爲您想要在等號的右側捕獲複數值，但不希望意外捕獲下一個鍵（不幸的是，鍵/值對沒有用逗號或其他符號分隔。

與解析好運:-)

來源

2011-11-27 04:42:29

您可以使用正則表達式：

>>> re.findall(r'\w+ = \S+', the_text) 
['keyword = GUESS', 'score = 83965', 'normalizedKeyword = GUESS', 
'productIds = []', 'categoryIds = []', 'hotListed = false', 
'blackListed = false', 'globalHotList = false', 'url = /buy/GUESS']

然後你可以分割=搶你所需要的人。

類似的東西：

>>> data = re.findall(r'\w+ = \S+', the_text) 
>>> ok = ('keyword', 'score', 'hotListed', 'url') 
>>> [i for i in [i.split(' = ') for i in data] if i[0] in ok 
[['keyword', 'GUESS'], ['score', '83965'], ['hotListed', 'false'], ['url', '/buy/GUESS']]

來源

2011-11-27 04:44:07 JBernardo

我陷在使用正則表達式拆分搶所需的人，請大家幫忙 – newcane

@Ryder我編輯了答案 – JBernardo

import re，pprint with open（'d：\ log.log'）as f： s = f.read（） results = re。findall（r'\ w + = \ S +'，s'）這將打印所有的鍵/值對，我只需要在ok中定義的內容 ok =（'keyword'，'score'，'hotListed'，'url'） [我爲我[i.split（'='）爲我在結果]如果我[0]在確定] pprint.pprint（results） – newcane

需要從Python的括號內的文本文件中讀取數據

回答

相關問題