Python：使用正則表達式獲取文件行中的特定文本

我正在使用python逐行搜索文本日誌文件，並且我想將行的某個部分保存爲變量。我使用正則表達式，但不認爲我正確使用它，因爲我總是得到None爲我的變量string_I_want。我在看這裏的其他正則表達式問題，看到人們在其re.search的末尾添加了.group()，但是這給了我一個錯誤。我不是最熟悉正則表達式，但不知道我哪裏錯了？Python：使用正則表達式獲取文件行中的特定文本

日誌文件示例：

2016-03-08 11:23:25 test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165

我的腳本：

def get_data(log_file): 

    #Read file line by line 
    with open(log_file) as f: 
     f = f.readlines() 

     for line in f: 
      date = line[0:10] 
      time = line[11:19] 

      string_I_want=re.search(r'/m=\w*/g',line) 

      print date, time, string_I_want

來源

2016-05-16 Catherine

正則表達式是wrong..you使用正則表達式的格式的Javascript – rock321987

不要只是猜測那些're'函數和方法做---閱讀「[正則表達式HOWTO]」（https://docs.python.org/2/howto/regex.html）「的全面介紹在Python 2中使用正則表達式，並參考['re'參考文檔]（https://docs.python.org/2/library/re.html）當你需要查詢細節。從長遠來看，這會節省您的時間。 –

您需要與全球標誌去掉/.../分隔符，並使用捕獲組：

mObj = re.search(r'm=(\w+)',line) 
if mObj: 
    string_I_want = mObj.group(1)

看到這個regex demo和Python demo：

import re 
p = r'm=(\w+)'    # Init the regex with a raw string literal (so, no need to use \\w, just \w is enough) 
s = "2016-03-08 11:23:25 test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165" 
mObj = re.search(p, s)  # Execute a regex-based search 
if mObj:     # Check if we got a match 
    print(mObj.group(1)) # DEMO: Print the Group 1 value

圖案的詳細資料：

m= - 匹配m=文字字符序列（添加空間之前或\b如果整個字必須匹配）
(\w+) - 第1個捕獲1+字母數字或下劃線字符。我們可以使用.group(1)方法來引用此值。

來源

2016-05-16 10:21:30

務必：

(?<=\sm=)\S+

例子：

In [135]: s = '2016-03-08 11:23:25 test_data:0317: m=string_I_want max_count: 17655, avg_size: 320, avg_rate: 165' 

In [136]: re.search(r'(?<=\sm=)\S+', s).group() 
Out[136]: 'string_I_want'

來源

2016-05-16 10:22:47 heemayl

這裏是你需要的東西：

import re 
def get_data(logfile): 
    f = open(logfile,"r") 
    for line in f.readlines(): 
     s_i_w = re.search(r'(?<=\sm=)\S+', line).group() 
     if s_i_w: 
      print s_i_w 
    f.close()

來源

2016-05-16 10:33:49

Python：使用正則表達式獲取文件行中的特定文本

回答

相關問題