2017-08-30 58 views
0

問題: Iam從我用自定義日期字段安裝的服務解析一條記錄。所以我想匹配日誌行,然後看看是否有新日誌進入日誌文件。用於解析自定義格式中日期的正則表達式邏輯無效?

但是爲了匹配使用正則表達式的日誌文件來完全匹配logline中的日期。我附加了下面的代碼部分。

代碼:

def matchDate(self , line): 
       matchThis = "" 
       #Thu Jul 27 00:03:27 2017 
       matched = re.match(r'\d\d\d\ \d\d\d \d\d\ \d\d:\d\d:\d\d \d\d\d\d',line) 
       print matched 
       if matched: 
       #matches a date and adds it to matchThis 
         matchThis = matched.group() 
         print 'Match found {}'.format(matchThis) 
       else: 
         matchThis = "NONE" 
       return matchThis 

     def log_parse(self): 
       currentDict = {} 
       with open(self.default_log , 'r') as f: 
         for line in f: 
           print line 
           if line.startswith(self.matchDate(line) , 0 ,24): 
             if currentDict: 
               yield currentDict 
             currentDict = { 
               "date" : line.split('[')[0][:24], 
               "no" : line.split(']')[0][-4:-1], 
               "type" : line.split(':')[0][-4:-1], 
               "text" : line.split(':')[1][1:] 
               } 
           else: 
             pass 
#          currentDict['text'] += line 
         yield currentDict 

在這裏,所以我解決了這個問題通過這樣

'[A-Za-z]{3} [A-Za-z]{3} [0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2} [0-9]{4}' 

這裏新的正則表達式是不匹配什麼是正則表達式編輯器[http://regexr.com/3gl67]

任何建議關於如何解決這個問題以及如何精確匹配logline。

例的logline:

Wed Aug 30 13:05:47 2017 [3163] INFO: Something new, the something you looking for is hidden. Update finished. 
Wed Aug 2 13:05:47 2017 [3163] INFO: Something new, the something you looking for is hidden. Update finished. 

enter image description here

+0

'R'\ W {3,4} \ W {3,4-} \ d + \ d {2}:\ d {2}:\ d {2} \ d {4} 「這是你需要的嗎? – Sraw

+0

'Wed Aug 2 13:05:47 2017 [3163]信息:新東西,你尋找的東西是隱藏的。更新完成.'不匹配 –

+0

實際上它不是'Wed Aug 2 13:05:47 2017 [3163]信息:新東西,你尋找的東西是隱藏的。更新完成.'但是'Wed Aug 2 13:05:47 2017 [3163]信息:新東西,你尋找的東西是隱藏的。更新完成.'在AUG之後有額外的空間 –

回答

0

我開發這個代碼,可以幫助你檢測所需的圖案:

import re 

#detecting Thu Jul 27 00:03:27 2017 

line = 'Wed Aug 30 13:05:47 2017 [3163] INFO: Something new, the something you looking for is hidden. Update finished.' 

days = '(?:Sat|Sun|Mon|Tue|Wed|Thu|Fri) ' 
months = '(?:Jan|Feb|Mar|Apr|May|June|July|Aug|Sept|Oct|Nov|Dec) ' 
day_number = '\d{2} ' 
time = '\d{1,2}:\d{1,2}:\d{1,2} ' 
year = '\d{4} ' 
date = days+months+day_number 

pattern = date + time + year 

date_matched = re.findall(date, line) 
time_matched = re.findall(time, line) 
year_matched = re.findall(year, line) 
full_matched = re.findall(pattern, line) 
print(date_matched, year_matched, time_matched , full_matched) 

if len(full_matched) > 0: 
    print('yes') 
else: 
    print('no') 

我用特定的模式爲月,日,年和時間。我不太熟悉re.match的功能,所以我用re.findall。我的優先考慮是代碼的簡單性和清晰性,所以我想可以使用更高效的代碼或模式。無論如何,我真的希望這個人能派上用場。

好運