2014-08-27 81 views
0

我有一個日誌如下圖所示:內部抓取件。*?正則表達式

事件:「[INIT] WinEvtLog:安全性:AUDIT_SUCCESS(528):安全性:管理員:AMAZON-D071A6F8:AMAZON-D071A6F8:成功登錄:用戶名:Administrator域:AMAZON-D071A6F8登錄ID:(0x0,0x1054A66)登錄類型:10登錄過程:User32身份驗證包:協商工作站名稱:AMAZON-D071A6F8登錄GUID: - 來電用戶名:AMAZON-D071A6F8 $來電域:WORKGROUP來電者登錄ID :(0x0,0x3E7)調用者進程ID:968轉換服務: - 源網絡地址:10.0.0.200源端口:60054 [END]「;

我捕捉到的日誌與此正則表達式:

EVENT:\s\"\[INIT\](?P<log>.*?)\[END\]\"; 

我這樣做是因爲我想以後顯示整個EVENT

(?P<log>)裏面有件我也想搶。例如,

Source\sPort:\s(?P<src_port>\d+) 
Source\sNetwork\sAddress:\s(?P<src_network_addr>\S+) 

並且除其他之外在EVENT之內。

我不知道如何創建一個正則表達式,以便能夠抓取整個EVENT以及EVENT中的位。

回答

2

捕獲組另一捕獲組內,

EVENT:\s\"\[INIT\](?P<log>.*?Source\sNetwork\sAddress:\s(?P<src_network_addr>\S+).*?Source\sPort:\s(?P<src_port>\d+).*?)\[END\]\" 

DEMO

上述正則表達式將捕獲log,以及這是存在的log內的src_portsrc_network_addr

+0

,只有當要素是有序的和非可選工作。 – 2014-08-27 17:45:51

+0

根據輸入張貼.. – 2014-08-27 17:46:34

1

下面列出的正則表達式將匹配開始EVENT: "[INIT]和結束[END]";的任何事件日誌。如果任何感興趣的短語都在事件日誌中,它們將被記錄下來。

請注意使用嵌套捕獲組:(?P<log>...(?P<src_port>...)...)。外部團隊將捕捉整個模式,包括內部組織捕獲的任何內容。

另請注意,任何不參與比賽的組仍然存在於結果dict中,其值爲None

import re 
from pprint import pprint 


texts=[ 
    'EVENT: "[INIT]WinEvtLog: Security: AUDIT_SUCCESS(528): Security: Administrator: AMAZON-D071A6F8: AMAZON-D071A6F8: Successful Logon: User Name: Administrator Domain: AMAZON-D071A6F8 Logon ID: (0x0,0x1054A66) Logon Type: 10 Logon Process: User32 Authentication Package: Negotiate Workstation Name: AMAZON-D071A6F8 Logon GUID: - Caller User Name: AMAZON-D071A6F8$ Caller Domain: WORKGROUP Caller Logon ID: (0x0,0x3E7) Caller Process ID: 968 Transited Services: - Source Network Address: 10.0.0.200 Source Port: 60054 [END]";', 
    'EVENT: "[INIT]Random text with one match Source Port: 60054 And stuff at end [END]";', 
    'EVENT: "[INIT]Random text with no matches [END]";'] 


for text in texts: 
    match = re.match(
    r''' 
     (?x)         # Verbose 
     EVENT:\s"\[INIT]      # anchor from beginning 
     (?P<log>        # record entire entry 
     (?:        # consisting of: 
      (?:Source\sNetwork\sAddress:\s # src_network_address 
      (?P<src_network_address>\S+)) 
      |        # OR 
      (?:Source\sPort:\s    # src_port 
      (?P<src_port>\S+)) 
      |        # OR 
      .*?        # anything else 
     )*         # as many times as required 
    ) 
     \s\[END]";$       # anchor at end 
    ''', 
    text) 
    if(match): 
    pprint (match.groupdict()) 

結果:

{'log': 'WinEvtLog: Security: AUDIT_SUCCESS(528): Security: Administrator: AMAZON-D071A6F8: AMAZON-D071A6F8: Successful Logon: User Name: Administrator Domain: AMAZON-D071A6F8 Logon ID: (0x0,0x1054A66) Logon Type: 10 Logon Process: User32 Authentication Package: Negotiate Workstation Name: AMAZON-D071A6F8 Logon GUID: - Caller User Name: AMAZON-D071A6F8$ Caller Domain: WORKGROUP Caller Logon ID: (0x0,0x3E7) Caller Process ID: 968 Transited Services: - Source Network Address: 10.0.0.200 Source Port: 60054', 
'src_network_address': '10.0.0.200', 
'src_port': '60054'} 
{'log': 'Random text with one match Source Port: 60054 And stuff at end', 
'src_network_address': None, 
'src_port': '60054'} 
{'log': 'Random text with no matches', 
'src_network_address': None, 
'src_port': None}