2017-07-28 81 views
0

無法爲--c5eda821-H---c5eda821-Z-之間捕獲完整的數據創造了良好的正則表達式無法創建對mod-安全分析一個很好的正則表達式

我對這個查詢的正則表達式是

re.compile('--([a-f0-9]{8})-H-(.+?)--[a-f0-9]{8}', re.MULTILINE | re.DOTALL) 

--c5eda821-F-- 
HTTP/1.1 200 OK 
X-Powered-By: PHP/5.5.9-1ubuntu4.21 
Expires: Thu, 19 Nov 1981 08:52:00 GMT 
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0 
Pragma: no-cache 
X-XSS-Protection: 0 
Vary: Accept-Encoding 
Content-Encoding: gzip 
X-Content-Type-Options: nosniff 
X-Frame-Options: sameorigin 
Content-Length: 1567 
Keep-Alive: timeout=5, max=99 
Connection: Keep-Alive 
Content-Type: text/html 

--c5eda821-E-- 

--c5eda821-H-- 
Message: Warning. String match "0" at RESPONSE_HEADERS:X-XSS-Protection. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "141"] [id "981403"] [msg "AppDefect: IE8's XSS protection Filter is Disabled."] [data "X-XSS-Protection: 0"] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag "http://websecuritytool.codeplex.com/wikipage?title=Checks#internet-explorer-xss-filter-disabled"] 
Apache-Handler: application/x-httpd-php 
Stopwatch: 1501247328871413 10305 (- - -) 
Stopwatch2: 1501247328871413 10305; combined=2942, p1=395, p2=2280, p3=34, p4=41, p5=147, sr=108, sw=45, l=0, gc=0 
Response-Body-Transformed: Dechunked 
Producer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/); OWASP_CRS/2.2.8. 
Server: Apache 
WebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" "" 
Engine-Mode: "DETECTION_ONLY" 

--c5eda821-Z-- 
+0

什麼問題? – revo

+0

我收到一個錯誤'AttributeError:'NoneType'對象沒有屬性'groups'' – dumbo

+0

可能的dup? https://stackoverflow.com/questions/15232832/python-regex-attributeerror-nonetype-object-has-no-attribute-groups – revo

回答

0

這工作對我來說:

>>> haystack = """--c5eda821-H- 

Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."] 
Message: Warning. Pattern match "(.*?)=(?i)(?!.*httponly.*)(.*$)" at RESPONSE_HEADERS:Set-Cookie. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "83"] [id "981184"] [msg "AppDefect: Missing HttpOnly Cookie Flag for auth."] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag "http://websecuritytool.codeplex.com/wikipage?title=Checks#cookie-not-setting-httponly-flag"] 
Apache-Handler: application/x-httpd-php 
Stopwatch: 1501247328778702 7722 (- - -) 
Stopwatch2: 1501247328778702 7722; combined=2901, p1=886, p2=1609, p3=54, p4=87, p5=213, sr=309, sw=52, l=0, gc=0 
Response-Body-Transformed: Dechunked 
Producer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/); 
OWASP_CRS/2.2.8. 
Server: Apache 
WebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" "" 
Engine-Mode: "DETECTION_ONLY" 


--c5eda821-Z--""" 

>>> print(re.search(r'--[\da-e]{8}-\w-(.+?)--[\da-e]{8}-\w--$', haystack, re.M|re.DOTALL).group(1)) 


Message: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."] 
Message: Warning. Pattern match "(.*?)=(?i)(?!.*httponly.*)(.*$)" at RESPONSE_HEADERS:Set-Cookie. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "83"] [id "981184"] [msg "AppDefect: Missing HttpOnly Cookie Flag for auth."] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag 
"http://websecuritytool.codeplex.com/wikipage?title=Checks#cookie-not-setting-httponly-flag"] 
Apache-Handler: application/x-httpd-php 
Stopwatch: 1501247328778702 7722 (- - -) 
Stopwatch2: 1501247328778702 7722; combined=2901, p1=886, p2=1609, p3=54, p4=87, p5=213, sr=309, sw=52, l=0, gc=0 
Response-Body-Transformed: Dechunked 
Producer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/); 
OWASP_CRS/2.2.8. 
Server: Apache 
WebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" "" 
Engine-Mode: "DETECTION_ONLY" 

您所描述的錯誤消息,是因爲當沒有比賽,re.search retur ns NoneNone沒有groups屬性。

我爲了防止這種情況的例外,您應該測試從方法的返回值,以檢查是否有任何匹配:

regex = re.compile(r'--[\da-e]{8}-\w-(.+?)--[\da-e]{8}-\w--$', re.M|re.DOTALL) 
match = regex.search(haystack) 
if match: 
    print match.group(1) 
else: 
    print "No match" 

[更新]

Yeah you are right in case of one only long string, I have many more in the same fashion. But, i only want the content between those tags. – dumbo

試試這個:

>>> regex = re.compile(r'--[\da-e]{8}-\w--(.+?)--[\da-e]{8}-\w--', re.M|re.DOTALL) 

>>> for i, match in enumerate(regex.findall(haystack)): 
...  print('{:02d}-> {}...'.format(i, match[:15].strip()))  

00-> HTTP/1.1 200 O... 
01-> Message: Warni... 

findall方法將返回匹配列表。如果你只想要最後一場比賽:

>>> matches = regex.findall(haystack) 

>>> print(matches[-1]) 

或者只是第二個:

>>> print(matches[1]) 
+0

是的,你是對的以防萬一只有一條長長的字符串,我有更多的同樣的方式。但是,我只想要這些標籤之間的內容。 – dumbo

+0

好的,看看更新的答案是你想要的。如果沒有,請更具體。例如,它總是「H」和「Z」? –

0

您可以使用兩種應用re.sub呼籲消除你不想要的部分。如果這在你的情況下正常工作,正則表達式通常更簡單。

>>> import re 
>>> text = open('temp.txt').read() 
>>> re.sub(r'--c5eda821-Z--', '', re.sub(r'--c5eda821-H-', '', text)) 
'\n\nMessage: Warning. Match of "eq 1" against "&ARGS:CSRF_TOKEN" required. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_43_csrf_protection.conf"] [line "31"] [id "981143"] [msg "CSRF Attack Detected - Missing CSRF Token."]\nMessage: Warning. Pattern match "(.*?)=(?i)(?!.*httponly.*)(.*$)" at RESPONSE_HEADERS:Set-Cookie. [file "/usr/share/modsecurity-crs/optional_rules/modsecurity_crs_55_application_defects.conf"] [line "83"] [id "981184"] [msg "AppDefect: Missing HttpOnly Cookie Flag for auth."] [tag "WASCTC/WASC-15"] [tag "MISCONFIGURATION"] [tag "http://websecuritytool.codeplex.com/wikipage?title=Checks#cookie-not-setting-httponly-flag"]\nApache-Handler: application/x-httpd-php\nStopwatch: 1501247328778702 7722 (- - -)\nStopwatch2: 1501247328778702 7722; combined=2901, p1=886, p2=1609, p3=54, p4=87, p5=213, sr=309, sw=52, l=0, gc=0\nResponse-Body-Transformed: Dechunked\nProducer: ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/); OWASP_CRS/2.2.8.\nServer: Apache\nWebApp-Info: "default" "59EFAF5D261B7D5BE14460C1BF3EE0A9" ""\nEngine-Mode: "DETECTION_ONLY"\n\n\n' 

編輯迴應評論:然後我建議這樣的方法。

>>> import re 
>>> with open('temp.txt') as text: 
...  for line in text.readlines(): 
...   if re.match(r'--c5[a-z]{3}821-[A-Z]-', line.strip()): 
...    continue 
...   else: 
...    print(line.strip()) 
+0

不好意思,有很多ID --c5eda821-Z-在末尾有不同的標籤 – dumbo

+0

請看編輯。 –