2016-04-29 73 views
1

下面是內容:正則表達式來確定字符串後得到的話

Timestamp: 24-03-2016 19:59:11 
Title:GetData() 
Message: Received request to get data 
Machine: LTPN 

---------------------------------------- 
Timestamp: 24-03-2016 20:15:34 
Title:GetData() 
Message: ERROR [08001] [Microsoft][ODBC SQL Server Driver][DBNETLIB]SQL Server does not exist or access denied. 
ERROR [01000] [Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionOpen (Connect()). 
ERROR [01S00] [Microsoft][ODBC SQL Server Driver]Invalid connection string attribute 
Machine: LTPN 

---------------------------------------- 

我需要捕捉的話冒號後(:),這是「的GetData()」,「接收到的請求獲取數據」, 「LTPN」。我希望有人能幫助我。

通過使用以下正則表達式,我可以得到我不想要的全部行數據。

^\s*Title:.+ gives "Title:GetData()" 
^\s*Message:.+ gives "Message: Received request to get data" 
^\s*Machine:.\S+ gives "Machine: LTPN" 

,但我想下面的輸出:

GetData() 
Received request to get data 
LTPN 
+0

你如何解析這個文本(例如像Java這樣的語言,或像Notepad ++這樣的工具)? –

+0

從我的理解,你想要的東西像** [這](https://regex101.com/r/jR4sZ3/1)** – rock321987

+0

他的工具可能不支持負向預覽,更好的解決方案是將每個結腸線並提取RHS。 –

回答

0

嘗試使用一下背後...

(?<=Title:).* 

或者它看起來像你希望每個冒號後的值 - >

(?<=^.*:).* 
+0

當我在Grok過濾器中使用這個正則表達式時,它不起作用。 –

+0

我對Grok過濾器並不瞭解,但您可能想嘗試一下:(?<= \ b。*:)。* \ b(?:\(\))? – TwistedStem

+0

哦,從頭開始,我剛剛在logstash和grok過濾器上發現了一些文檔... \ b是退格,而不是傳統的「一個詞中的第一個或最後一個字符」 – TwistedStem

0

請使用括號來捕捉p你想要的藝術,例如^\s*Message:(.+)它將返回Received request to get data

/^\s*\w+:(.+)/gm 

將更加通用的並且可以工作在多線一氣呵成。

+0

當我在grok過濾器中使用這個正則表達式時,它不起作用。 –

0

我猜你需要:

Title:(.*?)\sMessage:\s?(.*?)\sMachine:\s?(.*?)$ 

正則表達式說明:

Title: matches the characters Title: literally (case insensitive) 
1st Capturing group (.*?) 
    .*? matches any character (except newline) 
     Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy] 
\s match any white space character [\r\n\t\f ] 
Message: matches the characters Message: literally (case insensitive) 
\s? match any white space character [\r\n\t\f ] 
    Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy] 
2nd Capturing group (.*?) 
    .*? matches any character (except newline) 
     Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy] 
\s match any white space character [\r\n\t\f ] 
Machine: matches the characters Machine: literally (case insensitive) 
\s? match any white space character [\r\n\t\f ] 
    Quantifier: ? Between zero and one time, as many times as possible, giving back as needed [greedy] 
3rd Capturing group (.*?) 
    .*? matches any character (except newline) 
     Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy] 
$ assert position at end of a line 
g modifier: global. All matches (don't return on first match) 
m modifier: multi-line. Causes^and $ to match the begin/end of each line (not only begin/end of string) 
i modifier: insensitive. Case insensitive match (ignores case of [a-zA-Z]) 

Regex101 Demo

+0

當我在grok過濾器中使用這個正則表達式時,它不起作用。 –

+0

'grok filter'?你以前提過嗎? –

0

好吧,我looke d在Logstash文檔中發現grok過濾器使用oniguruma正則表達式。我對文檔也進一步瞭解,我認爲你可能會爲自己做更多的工作。試試這個:

filter { 
    multiline { 
    pattern => "^\Timestamp" 
    what => "previous" 
    negate=> true 
    } 
    grok { 
    match => ["message", "(?m)%{DATESTAMP:Timestamp}\s+%{TITLE}\s+%{MESSAGE}\s+%{MACHINE}"] 
    } 
} 

我會完全承認,我從來沒有使用Logstash或神交過濾器,這純粹是從我的文檔中看到的。但它看起來像匹配語句中的冒號後面的值是在該值之前的標題,並且似乎某些值已經在諸如標題,消息和機器的標題中構建。

希望它適合你。

相關問題