我正在從文檔分析器中提取一些文檔中的數據,這些文檔是我在C#中編寫的。文檔處於以下形式:正則表達式問題:直到下一場比賽或文檔結束
(Type 1): (potentially multi-lined string)
(Type 2): (potentially multi-lined string)
(Type 3): (potentially multi-lined string)
...
(Type N): (potentially multi-lined string)
(Type 1): (potentially multi-lined string)
...
End Of Document.
文檔重複(類型1) - (類型N)M倍以相同的格式
我在與所述多內襯字符串的麻煩和發現的(類型1)最後一次迭代 - (N型)
我需要做的就是捕捉(可能多行字符串)一個由它的前述命名(類型#)
下面是該文件的一個片段,我想匹配:
Name: John Dow Position: VP. over Development Bio: Here is a really long string of un important stuff that could include words like "Bio" or "Name". Some times I have problems here, but for the most part it should be normal Bio information Position History: Vp. over Development Sr. Project Manager Jr. Project Manager Developer Peon Notes: Here are some notes that may or may not be multilined and if it is, all the lines need to be captured for this person. Name: Joe Noob Position: Peon Bio: I'm a peon, so I have little bio Position History: Peon Notes: few notes Name: Jane Smith Position: VP. over Sales Bio: Here is a really long string of more un important stuff that could include words like "Bio" or "Name". Some times I have problems here, but for the most part it should be normal Bio information Position History: Vp. over Sales Sales Manager Secretary Notes: Here are some notes that may or may not be multilined and if it is, all the lines need to be captured for this person.
(型號#)的順序總是相同的,他們總是以換行符preceeded。
我有什麼:
Name:\s(?:(?.*?)\r\n)+?Position:\s(?:(?.*?)\r\n)+?Bio:\s(?:(?.*?)\r\n)+?Position History:\s(?:(?.*?)\r\n)+?Notes:\s(?:(?.*?)\r\n)+?
任何幫助將是巨大的!
要添加,您需要重構代碼以處理不同的標記值。 – 2011-01-25 16:56:56