我已經發布了關於similar question Python中字符提取使用正則表達式,但我有一個非貪婪量詞另一個問題,所以我用一個不同的例子問一個問題。問題是我需要使用Python中的正則表達式提取字符串文本的所有相關部分,並使用兩個特定的匹配項。具體而言,這裏是一個例子文本:通過在Python
example = """
The Bank does offer a hybrid loan. Hybrid loans are loans that start as a
fixed rate mortgage but after a set number of years automatically adjust
to an adjustable rate mortgage. The Bank offers a three year fixed rate mortgage
after which the interest rate will adjust annually. Item 1. Business 3-13 Item 1a.
Risk Factors 13-15 Item 1b. Unresolved Staff Comments 15 Item 2. Properties 15-16
The forward-looking statements are made as of the date of this report,
and the Company assumes no obligation to update the forward-looking statements
or to update the reasons why actual results could differ from those projected
in the forward-looking statements. PART 1. ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a\n community bank operating
in Northwest Ohio since 1897.ITEM 2. PROPERTIES Our principal office is located in Archbold, Ohio.
The Bank operates from the facilities at 307 North Defiance Street.
In addition, the Bank owns the property from 200 to 208 Ditto Street,
Archbold, Ohio, which it uses for Bank parking and a community mini-park area.
"""
,並和我想提取「之間」從開始起匹配「項目1.」的文本的部分和結束匹配「項目2.」,所以最後的結果應該是這樣的:
final_result_1 = """
ITEM 1. BUSINESS
General Farmers & Merchants Bancorp, Inc. (Company) is a bank holding company
incorporated under the laws of Ohio in 1985 and elected to become a financial
holding company under the Federal Reserve in 2014. Our primary subsidiary,
The Farmers & Merchants State Bank (Bank) is a\n community bank operating
in Northwest Ohio since 1897.
"""
final_result_2 = """
Item 1. Business 3-13 Item 1a.
Risk Factors 13-15 Item 1b. Unresolved Staff Comments 15
"""
最終結果的順序應該是在最終結果的文本的長度方面,所以「final_result_1」是兩個中最長的文本部分,'final_result_2'是最短的一個。你可以參考上一個問題here的答案。先謝謝你!
我很想幫忙,但這個問題是非常令人迷惑。你能否創建一些簡短的示例文本並解釋一下你想要輸出的內容? –
@krcoder,你需要從文本中排除「ITEM 2」,對不對? –
@code_byter,這是真的,以及'final_result_2'被排除的'Item 2'。 – krcoder