如何使用正則表達式匹配從開始到結束的塊

我想從起始標題到結束標題拿起整個塊，但不包括結束標題。實例是：如何使用正則表達式匹配從開始到結束的塊

<section1> 
Base_Currency=EUR 
Description=Revaluation 
Grouping_File 
<section2>

比賽結果應該是：

<section1> 
Base_Currency=EUR 
Description=Revaluation 
Grouping_File

問題是，我怎麼能制定本場比賽在Java中使用正則表達式的模式？

來源

2017-02-21 QSY

regexr.com幫助我在這些情況下很多。它也給你一個作弊表。 – Midnightas

你能舉個例子嗎？ – QSY

如果你輸入類似下面

<section1> 
Base_Currency=EUR 
Description=Revaluation 
Grouping_File 
<section2> 
Base_Currency=EUR 
Description=Revaluation 
Grouping_File 
<section3> 
Base_Currency=EUR 
Description=Revaluation 
Grouping_File

然後你就可以使用以下正則表達式

(?s)(<section\d+>.*?)(?=<section\d+>|$)

解釋爲正則表達式是

NODE      EXPLANATION 
-------------------------------------------------------------------------------- 
    (?s)      set flags for this block (with . matching 
          \n) (case-sensitive) (with^and $ 
          matching normally) (matching whitespace 
          and # normally) 
-------------------------------------------------------------------------------- 
    (      group and capture to \1: 
-------------------------------------------------------------------------------- 
    <section     '<section' 
-------------------------------------------------------------------------------- 
    \d+      digits (0-9) (1 or more times (matching 
          the most amount possible)) 
-------------------------------------------------------------------------------- 
    >      '>' 
-------------------------------------------------------------------------------- 
    .*?      any character (0 or more times (matching 
          the least amount possible)) 
-------------------------------------------------------------------------------- 
)      end of \1 
-------------------------------------------------------------------------------- 
    (?=      look ahead to see if there is: 
-------------------------------------------------------------------------------- 
    <section     '<section' 
-------------------------------------------------------------------------------- 
    \d+      digits (0-9) (1 or more times (matching 
          the most amount possible)) 
-------------------------------------------------------------------------------- 
    >      '>' 
-------------------------------------------------------------------------------- 
    |      OR 
-------------------------------------------------------------------------------- 
    $      before an optional \n, and the end of 
          the string 
-------------------------------------------------------------------------------- 
)      end of look-ahead

如果你想匹配只爲一個標籤，那麼你可以使用

(?s)(<section\d+>[^<]*)

解釋這個表達式是

NODE      EXPLANATION 
-------------------------------------------------------------------------------- 
    (?s)      set flags for this block (with . matching 
          \n) (case-sensitive) (with^and $ 
          matching normally) (matching whitespace 
          and # normally) 
-------------------------------------------------------------------------------- 
    (      group and capture to \1: 
-------------------------------------------------------------------------------- 
    <section     '<section' 
-------------------------------------------------------------------------------- 
    \d+      digits (0-9) (1 or more times (matching 
          the most amount possible)) 
-------------------------------------------------------------------------------- 
    >      '>' 
-------------------------------------------------------------------------------- 
    [^<]*     any character except: '<' (0 or more 
          times (matching the most amount 
          possible)) 
-------------------------------------------------------------------------------- 
)      end of \1

來源

2017-02-21 15:34:56

如果您整個輸入此格式的，你可以簡單地拆分：

String[] sections = input.split("\\R(?=<)");

\R是「任何新行序列」和(?=<)手段「的一個字符是'<'」。

但是如果不是這種情況，從你會需要的正則表達式工具箱：

的DOTALL標誌，以點匹配換行符太
的MULTILINE標誌，以便^比賽開始行太
負面展望讓你在下一節開始時停止消費

假設「節」開始w第i個一「<」在一行的開頭：

"(?sm)^<\\w+>(.(?!^<))*"

這裏是你如何使用它：

String input = "<section1>\nBase_Currency=EUR\nDescription=Revaluation\nGrouping_File\n<section2>\nfoo"; 
Matcher matcher = Pattern.compile("(?sm)^<\\w+>(.(?!^<))*").matcher(input); 
while (matcher.find()) { 
    String section = matcher.group(); 
}

來源

2017-02-21 15:27:23 Bohemian

如何使用正則表達式匹配從開始到結束的塊

回答

相關問題