這是SO上的PHP sentences boundaries question的擴展。PHP句子邊界包括空行嗎?
我想知道如何改變正則表達式,以保持換行也是如此。
示例代碼逐句分割一些文本,刪除一個句子,然後一起放回:
<?php
$re = '/# Split sentences on whitespace between them.
(?<= # Begin positive lookbehind.
[.!?] # Either an end of sentence punct,
| [.!?][\'"] # or end of sentence punct and quote.
) # End positive lookbehind.
(?<! # Begin negative lookbehind.
Mr\. # Skip either "Mr."
| Mrs\. # or "Mrs.",
| Ms\. # or "Ms.",
| Jr\. # or "Jr.",
| Dr\. # or "Dr.",
| Prof\. # or "Prof.",
| Sr\. # or "Sr.",
| T\.V\.A\. # or "T.V.A.",
# or... (you get the idea).
) # End negative lookbehind.
[\s+|^$] # Split on whitespace between sentences/empty lines.
/ix';
$text = <<<EOL
This is paragraph one. This is sentence one. Sentence two!
This is paragraph two. This is sentence three. Sentence four!
EOL;
echo "\nBefore: \n" . $text . "\n";
$sentences = preg_split($re, $text, -1);
$sentences[1] = " "; // remove 'sentence one'
// put text back together
$text = implode($sentences);
echo "\nAfter: \n" . $text . "\n";
?>
運行此,輸出是
Before:
This is paragraph one. This is sentence one. Sentence two!
This is paragraph two. This is sentence three. Sentence four!
After:
This is paragraph one. Sentence two!
This is paragraph two. This is sentence three. Sentence four!
我試圖讓「之後'文本與'之前'文本相同,只是刪除了一個句子。
After:
This is paragraph one. Sentence two!
This is paragraph two. This is sentence three. Sentence four!
我希望這可以做一個正則表達式的調整,但我錯過了什麼?
貌似有這正則表達式的問題:'[\ S + |^$]'真的匹配的空白,'+','|','^'和'$'符號。用'(?:\ h + |^$)'代替,我想就是這樣。 –
我想你可以在'+'了'\ s'後只是刪除或'\ S {1}'如果你真的需要它來匹配一個,因爲'\ S +'在消費其他的空格。本質上你需要'array(「stuf」,「\ n」,「stuff」);'但是不確定沒有測試它,而且這太複雜了,只能在我的腦海中運行。 – ArtisticPhoenix