正則表達式和pspell_check與UTF-8（Umlaute）

我遇到了這段代碼的麻煩。它應該做的是採取一個字符串，按字分割，然後檢查字典。但是，當字符串包含「變音符號」時，它會將其分開。正則表達式和pspell_check與UTF-8（Umlaute）

我很確定問題是[A-ZäöüÄÖÜ\']它似乎我包括特殊的charackters錯誤，但如何？

$string = "Rechtschreibprüfung";  
preg_match_all("/[A-ZäöüÄÖÜ\']{1,16}/i", $string, $words); 
for ($i = 0; $i < count($words[0]); ++$i) { 
    if (!pspell_check($pspell_link, $words[0][$i])) { 
     $array[] = $words[0][$i];    
    } 
}

結果：

$array[0] = Rechtschreibprü" 
$array[1] = "fung"

來源

2016-06-07 Shaeldon

你只需要''/ \ p {L} +/U '' –

@WiktorStribiżew感謝似乎工作，關心張貼作爲答案？因爲我從來沒有抓住這些東西，任何好的閱讀推薦？ – Shaeldon

要匹配的Unicode字母塊，你可以使用

'/\p{L}+/u'

的\p{L}匹配任何Unicode字母，+前面的一個或多個occurrenes匹配子模式和/u修飾符將模式和字符串視爲Unicode字符串。

要只匹配整個單詞，用單詞的邊界：

'/\b\p{L}+\b/u'

如果您有變音符號，也添加\p{M}：

'/\b[\p{M}\p{L}]+\b/u'

來源

2016-06-07 08:46:45

正則表達式和pspell_check與UTF-8（Umlaute）

回答

相關問題