2011-03-21 155 views
1

我試圖找到字符串中的某個字(確切的詞)相匹配的正則表達式。問題是這個詞有什麼特殊的字符,如'#'或其他。特殊字符可以是任何UTF-8字符,如(「áéíóúñ#@」),它必須忽略標點符號。正則表達式:匹配的詞有特殊字符

我把我正在尋找一些例子:

Searching:#myword 

Sentence: "I like the elephants when they say #myword" <- MATCH 
Sentence: "I like the elephants when they say #mywords" <- NO MATCH 
Sentence: "I like the elephants when they say myword" <-NO MATCH 
Sentence: "I don't like #mywords. its silly" <- NO MATCH 
Sentence: "I like #myword!! It's awesome" <- MATCH 
Sentence: "I like #myword It's awesome" <- MATCH 

PHP示例代碼:

$regexp= "#myword"; 
    if (preg_match("/(\w$regexp)/", "I like #myword!! It's awesome")) { 
     echo "YES YES YES"; 
    } else { 
     echo "NO NO NO "; 
    } 

謝謝!

更新:如果我找「myword」這個詞有由「W」,而不是其他字符開始。

Sentence: "I like myword!! It's awesome" <- MATCH 
Sentence: "I like #myword It's awesome" <-NO MATCH 
+5

如何在第二和第四產生一個匹配和不匹配? – alex 2011-03-21 13:47:08

+0

它跟一個字母字符和4日就不是 – LDK 2011-03-21 13:51:57

+0

嘗試逃避與\ – Yaronius 2011-03-21 13:57:47

回答

2

下面的解決方案是在分別考慮字符和邊界時產生的。也可能有一個可行的方法直接使用字邊界。

代碼:

function search($strings,$search) { 
     $regexp = "/(?:[[:space:]]|^)".$search."(?:[^\w]|$)/i"; 
     foreach ($strings as $string) { 
       echo "Sentence: \"$string\" <- " . 
        (preg_match($regexp,$string) ? "MATCH" : "NO MATCH") ."\n"; 
     } 
} 

$strings = array(
"I like the elephants when they say #myword", 
"I like the elephants when they say #mywords", 
"I like the elephants when they say myword", 
"I don't like #mywords. its silly", 
"I like #myword!! It's awesome", 
"I like #mywOrd It's awesome", 
); 
echo "Example 1:\n"; 
search($strings,"#myword"); 

$strings = array(
"I like myword!! It's awesome", 
"I like #myword It's awesome", 
); 
echo "Example 2:\n"; 
search($strings,"myword"); 

輸出:

Example 1: 
Sentence: "I like the elephants when they say #myword" <- MATCH 
Sentence: "I like the elephants when they say #mywords" <- NO MATCH 
Sentence: "I like the elephants when they say myword" <- NO MATCH 
Sentence: "I don't like #mywords. its silly" <- NO MATCH 
Sentence: "I like #myword!! It's awesome" <- MATCH 
Sentence: "I like #mywOrd It's awesome" <- MATCH 
Example 2: 
Sentence: "I like myword!! It's awesome" <- MATCH 
Sentence: "I like #myword It's awesome" <- NO MATCH 
+0

哇!謝謝彼得:) – LDK 2011-03-21 14:31:00

+0

NP,起初我完全困惑,但是當你清理問題和例子時,它可以解決。 :) – 2011-03-21 14:31:58

+0

這很好:)我怎樣才能添加大小寫不敏感? – LDK 2011-03-21 15:00:54

0

這應該做的伎倆(更換任何你想找到在「myWord」):

^.*#myword[^\w].*$ 

如果匹配成功,然後你的話被發現 - 否則就不是。

+0

這種表達是錯誤的:('的preg_match():未知的修飾詞 '\'' – LDK 2011-03-21 14:14:00

+0

好的作品對我很好(快報 - .NET)所以你可以用字符替換 「\ W」:[AZ] [AZ] [0-9] – 2011-03-21 14:17:44

+0

也許你只是需要跳過斜槓(「\\」而不是「\」,我不知道在PHP中)。 – 2011-03-21 14:18:22

1

你應該尋找myword像這樣/\bmyword\b/ wordboundary。
#本身也是wordboundary所以/\b#myword\b/這麼想的工作。
一個想法是爲了逃避unicode字符\X但這會產生其他問題。

/ #myword\b/ 
+0

此表達式與第三個示例相匹配,但不能爲 – LDK 2011-03-21 14:28:36

+0

@LDK true \ X不是一個好主意來逃避unicode特徵 – 2011-03-21 14:59:24

+0

+1它的工作原理!我錯過了領先的空間,當我嘗試它。 – 2011-03-21 15:56:43

相關問題