2012-02-17 56 views
2

我想逃避匹配報價,除了那些在標籤屬性,例如:除了標籤的屬性逃逸匹配報價

輸入:

xyz <test foo='123 abc' bar="def 456"> f00 'escape me' b4r "me too" but not this </tEsT> blah 'escape " me' 

預期輸出:

xyz <test foo='123 abc' bar="def 456"> f00 \'escape me\' b4r \"me too\" but not this </tEsT> blah \'escape " me\' 

我有以下正則表達式:

$result = preg_replace('/(([\'"])((\\\2|.)*?)\2)/', "\\\\$2$3\\\\$2", $input); 

返回:

xyz <test foo=\'123 abc\' bar=\"def 456\"> f00 \'escape me\' b4r \"me too\" but not this </tEsT> blah \'escape " me\' 

現在我想用正則表達式零寬度負的外觀後面跳過有等號前面匹配的引號:

$result = preg_replace('/((?<=[^=])([\'"])((\\\2|.)*?)\2)/', "\\\\$2$3\\\\$2", $input); 

但結果仍不如預期:

xyz <test foo='123 abc\' bar="def 456"> f00 \'escape me\' b4r "me too" but not this </tEsT> blah \'escape " me' 

能否請您給我的意見,我怎麼可以跳過整個不必要的塊(=「等等等等等等」),而不是僅僅跳過第一個報價?

+0

不要用正則表達式來做到這一點。你會後悔的。 – Jon 2012-02-17 10:40:11

回答

2

而不是回頭看背景,期待。通常要容易得多。

$result = preg_replace('/([\'"])(?![^<>]*>)((?:(?!\1).)*)\1/', 
         '\\\\$1$2\\\\$1', 
         $subject); 
(['"])   # capture the open quote 
(?![^<>]*>)  # make sure it's not inside a tag 
(    # capture everything up to the next quote 
    (?:    # ...after testing each character to 
    (?!\1|[<>]). # ...to be sure it's not the opening quote 
)*    # ...or an angle bracket 
) 
\1    # match another quote of the same type as the first one 

我假設不會有在屬性值的任何尖括號。

+0

它適合我!謝謝你的詳細解釋:)你從哪裏學到了正則表達式? – Artur 2012-02-17 12:49:45

+0

@Artur:主要從閱讀[掌握正則表達式](http://shop.oreilly.com/product/9780596528126.do),練習,並在這樣的論壇掛出。 :D – 2012-02-18 23:30:46

1

這是另一個。

$str = "xyz <test foo='123 abc' bar=\"def 456\"> f00 'escape me' b4r \"me too\" but not this <br/> <br/></tEsT> blah 'escape \" me'"; 

$str_escaped = preg_replace_callback('/(?<!\<)[^<>]+(?![^<]*\>)/','escape_quotes',$str); 
// check all the strings outside every possible tag 
// and replace each by the return value of the function below 

function escape_quotes($str) { 
    if (is_array($str)) $str = $str[0]; 
    return preg_replace('/(?<!\\\)(\'|")/','\\\$1',$str); 
    // escape all the non-escaped single and double quotes 
    // and return the escaped block 
} 
+0

有人可以驗證這是否適用於各種情況?我假設所有** <** and **> **符號都被轉義(分別爲「<」和「>」),而不是周圍的標記。 – inhan 2012-02-17 14:20:36