2010-11-23 71 views
0

我正在嘗試編寫一個函數,用於從字符串中提取所有網址,並從結尾刪除潛在的斜線。使用正則表達式將字符串中的URL放入數組中

function getUrls($string) { 
    $regex = '/https?\:\/\/[^\" ]+/i'; 
    preg_match_all($regex, $string, $matches); 
    return ($matches[0]); 
} 

但是,返回http://test.com。 (尾隨期)如果我有

$string = "Hi I am sharing http://test.com."; 
$urls = getUrls($string); 

它返回帶有期末的URL。

回答

1

這一個似乎工作(從here拍攝)

$regex="/(https?:\/\/+[\w\-]+\.[\w\-]+)/i"; 
+0

謝謝!這確實有效。任何想法如何使它匹配或不使用http://? – 2010-11-23 04:52:07

0

根據你想要的嚴格程度,考慮Daring Fireball討論的Liberal, Accurate Regex Pattern for Matching URLs正則表達式模式。在全模式是:

\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))) 

如果你有興趣在它的工作原理,艾倫風暴有很大的explanation

+1

@大衛:他更新了他的模式[這裏](http://daringfireball.net/2010/07/improved_regex_for_matching_urls)。他還指出,並非所有人都支持`[[:punct:]]`。我,我更願意用`[\ pP \ pS]`代替。他還包含一個僅適用於http和https的版本。 – tchrist 2010-11-23 11:12:58

0

萬一有人遇到此,這裏是我放在一起:

$aProtocols = array('http:\/\/', 'https:\/\/', 'ftp:\/\/', 'news:\/\/', 'nntp:\/\/', 'telnet:\/\/', 'irc:\/\/', 'mms:\/\/', 'ed2k:\/\/', 'xmpp:', 'mailto:'); 
$aSubdomains = array('www'=>'http://', 'ftp'=>'ftp://', 'irc'=>'irc://', 'jabber'=>'xmpp:'); 
$sRELinks = '/(?:(' . implode('|', $aProtocols) . ')[^\^\[\]{}|\\"\'<>`\s]*[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s])|(?:(?:(?:(?:[^@:<>(){}`\'"\/\[\]\s]+:)?[^@:<>(){}`\'"\/\[\]\s][email protected])?(' . implode('|', array_keys($aSubdomains)) . ')\.(?:[^`[email protected]#$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6}(?:[\/#?](?:[^\^\[\]{}|\\"\'<>`\s]*[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s])?)?)|(?:(?:[^@:<>(){}`\'"\/\[\]\s][email protected])?((?:(?:(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))(?:\.(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))){3})|(?:[A-Fa-f0-9:]{16,39}))|(?:(?:[^`[email protected]#$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6}))\/(?:[^\^\[\]{}|\\"\'<>`\s]*[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s](?:[#?](?:[^\^\[\]{}|\\"\'<>`\s]*[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s])?)?)?)|(?:[^@:<>(){}`\'"\/\[\]\s]+:[^@:<>(){}`\'"\/\[\]\s][email protected]((?:(?:(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))(?:\.(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))){3})|(?:[A-Fa-f0-9:]{16,39}))|(?:(?:[^`[email protected]#$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6}))(?:\/(?:(?:[^\^\[\]{}|\\"\'<>`\s]*[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s])?)?)?(?:[#?](?:[^\^\[\]{}|\\"\'<>`\s]*[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s])?)?))|([^@:<>(){}`\'"\/\[\]\s][email protected](?:(?:(?:[^`[email protected]#$%^&*()_=+\[{\]}\\|;:\'",<.>\/?\s]+\.)+[a-z]{2,6})|(?:(?:(?:(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))(?:\.(?:(?:[0-1]?[0-9]?[0-9])|(?:2[0-4][0-9])|(?:25[0-5]))){3})|(?:[A-Fa-f0-9:]{16,39}))))(?:[^\^*\[\]{}|\\"<>\/`\s]+[^[email protected]\^()\[\]{}|\\:;"\',.?<>`\s])?)/i'; 

function getUrls($string) { 
    global $sRELinks; 
    preg_match_all($sRELinks, $string, $matches); 
    return ($matches[0]); 
} 

http://yellow5.us/journal/server_side_text_linkification/

相關問題