如何從字符串中提取url，其中包含協議和地址其餘部分之間的空格？

想，我有以下字符串（在Python）：如何從字符串中提取url，其中包含協議和地址其餘部分之間的空格？

myString = "For further information please visit http:// somewebpage.com and please do not hesitate to contact us"

我想提取以下網址：

http:// somewebpage.com

我發現使用正則表達式解決方案，而不是一個的情況下地址前留空。

來源

2017-08-04 Patrick Balada

哪些協議，你期待？ – marvel308

@ marvel308只有http –

像這樣：

myString = myString.split() 
index = myString.index('http://') 
url = ''.join(myString[index:index+2])

請注意，我把句子上的每一個字，但只連接http部分與後的一個。

如果你確實需要的空間（我不能想象爲什麼），然後用替代''' '

來源

2017-08-04 12:17:06

這一個得到http://somewebpage.com而不是'http：// somewebpage.com'。 – GLR

不，不需要，再次測試，開頭有'http：//' –

對不起，這是格式化Stack Overflow的問題。我想表示「http：//」和剩餘的網址字符串之間沒有空格。 – GLR

純的正則表達式的解決方案：

http://\s[\w\.]+

[\w\.]查找任何信件或週期
+尋找上述字符1次或更多次

來源

2017-08-04 12:17:47 LukeBalizet

@ LukeBalizet謝謝。這個效果很好。替換正則表達式輸出的空間爲我提供瞭解決方案。 –

試試這個正則表達式：

>>>mystring = "For further information please visit http:// somewebpage.com and please do not hesitate to contact us" 

>>>url = re.findall('http[s]?:// (?:[a-zA-Z]|[0-9]|[[email protected]&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', mystring)[0] 
>>>url 
http:// somewebpage.com

來源

2017-08-04 12:32:17

嘗試什麼？我們在這裏尋找什麼？ – rick112358

@ rick112358正則表達式從字符串中返回url。 –

/https?:\/\/\s\S+/g

HTTP - 匹配HTTP序列
S' - 匹配0或1秒（對於http 小號）
： - 匹配：
// - 匹配兩個//
\ S - 匹配一個空間
\ S + - 匹配任何非空格的字符1次或更多次

正則表達式將匹配：

http:// somewebpage.com 
https:// somewebpage.com 
http:// 1234.com/test

但不是：

ftp:// www.test.com.xx 
http://www.google.com 
http://

http://www.regexpal.com/?fam=98273

來源

2017-08-04 13:00:27 Paschoal

如何從字符串中提取url，其中包含協議和地址其餘部分之間的空格？

回答

相關問題