對應於PHP的preg_match的Python

我打算將我的一個刮板移動到Python。我很喜歡在PHP中使用preg_match和preg_match_all。我在Python中找不到類似於preg_match的合適函數。任何人都可以幫我這麼做嗎？對應於PHP的preg_match的Python

例如，如果我想獲得<a class="title"和</a>之間的內容，我用下面的函數在PHP中：

preg_match_all('/a class="title"(.*?)<\/a>/si',$input,$output);

而在Python我無法找出一個類似的功能。

來源

2012-01-30 funnyguy

這裏是pyt hon正則表達式文檔：http://docs.python.org/howto/regex.html – 2012-01-30 09:39:24

在Python中，我們不使用正則表達式來解析HTML，我們使用[BeautifulSoup]（http://www.crummy.com/software/BeautifulSoup /）。見http://stackoverflow.com/a/1732454/78845 – Johnsyweb 2012-01-30 09:44:12

您正在尋找python的re module。

看看re.findall和re.search。

正如你所提到的，你正試圖解析HTML使用html parsers。 python中有幾個選項可用，如lxml或BeautifulSoup。

看看這個Why you should not parse html with regex

來源

2012-01-30 09:39:32 RanRag

謝謝先生們的回覆。我已經開始使用Beatifulsoup，並且在使用它時遇到了一些問題。我已經通過HTML數據Beatifulsopu和我面臨這個錯誤。湯= BeautifulSoup（data）print soup.prettify（）line 52，in soup = BeautifulSoup（data）文件「/home/infoken-user/Desktop/lin/BeautifulSoup.py」，第1519行，在__init__中 BeautifulStoneSoup .__ init __（self，* args，** kwargs）文件「/home/infoken-user/Desktop/lin/BeautifulSoup.py」，第1144行， .. '^ <\？。* encoding = [\ 「]（。*？）[\'」]。* \？>'）。match（xml_data） TypeError：期望的字符串或緩衝區 – funnyguy 2012-01-30 12:54:10

你可能有興趣閱讀Python Regular Expression Operations

來源

2012-01-30 09:40:28

我想你需要財產以後這樣的：在

output = re.search('a class="title"(.*?)<\/a>', input, flags=re.IGNORECASE) 
    if output is not None: 
     output = output.group(0) 
     print(output)

您可以添加（S？）啓用正則表達式以啓用多線模式：

output = re.search('(?s)a class="title"(.*?)<\/a>', input, flags=re.IGNORECASE) 
    if output is not None: 
     output = output.group(0) 
     print(output)

來源

2016-07-22 07:07:53

對應於PHP的preg_match的Python

回答

相關問題