多行正則表達式

-1

<a href="http://english317.ning.com/profiles/blogs/bad-business-writing-487">Continue</a> 
             </div> 
       <p class="small"> 

                Added by <a href="/profile/KemberleyRamirez">Kemberley Ramirez</a> on September 2, 2010 at 11:38pm

我想獲得後的文字/博客（如「壞企業寫作-487」）和還添加了字符串（學生姓名和提交日期）（例如「Kemberley拉米雷斯在2010年9月2日下午11時38」）

我使用UltraEdit與Perl表達式。

來源

2010-09-03 Caveatrob

您可能會發現這個網站有用：？？regexlib.com/ – vlood 2010-09-03 08:17:19

[朋友不會讓朋友們解析HTML正則表達式。（HTTP：/ /stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454） – Ether 2010-09-03 14:44:53

我沒有問我是否應該;我問了如何。在這種情況下完全可行，因爲這些標籤通常在同一個地方用REGEX解析。 – Caveatrob 2010-09-04 07:54:21

我不知道你想搭配什麼，但你最好使用適當的HTML解析器：

#!/usr/bin/perl 

use strict; use warnings; 

use HTML::TokeParser::Simple; 

my $parser = HTML::TokeParser::Simple->new(\*DATA); 

my $blog_re = qr{^http://english317.ning.com/profiles/blogs/(.+)\z}; 
my $profile_re = qr{^/profile/(\w+)\z}; 

while (my $tag = $parser->get_tag('a')) { 
    next unless my ($href) = $tag->get_attr('href'); 
    if ($href =~ $blog_re or $href =~ $profile_re) { 
     print "[$1]\n"; 
    } 
} 

__DATA__ 
<a href="http://english317.ning.com/profiles/blogs/bad-business-writing-487">Continue</a> 
             </div> 
       <p class="small"> 

                Added by <a href="/profile/KemberleyRamirez">Kemberley Ramirez</a> on September 2, 2010 at 11:38pm

來源

2010-09-03 15:51:13

-1

/s和/ m修飾符控制如何處理多行。看到perlretut

你可能要像帶/ s修飾詞，像這樣RRR reg.exps：（未經測試）

$foo =~ m|blogs/([^"]+).*Added by <[^>]+>([^<]+)</a>|s

以間||而不是//避免一切逃逸..

來源

2010-09-03 09:18:41

-2

以下應多行工作：

.*blogs\/(\S+)".*\(\n.*\)*<a.*>(.*)<\/a>(.*)

來源

2010-09-03 10:19:14

在「點相匹配換行」模式下使用PowerGrep，我想出了這個：

(?>profiles/blogs/(.*?)").*?added by(.*?)</a>(.*?2010.*?\d{2}[ap]m)

（然後一個額外的處理搜索） <一個*>

來源

2010-09-05 06:46:01 Caveatrob

多行正則表達式

回答

相關問題