2012-08-07 36 views
0

我抓取頁面,此代碼:Symfony 2 DOM爬蟲。以文本沒有標籤

<br/> 

<td class="PropertyBody"> 
<b>Category:</b> 
Miscellanea: Soft Skill 
<br> 
<b>Owner:</b> 
<a href="mailto:">blabla</a> 
<br> 
<b>Location:</b> 
bla bla 
<br> 
<b>Duration:</b> 
6:00 
<br> 
<b>Max attendees:</b> 
15 
<br> 
<b>Start at:</b> 
7/19/2012 10:00:00 AM 
<br> 
<b>Your status:</b> 
<br> 
</td> 

我怎樣才能提取,例如'7/19/2012 10:00:00 AM'從這個代碼與Symfony的履帶? $crawler->filter('.PropertyBody > b')->eq(5)->text();只取'Start at:'

謝謝,我已經做到了:

$bigPiece = $crawler->filter('.PropertyBody')->text(); 
     //getting CATEGORY   
     $pos = strpos($bigPiece, ':')+1; 
     $pos2 = strpos($bigPiece, 'Owner:'); 
     $category = trim(substr($bigPiece, $pos, $pos2-$pos)); 
     $this->category = $category; 
     //getting OWNER 
     $pos = strpos($bigPiece, 'Owner:')+6; 
     $pos2 = strpos($bigPiece, 'Location:'); 
     $owner = trim(substr($bigPiece, $pos, $pos2-$pos)); 
     $training->setOwner($owner); 
     //getting LOCATION 
     $pos = strpos($bigPiece, 'Location:')+9; 
     $pos2 = strpos($bigPiece, 'Duration:'); 
     $location = trim(substr($bigPiece, $pos, $pos2-$pos)); 
     $training->setLocation($location); 
     //getting DURATION 
     $pos = strpos($bigPiece, 'Duration:')+9; 
     $pos2 = strpos($bigPiece, 'Max attendees:'); 
     $duration = trim(substr($bigPiece, $pos, $pos2-$pos)); 
     $training->setDuration($duration); 
     //getting MAXATTENDEES 
     $pos = strpos($bigPiece, 'Max attendees:')+14; 
     $pos2 = strpos($bigPiece, 'Start at:'); 
     $maxattendees = trim(substr($bigPiece, $pos, $pos2-$pos)); 
     $training->setMaxattendies($maxattendees); 
     //getting START AT 
     $pos = strpos($bigPiece, 'Start at:')+9; 
     $pos2 = strpos($bigPiece, 'Your status:'); 
     $start = trim(substr($bigPiece, $pos, $pos2-$pos)); 
     $training->setStarts($start); 

回答

1

添加span標籤。這樣做:

<b>Start at:</b> 
<span class="wantthis">7/19/2012 10:00:00 AM</span> 

然後用選擇它:

$crawler->filter('.wantthis')->text(); 
+1

我爬這個代碼的其他網站,我不能添加類。 – AlOpal19 2012-08-07 10:46:15

1

如果你需要測試這種特殊情況下,你不必添加標籤,這是封閉的能力,那麼你應該可能考慮使用PHPUnit的assertContains()

$text = $crawler->filter('.PropertyBody > b')->text(); 
$this->assertContains('7/19/2012 10:00:00 AM', $text);