2011-05-03 137 views
2

我想解析一下鏈接:http://dizli.com/dizli/db.html using php。用html解析html代碼錯誤問題

但是,當我寫的代碼,

$url = "http://dizli.com/dizli/db.html"; 
$dom = new DOMDocument(); 
$html = $dom->loadHTMLFile($url); 
$dom->preserveWhiteSpace = false; 
$tables = $dom->getElementsByTagName('table'); 
$tr = $tables->item(2)->getElementsByTagName('tr'); 
$rows = $tables->item(0)->getElementsByTagName('td'); 

foreach($rows as $row) 
{ 
    $movie = $row->getElementsByTagName('b'); 
    echo $movie;} 

我得到了錯誤的一串:

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and td in http://dizli.com/dizli/db.html, line: 54 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and b in http://dizli.com/dizli/db.html, line: 81 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and b in http://dizli.com/dizli/db.html, line: 106 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: htmlParseEntityRef: no name in http://dizli.com/dizli/db.html, line: 115 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and b in http://dizli.com/dizli/db.html, line: 126 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and font in http://dizli.com/dizli/db.html, line: 126 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: font and b in http://dizli.com/dizli/db.html, line: 128 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: htmlParseEntityRef: no name in http://dizli.com/dizli/db.html, line: 1575 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Tag blink invalid in http://dizli.com/dizli/db.html, line: 2190 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and b in http://dizli.com/dizli/db.html, line: 2200 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: td and font in http://dizli.com/dizli/db.html, line: 2200 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Warning: DOMDocument::loadHTMLFile() [domdocument.loadhtmlfile]: Opening and ending tag mismatch: body and center in http://dizli.com/dizli/db.html, line: 2225 in C:\development\app_server\C7\Lib\Tools\News.php on line 93 

Catchable fatal error: Object of class DOMNodeList could not be converted to string in C:\development\app_server\C7\Lib\Tools\News.php on line 102 

人可以幫我分析這個環節,這樣我就可以保存電影的名稱和導演的名字。

在此先感謝。 Zeeshan

+0

有點關係 - http://stackoverflow.com/questions/1148928/disable-warnings-when- loading-non-well-formed-html-by-domdocument-php – Phil 2011-05-03 23:23:46

回答

1

該頁面是用非常古老的HTML代碼編寫的(您可以通過FONT標記,大寫字母等進行判斷),因此<標籤以及可能的段落和其他內容都未被封閉。我建議在這種情況下使用正則表達式來查找它們。

1

你的主要問題是最後一行:

echo $movie; 

$movieDOMNodeList一個實例,所以你不容只是呼應它,你需要得到it's元素例如像$movie->item(0)

你也可以做一個var_dump$movie,看看你能得到什麼。

可能會忽略的警告,取決於您獲得的輸出。

2

要隱藏的錯誤,並與該代碼,只是廣告@$dom之前仍然有效,如:

$html = @$dom->loadHTMLFile($url); 
+0

爲什麼這有效?什麼是@操作符?你能解釋一下嗎? – 2015-08-20 09:17:44