2010-10-05 115 views
1

我有以下代碼:PHP捲曲CURLOPT_HEADER和DOM

curl_setopt($ch, CURLOPT_URL, $host); 
    curl_setopt($ch, CURLOPT_HEADER, 1); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_USERAGENT, $user_agent); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0); 
    $html = curl_exec($ch); 


    preg_match_all('|Set-Cookie: (.*);|U', $html, $results); 
    $cookies = implode(';', $results[1]); 


    $dom = new DOMDocument(); 
    $dom->loadHTML($html); 

上線$ dom-> loadHTML($ HTML);我收到以下錯誤:

 
Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: 
Misplaced DOCTYPE declaration in 
Entity, line: 12 in 
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php 
on line 39 

Warning: DOMDocument::loadHTML() 
[function.DOMDocument-loadHTML]: 
htmlParseStartTag: misplaced 
tag in Entity, line: 13 in 
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php 
on line 39 

Warning: DOMDocument::loadHTML() 
[function.DOMDocument-loadHTML]: 
htmlParseStartTag: misplaced 
tag in Entity, line: 14 in 
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php 
on line 39 

Warning: DOMDocument::loadHTML() 
[function.DOMDocument-loadHTML]: 
Unexpected end tag : head in Entity, 
line: 32 in 
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php 
on line 39 

Warning: DOMDocument::loadHTML() 
[function.DOMDocument-loadHTML]: 
htmlParseStartTag: misplaced 
tag in Entity, line: 34 in 
D:\Programs\xampp\xampp\htdocs\ip\megafonmoscow.php 
on line 39 

是這個錯誤的行curl_setopt($ch, CURLOPT_HEADER, 1);原因是什麼?我需要它,因爲餅乾。有關如何解決這個問題的任何想法?

回答

2

嘗試刪除該行,以便它不會返回標題,然後使用get_headers函數在curl請求之後獲取它們。

curl_setopt($ch, CURLOPT_URL, $host); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_USERAGENT, $user_agent); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0); 
    $html = curl_exec($ch); 
    $headers=get_headers($host, 1); 
+0

除非我弄錯了,否則get_headers()函數會向URL提出一個單獨的請求 - 因此,根據您使用SSL執行的操作,您可能會得到不同的結果,而且如果使用cURL發送任何POST,您肯定會得到不同的結果(與這個例子無關,但值得注意) – richplane 2017-03-08 15:39:09

2

的替代mck89的做法是下載標題和正文在一起,但你嘗試分析它之前拆分它們:

$html = curl_exec($ch); 

[snip] 

$html = preg_replace('/^.*\n\n/s','',$html,1); // strip out everything before & including the double line break between headers and body 

$dom = new DOMDocument(); 
$dom->loadHTML($html); 

這節省了一個HTTP請求,因此一定量的時間。