從HTML內容中提取媒體標記

-3

我使用CURL從某些網頁獲取內容。我需要從內容中提取媒體標籤。從HTML內容中提取媒體標記

有沒有可用的庫？或者有任何想法，這將是非常好的。

2012-08-13 Stranger

[*** *** SIGH（http://stackoverflow.com/search?q= [PHP] +解析+ html） – 2012-08-13 00:24:18

你有沒有想過自己想辦法呢？如果你甚至不能嘗試使用谷歌搜索，那麼你不應該停止這個網站。 – 2012-08-13 00:40:39

這會幫助嗎？

function file_get_contents_curl($url) 
{ 
    $ch = curl_init(); 

    curl_setopt($ch, CURLOPT_HEADER, 0); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_URL, $url); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); 

    $data = curl_exec($ch); 
    curl_close($ch); 

    return $data; 
} 

$html = file_get_contents_curl("http://example.com/"); 

//parsing begins here: 
$doc = new DOMDocument(); 
@$doc->loadHTML($html); 
$nodes = $doc->getElementsByTagName('title'); 

//get and display what you need: 
$title = $nodes->item(0)->nodeValue; 

$metas = $doc->getElementsByTagName('meta'); 

for ($i = 0; $i < $metas->length; $i++) 
{ 
    $meta = $metas->item($i); 
    if($meta->getAttribute('name') == 'description') 
     $description = $meta->getAttribute('content'); 
    if($meta->getAttribute('name') == 'keywords') 
     $keywords = $meta->getAttribute('content'); 
} 

echo "Title: $title". '<br/><br/>'; 
echo "Description: $description". '<br/><br/>'; 
echo "Keywords: $keywords";

或者，如果您需要保存的圖像..

$remote_img = 'http://www.example.com/images/image.jpg '; 
$img = imagecreatefromjpeg($remote_img); 
$path = 'images/'; 
imagejpeg($img, $path); 

function save_image($img,$fullpath){ 
    $ch = curl_init ($img); 
    curl_setopt($ch, CURLOPT_HEADER, 0); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($ch, CURLOPT_BINARYTRANSFER,1); 
    $rawdata=curl_exec($ch); 
    curl_close ($ch); 
    if(file_exists($fullpath)){ 
     unlink($fullpath); 
    } 
    $fp = fopen($fullpath,'x'); 
    fwrite($fp, $rawdata); 
    fclose($fp); 
}

來源

2012-08-13 00:25:35 themis

您可以將TagName更改爲任何您想要提取的數據 – themis 2012-08-13 00:29:43

從HTML內容中提取媒體標記

回答

相關問題