2012-02-25 116 views
0

我只想獲得'頻道'標記的名稱即CHANNEL ...當我使用它來解析來自Google的RSS時,腳本工作正常......... .....但是當我使用它的一些其他供應商它給出了輸出'#文字',而不是給'渠道'這是預期的輸出.......以下是我的腳本plz幫助我。PHP不使用cURL正確解析RSS

$url = 'http://ibnlive.in.com/ibnrss/rss/sports/cricket.xml'; 
    $get = perform_curl($url); 
    $xml = new DOMDocument(); 
    $xml -> loadXML($get['remote_content']); 
    $fetch = $xml -> documentElement; 
    $gettitle = $fetch -> firstChild -> nodeName; 
    echo $gettitle; 
    function perform_curl($rss_feed_provider_url){ 

     $url = $rss_feed_provider_url; 
     $curl_handle = curl_init(); 

     // Do we have a cURL session? 
     if ($curl_handle) { 
      // Set the required CURL options that we need. 
      // Set the URL option. 
      curl_setopt($curl_handle, CURLOPT_URL, $url); 
      // Set the HEADER option. We don't want the HTTP headers in the output. 
      curl_setopt($curl_handle, CURLOPT_HEADER, false); 
      // Set the FOLLOWLOCATION option. We will follow if location header is present. 
      curl_setopt($curl_handle, CURLOPT_FOLLOWLOCATION, true); 
      // Instead of using WRITEFUNCTION callbacks, we are going to receive the remote contents as a return value for the curl_exec function. 
      curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true); 

      // Try to fetch the remote URL contents. 
      // This function will block until the contents are received. 
      $remote_contents = curl_exec($curl_handle); 

      // Do the cleanup of CURL. 
      curl_close($curl_handle); 

      $remote_contents = utf8_encode($remote_contents); 

      $handle = @simplexml_load_string($remote_contents); 
      $return_result = array(); 
      if(is_object($handle)){ 
       $return_result['handle'] = true; 
       $return_result['remote_content'] = $remote_contents; 
       return $return_result; 
      } 
      else{ 
       $return_result['handle'] = false; 
       $return_result['content_error'] = 'INVALID RSS SOURCE, PLEASE CHECK IF THE SOURCE IS A VALID XML DOCUMENT.'; 
       return $return_result; 
      } 

     } // End of if ($curl_handle) 
     else{ 
     $return_result['curl_error'] = 'CURL INITIALIZATION FAILED.'; 
     return false; 
     } 
    } 

回答

2

it gives an output '#text' instead of giving 'channel' which is the intended output它發生,因爲$fetch -> firstChild -> nodeType是3,這是一個TEXT_NODE或只是一些文本。您可以通過

echo $fetch->getElementsByTagName('channel')->item(0)->nodeName; 

$gettitle = $fetch -> firstChild -> nodeValue; 
var_dump($gettitle); 

選擇頻道爲您提供了

string(5) " 
    " 

或空間,這恰好XML標記由於格式之間出現了新的線符號。

PS:和RSS通過您的鏈接養活失敗在http://validator.w3.org/feed/

0

驗證看一看的XML - 它已經相當印有空格所以它被正確解析。根節點的第一個子節點是文本節點。如果您想要更容易的時間,或者在您的DomDocument上使用XPath查詢來獲取感興趣的標籤,我建議使用SimpleXML

這裏是你如何使用SimpleXML的

$xml = new SimpleXMLElement($get['remote_content']); 
print $xml->channel[0]->title;