2012-07-10 176 views
0

使用此代碼我使用此代碼http://martinsikora.com/how-to-steal-google-s-did-you-mean-feature來做一個「你的意思」與我的搜索,但我的託管服務提供商已open_basedir設置,並不會讓我改變。我已經看到了一些解決方法,但我不知道如何將這些實現到他的代碼段。如何實現CURLOPT_RETURNTRANSFER的解決方法

這裏的片段:

$ch = curl_init($url); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
curl_setopt($ch, CURLOPT_TIMEOUT, 10); 
curl_setopt($ch, CURLOPT_HEADER, true); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
curl_setopt($ch, CURLOPT_USERAGENT, $agents[rand(0, count($agents) - 1)]); 
$data = curl_exec($ch); 
curl_close($ch); 
+1

我不知道上面的代碼與open_basedir的呢? – 2012-07-10 21:01:37

+0

對不起,在我的代碼中,我拿出了主要問題= curl_setopt($ ch,CURLOPT_FOLLOWLOCATION,true); – kezi 2012-07-10 21:03:46

+0

你不能做跟隨open_basedir設置 – kezi 2012-07-10 21:23:01

回答

1

什麼奇怪的和討厭的(基本上無證)的限制,特別是當它可以很容易被周圍的工作。您只需檢查3xx響應代碼,然後檢查Location:標頭的內容以查找要重定向到的URL。

這並不像我喜歡的那樣微不足道,因爲有很多應用程序違反了RFC並且沒有使用完整的URL作爲位置標題中的數據 - 所以您需要做一點冒充以獲得正確的位置。

像這樣的東西應該爲你的代碼(未經測試)工作:

function make_url_from_location ($oldUrl, $locationHeader) { 
    // Takes a URL and a location header and calculates the new URL 
    // This takes relative paths (which are non-RFC compliant) into 
    // account, which most browsers will do. Requires $oldUrl to be 
    // a full URL 

    // First check if $locationHeader is a full URL 
    $newParts = parse_url($locationHeader); 
    if (!empty($newParts['scheme'])) { 
    return $locationHeader; 
    } 

    // We need a path at a minimum. If not, return the old URL. 
    if (empty($newParts['path'])) { 
    return $oldUrl; 
    } 

    // Construct the start of the new URL 
    $oldParts = parse_url($oldUrl); 
    $newUrl = $oldParts['scheme'].'://'.$oldParts['host']; 
    if (!empty($oldParts['port'])) { 
    $newUrl .= ':'.$oldParts['port']; 
    } 

    // Build new path 
    if ($newParts['path'][0] == '/') { 
    $newUrl .= $newParts['path']; 
    } else { 
    // str_replace() to work around (buggy?) Windows behaviour where one level 
    // paths cause dirname to return a \ instead of a/
    $newUrl .= str_replace('\\', '/', dirname($oldParts['path'])).$newParts['path']; 
    } 

    // Add a query string 
    if (!empty($newParts['query'])) { 
    $newUrl .= '?'.$newParts['query']; 
    } 

    return $newUrl; 

} 

$maxRedirects = 30; 

$redirectCount = 0; 
$complete = FALSE; 

// Get user agent string once at start - array_rand() is tidier 
// For these purposes, a single static string will probably be fine 
$userAgent = $agents[array_rand($agents)]; 

do { 

    // Make the request 
    $ch = curl_init($url); 
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_HEADER, true); 
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
    curl_setopt($ch, CURLOPT_USERAGENT, $userAgent]); 
    $data = curl_exec($ch); 

    // Get the response code (easier than parsing it from the headers) 
    $responseCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); 

    // Split header from body 
    $data = explode("\r\n\r\n", $data, 2); 
    $header = $data[0]; 
    $data = $data[1]; 

    // Check for redirect response codes 
    if ($responseCode >= 300 && $responseCode < 400) { 

    if (!preg_match('/^location:\s*(.+?)$/mi', $header, $matches)) { 
     // This is an error. If you get here the response was a 3xx code and 
     // no location header was set. You need to handle that error here. 
     $complete = TRUE; 
    } 

    // Get URL for next iteration 
    $url = make_url_from_location(curl_getinfo($ch, CURLINFO_EFFECTIVE_URL), trim($matches[1])); 

    } else { 

    // Non redirect response code (might still be an error code though!) 
    $complete = TRUE; 

    } 

// Loop until no more redirects or $maxRedirects is reached 
} while (!$complete && ++$redirectCount < $maxRedirects); 

// Perform whatever error checking is necessary here 

// Close the cURL handle 
curl_close($ch); 
+0

什麼$ locationheaders是我有一個網址是 ' $ url ='http://www.google.com/search?client=firefox-a&hl='。 $ lang。 '&q ='。 urlencode($ query);' – kezi 2012-07-11 03:06:00

+0

@KesiMaduka對不起,我不明白你在那裏問什麼 - 你能詳細說明一下嗎? – DaveRandom 2012-07-11 08:52:34

+0

喜歡看http://martinsikora.com/how-to-steal-google-s-did-you-mean-feature哪些變量,我將放在'$ locationHeader'函數例如:make_url_from_location($ oldUrl,$ locationHeader)基於那個代碼的變量,你將放在$ oldUrl&$ locationHeader – kezi 2012-07-11 22:12:22