我需要從大約6000個網頁中提取數據。在做了一些研究之後,我決定給WinHTTP一個鏡頭。我能夠做到這一點,但是我在同步做事,所以需要一段時間才能完成。我現在試圖異步使用WinHTTP,但我遇到了障礙。我搜索了很多教程和示例,但我只能找到MSDN文檔,這對我所做的事情來說似乎過於複雜。如前所述,我無法找到很多資源,所以我繼續給它一個鏡頭:WinHTTP多個異步請求
std::string theSource = "";
char * httpBuffer;
DWORD dwSize = 1;
DWORD dwRecv = 1;
HINTERNET hOpen =
WinHttpOpen
(
L"Example Agent",
WINHTTP_ACCESS_TYPE_NO_PROXY,
NULL,
NULL,
WINHTTP_FLAG_ASYNC
);
WINHTTP_STATUS_CALLBACK theCallback =
WinHttpSetStatusCallback
(
hOpen,
(WINHTTP_STATUS_CALLBACK) HttpCallback,
WINHTTP_CALLBACK_FLAG_ALL_NOTIFICATIONS,
NULL
);
HINTERNET hConnect =
WinHttpConnect
(
hOpen,
L"example.org",
INTERNET_DEFAULT_HTTPS_PORT,
0
);
HINTERNET hRequest = NULL;
BOOL allComplete = false;
int theRequest = 1;
while (!allComplete)
{
if (theRequest == 1)
{
hRequest = WinHttpOpenRequest
(
hConnect,
L"GET",
L"example.html",
0,
WINHTTP_NO_REFERER,
WINHTTP_DEFAULT_ACCEPT_TYPES,
WINHTTP_FLAG_SECURE
);
WinHttpSendRequest
(
hRequest,
WINHTTP_NO_ADDITIONAL_HEADERS,
0,
WINHTTP_NO_REQUEST_DATA,
0,
0,
0
);
}
else if (theRequest == 2)
{
WinHttpReceiveResponse(hRequest, NULL);
}
else if (theRequest == 3)
{
WinHttpQueryHeaders
(
hRequest,
WINHTTP_QUERY_RAW_HEADERS_CRLF,
WINHTTP_HEADER_NAME_BY_INDEX,
NULL,
&dwSize,
WINHTTP_NO_HEADER_INDEX
);
WCHAR * headerBuffer = new WCHAR[dwSize/sizeof(WCHAR)];
WinHttpQueryHeaders
(
hRequest,
WINHTTP_QUERY_RAW_HEADERS_CRLF,
WINHTTP_HEADER_NAME_BY_INDEX,
headerBuffer,
&dwSize,
WINHTTP_NO_HEADER_INDEX
);
delete [] headerBuffer;
dwSize = 1;
while (dwSize > 0)
{
if (!WinHttpQueryDataAvailable(hRequest, &dwSize))
{
break;
}
httpBuffer = new char[dwSize + 1];
ZeroMemory(httpBuffer, dwSize + 1);
if (!WinHttpReadData(hRequest, httpBuffer, dwSize, &dwRecv))
{
std::cout << "WinHttpReadData() - Error Code: " << GetLastError() << "\n";
}
else
{
theSource = theSource + httpBuffer;
}
delete [] httpBuffer;
// Parse the source for the data I'm looking for.
break;
}
}
下面是我的回調函數:
void CALLBACK HttpCallback(HINTERNET hInternet, DWORD * dwContext, DWORD dwInternetStatus, void * lpvStatusInfo, DWORD dwStatusInfoLength)
{
switch (dwInternetStatus)
{
default:
std::cout << dwInternetStatus << "\n";
break;
case WINHTTP_CALLBACK_STATUS_HANDLE_CREATED:
std::cout << "Handle created.\n";
theRequest = 1;
break;
case WINHTTP_CALLBACK_STATUS_REQUEST_SENT:
std::cout << "Request sent.\n";
theRequest = 2;
break;
case WINHTTP_CALLBACK_STATUS_RESPONSE_RECEIVED:
std::cout << "Response received.\n";
theRequest = 3;
break;
}
}
注:我只提供這部分我的代碼,因爲它是涉及我的問題/問題的部分。我很抱歉如果一個變量聲明丟失。
上面的代碼對我的作品和做實際上讓我在尋找所需的信息,但只有一個頁面。在完成這一步之後,我意識到當用這種方法發出多個請求時,我不知道該怎麼做。同樣,除了MSDN文章之外,搜索功能還沒有出現,據我所知,這些文章不是一次就發出多個請求的示例。另外,我用來打開/發送/ etc的while循環。基於請求的價值的請求似乎是這樣做的一個可怕的方式。我會很感激任何其他建議,以改善我的代碼。
在一般情況下,這裏是我的問題的總結:我需要使用異步的WinHTTP約6000 GET請求。我並不完全相信如何做到這一點,因爲我是WinHTTP的新手,所以我正在尋找處理多個異步請求的最基本的(或可能有效的)方法。
你看到這個MSDN文章:異步的WinHTTP(http://msdn.microsoft.com/en-us/magazine/cc716528.aspx)?也許這個單線程,多套接字和易於理解的[Perl API'HTTP :: Async'](https://metacpan.org/module/HTTP::Async)可以爲您提供一些啓發,讓您瞭解如何繼續。 – Lumi