我想訪問網站上的所有鏈接,並希望檢查其狀態(HTTP 200或500等)。點擊某個鏈接後,處理生成的新窗口時出現問題。很少的鏈接會導致新的窗口,其中很少有其他窗口在同一個窗口中打開。如何檢查新窗口並切換到新窗口並返回到主窗口。這是我到目前爲止的代碼:使用硒爬行的應用程序
public class TestLink {
//list to save visited links
static List<String> links = new ArrayList<String>();
WebDriver driver;
public TestLink(WebDriver driver) {
this.driver = driver;
}
public void linkTest() {
// loop over all the a elements in the page
try{
for(WebElement link : driver.findElements(By.tagName("a"))) {
// Check if link is displayed and not previously visited
if (link.isDisplayed()
&& !links.contains(link.getText())) {
// add link to list of links already visited
links.add(link.getText());
System.out.println(link.getText());
// click on the link. This opens a new page
link.click();
// call testLink on the new page
new TestLink(driver).linkTest();
}
}
driver.navigate().back();
}catch(StaleElementReferenceException e) {
e.printStackTrace();
}
}
public static void main(String[] args) throws InterruptedException {
WebDriver driver = new HtmlUnitDriver();
driver.get("http://www.flipkart.com/");
// start recursive linkText
new TestLink(driver).linkTest();
}
}
編輯
下面的代碼工作正常字符串的URL,但我想在網站每個鏈路狀態代碼。如何動態構建每個鏈接的url。
public static int getResponseCode(String url) {
try {
WebClient client = new WebClient();
// webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
client.getOptions().setThrowExceptionOnFailingStatusCode(false);
if(url != null)
return client.getPage(url).getWebResponse().getStatusCode();
} catch (IOException ioe) {
throw new RuntimeException(ioe);
}
return 0;
}
感謝@Erki m請看看我的編輯的問題,讓我知道你對此有任何想法。 – 2014-09-03 04:14:49