下標出使用XPath與rvest包報廢時的邊界錯誤

我試圖使用rvest包從一個網站刮的表：下標出使用XPath與rvest包報廢時的邊界錯誤

library("rvest") 
uci_html <- read_html("http://archive.ics.uci.edu/ml/datasets.html") 
uci_data <- uci_html %>% 
    html_nodes(xpath="/html/body/table[2]/tbody/tr/td[2]/table[2]") %>% 
    html_table() 
uci_data <- uci_data[[1]]

據我所看到的，所有的例子我使用的應該工作的格式，但是R不殺任何數據，因此我得到的錯誤：

Error in uci_data[[1]] : subscript out of bounds

你知道爲什麼會是這樣和我能做些什麼來湊的數據？

來源

2017-04-17 Neil Barmecha

我不太瞭解它，但它看起來像tbody是不必要的。

library("rvest") 
uci_html <- read_html("http://archive.ics.uci.edu/ml/datasets.html") 
uci_data <- uci_html %>% 
    html_nodes(xpath="/html/body/table[2]/tr/td[2]/table[2]") %>% html_table(fill=TRUE) 
uci_data <- uci_data[[1]]

使用html標籤的另一個方法是：

tables<-uci_html %>% html_nodes("table") 
html_table(tables[6], fill=TRUE)[[1]]

爲了識別第六表是感興趣的表，它涉及到一些試驗和錯誤，但我發現使用html標籤更容易比xpath表單。

來源

2017-05-27 18:18:06 Dave2e

下標出使用XPath與rvest包報廢時的邊界錯誤

回答

相關問題