0
爲了刮掉一些財務報表,我試圖獲得一個文檔交付協議號碼列表。httr POST隱藏字段
下面的url有指定公司所有文檔類別的鏈接。
u1 <- "http://siteempresas.bovespa.com.br/consbov/ExibeTodosDocumentosCVM.asp?CCVM=22446&CNPJ=09.414.761/0001-64&TipoDoc=C"
通過點擊DFP我重定向到包含協議號不同的頁面。問題是我無法在R中獲得相同的結果。
我試過httr :: POST沒有成功。
library(httr)
page <- GET(u1, encoding = "ISO-8859-1")
key <- cookies(page)
pgpost <- POST(u1,
body = list(hdnCategoria = "IDI2",
action = "ExibeTodosDocumentosCVM.asp?CNPJ=09.414.761/0001-64&CCVM=22446&TipoDoc=C&QtLinks=10"),
set_cookies(ASPSESSIONIDQATQCCSC = key$value[1],
TS01871345 = key$value[2],
ASPSESSIONIDSQQTABSC = key$value[3],
ASPSESSIONIDSCDSBADC = key$value[4]))
pgcont <- content(pgpost, "text", encoding = "ISO-8859-1")
pgcont <- strsplit(pgcont, "\r")[[1]]
pgcont <- gsub('[\n\t]', "", pgcont); pgcont
pgcont
表明我同樣的內容從u1
我使用rvest點擊鏈接
library(rvest)
s <- html_session(u1)
s %>% follow_link("DFP")
也試過,但最終與此錯誤消息
[1] Navigating to javascript:fVisualizaDocumentos('C','IDI2')
Error in curl::curl_fetch_memory(url, handle = handle) :
Couldn't resolve host name
任何如何解決這個問題的想法?提前致謝!
Here is a picture of the information I'm looking for