0
嗨,我是R新手,我正在構建兩個來自Web的指南,我想出瞭如何自動化腳本進行數據挖掘,而不是追加數據然後每次寫代碼運行。我想附加它可以讓任何一個人指向正確的方向。簡單的R項目
這裏是腳本這樣
# loading the package is required once each session
require(XML)
# initialize a storage variable for Twitter tweets
mydata.vectors <- character(0)
# paginate to get more tweets
for (page in c(1:15))
{
# search parameter
twitter_q <- URLencode('#google OR #apple')
# construct a URL
twitter_url = paste('http://search.twitter.com/search.atom?q=',twitter_q,'&rpp=100&page=', page, sep='')
# fetch remote URL and parse
mydata.xml <- xmlParseDoc(twitter_url, asText=F)
# extract the titles
mydata.vector <- xpathSApply(mydata.xml, '//s:entry/s:title', xmlValue, namespaces =c('s'='http://www.w3.org/2005/Atom'))
# aggregate new tweets with previous tweets
mydata.vectors <- c(mydata.vector, mydata.vectors)
}
# how many tweets did we get?
length(mydata.vectors)
究竟是哪個位被'寫入'了?從你的代碼看來,'mydata.vectors'將包含你到目前爲止所有的結果。 – 2012-03-27 02:45:07