導入CSV時選擇指定行

我有一個很大的CSV文件，我只想導入選擇某些行，如果它。首先，我創建將被導入的行的索引，然後我希望將這些行的名稱傳遞給sqldf並返回指定行的完整記錄。導入CSV時選擇指定行

#create the random rows ids that will be sampled 
library(dplyr) 
#range for the values 
index<-c(1:20) 
index<-as.data.frame(as.matrix(index)) 
#number of values to be returned 
number<-5 
ids<-sample_n(index,number) 

#sample the data 
library(sqldf) 
#filepath 
f<-file("/Users/.../filename.csv") 
#select data  
df<-sqldf("select * from f")

如何通過指定行號從CSV文件中導入行的選擇？

來源

2015-05-29 SharkSandwich

問題是什麼？ – zx8754

如何通過指定行號從CSV文件中導入選定的行。 – SharkSandwich

[相關文章：讀取CSV行的子集的最快方式]（http://stackoverflow.com/a/25244592/680068） – zx8754

試試這個例子：

library(sqldf) 

#dummy csv 
write.csv(data.frame(myid=1:10,var=runif(10)),"temp.csv") 

#define ids 
ids <- c(1,3,4) 
ids <- paste(ids,collapse = ",") 

f <- file("temp.csv") 

#query with subset 
fn$sqldf("select * 
      from f 
      where myid in ($ids)", 
      file.format = list(header = TRUE, sep = ",")) 

#output 
#  X myid  var 
# 1 "1" 1 0.2310945 
# 2 "3" 3 0.8825055 
# 3 "4" 4 0.6655517 

close(f)

來源

2015-05-29 11:20:58 zx8754

@ G.Grothendieck，出於某種原因，我無法將IN的外部變量操作符，所以用'paste'代替。 – zx8754

謝謝。我正在嘗試類似的東西，但無法使'where'子句起作用。 – SharkSandwich

@ G.Grothendieck是的，這個作品（更新了這篇文章），我的意思是說，避免把ID壓縮成一個字符串。 – zx8754

或許真的baseR這樣的...

# dummy csv 
write.csv(data.frame(myid=1:10, var=runif(10)),"temp.csv") 

# define ids 
ids <- c(1,3,4) 

# reading from line 3 to 4/reading 2 lines 
read.table("temp.csv", header=T, sep=",", skip=2, nrows=2) 

## X2 X2.1 X0.406697876984254 
## 1 3 3   0.6199803 
## 2 4 4   0.0271722 


# selctive line retrieval function 
dummy <- function(file, ids){ 
    tmp <- 
    mapply(
     read.table, 
     skip=ids, 
     MoreArgs= list(nrows=1, file=file, sep=",") , 
     SIMPLIFY = FALSE 
    ) 
    tmp_df <- do.call(rbind.data.frame, tmp) 
    names(tmp_df) <- names(read.table("temp.csv", header=T, sep=",",nrows=1)) 
    return(tmp_df) 
} 

# et voila 
dummy("temp.csv", ids) 

## X myid  var 
## 1 1 1 0.9040861 
## 2 3 3 0.6027502 
## 3 4 4 0.6829611

來源

2015-05-29 11:13:40 petermeissner

導入CSV時選擇指定行

回答

相關問題