我在R中使用10個列表（files1,files2，files3，files3，... files10）。每個列表包含多個數據幀。R - 使用for循環中的列表名稱

現在，我想從每個列表中的每個數據幀提取一些值。

我打算用一個for循環

nt = c("A", "C", "G", "T") 
for (i in files1) { 
    for (j in nt) { 
     name = paste(j, i, sep = "-") # here I want as output name = "files1-A". However this doesn't work. How can I get the name of the list "files1"? 
     colname = paste("percentage", j, sep = "") # here I was as output colname = percentageA. This works 
     assign(name, unlist(lapply(i, function(x) x[here I want to use the column with the name "percentageA", so 'colname'][x$position==1000]))) 
    } 
}

所以，我有使用列表的名稱，並將其分配給變量的麻煩。

我知道只通過第一個列表循環，但是它也有可能立即循環所有我的列表？

換句話說：如何將下面的代碼放在for循環中？

A_files1 = unlist(lapply(files1, function(x) x$percentageA[x$position==1000])) 
C_files1 = unlist(lapply(files1, function(x) x$percentageC[x$position==1000])) 
G_files1 = unlist(lapply(files1, function(x) x$percentageG[x$position==1000])) 
T_files1 = unlist(lapply(files1, function(x) x$percentageT[x$position==1000])) 

A_files2 = unlist(lapply(files2, function(x) x$percentageA[x$position==1000])) 
C_files2 = unlist(lapply(files2, function(x) x$percentageC[x$position==1000])) 
G_files2 = unlist(lapply(files2, function(x) x$percentageG[x$position==1000])) 
T_files2 = unlist(lapply(files2, function(x) x$percentageT[x$position==1000])) 

.... 

A_files10 = unlist(lapply(files10, function(x) x$percentageA[x$position==1000])) 
C_files10 = unlist(lapply(files10, function(x) x$percentageC[x$position==1000])) 
G_files10 = unlist(lapply(files10, function(x) x$percentageG[x$position==1000])) 
T_files10 = unlist(lapply(files10, function(x) x$percentageT[x$position==1000]))

來源

2016-12-29 user1987607

確實'names（fileS1）'return'NULL'？ –

@ joel.wilson：是的確如此 – user1987607

發佈樣本數據，例如2-3個文件以獲得工作實例將會很棒。請參閱[如何製作可重現的示例]（http：// stackoverflow。COM /問題/ 5963269 /如何對化妝一個偉大-R-重複性，例如/ 5965451＃5965451）。一般來說，爲了讀取多個文件，我創建了一個從單個文件返回數據幀的函數（variable1，variable2）。然後，我使用帶有'group_by（variable1，variable2）''do（myfunction（。$ variable1，。$ variable2））'的'dplyr'包來讀取多個文件。這對於獲取單個數據幀中的所有數據非常重要。 –

爲了回答你的問題我創建了一個假的列表包含dataframes：

n = data.frame(andrea=c(1983, 11, 8),paja=c(1985, 4, 3)) 
s = data.frame(col1=c("aa", "bb", "cc", "dd", "ee")) 
b = data.frame(col1=c(TRUE, FALSE, TRUE, FALSE, FALSE)) 
x = list(n, s, b, 3) # x contains copies of n, s, b 
names(x) <- c("dataframe1","dataframe2","dataframe3","dataframe4") 
files1 = x

現在，進入你的循環會發生什麼：

i = files1 
j = "A"

如果你想使用包含在nt中的pedix的數據幀名稱（在本例中爲nt = "A"），您必須使用名稱（i）：

name_wrong = paste(j, i, sep = "-") 
name  = paste(names(i),j,sep = "-")

所以你獲得：

> name 
[1] "dataframe1-A" "dataframe2-A" "dataframe3-A" "dataframe4-A"

我希望這是你所需要的。

來源

2016-12-29 11:39:46

這不完全是我想要的。我不想列出所有的數據框，我只想使用我的列表名稱。 – user1987607

如何將列表中的所有列表放入列表中：'biglist < - list（files1 = files1）' 'names（biglist）'將返回''[1]「files1」'。 –

我認爲這個數據會更容易操縱，如果你壓扁數據結構。您可以使用一個數據框，而不是10個數據框架列表，其中所有的觀測數據都以其名稱和文件名索引。

生成樣本數據，並與每個項目只有10或11點使用代碼的問題

簡化數據我想列表中的項目有不同的行數？

files1 <- list(item1 = data.frame(position = 1:10, 
            percentageA = 1:10/10, 
            percentageC = 1:10/10, 
            percentageG = 1:10/10, 
            percentageT = 1:10/10), 
       item2 = data.frame(position = 1:11, 
            percentageA = 1:11/20, 
            percentageC = 1:11/20, 
            percentageG = 1:11/20, 
            percentageT = 1:11/20)) 
str(file) 

# Select the 9th position using your code 
A_files1 = unlist(lapply(files1, function(x) x$percentageA[x$position==9])) 
C_files1 = unlist(lapply(files1, function(x) x$percentageC[x$position==9])) 
G_files1 = unlist(lapply(files1, function(x) x$percentageG[x$position==9])) 
T_files1 = unlist(lapply(files1, function(x) x$percentageT[x$position==9]))

拼合dataframes名單成一個數據幀

# Add name to each data frame 
# Inspired by this answer 
# http://stackoverflow.com/a/18434780/2641825 


# For information l[1] creates a single list item 
# l[[1]] extracts the data frame from the list 
#' @param i index 
#' @param listoffiles list of data frames 
addname <- function(i, listoffiles){ 
    dtf <- listoffiles[[i]] # Extract the dataframe from the list 
    dtf$name <- names(listoffiles[i]) # Add the name inside the data frame 
    return(dtf) 
} 
# Add the name inside each data frame 
files1 <- lapply(seq_along(files1), addname, files1) 
str(files1) # look at the structure of the list 
files1table <- Reduce(rbind,files1) 

# Get the values of interest with 
files1table$percentageA[files1table$position == 9] 
# [1] 0.90 0.45 

# Get all Letters of interest with 
subset(files1table,position==9) 

# position percentageA percentageC percentageG percentageT name 
# 9   9  0.90  0.90  0.90  0.90 item1 
# 19  9  0.45  0.45  0.45  0.45 item2

拼合dataframes列表的所有列表到一個單一的數據幀

# Now create anoter list, files2, duplicate just for the sake of the example 
files2 <- files1 
# file1 and file2 both have a name column inside their dataframes already 
# Create a list of list of dataframes 
lolod <- list(files1 = files1, files2 = files2) 
str(lolod) # a list of lists 
# Flatten to a list of dataframes 
# Use sapply to keep names based on this answer http://stackoverflow.com/a/9469981/2641825 
lod <- sapply(lolod, Reduce, f=rbind, simplify = FALSE, USE.NAMES = TRUE) 
# Add the name inside each data frame again 
addfilename <- function(i, listoffiles){ 
    dtf <- listoffiles[[i]] # Extract the dataframe from the list 
    dtf$filename <- names(listoffiles[i]) # Add the name inside the data frame 
    return(dtf) 
} 
lod <- lapply(seq_along(lod), addfilename, lod) 


# Flatten to a dataframe 
d <- Reduce(rbind, lod) 
# Now the data structure is flattened and much easier to deal with 

subset(d,position==9) 
# position percentageA percentageC percentageG percentageT name filename 
# 9   9  0.90  0.90  0.90  0.90 item1 files1 
# 19  9  0.45  0.45  0.45  0.45 item2 files1 
# 30  9  0.90  0.90  0.90  0.90 item1 files2 
# 40  9  0.45  0.45  0.45  0.45 item2 files2

這個答案比我預期的要長得它成爲。我希望我沒有嚇到你。受tidy data的啓發，簡化數據結構將有助於您日後的工作。如果您在原始數據中提供了名稱，那麼這個複雜的列表重命名可能不是必需的。

來源

2016-12-29 23:58:12

R - 使用for循環中的列表名稱

回答

生成樣本數據，並與每個項目 只有10或11點使用代碼的問題

拼合dataframes名單成一個數據幀

拼合dataframes列表的所有列表到一個單一的數據幀

相關問題

生成樣本數據，並與每個項目只有10或11點使用代碼的問題