2017-04-26 96 views
1

我寫了一個代碼來對我的大數據集進行排序。我測試了一個文件的代碼(例如「K_7-1​​_0H.TXT」),它工作得很好,沒有任何警告和錯誤信息。我得到了我想要的正確的.csv文件。但是,當我跑環,錯誤消息傳來:錯誤:維數不正確

Error in `[.default`(data, 1,) : incorrect number of dimensions 

這是我的代碼:

kcl <- list.files(pattern = "K_.*\\.TXT",recursive = TRUE) 
c <- c(0, 0.001,0.01,0.05,0.25,0.5,1.0,1.5,2.0,2.5,3.0,3.4) 
Kconc. <- rep(c,each=8) 
for (i in kcl) { 
    data <- read.csv(i,header=FALSE,sep = ",") 
    data <- data[-c(1:14),] #delete noise info 
    colnames(data) <- as.character(unlist(data[1,])) #add column name 
    data <- data[-1,] 
    data <- data[,-1] 
    #change cell number and sort 
    data$`Well Label` <- as.character(data$`Well Label`) 
    data[1,1] <- "T01" 
    data[12,1] <- "T02" 
    ... 
    data[89,1] <- "T09" 
    data <- arrange(data, `Well Label`) #sort according to table number 
    data <- data[-1,] #delete noise info 
    data <- cbind(data,Kconc.) 
    j <- sub("\\.[[:alnum:]]+$","",i) #grep the isolate name without the extention 
    write.csv(data, paste0(j,".csv")) 
} 

這裏是列表內容」

> kcl 
[1] "K_10-1_0.TXT" "K_10-3_0.TXT" 
[9] "K_10-3_6.TXT" "K_10-3_7.TXT" "K_11-1_8.TXT" 
[17] "K_11-2_0.TXT" "K_11-3_8.TXT" 
[25] "K_7-1_0H.TXT" "K_7-3_82.TXT" "K_8-1_0H.TXT" "K_8-1_60.TXT" "K_8-1_72.TXT" "K_8-1_84.TXT" 
[49] "K_9-1_0Z.TXT" "K_9-1_60.TXT" "K_9-1_72.TXT" "K_9-1_84.TXT" "K_9-2-84.TXT" 

當我檢查了我的文件,創建了像「K_10 * .csv」和「K_11 * .csv」這樣的文件,並且我得到了我想要的文件,但是像「K_7 * .TXT」,「K_8 * .TXT」和「K_9 * .TXT」根本不需要工作,這意味着我甚至沒有爲這些文件創建.csv。

我真的不明白錯誤消息,爲什麼代碼只適用於某些文件。有人能幫助我嗎?


編輯:輸入和輸出期望
輸入是.txt文件如下:

[Assay],C:\REVEL\650-S.ASY 
"Assay title",Untitled Assay 
"Read Time",11.04.17,13:04:00 
"Operator", 
"Comments", 
"Kit Lot Number",, 
"Wells",A1 - H12 
OD RESULTS 
"Units",OD 

[Results],Results are sorted on Sample ID,in ascending order 

"Sample ID","Well Label","OD Results" 
"T1","T1",0.045 
"T10","T10",0.044 
"T11","T11",0.045 
"T2","T2",0.045 

預期輸出:

Well Label OD Results Hconc. 
2 T01 0.189 0 
3 T02 0.11 0 
4 T03 0.151 0 
5 T04 0.053 0 
+2

大概在一個點數據已被轉換成的載體。您需要在子集中添加「drop = FALSE」,以避免這種情況。 – Cath

+2

簡化您的文章,提供示例輸入和預期輸出。另外,還有其他一些小問題:避免使用'c'作爲變量名,使用header = TRUE,使用stringsAsFactor = FALSE,可能會將forloop更改爲lapply。 – zx8754

+0

@Cath我應該在哪一步添加'drop = FALSE'?在'list.files'步驟?謝謝! – Ziming

回答

0

嘗試這個例子:

library(dplyr) 

# skip info rows 
df1 <- read.csv("test.txt", skip = 12, stringsAsFactors = FALSE) 
Kconc <- c(0, 0.001,0.01,0.05,0.25,0.5,1.0,1.5,2.0,2.5,3.0,3.4) 

# Prerix with zero, e.g.: T1 to T01, then sort 
res <- 
    df1 %>% 
    transmute(
    `Well Label` = if_else(nchar(df1$Sample.ID) == 2, 
          paste0(substr(df1$Sample.ID, 1, 1), 
            0, 
            substr(df1$Sample.ID, 2, 3)), 
          df1$Sample.ID), 
    `OD Results` = OD.Results) %>% 
    arrange(`Well Label`) 
res 
# Well Label OD Results 
# 1  T01  0.045 
# 2  T02  0.045 
# 3  T10  0.044 
# 4  T11  0.045 

Then cbind Kconc,因爲它回收不需要rep。在這個例子中,我們只有4行,所以爲了得到這個例子的正確結果,我們需要使用res <- cbind(res, Kconc[1:4])

res <- cbind(res, Kconc) 

另外,從gtools讀到自然順序:

df1[ gtools::mixedorder(df1$Sample.ID), ] 
# Sample.ID Well.Label OD.Results 
# 1  T1   T1  0.045 
# 4  T2   T2  0.045 
# 2  T10  T10  0.044 
# 3  T11  T11  0.045 

的test.txt
[Assay],C:\REVEL\650-S.ASY 
"Assay title",Untitled Assay 
"Read Time",11.04.17,13:04:00 
"Operator", 
"Comments", 
"Kit Lot Number",, 
"Wells",A1 - H12 
OD RESULTS 
"Units",OD 

[Results],Results are sorted on Sample ID,in ascending order 

"Sample ID","Well Label","OD Results" 
"T1","T1",0.045 
"T10","T10",0.044 
"T11","T11",0.045 
"T2","T2",0.045 
+0

非常感謝!你的代碼是超級的,對於測試文件來說是完美的!但只有一件事:當我在真實文件中嘗試使用這個代碼來預先設置零時,一個錯誤按摩總是顯示出來。錯誤mutate_impl(。數據,點): 錯誤的結果大小(96),預期98或1我不明白爲什麼預期的結果將是98或1.這是什麼意思?謝謝 ! – Ziming

+0

什麼錯誤?還提供代碼和輸入文件來重現該錯誤。 – zx8754

+0

對不起。原來是我的文件名中的錯誤。我現在解決了這個問題,你提供的代碼是完美的。非常感謝! – Ziming