2016-09-13 196 views
1

從以下工作構建起:類型轉換 - 僅在循環中強制類型轉換?

sum(sapply(DNAStringSet(seq_set[, 1]), function(s) 
    countPWM(motifs[[1]], reverseComplement(s), min.score = "75%"))) 

我寫這個循環:

percentages <- as.character(seq(0, 100, 5)) 

for (i in 1:length(percentages)) { 
    sum(sapply(DNAStringSet(seq_set[, 1]), function(s) 
    countPWM(
     motifs[[1]], 
     reverseComplement(s), 
     min.score = as.character(cat('"', percentages[i], "%" , '"', sep = "") 
    )))) 
} 

,並返回以下:

Error in .normargMinScore(min.score, pwm) : 
    'min.score' must be a single number or string 

我不知道,有一個與問題數據類型

min.score 

,但是當我檢查:

test <- as.character(cat('"', percentages[1], "%" , '"', sep = "")) 
typeof(test) 


> typeof(test) 
[1] "character" 

它似乎是爲了。

我認爲這可能與類型強制有關,如R-bloggers所述,因爲使用了sapply function。但這似乎並不正確。

幫助將不勝感激, 因爲我還是新的R和編程

我sessionInfo()

R version 3.2.5 (2016-04-14) 
Platform: x86_64-pc-linux-gnu (64-bit) 
Running under: Ubuntu 14.04.4 LTS 

locale: 
[1] LC_CTYPE=en_US.UTF-8  LC_NUMERIC=C    
[3] LC_TIME=en_US.UTF-8  LC_COLLATE=en_US.UTF-8  
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 
[7] LC_PAPER=de_DE.UTF-8  LC_NAME=C     
[9] LC_ADDRESS=C    LC_TELEPHONE=C    
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C  

attached base packages: 
[1] stats4 parallel stats  graphics grDevices utils  
[7] datasets methods base  

other attached packages: 
[1] Biostrings_2.38.4 XVector_0.10.0  IRanges_2.4.8  
[4] S4Vectors_0.8.11 BiocGenerics_0.16.1 

loaded via a namespace (and not attached): 
[1] zlibbioc_1.16.0 tools_3.2.5 

,這是我怎麼構建我的數據:

seq_set <- matrix(1:2000, 1000, 2) 
seq_set[, 1] <- 
    sapply(seq_set[, 1], function(s) 
    paste(sample(
     c('A', 'C', 'G', 'T'), 
     size = ncol(motifs[[1]]), 
     replace = T 
    ), collapse = '')) 
seq_set[, 2] <- 
    sapply(seq_set[, 2], function(s) 
    paste(sample(
     c('A', 'C', 'G', 'T'), 
     size = ncol(motifs[[2]]), 
     replace = T 
    ), collapse = '')) 

這些是我的書架中的包裝:

AnnotationDbi     Annotation Database Interface 
Biobase      Biobase: Base functions for Bioconductor 
BiocGenerics     S4 generic functions for Bioconductor 
BiocInstaller     Install/Update Bioconductor, CRAN, and github Packages 
BiocParallel     Bioconductor facilities for parallel evaluation 
Biostrings     String objects representing biological sequences, and 
           matching algorithms 
bitops      Bitwise Operations 
BSgenome      Infrastructure for Biostrings-based genome data packages and 
           support for efficient SNP representation 
caTools      Tools: moving window statistics, GIF, Base64, ROC AUC, etc. 
CNEr       CNE Detection and Visualization 
DBI       R Database Interface 
DirichletMultinomial   Dirichlet-Multinomial Mixture Model Machine Learning for 
           Microbiome Data 
futile.logger     A Logging Utility for R 
futile.options    Futile options management 
GenomeInfoDb     Utilities for manipulating chromosome and other 'seqname' 
           identifiers 
GenomicAlignments    Representation and manipulation of short genomic alignments 
GenomicRanges     Representation and manipulation of genomic intervals and 
           variables defined along a genome 
gtools      Various R Programming Tools 
IRanges      Infrastructure for manipulating intervals on sequences 
lambda.r      Modeling Data with Functional Programming 
Rcpp       Seamless R and C++ Integration 
RCurl       General Network (HTTP/FTP/...) Client Interface for R 
Rsamtools      Binary alignment (BAM), FASTA, variant call (BCF), and tabix 
           file import 
RSQLite      SQLite Interface for R 
rtracklayer     R interface to genome browsers and their annotation tracks 
S4Vectors      S4 implementation of vectors and lists 
seqLogo      Sequence logos for DNA sequence alignments 
snow       Simple Network of Workstations 
SummarizedExperiment   SummarizedExperiment container 
TFBSTools      Software Package for Transcription Factor Binding Site 
           (TFBS) Analysis 
TFMPvalue      Efficient and Accurate P-Value Computation for Position 
           Weight Matrices 
XML       Tools for Parsing and Generating XML Within R and S-Plus 
XVector      Representation and manpulation of external sequences 
zlibbioc      An R packaged zlib-1.2.5 

Packages in library ‘/usr/lib/R/library’: 

base       The R Base Package 
boot       Bootstrap Functions (Originally by Angelo Canty for S) 
class       Functions for Classification 
cluster      "Finding Groups in Data": Cluster Analysis Extended 
           Rousseeuw et al. 
codetools      Code Analysis Tools for R 
compiler      The R Compiler Package 
datasets      The R Datasets Package 
foreign      Read Data Stored by Minitab, S, SAS, SPSS, Stata, Systat, 
           Weka, dBase, ... 
graphics      The R Graphics Package 
grDevices      The R Graphics Devices and Support for Colours and Fonts 
grid       The Grid Graphics Package 
KernSmooth     Functions for Kernel Smoothing Supporting Wand & Jones 
           (1995) 
lattice      Trellis Graphics for R 
MASS       Support Functions and Datasets for Venables and Ripley's 
           MASS 
Matrix      Sparse and Dense Matrix Classes and Methods 
methods      Formal Methods and Classes 
mgcv       Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness 
           Estimation 
nlme       Linear and Nonlinear Mixed Effects Models 
nnet       Feed-Forward Neural Networks and Multinomial Log-Linear 
           Models 
parallel      Support for Parallel computation in R 
rpart       Recursive Partitioning and Regression Trees 
spatial      Functions for Kriging and Point Pattern Analysis 
splines      Regression Spline Functions and Classes 
stats       The R Stats Package 
stats4      Statistical Functions using S4 Classes 
survival      Survival Analysis 
tcltk       Tcl/Tk Interface 
tools       Tools for Package Development 
utils       The R Utils Package 
+1

有這樣的sessionInfo是很好的,但也最好使這個例子完全重現(使用'library()'調用,一個小數據集等)。見http://stackoverflow.com/a/28481250/ – Frank

+1

很可能你想用'paste'而不是'cat'。猜測它應該是'min.score <-paste(''',百分比[i],「%」,''',sep =「」)'。 – nicola

+0

謝謝弗蘭克,你是對的。我做了更改 – piderotrema

回答

0

尼科拉的評論伎倆。

這樣:

seq_set_matches <- matrix(1:42, 21, 2) 
percentages <- as.character(seq(0, 100, 5)) 
for (i in 1:length(percentages)) { 
    seq_set_matches[i,1]<- sum(sapply(DNAStringSet(seq_set[, 1]), function(s) 
    countPWM(
     motifs[[1]], 
     reverseComplement(s), 
     min.score = paste(percentages[i], "%" , sep = "") 
    ))) 
} 

作品。親愛的尼科拉,如果你喜歡,我很樂意接受你的幫助,作爲正式的答覆。再次感謝。