2013-02-14 50 views
1

我不知道是否有人熟悉Bioconductor RankProduct軟件包的排名和獲得差異表達的基因。有關該軟件的一些信息如下paper,manual,documentation基因排名微陣列

我在使用該程序時遇到了一些問題,可能是因爲我對R語言知之甚少。我試圖用我自己的數據複製上述pdf文件中的步驟。雖然我自己的數據集不像示例中那樣位於afffy .cel文件中,但僅作爲製表符分隔文件中的行和列。我有兩個條件(1和2中,複製爲每個= 4)

這是我的代碼:

library(RankProd) 
library(preprocessCore) 

#Read expression data 
#gdata <- read.table(file="data2.txt", sep="\t", header=T) #9000 rows of genes X 8 columns of chips 
gdata <- read.table(file="data2.txt", sep="\t", header=T, row.names=1) #9000 rows of genes X 8 columns of chips 

#colnames(gdata) 

# This vector contains the microarray sample names 
SampleNames= names(data.frame(gdata[,-1])) 
#names(datExpr)=gdata[,1] 

# This vector contains the gene names 
datExpr.gnames= gdata$GeneName 

# Since the first column contains the gene names, exclude it. 
# dataExp is then the matix required 
datExpr=data.frame(gdata[,-1]) 

#convert data into matrix form 
datExpr <- as.matrix(datExpr) 

#data normalization - quantile normalization 
#datExpr.log.norm <- normalize.quantiles((log2(datExpr)),copy=TRUE) #with logged data 
datExpr <- datExpr.log.norm 
#datExpr.norm <- normalize.quantiles(datExpr,copy=TRUE) #without logged data 
#datExpr <- datExpr.norm 


# Identify two class data - control/treatment (or condition 1/condition2) 
nl <- 4 
n2 <- 4 
cl <- rep(c(0,1), c(nl, n2)) 

datExpr.cl <- cl 

# data were generated under identical or very similar conditions except the 
# factor of interest (e.g., control and treatment), 
origin <- rep(1, nl + n2) 

datExpr.origin <- origin 

# Data anslysis 
datExpr.sub <- datExpr[,which(datExpr.origin == 1)] 
datExpr.cl.sub <- datExpr.cl[which(datExpr.origin == 1)] 
datExpr.origin.sub <- datExpr.origin[which(datExpr.origin == 1)] 

#Rank product analysis and output 
#RP.out <- RP(datExpr.sub, datExpr.cl.sub, num.perm = 100, logged = TRUE,na.rm = FALSE, plot = FALSE, rand = 123) 

RP.out <- RPadvance(datExpr.sub, datExpr.cl.sub, datExpr.origin.sub, num.perm = 100,logged = TRUE, 
       na.rm = FALSE, gene.names = datExpr.gnames, plot = FALSE,rand = 123) 



# Output a table of the identified genes based on user-specified selection criteria 
topGene(RP.out, cutoff = 0.05, method = "pfp", logged = TRUE,logbase = 2, gene.names = datExpr.gnames) 

我並運行該代碼,但在一個條件VS的差異表達基因的我的倍數變化其他都是0或無限。我想知道有這方面經驗的人能否幫助我。

回答

0

乍一看我注意的是,

#datExpr.log.norm <- normalize.quantiles((log2(datExpr)),copy=TRUE) #with logged data 
datExpr <- datExpr.log.norm 

這裏只要第一行註釋掉datExpr將導致空。

+0

謝謝Csgillespie。我決定改用RankProdIt(交互式程序),這對我的需求很有用。 – 2013-02-15 03:51:15