2016-03-02 159 views
0

我嘗試了下面的與lda相關的代碼,並且無法理解爲什麼它除LD1之外沒有返回LD2。lda只返回LD1

library(MASS) 
library(ggplot2) 

負載功能ggplotLDAPrep()here

ggplotLDAPrep <- function(x){ 
    if (!is.null(Terms <- x$terms)) { 
    data <- model.frame(x) 
    X <- model.matrix(delete.response(Terms), data) 
    g <- model.response(data) 
    xint <- match("(Intercept)", colnames(X), nomatch = 0L) 
    if (xint > 0L) 
     X <- X[, -xint, drop = FALSE] 
    } 
    means <- colMeans(x$means) 
    X <- scale(X, center = means, scale = FALSE) %*% x$scaling 
    rtrn <- as.data.frame(cbind(X,labels=as.character(g))) 
    rtrn <- data.frame(X,labels=as.character(g)) 
    return(rtrn) 
} 


test<-iris[grep("setosa|virginica", iris$Species),1:5] 
ldaobject <- lda(Species ~ ., data=test) 
fitGraph <- ggplotLDAPrep(ldaobject) 
ggplot(fitGraph, aes(LD1,LD2, color=labels))+geom_point() 
ldaobject 

任何見解?

+0

如果你只有兩個組,也許你只需要找到一個判別? – user20650

回答

1

正如@ user20650所提到的,您至少需要3組返回LD1和LD2。 參見這個例子:

library(MASS) 
library(ggplot2) 

ggplotLDAPrep <- function(x){ 
    if (!is.null(Terms <- x$terms)) { 
    data <- model.frame(x) 
    X <- model.matrix(delete.response(Terms), data) 
    g <- model.response(data) 
    xint <- match("(Intercept)", colnames(X), nomatch = 0L) 
    if (xint > 0L) 
     X <- X[, -xint, drop = FALSE] 
    } 
    means <- colMeans(x$means) 
    X <- scale(X, center = means, scale = FALSE) %*% x$scaling 
    rtrn <- as.data.frame(cbind(X,labels=as.character(g))) 
    rtrn <- data.frame(X,labels=as.character(g)) 
    return(rtrn) 
} 


test<-iris[grep("setosa|virginica|versicolor", iris$Species),1:5] 
ldaobject <- lda(Species ~ ., data=test) 
fitGraph <- ggplotLDAPrep(ldaobject) 
ggplot(fitGraph, aes(LD1,LD2, color=labels))+geom_point() 

ggplot image

ldaobject 

> ldaobject 
Call: 
lda(Species ~ ., data = test) 

Prior probabilities of groups: 
    setosa versicolor virginica 
0.3333333 0.3333333 0.3333333 

Group means: 
      Sepal.Length Sepal.Width Petal.Length Petal.Width 
setosa   5.006  3.428  1.462  0.246 
versicolor  5.936  2.770  4.260  1.326 
virginica   6.588  2.974  5.552  2.026 

Coefficients of linear discriminants: 
        LD1   LD2 
Sepal.Length 0.8293776 0.02410215 
Sepal.Width 1.5344731 2.16452123 
Petal.Length -2.2012117 -0.93192121 
Petal.Width -2.8104603 2.83918785 

Proportion of trace: 
    LD1 LD2 
0.9912 0.0088 

劇情結果

ggplot(fitGraph, aes(LD1,LD2, color=labels))+ 

編輯:添加橢圓

此代碼是主要從here

geom_point() + 
    stat_ellipse(aes(x=LD1, y=LD2, fill = labels), alpha = 0.2, geom = "polygon") 

enter image description here

+0

非常好的例子。感謝loki和@ 20650。另一個問題是,我們如何爲每個類製作不同的符號並添加橢圓(如果不是很難)。 – user4178184

+0

我編輯了我的答案,並添加了省略號 – loki

+0

太好了,非常感謝洛基。 – user4178184