R - summary.princomp的限制輸出

我對包含1000多個變量的數據集運行主成分分析。我使用R Studio，當我運行總結以查看組件的累積方差時，我只能看到最後幾百個組件。我如何限制總結僅顯示前100個組件？R - summary.princomp的限制輸出

2012-04-07 user1209675

你能提供一個小的可重複的例子？ – digEmAll 2012-04-07 15:30:31

@digemall並非如此，數據集非常龐大。我正在運行：prin < - princomp（train [c（2：1777）]）summary（prin）當我這樣做時，它顯示了所有1776個主要組件的信息。我只需要前100名左右。 – user1209675 2012-04-07 16:07:41

是的，當然不是完整的代碼。我的意思是一個小例子來理解你的步驟。無論如何@Joran得到了點;） – digEmAll 2012-04-07 16:44:08

這是很容易修改print.summary.princomp（你可以看到通過鍵入stats:::print.summary.princomp原碼）來做到這一點：

pcaPrint <- function (x, digits = 3, loadings = x$print.loadings, cutoff = x$cutoff,n, ...) 
{ 
    #Check for sensible value of n; default to full output 
    if (missing(n) || n > length(x$sdev) || n < 1){n <- length(x$sdev)} 
    vars <- x$sdev^2 
    vars <- vars/sum(vars) 
    cat("Importance of components:\n") 
    print(rbind(`Standard deviation` = x$sdev[1:n], `Proportion of Variance` = vars[1:n], 
     `Cumulative Proportion` = cumsum(vars)[1:n])) 
    if (loadings) { 
     cat("\nLoadings:\n") 
     cx <- format(round(x$loadings, digits = digits)) 
     cx[abs(x$loadings) < cutoff] <- paste(rep(" ", nchar(cx[1, 
      1], type = "w")), collapse = "") 
     print(cx[,1:n], quote = FALSE, ...) 
    } 
    invisible(x) 
} 

pcaPrint(summary(princomp(USArrests, cor=TRUE), 
       loadings = TRUE, cutoff = 0.2), digits = 2,n = 2)

編輯要包括用於n一個合理的值基本檢查。現在我已經完成了這個任務，不知道是否值得把R Core作爲一個永久的補充來提出。看起來很簡單，並且可能有用。

來源

2012-04-07 16:28:34 joran

非常感謝。正是我需要的。這使得數據挖掘應用程序變得更加容易。 – user1209675 2012-04-07 16:41:43

@joran：是的，這是一個值得提交給R-Core團隊IMO的特性。 – digEmAll 2012-04-07 16:45:53

你可以把加載矩陣形式，你可以將矩陣保存到一個變量，然後子集（一個la matrix[,1:100]）它看到第一個/中間/最後n。在這個例子中，我使用了head（）。每列是一個主要組成部分。

head(
    matrix(
    prin$loadings, 
     ncol=length(dimnames(prin$loadings)[[2]]), 
     nrow=length(dimnames(prin$loadings)[[1]]) 
), 
100)

來源

2012-04-07 16:23:40

我嘗試這樣做，它似乎是工作： L =負荷（首席） L [1：100]

來源

2012-07-27 23:04:02 wj4f

R - summary.princomp的限制輸出

回答

相關問題