2014-11-05 81 views
2

我有,看起來像一個數據幀:包括Na的頻率的氣泡圖

Data<- data.frame(item1=c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, NA, 5, NA, NA), 
        item2=c(1, 2, 2, 4, 1, 1, 2, 3, 5, 5, NA, NA, NA, NA), 
        item3=c(1, 2, 2, 4, 1, 1, 2, 3, 5, 5, NA, NA, NA, NA), 
        item4=c(1, 2, 2, 4, 1, 1, 4, 3, 1, 5, NA, 3, NA, NA), 
        item5=c(1, 5, 2, 4, 2, 1, 2, 3, 5, 5, NA, NA, 1, NA)) 

和我有一個函數定義的,即提取柱的頻率和繪製它沒有NA的

frequencies <- function(x,K=5) 
{ 
    p <- length(x) # items 
    n <- nrow(x) # observations 
    r <- (5, NA) # values 
    myf <- function(y) # extract frequencies 
    { 
    y <- y[!is.na(y)] 
    y <- as.factor(y) 
    aux <- summary(y) 
    res <- rep(0, r) 
    res[1:r %in% names(aux)] <- aux 
    100 * res/sum(res) 
    } 

    freqs <- apply(x, 2, FUN = myf) # apply myf by columns 
    df2 <- expand.grid(vals = 1:r, item = 1:p) # all possible combinations 
    df2$freq <- as.numeric(freqs) # add frequencies 

    # graph 
    plot(df2$item,df2$vals,type="n",xlim=c(1,p),ylim=c(1,r),xaxt = "n", 
     xlab="", ylab="", ann=FALSE) 


    axis(1, labels=FALSE) 
    labs <- paste(names(x)) ##labels=c("v1", "v2", ...) 
    text(1:p, srt = 60, adj=0.5, pos=1, las=2, 
     labels = labs, xpd = TRUE, par("usr")[1], cex.main=0.8, offset=1) 



    points(df2$item,df2$vals,pch=22,col="black", bg="gray", cex=(df2$freq/n)*K) 
} 

我想NA的被ploted爲「價值」(在y座標),所以我的情節可以看看類似於一個(已被編輯用編輯器,沒有R): enter image description here

謝謝你在前進,

安古洛

回答

2

另一種可能性,你melt您的數據長格式,然後使用exclude = NULL也算NAtable計數。如果你想使頻率與面積成正比,而不是正方形的寬度,請檢查scale_size_area

library(reshape2) 
library(ggplot2) 

Data2 <- melt(Data) 
Data3 <- with(Data2, as.data.frame(table(variable, value, exclude = NULL))) 
Data3 <- Data3[!is.na(Data3$variable), ] 

ggplot(data = Data3, aes(x = variable, y = value, size = Freq)) + 
    geom_point(shape = 0) 

enter image description here

+0

謝謝,這是我想要的一個很好的解決方案。你知道這種情節是否有一個特定的名字? – 2014-11-05 10:31:26

+0

我認爲它被稱爲[**氣泡圖/plot**](http://en.wikipedia.org/wiki/Bubble_chart) – Henrik 2014-11-05 10:47:25

+0

@AriadnaAngulo,我把你的問題的標題從「頻率圖」改爲「泡泡圖」,我認爲這是一個比較常見的問題參考這種情節的方式。 – Henrik 2014-11-05 12:00:43

1

嘗試是這樣的:

#u Useful packages: 
library(plyr) 
library(ggplot2) 

# Loop over variables getting the counts of each value 
counts <- lapply(Data, count) 

# Combine the list of counts into a single data frame 
all_counts <- do.call(rbind, counts) 

# A bit of fixing. Make x into a factor, and get the variable name 
all_counts <- within(
    all_counts, 
    { 
    Value <- factor(x) 
    Variable <- rep(names(counts), vapply(counts, nrow, integer(1))) 
    } 
) 

# Remove NAs (it isn't very clear from the question whether you want NAs or not) 
all_counts <- subset(all_counts, !is.na(x)) 

# Draw the plot. sqrt is to scale area by freq rather than width by freq 
(p <- ggplot(all_counts, aes(var, x, size = sqrt(freq))) + 
    geom_point(shape = 15) # shape 15 is a square. See ?points. 
) 
+0

ploting時我無法運行此代碼由於一個錯誤:不知道如何自動選取規模類型功能的對象。默認爲連續 data.frame中的錯誤(x = function(x,y = NULL,na.rm = FALSE,use): 參數意味着行數不同:0,25 – 2014-11-05 10:39:45

+0

猜測,您尚未轉換'x'或'y'軸變量是一個因素 – 2014-11-06 10:35:33