顯示網格區域的模態值的熱圖方式圖（通過stat_summary_2d？）

我有一組帶有類別標籤的2D點，並且想要顯示哪個類別支配超過2D平面的網格的每個單元格。顯示網格區域的模態值的熱圖方式圖（通過stat_summary_2d？）

我想我可以使用stat_summary_2d來選擇最常見的值，如下圖所示，但是我得到了三種不同的圖表，除了圖例標籤外，它們應該是相同的。

我濫用我stat_summary_2d？有沒有更好的方法來產生這種情節？

library(ggplot2) 
set.seed(12345) 
x = runif(1000) 
y = runif(1000) 
lab = rep(c("red", "blue", "green", "yellow"), 250) 

df = data.frame(x=x, y=y, lab=factor(lab, labels=c("red", "blue", "green", "yellow"))) 
df$val = as.numeric(df$lab) 

#Attempt 1 
ggplot(df, aes(x=x, y=y)) + 
    stat_summary_2d(aes(z=lab), 
        fun=function(z) names(which.max(table(z))), 
        binwidth=.1) 

#Attempt 2 
ggplot(df, aes(x=x, y=y)) + 
    stat_summary_2d(aes(z=val), 
        fun=function(z) names(which.max(table(z))), 
        binwidth=.1) 

#Attempt 3 
ggplot(df, aes(x=x, y=y)) + 
    stat_summary_2d(aes(z=as.numeric(lab)), 
         fun=function(z) names(which.max(table(z))), 
         binwidth=.1)

來源

2017-09-17 mr_kitty

添加group = 1嘗試1 &你會看到面板相同的分佈爲隨後的兩次嘗試。

指定填充調色板適當，&所有三個外觀都相同：

library(ggplot2) 

#Attempt 1 
p1 <- ggplot(df, aes(x=x, y=y, group = 1)) + 
    stat_summary_2d(aes(z=lab), 
        fun=function(z) names(which.max(table(z))), 
        binwidth=.1) + 
    scale_fill_manual(values = c("red" = "red", 
           "blue" = "blue", 
           "green" = "green", 
           "yellow" = "yellow"), 
        breaks = c("red", "blue", "green", "yellow")) + 
    ggtitle("Attempt 1") + theme(legend.position = "bottom") 

#Attempt 2 
p2 <- ggplot(df, aes(x=x, y=y)) + 
    stat_summary_2d(aes(z=val), 
        fun=function(z) names(which.max(table(z))), 
        binwidth=.1) + 
    scale_fill_manual(values = c("red", "blue", "green", "yellow")) + 
    ggtitle("Attempt 2") + theme(legend.position = "bottom") 

#Attempt 3 
p3 <- ggplot(df, aes(x=x, y=y)) + 
    stat_summary_2d(aes(z=as.numeric(lab)), 
        fun=function(z) names(which.max(table(z))), 
        binwidth=.1) + 
    scale_fill_manual(values = c("red", "blue", "green", "yellow")) + 
    ggtitle("Attempt 3") + theme(legend.position = "bottom") 

gridExtra::grid.arrange(p1, p2, p3, nrow = 1)

說明：如果您檢查第一個圖的基礎數據，你會發現，有379行數據，每個數據對應於熱圖中的一個圖塊。如果我們總計每個倉內不同顏色的數量，我們也會得到379個，所以實際上在每個倉位上都有多個貼圖。（相反，第二個和第三個圖的基礎數據各有100行）

基於此，我們知道ggplot已經將「lab」中的每個因子級別解釋爲單獨的組，並分別執行stat_summary_2d()爲每個級別。將美學映射添加到group = 1迫使所有級別一起考慮。

p1.original <- ggplot(df, aes(x=x, y=y)) + 
    stat_summary_2d(aes(z=lab), 
        fun=function(z) names(which.max(table(z))), 
        binwidth=.1) 

View(layer_data(p1.original))

來源

2017-09-17 04:10:19

顯示網格區域的模態值的熱圖方式圖（通過stat_summary_2d？）

回答

相關問題