ggplot2基於分層聚類重新編制熱圖

儘管我發現了相當類似的問題，但我仍然努力使用ggplot2，但我並沒有設法實現它。我想按列重新排序，並按照分層聚類排列熱圖。ggplot2基於分層聚類重新編制熱圖

這裏我實際的代碼：

# import 
library("ggplot2") 
library("scales") 
library("reshape2") 

# data loading 
data_frame = read.csv(file=input_file, header=TRUE, row.names=1, sep='\t') 

# clustering with hclust on row and on column 
dd.col <- as.dendrogram(hclust(dist(data_frame))) 
dd.row <- as.dendrogram(hclust(dist(t(data_frame)))) 

# ordering based on clustering 
col.ord <- order.dendrogram(dd.col) 
row.ord <- order.dendrogram(dd.row) 


# making a new data frame reordered 
new_df = as.data.frame(data_frame[col.ord, row.ord]) 
print(new_df) # when mannualy looking new_df it seems working 

# get the row name 
name = as.factor(row.names(new_df)) 

# reshape 
melte_df = melt(cbind(name, new_df)) 

# the solution is here to reorder the name column factors levels. 
melte_df$name = factor(melte_df$name, levels = row.names(data_frame)[as.vector(row.ord)]) 

# ggplot2 dark magic 
(p <- ggplot(melte_df, aes(variable, name)) + geom_tile(aes(fill = value), 
colour = "white") + scale_fill_gradient(low = "white", 
high = "steelblue") + theme(text=element_text(size=12), 
axis.text.y=element_text(size=3))) 

# save fig 
ggsave(file = "test.pdf") 

# result is ordered as only by column what I have missed?

我有R相當牛逼，如果你可以開發你的答案，你會受到歡迎。

來源

2017-08-04 RomainL.

沒有一個例子集再現，我不是100％肯定這是原因，但我猜想，你的問題依賴於該行：

name = as.factor(row.names(new_df))

當您使用的一個因素，排序是基於該因素水平的排序。您可以根據需要對數據框進行重新排序，繪圖時使用的順序將成爲關卡的順序。

下面是一個例子：

data_frame <- data.frame(x = c("apple", "banana", "peach"), y = c(50, 30, 70)) 
data_frame 
     x y 
1 apple 50 
2 banana 30 
3 peach 70 

data_frame$x <- as.factor(data_frame$x) # Make x column a factor 

levels(data_frame$x) # This shows the levels of your factor 
[1] "apple" "banana" "peach" 

data_frame <- data_frame[order(data_frame$y),] # Order by value of y 
data_frame 
    x y 
2 banana 30 
1 apple 50 
3 peach 70 

# Now let's plot it: 
p <- ggplot(data_frame, aes(x)) + geom_bar(aes(weight=y)) 
p

這是結果：

看到了嗎？它不是按照我們想要的y值排序的。它按照因素的等級排序。現在，如果問題確實存在，那麼在這裏有解決方案R - Order a factor based on value in one or more other columns。

應用實例與dplyr的解決方案：

library(dplyr) 
data_frame <- data_frame %>% 
     arrange(y) %>%   # sort your dataframe 
     mutate(x = factor(x,x)) # reset your factor-column based on that order 

data_frame 
     x y 
1 banana 30 
2 apple 50 
3 peach 70 

levels(data_frame$x) # Levels of the factor are reordered! 
[1] "banana" "apple" "peach" 

p <- ggplot(data_frame, aes(x)) + geom_bar(aes(weight=y)) 
p

這是現在的結果是：

我希望這可以幫助，否則，你可能想給的例子你的原始數據集！

來源

2017-08-04 13:13:17 agatheblues

你的答案真正有用的地方指出問題。但最終我找到了一個更方便的方法。通過重新排列因素水平。我將編輯我的問題，添加使其工作的原因，但再次感謝您的幫助。 –

ggplot2基於分層聚類重新編制熱圖

回答

相關問題