我試圖從兩個不同的距離矩陣創建樹形圖並對它們進行比較。我使用代碼here作爲起點,但問題是因爲我使用了兩個不同的矩陣,但使用了相同的聚類方法,所以我需要將兩個不同的矩陣一起繪製以進行比較分析。我想知道是否有可能將每個正方形/節點的兩半對角線分開以顯示兩個不同的距離矩陣。在同一圖上繪製兩個距離矩陣?
這裏是我的代碼:
from sklearn import preprocessing
from sklearn.neighbors import DistanceMetric
import pandas as pd
import numpy as np
from ete3 import Tree
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise import cosine_distances
import scipy
import pylab
import scipy.cluster.hierarchy as sch
import scipy.spatial.distance as sd
import random
#g[n] is a one dimensional array containing datapoints
g1 = random.sample(range(30), 5)
g2 = random.sample(range(30), 5)
g3 = random.sample(range(30), 5)
g4 = random.sample(range(30), 5)
g5 = random.sample(range(30), 5)
g1 = np.array(g1)
g2 = np.array(g2)
g3 = np.array(g3)
g4 = np.array(g4)
g5 = np.array(g5)
X = (g1,g2,g3,g4,g5)
#Comparing between euclidean and cosine###########################################
distanceC = cosine_distances(X)
dist = DistanceMetric.get_metric('euclidean')
distanceE = dist.pairwise(X)
##################################################################################
#Plots############################################################################
# Compute and plot first dendrogram.
fig = pylab.figure(figsize=(8,8))
ax1 = fig.add_axes([0.09,0.1,0.2,0.6])
Y = sch.average(sd.squareform(distanceC))
Z1 = sch.dendrogram(Y, orientation='right')
ax1.set_xticks([])
ax1.set_yticks([])
# Compute and plot second dendrogram.
ax2 = fig.add_axes([0.3,0.71,0.6,0.2])
Y = sch.average(sd.squareform(distanceE))
Z2 = sch.dendrogram(Y)
ax2.set_xticks([])
ax2.set_yticks([])
# Plot distance matrix.
axmatrix = fig.add_axes([0.3,0.1,0.6,0.6])
idx1 = Z1['leaves']
idx2 = Z2['leaves']
distance = distance[idx1,:]
distance = distance[:,idx2]
im = axmatrix.matshow(distance, aspect='auto', origin='lower', cmap=pylab.cm.YlGnBu)
axmatrix.set_xticks([])
axmatrix.set_yticks([])
# Plot colorbar.
axcolor = fig.add_axes([0.91,0.1,0.02,0.6])
pylab.colorbar(im, cax=axcolor)
fig.show()
fig.savefig('dendrogram.png')
##################################################################################
我已經刪除了第二個問題。雖然我明白這裏的代碼示例有點「破碎」,但問題在於生成列表g1,g2 ... g5的代碼有很多文件IO和處理操作,這些操作並不真正相關,但我仍然綁定用一個隨機列表生成器代替它,它應該完成這項工作。 – Siddharth