2017-08-17 70 views
0

我想要加載因子以查看哪個因子加載到哪個變量。我指的是以下鏈接:無法從sklearn PCA提取因子加載

Factor Loadings using sklearn

這裏是我的代碼,其中input_data是master_data。

X=master_data_predictors.values 

#Scaling the values 
X = scale(X) 

#taking equal number of components as equal to number of variables 
#intially we have 9 variables 
pca = PCA(n_components=9) 

pca.fit(X) 

#The amount of variance that each PC explains 
var= pca.explained_variance_ratio_ 

#Cumulative Variance explains 
var1=np.cumsum(np.round(pca.explained_variance_ratio_, decimals=4)*100) 

print var1 
[ 74.75 85.85 94.1 97.8 98.87 99.4 99.75 100. 100. ] 

#Retaining 4 components as they explain 98% of variance 
pca = PCA(n_components=4) 
pca.fit(X) 
X1=pca.fit_transform(X) 

print pca.components_ 

array([[ 0.38454129, 0.37344315, 0.2640267 , 0.36079567, 0.38070046, 
     0.37690887, 0.32949014, 0.34213449, 0.01310333], 
     [ 0.00308052, 0.00762985, -0.00556496, -0.00185015, 0.00300425, 
     0.00169865, 0.01380971, 0.0142307 , -0.99974635], 
     [ 0.0136128 , 0.04651786, 0.76405944, 0.10212738, 0.04236969, 
     0.05690046, -0.47599931, -0.41419841, -0.01629199], 
     [-0.09045103, -0.27641087, 0.53709146, -0.55429524, 0.058524 , 
     -0.19038107, 0.4397584 , 0.29430344, 0.00576399]]) 

import math 
loadings = pca.components_.T * math.sqrt(pca.explained_variance_) 

它給我以下錯誤「僅長度-1陣列可以轉換到Python標量

我明白的問題。我必須遍歷pca.components_和pca.explained_variance_數組,例如:

##just a thought 
Loading=np.empty((8,4)) 

for i,j in (pca.components_, pca.explained_variance_): 
    loading=i*math.sqrt(j) 
    Loading=Loading.append(loading) 
##unable to proceed further 
##something wrong here 

回答

1

這只是一個混合模塊的問題。對於numpy數組,使用np.sqrt而不是math.sqrt(它只適用於單個值,不適用於數組)。因此

你的最後一行應改爲:

loadings = pca.components_.T * np.sqrt(pca.explained_variance_) 

這是你鏈接到原始的答案是錯誤的。我已經相應地編輯了它們。