2016-09-27 70 views
2

我試圖在此WildML - Implementing a Neural Network From Scratch教程中重現模型,但使用Keras代替。我嘗試使用與教程相同的所有配置,但即使在調整了時代,批量大小,激活函數和隱藏層中的單元數後,我仍然得到了線性分類:Keras模型爲make_moons數據創建線性分類

classification graph

這裏是我的代碼:

from keras.models import Sequential 
from keras.layers import Dense, Activation 
from keras.utils.visualize_util import plot 
from keras.utils.np_utils import to_categorical 

import numpy as np 
import matplotlib.pyplot as plt 

import sklearn 
from sklearn import datasets, linear_model 

# Build model 
model = Sequential() 
model.add(Dense(input_dim=2, output_dim=3, activation="tanh", init="normal")) 
model.add(Dense(output_dim=2, activation="softmax")) 
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy']) 

# Train 
np.random.seed(0) 
X, y = sklearn.datasets.make_moons(200, noise=0.20) 
y_binary = to_categorical(y) 
model.fit(X, y_binary, nb_epoch=100) 

# Helper function to plot a decision boundary. 
# If you don't fully understand this function don't worry, it just generates the contour plot below. 
def plot_decision_boundary(pred_func): 
    # Set min and max values and give it some padding 
    x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 
    y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 
    h = 0.01 
    # Generate a grid of points with distance h between them 
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) 
    # Predict the function value for the whole gid 
    Z = pred_func(np.c_[xx.ravel(), yy.ravel()]) 
    Z = Z.reshape(xx.shape) 
    # Plot the contour and training examples 
    plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral) 
    plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Spectral) 

# Predict and plot 
plot_decision_boundary(lambda x: model.predict_classes(x, batch_size=200)) 
plt.title("Decision Boundary for hidden layer size 3") 
plt.show() 

回答

1

我相信我想通了這個問題。如果我刪除np.random.seed(0),火車2000點時代,輸出不總是線性的,偶爾獲得更高的90%+精度:

enter image description here

這肯定是這樣np.random.seed(0)導致被接種不良的參數,因爲我正在修理隨機播種,所以每次都會看到相同的圖形。

+2

2000 epochs ??爲什麼這麼多?是否存在一個閾值,超出該閾值模型開始呈現非線性行爲? – vgoklani

+1

這不是一個令人滿意的分析。你從不調整學習速度和批量。我非常肯定,一個好的學習速度/亞當學習將更加強大。您也沒有跳過數據標準化步驟,這非常重要。 – sascha

+0

優化程序= Adam(lr = 0.1)或優化程序= SGD(lr = 0.1)和kernel_initializer =「glorot_normal」似乎工作正常。爲什麼SGD對np.random.seed(0)有問題,對我來說仍然是一個懸而未決的問題。 –