2017-10-18 152 views
0

我正在使用MNIST數據集構建用於手寫數字識別的ConvNet。我的代碼是使用Theano後端在Keras中編寫的。在Theano/Keras中將輸出類聚合爲一個類別

我想訓練我的ConvNet,因此它可以識別類的一個子集(例如,僅數字'1'和'2')並輸出任何其他類作爲通用'未知'類。我知道這可以在Theano上完成,因爲它在"Distributed Neural Networks for Internet of Things: The Big-Little Approach"上進行了描述,但我無法找到關於此主題的任何文檔或示例。

回答

0

修改您的目標函數輸出(您的Y_train)以將0,1,2,3,4,5等等轉換爲0,1,2(其中2代表「other」)。請注意,如果您實際上想要預測數字5和6以及其他所有內容,則需要重新將您的類索引爲0,以便0變成數字5,1變成數字6,並且2變成「其他」。

這裏的Keras MNIST example的修改版本,基本上直接從只有幾行額外回購:

'''Trains a simple convnet on the MNIST dataset. 

Gets to 99.25% test accuracy after 12 epochs 
(there is still a lot of margin for parameter tuning). 
16 seconds per epoch on a GRID K520 GPU. 
''' 

from __future__ import print_function 
import keras 
from keras.datasets import mnist 
from keras.models import Sequential 
from keras.layers import Dense, Dropout, Flatten 
from keras.layers import Conv2D, MaxPooling2D 
from keras import backend as K 
import numpy as np 

batch_size = 128 
epochs = 1 

# input image dimensions 
img_rows, img_cols = 28, 28 

# the data, shuffled and split between train and test sets 
(x_train, y_train), (x_test, y_test) = mnist.load_data() 

############################### 
# This is the key... order is important! 
y_train[y_train<=4]=2 
y_train[y_train==5]=0 
y_train[y_train==6]=1 
y_train[y_train>=7]=2 

y_test[y_test<=4]=2 
y_test[y_test==5]=0 
y_test[y_test==6]=1 
y_test[y_test>=7]=2 

num_classes=3 
print(np.unique(y_train)) 
# [0 1 2] 
############################### 

if K.image_data_format() == 'channels_first': 
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols) 
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols) 
    input_shape = (1, img_rows, img_cols) 
else: 
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1) 
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1) 
    input_shape = (img_rows, img_cols, 1) 

x_train = x_train.astype('float32') 
x_test = x_test.astype('float32') 
x_train /= 255 
x_test /= 255 
print('x_train shape:', x_train.shape) 
print(x_train.shape[0], 'train samples') 
print(x_test.shape[0], 'test samples') 

# convert class vectors to binary class matrices 
y_train = keras.utils.to_categorical(y_train, num_classes) 
y_test = keras.utils.to_categorical(y_test, num_classes) 

model = Sequential() 
model.add(Conv2D(32, kernel_size=(3, 3), 
       activation='relu', 
       input_shape=input_shape)) 
model.add(Conv2D(64, (3, 3), activation='relu')) 
model.add(MaxPooling2D(pool_size=(2, 2))) 
model.add(Dropout(0.25)) 
model.add(Flatten()) 
model.add(Dense(128, activation='relu')) 
model.add(Dropout(0.5)) 
model.add(Dense(num_classes, activation='softmax')) 

model.compile(loss=keras.losses.categorical_crossentropy, 
       optimizer=keras.optimizers.Adadelta(), 
       metrics=['accuracy']) 

model.fit(x_train, y_train, 
      batch_size=batch_size, 
      epochs=epochs, 
      verbose=1, 
      validation_data=(x_test, y_test)) 
score = model.evaluate(x_test, y_test, verbose=0) 
print('Test loss:', score[0]) 
print('Test accuracy:', score[1]) 
相關問題