2016-08-23 40 views
1

重用LSTM的可變我試圖用LSTM使RNN。 我製作了LSTM模型,之後它有兩個DNN網絡和一個迴歸輸出層。回用Tensorflow

我訓練我的數據,並最終喪失培訓成爲約0.009。 但是,當我將模型應用於測試數據時,損失約爲0.5

第一次時代訓練損失約爲0.5所以,我認爲訓練的變量不用於測試模型。

訓練和測試模型之間的唯一區別是批量大小。 Trainning Batch = 100~200Test Batch Size = 1

在主要功能

我做LSTM實例。 在LSTM innitializer,模型製作。

def __init__(self,config,train_model=None): 
    self.sess = sess = tf.Session() 

    self.num_steps = num_steps = config.num_steps 
    self.lstm_size = lstm_size = config.lstm_size 
    self.num_features = num_features = config.num_features 
    self.num_layers = num_layers = config.num_layers 
    self.num_hiddens = num_hiddens = config.num_hiddens 
    self.batch_size = batch_size = config.batch_size 
    self.train = train = config.train 
    self.epoch = config.epoch 
    self.learning_rate = learning_rate = config.learning_rate 

    with tf.variable_scope('model') as scope:   
     self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False)) 
     self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers) 

    with tf.name_scope('placeholders'): 
     self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features], 
           name='input-x') 
     self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y') 
     self.init_state = cell.zero_state(self.batch_size,tf.float32) 
    with tf.variable_scope('model'): 
     self.W1 = tf.Variable(tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1') 
     self.b1 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b1') 
     self.W2 = tf.Variable(tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2') 
     self.b2 = tf.Variable(tf.truncated_normal([num_hiddens],stddev=0.1),name='b2') 
     self.W3 = tf.Variable(tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3') 
     self.b3 = tf.Variable(tf.truncated_normal([num_features],stddev=0.1),name='b3') 


    self.output, self.loss = self.inference() 
    tf.initialize_all_variables().run(session=sess)     
    tf.initialize_variables([self.b2]).run(session=sess) 

    if train_model == None: 
     self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss) 

使用上述LSTM 初始化,下面LSTM實例製成。

with tf.variable_scope("model",reuse=None): 
    train_model = LSTM(main_config) 
with tf.variable_scope("model", reuse=True): 
    predict_model = LSTM(predict_config) 

做兩個LSTM實例之後,我訓練的train_model。 而我在predict_model輸入測試集。

爲什麼變量不被重用?

回答

2

問題是如果您要重複使用scope,則應該使用tf.get_variable()來創建變量,而不是tf.Variable()

看看at this tutorial共享變量,你會更好地理解它。

而且,你不需要在這裏使用一個會話,因爲你沒有當你要訓練你的模型,當你定義模型,變量應該被初始化爲初始化變量。

重用變量的代碼如下:

def __init__(self,config,train_model=None): 
    self.num_steps = num_steps = config.num_steps 
    self.lstm_size = lstm_size = config.lstm_size 
    self.num_features = num_features = config.num_features 
    self.num_layers = num_layers = config.num_layers 
    self.num_hiddens = num_hiddens = config.num_hiddens 
    self.batch_size = batch_size = config.batch_size 
    self.train = train = config.train 
    self.epoch = config.epoch 
    self.learning_rate = learning_rate = config.learning_rate 

    with tf.variable_scope('model') as scope:   
     self.lstm_cell = lstm_cell = tf.nn.rnn_cell.LSTMCell(lstm_size,initializer = tf.contrib.layers.xavier_initializer(uniform=False)) 
     self.cell = cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers) 

    with tf.name_scope('placeholders'): 
     self.x = tf.placeholder(tf.float32,[self.batch_size,num_steps,num_features], 
           name='input-x') 
     self.y = tf.placeholder(tf.float32, [self.batch_size,num_features],name='input-y') 
     self.init_state = cell.zero_state(self.batch_size,tf.float32) 
    with tf.variable_scope('model'): 
     self.W1 = tf.get_variable(initializer=tf.truncated_normal([lstm_size*num_steps,num_hiddens],stddev=0.1),name='W1') 
     self.b1 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b1') 
     self.W2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_hiddens],stddev=0.1),name='W2') 
     self.b2 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens],stddev=0.1),name='b2') 
     self.W3 = tf.get_variable(initializer=tf.truncated_normal([num_hiddens,num_features],stddev=0.1),name='W3') 
     self.b3 = tf.get_variable(initializer=tf.truncated_normal([num_features],stddev=0.1),name='b3') 


    self.output, self.loss = self.inference() 

    if train_model == None: 
     self.train_step = tf.train.GradientDescentOptimizer(self.learning_rate).minimize(self.loss) 

要查看創建哪些變量創建train_modelpredict_model使用下面的代碼後:

for v in tf.all_variables(): 
    print(v.name)