2017-05-08 65 views
0

內的基本LSTM細胞,我想實現從本文比賽LSTM:https://arxiv.org/pdf/1608.07905.pdf調用自定義Tensorflow細胞

我使用Tensorflow。體系結構的一部分是一個RNN,它使用輸入和以前的狀態來計算應用於上下文的注意向量,然後將結果與輸入連接起來併發送給LSTM。爲了構建這個RNN的第一部分,我爲Tensorflow編寫了一個自定義單元。但我不確定如何將結果發送到LSTM。是否有可能在我正在寫的自定義單元格中調用基本的LSTM單元格?我嘗試了幾種不同的方式,但在LSTM單元被調用的行處不斷收到錯誤「module」對象沒有屬性「rnn_cell」。任何幫助將非常感激!

編輯添加代碼:

進口numpy的爲NP 進口tensorflow爲TF

類MatchLSTMCell(tf.contrib.rnn.RNNCell):

def __init__(self, state_size, question_tensor, encoded_questions, batch_size): 
    self._state_size = state_size 
    self.question_tensor = question_tensor 
    self.encoded_questions = encoded_questions 
    self.batch_size = batch_size 

@property 
def state_size(self): 
    return self._state_size 

@property 
def output_size(self): 
    return self._state_size 

def __call__(self, inputs, state, scope=None): 
    scope = scope or type(self).__name__ 

    with tf.variable_scope(scope): 

     W_p = tf.get_variable("W_p", dtype=tf.float64, shape=[self.state_size, self.state_size], initializer=tf.contrib.layers.xavier_initializer()) 
     W_r = tf.get_variable("W_r", dtype=tf.float64, shape=[self.state_size, self.state_size], initializer=tf.contrib.layers.xavier_initializer()) 
     b_p = tf.get_variable("b_p", dtype=tf.float64, shape=[self.state_size]) 
     w = tf.get_variable("w", dtype=tf.float64, shape=[1,self.state_size]) 
     b = tf.get_variable("b", dtype=tf.float64, shape=[]) 

     #print 'question tensor', np.shape(self.question_tensor) 
     #print 'inputs', np.shape(inputs) 
     #print 'insides', np.shape(tf.matmul(inputs, W_p) + tf.matmul(state, W_r) + b_p) 
     G = tf.nn.tanh(
         tf.transpose(tf.transpose(self.question_tensor, perm=[1,0,2]) + 
         (tf.matmul(inputs, W_p) + tf.matmul(state, W_r) + b_p), perm=[1,0,2]) 
         ) 
     #print 'big G', np.shape(G) 

     attention_list = [] 
     for i in range(self.batch_size): 
      attention_matrix = tf.matmul(G[i,:,:], tf.transpose(w)) 
      attention_list.append(attention_matrix) 
     attention_scores = tf.stack(attention_list) 
     a = tf.nn.softmax(attention_scores + b) 
     a = tf.reshape(a, [self.batch_size, -1]) 
     #print 'a shape is', np.shape(a) 

     weighted_question_list = [] 
     for i in range(self.batch_size): 
      attention_vector = tf.matmul(tf.reshape(a[i], [1,-1]), self.encoded_questions[i]) 
      weighted_question_list.append(attention_vector) 
     weighted_questions = tf.stack(weighted_question_list) 
     weighted_questions = tf.reshape(weighted_questions, [32, -1]) 
     #print'weighted questions', np.shape(weighted_questions) 

     z = tf.concat([inputs, weighted_questions], 1) 
     lstm_cell = tf.nn.rnn_cell.LSTMCell(self.state_size) 
     output, new_state = lstm_cell.__call__(z, state) 

    return output, new_state 
+0

無碼來看看這是困難的,我想幫助。什麼可以幫助你創建一個簡約的測試程序,顯示你的自定義RNN是否工作,以及如何使用LSTM以及如何工作或不工作的另一個簡單測試。這兩個程序可以幫助其他人(包括我)在堆棧溢出時幫助你調試你的問題。 – Wontonimo

+0

謝謝!我會去寫這些程序。我將我的代碼添加到該帖子中,以防止在我可以編寫這些程序之前有所幫助。謝謝您的幫助! –

回答

1

我也想重新實現Match_LSTM for Squad進行實驗。 我使用MurtyShikhar's作爲參考。有用!但是,他必須定製AttentionWrapper並使用已有的BasicLSTM單元。

我也試圖通過把z和狀態(輸入狀態)對在Basic_LSTM創建Match_LSTM_cell:

def __call__(self, inputs,state): 
     #c is not a output. c somehow is a "memory keeper". 
     #Necessary to update and pass new_c through LSTM 
     c,h=state 

     #...Calculate your z 
     #...inputs will be each tokens in context(passage) respectively 
     #...Calculate alpha_Q 
     z=tf.concat([inputs,alpha_Q],axis=1) 

     ########This part is reimplement of Basic_LSTM 
     with vs.variable_scope("LSTM_core"): 
      sigmoid=math_ops.sigmoid 
      concat=_linear([z,h],dimension*4,bias=True) 
      i,j,f,o=array_ops.split(concat,num_or_size_splits=4,axis=1) 
      new_c=(c*sigmoid(f+self._forget_bias)+sigmoid(i)*self._activation(j)) 

      new_h = self._activation(new_c) * sigmoid(o) 
      new_state=(new_c,new_h) 
     return new_h,new_state