2015-11-30 34 views
1

我正試圖在tensor-flow中實現批量歸一化層。我沒有問題,使用tf.moments來運行火車這一步,得到的意思是方差使用張量流執行批量歸一化

對於測試時間,我想設置一個指數移動平均值來跟蹤均值和方差。我試圖做這樣的:

def batch_normalized_linear_layer(state_below, scope_name, n_inputs, n_outputs, stddev, wd, eps=.0001): 
    with tf.variable_scope(scope_name) as scope: 
    weight = _variable_with_weight_decay(
     "weights", shape=[n_inputs, n_outputs], 
     stddev=stddev, wd=wd 
    ) 
    act = tf.matmul(state_below, weight) 
    # get moments 
    act_mean, act_variance = tf.nn.moments(act, [0]) 
    # get mean and variance variables 
    mean = _variable_on_cpu('bn_mean', [n_outputs], tf.constant_initializer(0.0)) 
    variance = _variable_on_cpu('bn_variance', [n_outputs], tf.constant_initializer(1.0)) 
    # assign the moments 
    assign_mean = mean.assign(act_mean) 
    assign_variance = variance.assign(act_variance) 

    act_bn = tf.mul((act - mean), tf.rsqrt(variance + eps), name=scope.name+"_bn") 

    beta = _variable_on_cpu("beta", [n_outputs], tf.constant_initializer(0.0)) 
    gamma = _variable_on_cpu("gamma", [n_outputs], tf.constant_initializer(1.0)) 
    bn = tf.add(tf.mul(act_bn, gamma), beta) 
    output = tf.nn.relu(bn, name=scope.name) 
    _activation_summary(output) 
    return output, mean, variance 

凡_variable_on_cpu被定義爲:

def _variable_on_cpu(name, shape, initializer): 
    """Helper to create a Variable stored on CPU memory. 

    Args: 
    name: name of the variable 
    shape: list of ints 
    initializer: initializer for Variable 

    Returns: 
    Variable Tensor 
    """ 
    with tf.device('/cpu:0'): 
    var = tf.get_variable(name, shape, initializer=initializer) 
    return var 

我相信,我設置

assign_mean = mean.assign(act_mean) 
assign_variance = variance.assign(act_variance) 

錯誤,但我不知道如何。當我使用張量板來跟蹤這些均值和方差變量時,他們只是平坦的初始值。

+1

嘗試增加: '''輸出= tf.with_dependencies(依賴= [assign_mean,assign_variance],output_tensor =輸出)''' 只是返回之前。 –

回答

3

Rafal的評論得到了問題的核心:您沒有運行分配節點。您可以嘗試使用我在另一個答案 - How could I use Batch Normalization in TensorFlow?中發佈的蝙蝠科幫手,或者您可以通過添加with_dependencies來強制分配,正如他所建議的。

一般原則是,如果數據或控制依賴關係流經「通過」它,則只應該指望正在運行的節點。 with_dependencies確保在使用輸出操作之前,指定的依賴項已完成。