2017-10-07 126 views
1

我想用張量「圖像」來提供CNN。當佔位符is_training爲True時,我希望此張量包含來自訓練集(具有FIXED大小)的圖像,否則我希望它包含來自測試集(不是固定大小)的圖像。如何使用Tensorflow的tf.cond()與兩個不同的數據集迭代器而不迭代兩者?

這是必要的,因爲在訓練中,我從訓練圖像中隨機取出固定的作物,而在測試中,我想執行密集評估並饋送網絡內的整個圖像(它是完全卷積的,因此它會接受它們)

當前不工作的方法是創建兩個不同的迭代器,並嘗試在session.run(images,{is_training:True/False})處用tf.cond選擇訓練/測試輸入。

問題是迭代器被評估。訓練和測試數據集也有不同的大小,所以我不能將它們迭代到最後。有沒有辦法做到這一點?或者以更聰明的方式重寫?

我已經看到了一些關於這個問題的答案,但他們總是使用tf.assign,它需要一個numpy數組並將其賦值給張量。在這種情況下,我不能使用tf.assign,因爲我已經有了來自迭代器的張量。

我現在的代碼是這一個。它只是簡單地檢查了張量的形狀「圖像」:

train_filenames, train_labels = list_images(args.train_dir) 
val_filenames, val_labels = list_images(args.val_dir) 

graph = tf.Graph() 
with graph.as_default(): 

    # Preprocessing (for both training and validation): 
    def _parse_function(filename, label): 
     image_string = tf.read_file(filename) 
     image_decoded = tf.image.decode_jpeg(image_string, channels=3)   
     image = tf.cast(image_decoded, tf.float32) 

     return image, label 

    # Preprocessing (for training) 
    def training_preprocess(image, label): 

     # Random flip and crop 
     image = tf.image.random_flip_left_right(image) 
     image = tf.random_crop(image, [args.crop,args.crop, 3]) 

     return image, label 

    # Preprocessing (for validation) 
    def val_preprocess(image, label): 

     flipped_image = tf.image.flip_left_right(image) 
     batch = tf.stack([image,flipped_image],axis=0) 

     return batch, label 

    # Training dataset 
    train_filenames = tf.constant(train_filenames) 
    train_labels = tf.constant(train_labels) 
    train_dataset = tf.contrib.data.Dataset.from_tensor_slices((train_filenames, train_labels)) 
    train_dataset = train_dataset.map(_parse_function,num_threads=args.num_workers, output_buffer_size=args.batch_size) 
    train_dataset = train_dataset.map(training_preprocess,num_threads=args.num_workers, output_buffer_size=args.batch_size) 
    train_dataset = train_dataset.shuffle(buffer_size=10000) 
    batched_train_dataset = train_dataset.batch(args.batch_size) 

    # Validation dataset 
    val_filenames = tf.constant(val_filenames) 
    val_labels = tf.constant(val_labels) 
    val_dataset = tf.contrib.data.Dataset.from_tensor_slices((val_filenames, val_labels)) 
    val_dataset = val_dataset.map(_parse_function,num_threads=1, output_buffer_size=1) 
    val_dataset = val_dataset.map(val_preprocess,num_threads=1, output_buffer_size=1) 

    train_iterator = tf.contrib.data.Iterator.from_structure(batched_train_dataset.output_types,batched_train_dataset.output_shapes) 
    val_iterator = tf.contrib.data.Iterator.from_structure(val_dataset.output_types,val_dataset.output_shapes) 

    train_images, train_labels = train_iterator.get_next() 
    val_images, val_labels = val_iterator.get_next() 

    train_init_op = train_iterator.make_initializer(batched_train_dataset) 
    val_init_op = val_iterator.make_initializer(val_dataset) 

    # Indicates whether we are in training or in test mode 
    is_training = tf.placeholder(tf.bool) 

    def f_true(): 
     with tf.control_dependencies([tf.identity(train_images)]): 
      return tf.identity(train_images) 

    def f_false(): 
     return val_images 

    images = tf.cond(is_training,f_true,f_false) 

    num_images = images.shape 

    with tf.Session(graph=graph) as sess: 

     sess.run(train_init_op) 
     #sess.run(val_init_op) 

     img = sess.run(images,{is_training:True}) 
     print(img.shape) 

的問題是,當我想只使用訓練迭代器,我的評論行初始化val_init_op,但有以下錯誤:

FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element. 
[[Node: IteratorGetNext_1 = IteratorGetNext[output_shapes=[[2,?,?,3], []], output_types=[DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/cpu:0"](Iterator_1)]] 

如果我不評論那條線,一切正常,當is_training爲true時,我得到訓練圖像,當is_training爲False時,我得到驗證圖像。問題是這兩個迭代器都需要初始化,當我評估其中一個迭代器時,另一個迭代器也會增加。正如我所說,他們是不同的大小,這會導致一個問題。

我希望有辦法解決它!在此先感謝

回答

2

訣竅是調用iterator.get_next()f_true()f_false()功能:

def f_true(): 
    train_images, _ = train_iterator.get_next() 
    return train_images 

def f_false(): 
    val_images, _ = val_iterator.get_next() 
    return val_images 

images = tf.cond(is_training, f_true, f_false) 

的建議同樣適用於具有副作用,如賦值給一個變量任何TensorFlow OP:如果您希望有條件地發生副作用,必須在傳遞到tf.cond()的相應分支函數內創建操作。

+0

它的工作!非常感謝你!只是一點點(愚蠢)的修正,使其真正的工作:添加返回值的兩個函數。 – simo23

+0

有沒有更好的方法來提供這些不同大小的驗證圖像?現在我需要單獨提供每個驗證圖像(帶有翻轉版本),如您所能想象的那樣非常慢。 – simo23

+0

謝謝,我糾正了代碼示例。提供不同尺寸圖像的批量支持並不多。使用'tf.data' API的一個選擇是將所有驗證計算移動到並行的'Dataset.map()'轉換中。通過將'num_parallel_calls'參數設置爲'N',可以並行處理'N'圖像。 – mrry