TF slice_input_producer不使張量保持同步

我正在將圖像讀入我的TF網絡，但我還需要關聯的標籤以及它們。TF slice_input_producer不使張量保持同步

所以我試圖按照this answer，但輸出的標籤實際上並不匹配我在每批中獲得的圖像。

我的圖像名稱格式爲dir/3.jpg，所以我只是從圖像文件名中提取標籤。

truth_filenames_np = ... 
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) 

# get the labels 
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] 

labels_tf = tf.convert_to_tensor(labels) 

# *** This line should make sure both input tensors are synced (from my limited understanding) 
# My list is also already shuffled, so I set shuffle=False 
truth_image_name, truth_label = tf.train.slice_input_producer([truth_filenames_tf, labels_tf], shuffle=False) 


truth_image_value = tf.read_file(truth_image_name) 
truth_image = tf.image.decode_jpeg(truth_image_value) 
truth_image.set_shape([IMAGE_DIM, IMAGE_DIM, 3]) 
truth_image = tf.cast(truth_image, tf.float32) 
truth_image = truth_image/255.0 

# Another key step, where I batch them together 
truth_images_batch, truth_label_batch = tf.train.batch([truth_image, truth_label], batch_size=mb_size) 


with tf.Session() as sess: 
    sess.run(tf.global_variables_initializer()) 

    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 

    for i in range(epochs): 
     print "Epoch ", i 
     X_truth_batch = truth_images_batch.eval() 
     X_label_batch = truth_label_batch.eval() 

     # Here I display all the images in this batch, and then I check which file numbers they actually are. 
     # BUT, the images that are displayed don't correspond with what is printed by X_label_batch! 
     print X_label_batch 
     plot_batch(X_truth_batch) 



    coord.request_stop() 
    coord.join(threads)

我做錯了什麼，或者slice_input_producer沒有真正確保它的輸入張量是同步的嗎？

旁白：

我也注意到，當我得到tf.train.batch批次，該批次中的元素，我把它原來的列表是彼此相鄰，但批訂單ISN」 t按原始順序排列。例如：如果我的數據是[「dir/1.jpg」，「dir/2.jpg」，「dir/3.jpg」，「dir/4.jpg」，「dir/5.jpg」，dir/6.jpg「]，然後我可以批量處理（batch_size = 2）[」dir/3.jpg「，」dir/4.jpg「]，然後批處理[」dir/1.jpg「，」dir/2.jpg「]，然後是最後一個因此，這使得很難甚至只是使用FIFO隊列作爲標籤，因爲訂單將不符合批次訂單的要求。

來源

2017-04-23 rasen58

能否請您編輯代碼，以再現該問題的最低限度的入隊嘗試？如在，刪除所有圖像處理，看看圖像/標籤是否洗牌 - 因爲它是我們不能運行這個代碼，除非我們有文件 –

這是一個完整的可運行示例，重現問題：

import tensorflow as tf 

truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)] 
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) 
# get the labels 
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] 
labels_tf = tf.convert_to_tensor(labels) 

# My list is also already shuffled, so I set shuffle=False 
truth_image_name, truth_label = tf.train.slice_input_producer(
    [truth_filenames_tf, labels_tf], shuffle=False) 

# # Another key step, where I batch them together 
# truth_images_batch, truth_label_batch = tf.train.batch(
#  [truth_image_name, truth_label], batch_size=11) 

epochs = 7 

with tf.Session() as sess: 
    sess.run(tf.global_variables_initializer()) 
    coord = tf.train.Coordinator() 
    threads = tf.train.start_queue_runners(coord=coord) 
    for i in range(epochs): 
     print("Epoch ", i) 
     X_truth_batch = truth_image_name.eval() 
     X_label_batch = truth_label.eval() 
     # Here I display all the images in this batch, and then I check 
     # which file numbers they actually are. 
     # BUT, the images that are displayed don't correspond with what is 
     # printed by X_label_batch! 
     print(X_truth_batch) 
     print(X_label_batch) 
    coord.request_stop() 
    coord.join(threads)

什麼這是打印：

Epoch 0 
b'dir/0.jpg' 
b'1.jpg' 
Epoch 1 
b'dir/2.jpg' 
b'3.jpg' 
Epoch 2 
b'dir/4.jpg' 
b'5.jpg' 
Epoch 3 
b'dir/6.jpg' 
b'7.jpg' 
Epoch 4 
b'dir/8.jpg' 
b'9.jpg' 
Epoch 5 
b'dir/10.jpg' 
b'11.jpg' 
Epoch 6 
b'dir/12.jpg' 
b'13.jpg'

所以基本上每個eval call都會再次運行操作！添加配料不作出一個區別 - 只是打印批次（第11名，隨後在接下來的11個標籤等）

解決方法我看到的是：

for i in range(epochs): 
    print("Epoch ", i) 
    pair = tf.convert_to_tensor([truth_image_name, truth_label]).eval() 
    print(pair[0]) 
    print(pair[1])

其打印正確：

Epoch 0 
b'dir/0.jpg' 
b'0.jpg' 
Epoch 1 
b'dir/1.jpg' 
b'1.jpg' 
# ...

但是對違反最小驚喜的原則沒有做任何事情。

編輯：尚未這樣做的另一種方式：

import tensorflow as tf 

truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)] 
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np) 
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np] 
labels_tf = tf.convert_to_tensor(labels) 
truth_image_name, truth_label = tf.train.slice_input_producer(
    [truth_filenames_tf, labels_tf], shuffle=False) 
epochs = 7 
with tf.Session() as sess: 
    sess.run(tf.global_variables_initializer()) 
    tf.train.start_queue_runners(sess=sess) 
    for i in range(epochs): 
     print("Epoch ", i) 
     X_truth_batch, X_label_batch = sess.run(
      [truth_image_name, truth_label]) 
     print(X_truth_batch) 
     print(X_label_batch)

這是爲tf.convert_to_tensor和合作一個更好的方法只接受同類型的張量/形狀等

注意，我刪除了協調員爲簡單起見，但這會導致警告：

W c：\ tf_jenkins \ home \ workspace \ release-win \ device \ cpu \ os \ windows \ tensorfl流\核心\仁\ queue_base。抄送：294] _0_input_producer/input_producer/fraction_of_32_full/fraction_of_32_full：跳繩取消了與隊列中沒有關閉

見this

來源

2017-05-11 18:25:01

TF slice_input_producer不使張量保持同步

回答

相關問題