使用張量流TFRecords用於不同圖像大小的數據集

在tensorflow教程中，MNTR數據集提供了TFRecords的示例用法。 MNIST數據集被轉換爲TFRecords文件是這樣的：使用張量流TFRecords用於不同圖像大小的數據集

def convert_to(data_set, name): 
    images = data_set.images 
    labels = data_set.labels 
    num_examples = data_set.num_examples 

    if images.shape[0] != num_examples: 
    raise ValueError('Images size %d does not match label size %d.' % 
        (images.shape[0], num_examples)) 
    rows = images.shape[1] 
    cols = images.shape[2] 
    depth = images.shape[3] 

    filename = os.path.join(FLAGS.directory, name + '.tfrecords') 
    print('Writing', filename) 
    writer = tf.python_io.TFRecordWriter(filename) 
    for index in range(num_examples): 
    image_raw = images[index].tostring() 
    example = tf.train.Example(features=tf.train.Features(feature={ 
     'height': _int64_feature(rows), 
     'width': _int64_feature(cols), 
     'depth': _int64_feature(depth), 
     'label': _int64_feature(int(labels[index])), 
     'image_raw': _bytes_feature(image_raw)})) 
    writer.write(example.SerializeToString()) 
    writer.close()

，然後將其readed和解碼這樣的：

def read_and_decode(filename_queue): 
    reader = tf.TFRecordReader() 
    _, serialized_example = reader.read(filename_queue) 
    features = tf.parse_single_example(
     serialized_example, 
     # Defaults are not specified since both keys are required. 
     features={ 
      'image_raw': tf.FixedLenFeature([], tf.string), 
      'label': tf.FixedLenFeature([], tf.int64), 
     }) 

    # Convert from a scalar string tensor (whose single string has 
    # length mnist.IMAGE_PIXELS) to a uint8 tensor with shape 
    # [mnist.IMAGE_PIXELS]. 
    image = tf.decode_raw(features['image_raw'], tf.uint8) 
    image.set_shape([mnist.IMAGE_PIXELS]) 

    # OPTIONAL: Could reshape into a 28x28 image and apply distortions 
    # here. Since we are not applying any distortions in this 
    # example, and the next step expects the image to be flattened 
    # into a vector, we don't bother. 

    # Convert from [0, 255] -> [-0.5, 0.5] floats. 
    image = tf.cast(image, tf.float32) * (1./255) - 0.5 

    # Convert label from a scalar uint8 tensor to an int32 scalar. 
    label = tf.cast(features['label'], tf.int32) 

    return image, label

問題：它有沒有辦法讀取不同大小TFRecords圖像？因爲在這一點

image.set_shape([mnist.IMAGE_PIXELS])

所有張量尺寸需要知道。這意味着我不能做類似於

width = tf.cast(features['width'], tf.int32) 
height = tf.cast(features['height'], tf.int32) 
tf.reshape(image, [width, height, 3])

那麼我如何在這種情況下使用TFRecords？另外我不明白爲什麼在本教程中，作者在TFRecords文件中保存高度和寬度（如果它們之後沒有使用它），並在讀取和解碼圖像時使用預定義的常量。

來源

2016-08-04 sergekondrat

對於這種特殊情況下的培訓，沒有理由保持寬度和高度，但是由於圖像序列化爲單個字節流，未來您可能會想知道原始數據的形狀而不是784字節 - 本質上，他們只是創造獨立的例子。

對於不同大小的圖像，您必須記住，在某些時候您需要將特徵張量映射到權重，並且由於給定網絡的權重數是固定的，因此必須是特徵張量。還有一點需要考慮的是數據標準化：如果您使用的是不同形狀的圖像，它們是否具有相同的均值和方差？你可能會選擇忽略這一點，但如果你不這樣做，你也必須爲它提供一個解決方案。

如果你只是要求使用不同尺寸的圖像，即100x100x3代替28x28x1，當然你可以使用

image.set_shape([100, 100, 3])

爲了重塑30000「元素」總一個張量單等級3張量。或者，如果您正在使用批次工作（待確定大小的），你可以使用

image_batch.set_shape([None, 100, 100, 3])

請注意，這不是一個列表張量，而是一個單等級4張量並且因爲該批中的所有圖像必須具有相同的尺寸;即在同一批次中具有100x100x3圖像後跟28x28x1圖像是不可能的。

之前批處理雖然你可以自由地擁有你想要的任何大小和形狀，你也可以加載記錄中的形狀 - 他們在MNIST示例中沒有這樣做。例如，您可能會應用image processing operations中的任意一個，以獲取固定大小的增強圖像以供進一步處理。

還要注意，圖像的序列化表示可能實際上具有不同的長度和形狀。例如，您可能決定存儲JPEG or PNG bytes而不是原始像素值;他們顯然會有不同的大小。

最後，還有tf.FixedLenFeature()，但他們正在創建SparseTensor表示法。這通常與非二元圖像沒有關係。

來源

2017-02-12 22:17:00 sunside

使用張量流TFRecords用於不同圖像大小的數據集

回答

相關問題