2017-03-03 255 views
0

我想製作tensorflow的初始版本v3,爲圖像分配標記。我的目標是將JPEG圖像轉換爲初始神經網絡接受的輸入。我不知道如何首先處理這些圖像,以便它可以使用Google Inception的v3模型運行。原來tensorflow項目是在這裏: https://github.com/tensorflow/models/tree/master/inception圖像處理:如何處理圖像以用於初始化

原來,所有的圖片都在數據集和整個數據集首先在ImageProcessing.py傳遞給輸入()或distorted_inputs()。數據集中的圖像被處理並傳遞給train()或eval()方法(這兩種方法都有效)。問題是我想要一個函數來打印出一個特定圖像(不是數據集)的標籤。

以下是推理函數的代碼,用於在谷歌開始時生成標籤。 inceptionv4函數是一個在tensorflow中實現的卷積神經網絡。

def inference(images, num_classes, for_training=False, restore_logits=True, 
       scope=None): 
    """Build Inception v3 model architecture. 

    See here for reference: http://arxiv.org/abs/1512.00567 

    Args: 
    images: Images returned from inputs() or distorted_inputs(). 
    num_classes: number of classes 
    for_training: If set to `True`, build the inference model for training. 
     Kernels that operate differently for inference during training 
     e.g. dropout, are appropriately configured. 
    restore_logits: whether or not the logits layers should be restored. 
     Useful for fine-tuning a model with different num_classes. 
    scope: optional prefix string identifying the ImageNet tower. 

    Returns: 
    Logits. 2-D float Tensor. 
    Auxiliary Logits. 2-D float Tensor of side-head. Used for training only. 
    """ 
    # Parameters for BatchNorm. 
    batch_norm_params = { 
     # Decay for the moving averages. 
     'decay': BATCHNORM_MOVING_AVERAGE_DECAY, 
     # epsilon to prevent 0s in variance. 
     'epsilon': 0.001, 
    } 
    # Set weight_decay for weights in Conv and FC layers. 
    with slim.arg_scope([slim.ops.conv2d, slim.ops.fc], weight_decay=0.00004): 
    with slim.arg_scope([slim.ops.conv2d], 
         stddev=0.1, 
         activation=tf.nn.relu, 
         batch_norm_params=batch_norm_params): 
     logits, endpoints = inception_v4(
      images, 
      dropout_keep_prob=0.8, 
      num_classes=num_classes, 
      is_training=for_training, 
      scope=scope) 

    # Add summaries for viewing model statistics on TensorBoard. 
    _activation_summaries(endpoints) 

    # Grab the logits associated with the side head. Employed during training. 
    auxiliary_logits = endpoints['AuxLogits'] 

    return logits, auxiliary_logits 

這是我嘗試在傳遞給推理函數之前處理圖像。

def process_image(self, image_path): 
    filename_queue = tf.train.string_input_producer(image_path) 
    reader = tf.WholeFileReader() 
    key, value = reader.read(filename_queue) 

    img = tf.image.decode_jpeg(value) 
    height = self.image_size 
    width = self.image_size 
    image_data = tf.cast(img, tf.float32) 
    image_data = tf.reshape(image_data, shape=[1, height, width, 3]) 
    return image_data 

我想簡單地處理一個圖像文件,以便我可以將它傳遞給推理函數。這個推斷會打印出標籤。上面的代碼沒有工作,印刷錯誤:

ValueError: Shape() must have rank at least 1

我很感激,如果任何人都可以提供任何洞察到這一問題。

回答

0

初始只需要(299,299,3)圖像,輸入的縮放比例介於-1和1之間。請參閱下面的代碼。我只是用這個改變圖像,並把它們放在一個TFRecord(然後隊列)來運行我的東西。

from PIL import Image 
import PIL 
import numpy as np 
def load_image(self, image_path): 
    img = Image.open(image_path) 
    newImg = img.resize((299,299), PIL.Image.BILINEAR).convert("RGB") 
    data = np.array(newImg.getdata()) 
    return 2*(data.reshape((newImg.size[0], newImg.size[1], 3)).astype(np.float32)/255) - 1