資源耗盡錯誤OOM與形狀分配張量[384,192]

我現在正在處理醫學圖像分類問題。圖像是患者大腦的多個切片。數據已被清除。我有150名AD患者，150 MCI（輕度認知障礙）患者，150名NC患者（正常）。每個患者有96個dicom文件或者同樣說96個切片。每片是160 * 160。資源耗盡錯誤OOM與形狀分配張量[384,192]

我使用Tensorflow cifar10代碼作爲我的模板，並使代碼工作，我必須更改它的read_cifar10部分。我按照下面的鏈接建議更改代碼。

Attach a queue to a numpy array in tensorflow for data fetch instead of files?

首先，我的數據更改爲與我自己的Python模塊load_img_M.py二進制文件。爲了減少數據量，我只選擇中間片，從30到60

import numpy as np 
    import dicom 
    import os      
    ad_path = '/home/zmz/Pictures/AD' 
    mci_path = '/home/zmz/Pictures/MCI' 
    nc_path = '/home/zmz/Pictures/NC' 
    data_path =['/home/zmz/Pictures/%s' %i for i in ['NC','MCI','AD']] 
    KDOWN = 29 
    KUP = 61 
    SLICESNUM = KUP - KDOWN + 1 
    RECORDBYTES = 160*160*SLICESNUM + 1 
    #load image from the directory and save to binary file 
    def img2binary():   
     train_arr = np.zeros([100,SLICESNUM*160*160+1]) 
     test_arr = np.zeros([50,SLICESNUM*160*160+1]) 
     for p in range(len(data_path)): 
      Patientlist = os.listdir(data_path[p])   
      for q in range(len(Patientlist)): 
       Dicompath = os.path.join(data_path[p],Patientlist[q]) 
       Dicomlist = os.listdir(Dicompath) 
       if q<100: 
        train_arr[q,0] = p 
       else: 
        test_arr[q-100,0] = p    
       for k in range(len(Dicomlist)): 
        if k>KDOWN and k<KUP:#select the middle slices which have the most information 
         Picturepath = os.path.join(Dicompath,Dicomlist[k]) 
         img = dicom.read_file(Picturepath) 
         #print(type(img.pixel_array)) 
         imgpixel = img.pixel_array.reshape(25600) 
         if q <100: 
          # assign the label of the picture 
          train_arr[q,(1+(k-KDOWN-1)*25600):(1+(k-KDOWN)*25600)] = imgpixel #assign the pixel 
         else:       
          test_arr[q-100,(1+(k-KDOWN-1)*25600):(1+(k-KDOWN)*25600)] = imgpixel             
      train_arr.tofile("/home/zmz/Pictures/tmp/images/train%s.bin"%p) 
      test_arr.tofile("/home/zmz/Pictures/tmp/images/test%s.bin"%p)

和二進制文件看起來是這樣的：

How binary files look like

接下來，我改變cifar10_input模塊：

"""Routine for decoding the Alzeheimer dicom format""" 

from __future__ import absolute_import 
from __future__ import division 
from __future__ import print_function 
import load_img_M 
import os 
import numpy as np 
import dicom 

from six.moves import xrange # pylint: disable=redefined-builtin 
import tensorflow as tf 

# Process images of this size. Note that this differs from the original CIFAR 
# image size of 32 x 32. If one alters this number, then the entire model 
# architecture will change and any model would need to be retrained. 
IMAGE_SIZE = 160 

# Global constants describing the ADNI data set. 
IMAGE_HEIGHT = 160 
IMAGE_WIDTH = 160 
IMAGE_CHANNEL = 1 
SLICES_NUM = load_img_M.SLICESNUM 
NUM_CLASSES = 3 
NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 300 
NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = 150 
#define a dicom reader to read a record 


def read_ADNI(filename_queue): 
    """Reads and parses examples from ADNI data files. 

    Recommendation: if you want N-way read parallelism, call this function 
    N times. This will give you N independent Readers reading different 
    files & positions within those files, which will give better mixing of 
    examples. 

    Args: 
    filename_queue: A queue of strings with the filenames to read from. 

    Returns: 
    An object representing a single example, with the following fields: 
     height: number of rows in the result (160) 
     width: number of columns in the result (160) 
     channels: number of color channels in the result (1) 
     key: a scalar string Tensor describing the filename & record number 
     for this example. 
     label: an int32 Tensor with the label in the range 0,1,2. 
     uint8image: a [slice, height, width, channels] uint8 Tensor with the image data 
    """ 

    class ADNIRecord(object): 
    pass#do nothing the class is vacant 
    result = ADNIRecord() 


    label_bytes = 1 
    result.height = IMAGE_HEIGHT 
    result.width = IMAGE_WIDTH 
    result.depth = IMAGE_CHANNEL 
    result.slice = SLICES_NUM 
    image_bytes = result.height * result.width * result.depth * result.slice 
    record_bytes = label_bytes + image_bytes 

    assert record_bytes == load_img_M.RECORDBYTES 

    reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) 
    result.key, value = reader.read(filename_queue) 

    # Convert from a string to a vector of uint8 that is record_bytes long. 
    record_bytes = tf.decode_raw(value, tf.uint8) 

    # The first bytes represent the label, which we convert from uint8->int32. 
    result.label = tf.cast(
     tf.strided_slice(record_bytes, [0], [label_bytes]), tf.int32) 

    # The remaining bytes after the label represent the image, which we reshape 
    # from [depth * height * width] to [depth, height, width]. 
    depth_major = tf.reshape(
     tf.strided_slice(record_bytes, [label_bytes], 
         [label_bytes + image_bytes]), 
     [result.slice,result.height, result.width, result.depth]) 
    # Convert from [depth, height, width] to [height, width, depth]. 
    #result.uint8image = tf.transpose(depth_major, [1, 2, 0]) 
    result.uint8image = depth_major 
    return result

最後，我改變了失真的輸入。我刪除這些塊進行圖像預處理，如裁剪和翻轉：

def distorted_inputs(data_dir, batch_size): 
    """Construct distorted input for ADNI training using the Reader ops. 

    Args: 
    data_dir: Path to the ADNI data directory. 
    batch_size: Number of images per batch. 

    Returns: 
    images: Images. 5D tensor of [batch_size, slices , IMAGE_SIZE , IMAGE_SIZE, 1] size. 
    labels: Labels. 1D tensor of [batch_size] size. 
    """ 
    #filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) 
    #   for i in xrange(1, 6)]#data_batch_1,2,3.bin 
    filenames = [os.path.join(data_dir,'tmp/images/train%s.bin' % i) for i in [0,1,2]] 
    for f in filenames: 
    if not tf.gfile.Exists(f): 
     raise ValueError('Failed to find file: ' + f) 

    # Create a queue that produces the filenames to read. 
    filename_queue = tf.train.string_input_producer(filenames) 
    ''' 
    call the first function defined at the very beggining 
    ''' 
    # Read examples from files in the filename queue. 
    read_input = read_ADNI(filename_queue) 
    reshaped_image = tf.cast(read_input.uint8image, tf.float32) 
# Set the shapes of tensors. 
    reshaped_image.set_shape([SLICES_NUM,IMAGE_HEIGHT, IMAGE_WIDTH, IMAGE_CHANNEL]) 
    read_input.label.set_shape([1]) 
    # Ensure that the random shuffling has good mixing properties. 
    min_fraction_of_examples_in_queue = 0.4 
    min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN * 
          min_fraction_of_examples_in_queue) 
    print ('Filling queue with %d ADNI images before starting to train. ' 
     'This will take a few minutes.' % min_queue_examples) 
return _generate_image_and_label_batch(reshaped_image, read_input.label, 
             min_queue_examples, batch_size, 
             shuffle=True)

此問題是一個3d conv網絡問題。我用tf.conv3d並進行修改，這樣的代碼可以工作：

def inference(images):#core code 
    """Build the ADNI model. 

    Args: 
    images: Images returned from distorted_inputs() or inputs(). 

    Returns: 
    Logits. 
    """ 
    # We instantiate all variables using tf.get_variable() instead of 
    # tf.Variable() in order to share variables across multiple GPU training runs. 
    # If we only ran this model on a single GPU, we could simplify this function 
    # by replacing all instances of tf.get_variable() with tf.Variable(). 
    # 
    # conv1 
    with tf.variable_scope('conv1') as scope: 
    kernel = _variable_with_weight_decay('weights', 
             shape=[3, 3, 3, 1, 64], 
             stddev=5e-2, 
             wd=0.0) 
    conv = tf.nn.conv3d(images, kernel, [1, 1, 1, 1, 1], padding='SAME') 
    biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0)) 
    pre_activation = tf.nn.bias_add(conv, biases) 
    conv1 = tf.nn.relu(pre_activation, name=scope.name) 
    _activation_summary(conv1) 

    # pool1 
    pool1 = tf.nn.max_pool3d(conv1, ksize=[1, 3, 3, 3, 1], strides=[1, 1, 2, 2, 1], 
         padding='SAME', name='pool1') 
    # norm1 
    #norm1 = tf.nn.lrn3d(pool1, 4, bias=1.0, alpha=0.001/9.0, beta=0.75,name='norm1') 
    norm1 = pool1 

    # conv2 
    with tf.variable_scope('conv2') as scope: 
    kernel = _variable_with_weight_decay('weights', 
             shape=[3, 3, 3, 64, 64], 
             stddev=5e-2, 
             wd=0.0) 
    conv = tf.nn.conv3d(norm1, kernel, [1, 1, 1, 1, 1], padding='SAME') 
    biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.1)) 
    pre_activation = tf.nn.bias_add(conv, biases) 
    conv2 = tf.nn.relu(pre_activation, name=scope.name) 
    _activation_summary(conv2) 

    # norm2 
    #norm2 = tf.nn.lrn3d(conv2, 4, bias=1.0, alpha=0.001/9.0, beta=0.75,name='norm2') 
    norm2 = conv2 
    # pool2 
    pool2 = tf.nn.max_pool3d(norm2, ksize=[1, 3, 3 ,3, 1], 
         strides=[1, 1, 2, 2, 1], padding='SAME', name='pool2') 

    # local3 
    with tf.variable_scope('local3') as scope: 
    # Move everything into depth so we can perform a single matrix multiply. 
    reshape = tf.reshape(pool2, [FLAGS.batch_size, -1]) 
    dim = reshape.get_shape()[1].value 
    weights = _variable_with_weight_decay('weights', shape=[dim, 384], 
              stddev=0.04, wd=0.004) 
    biases = _variable_on_cpu('biases', [384], tf.constant_initializer(0.1)) 
    local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name) 
    _activation_summary(local3) 

    # local4 
    with tf.variable_scope('local4') as scope: 
    weights = _variable_with_weight_decay('weights', shape=[384, 192], 
              stddev=0.04, wd=0.004) 
    biases = _variable_on_cpu('biases', [192], tf.constant_initializer(0.1)) 
    local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name=scope.name) 
    _activation_summary(local4) 

    # linear layer(WX + b), 
    # We don't apply softmax here because 
    # tf.nn.sparse_softmax_cross_entropy_with_logits accepts the unscaled logits 
    # and performs the softmax internally for efficiency. 
    with tf.variable_scope('softmax_linear') as scope: 
    weights = _variable_with_weight_decay('weights', [192, NUM_CLASSES], 
              stddev=1/192.0, wd=0.0) 
    biases = _variable_on_cpu('biases', [NUM_CLASSES], 
           tf.constant_initializer(0.0)) 
    softmax_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name) 
    _activation_summary(softmax_linear) 

    return softmax_linear

我怕電腦不能處理我選擇了批量大小爲1起初，這大量data.So，希望該代碼至少可以運行。即使我在標題中遇到問題。其實我有非常好的Linux工作站。12G GPU鈦金XP，64G內存。但我與同學分享。因此，分配給我的帳戶的資源可能很低。 My Linux GPU parameters

如果你能做一些計算來證明爲什麼資源耗盡，那就更好了。

來源

2017-10-10 oliverzzm

即使對於批量大小1，網絡也太大。可以在需要的重量數量中找到問題，而不是在批次中。嘗試去除圖層的幾層或減少3d圖像的像素，並查看它的限制。

來源

2017-10-10 12:07:23

是否有更好的方法輸入Dicom數據？使用16個線程排隊這些二進制文件是否會佔用很多RAM？ – oliverzzm

我不認爲你的問題是數據。內存太小，無法存儲卷積網絡的權重。以不同方式提供數據不會對您的案例產生重大影響。 –

我調整到80 * 80，沒有更多資源枯竭的問題。 – oliverzzm

資源耗盡錯誤OOM與形狀分配張量[384,192]

回答

相關問題