2017-02-09 113 views
1

從googlenet的原型文檔中查看時,發現初始圖層在結尾處具有一個連續圖層,該連續圖層需要幾個底部輸入。連接層後googlenet中的輸出維度是什麼?

e.g:

layer { 
    name: "inception_3a/output" 
    type: "Concat" 
    bottom: "inception_3a/1x1" 
    bottom: "inception_3a/3x3" 
    bottom: "inception_3a/5x5" 
    bottom: "inception_3a/pool_proj" 
    top: "inception_3a/output" 
} 

如可以看到的那樣,有一個1×1 CONV層,一個3×3 CONV層,一個5×5 CONV層和最後一個池層。這些層說明如下:

layer { 
    name: "inception_3a/1x1" 
    type: "Convolution" 
    bottom: "pool2/3x3_s2" 
    top: "inception_3a/1x1" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    convolution_param { 
    num_output: 64 
    kernel_size: 1 
    weight_filler { 
     type: "xavier" 
     std: 0.03 
    } 
    bias_filler { 
     type: "constant" 
     value: 0.2 
    } 
    } 
} 
layer { 
    name: "inception_3a/relu_1x1" 
    type: "ReLU" 
    bottom: "inception_3a/1x1" 
    top: "inception_3a/1x1" 
} 
layer { 
    name: "inception_3a/3x3_reduce" 
    type: "Convolution" 
    bottom: "pool2/3x3_s2" 
    top: "inception_3a/3x3_reduce" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    convolution_param { 
    num_output: 96 
    kernel_size: 1 
    weight_filler { 
     type: "xavier" 
     std: 0.09 
    } 
    bias_filler { 
     type: "constant" 
     value: 0.2 
    } 
    } 
} 
layer { 
    name: "inception_3a/relu_3x3_reduce" 
    type: "ReLU" 
    bottom: "inception_3a/3x3_reduce" 
    top: "inception_3a/3x3_reduce" 
} 
layer { 
    name: "inception_3a/3x3" 
    type: "Convolution" 
    bottom: "inception_3a/3x3_reduce" 
    top: "inception_3a/3x3" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    convolution_param { 
    num_output: 128 
    pad: 1 
    kernel_size: 3 
    weight_filler { 
     type: "xavier" 
     std: 0.03 
    } 
    bias_filler { 
     type: "constant" 
     value: 0.2 
    } 
    } 
} 
layer { 
    name: "inception_3a/relu_3x3" 
    type: "ReLU" 
    bottom: "inception_3a/3x3" 
    top: "inception_3a/3x3" 
} 
layer { 
    name: "inception_3a/5x5_reduce" 
    type: "Convolution" 
    bottom: "pool2/3x3_s2" 
    top: "inception_3a/5x5_reduce" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    convolution_param { 
    num_output: 16 
    kernel_size: 1 
    weight_filler { 
     type: "xavier" 
     std: 0.2 
    } 
    bias_filler { 
     type: "constant" 
     value: 0.2 
    } 
    } 
} 
layer { 
    name: "inception_3a/relu_5x5_reduce" 
    type: "ReLU" 
    bottom: "inception_3a/5x5_reduce" 
    top: "inception_3a/5x5_reduce" 
} 
layer { 
    name: "inception_3a/5x5" 
    type: "Convolution" 
    bottom: "inception_3a/5x5_reduce" 
    top: "inception_3a/5x5" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    convolution_param { 
    num_output: 32 
    pad: 2 
    kernel_size: 5 
    weight_filler { 
     type: "xavier" 
     std: 0.03 
    } 
    bias_filler { 
     type: "constant" 
     value: 0.2 
    } 
    } 
} 
layer { 
    name: "inception_3a/relu_5x5" 
    type: "ReLU" 
    bottom: "inception_3a/5x5" 
    top: "inception_3a/5x5" 
} 
layer { 
    name: "inception_3a/pool" 
    type: "Pooling" 
    bottom: "pool2/3x3_s2" 
    top: "inception_3a/pool" 
    pooling_param { 
    pool: MAX 
    kernel_size: 3 
    stride: 1 
    pad: 1 
    } 
} 
layer { 
    name: "inception_3a/pool_proj" 
    type: "Convolution" 
    bottom: "inception_3a/pool" 
    top: "inception_3a/pool_proj" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    convolution_param { 
    num_output: 32 
    kernel_size: 1 
    weight_filler { 
     type: "xavier" 
     std: 0.1 
    } 
    bias_filler { 
     type: "constant" 
     value: 0.2 
    } 
    } 
} 

可以看出的是,這些具有輸出的不同數目和也不同的濾波器大小,總之在CONCAT層上的文檔如下:

輸入:

n_i * c_i * h * w for each input blob i from 1 to K.

輸出:

我f軸= 0:(n_1 + n_2 + ... + n_K) * c_1 * h * w,並且所有輸入c_i 應該是相同的。

如果axis = 1:n_1 * (c_1 + c_2 + ... + c_K) * h * w,並且所有輸入n_i應該相同。

首先,我不確定默認設置是什麼,其次我不確定哪個尺寸將具有輸出音量,因爲寬度和高度應該保持不變,但所有thre conv層都會產生不同的輸出。任何指針將真正理解

回答

1

關於「的毗連」軸的默認值是1,從而通過通道尺寸串聯。爲了做到這一點,所有連接的圖層應具有相同的高度和寬度。尋找到日誌中,尺寸(假設批量大小32):

inception_3a/1x1的 - > [32,64,28,28]
inception_3a/3×3 - > [32,128,28,28]
inception_3a/5x5的 - > [32,32,28,28]
inception_3a/pool_proj - > [32,32,28,28]

因而最終輸出將具有尺寸:
inception_3a /輸出 - > [32(64 + 128 + 32 + 32)28,28] - > [32,256,28,28]

從Caffe日誌預計:

Creating Layer inception_3a/output 
inception_3a/output <- inception_3a/1x1 
inception_3a/output <- inception_3a/3x3 
inception_3a/output <- inception_3a/5x5 
inception_3a/output <- inception_3a/pool_proj 
inception_3a/output -> inception_3a/output 
Setting up inception_3a/output 
Top shape: 32 256 28 28 (6422528) 
+0

如果軸將是2,或3,並且因此在它將是對以下尺寸? – Kev1n91

+0

是的,但請注意,在這種情況下,它將不起作用,因爲所有圖層的通道數都不相同。實際上,所有輸入軸必須具有除concat軸以外的相同尺寸,如代碼所示:[concat_layer.cpp](https://github.com/BVLC/caffe/blob/master/src/caffe/layers /concat_layer.cpp) –