虧損不融合咖啡迴歸

我在咖啡中做迴歸。該數據集是128x128大小的400個RGB圖像，標籤包含範圍（-1,1）內的浮點數。我應用於數據集的唯一轉換是Normalization（將RGB中的每個像素值除以255）。但是這種損失似乎並沒有收斂。虧損不融合咖啡迴歸

這可能是什麼原因造成的？任何人都可以請建議我嗎？

這裏是我的訓練日誌：

Training.. 
Using solver: solver_hdf5.prototxt 
I0929 21:50:21.657784 13779 caffe.cpp:112] Use CPU. 
I0929 21:50:21.658033 13779 caffe.cpp:174] Starting Optimization 
I0929 21:50:21.658107 13779 solver.cpp:34] Initializing solver from parameters: 
test_iter: 100 
test_interval: 500 
base_lr: 0.0001 
display: 25 
max_iter: 10000 
lr_policy: "inv" 
gamma: 0.0001 
power: 0.75 
momentum: 0.9 
weight_decay: 0.0005 
snapshot: 5000 
snapshot_prefix: "lenet_hdf5" 
solver_mode: CPU 
net: "train_test_hdf5.prototxt" 
I0929 21:50:21.658143 13779 solver.cpp:75] Creating training net from net file: train_test_hdf5.prototxt 
I0929 21:50:21.658567 13779 net.cpp:334] The NetState phase (0) differed from the phase (1) specified by a rule in layer data 
I0929 21:50:21.658709 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression" 
state { 
    phase: TRAIN 
} 
layer { 
    name: "data" 
    type: "HDF5Data" 
    top: "data" 
    top: "label" 
    include { 
    phase: TRAIN 
    } 
    hdf5_data_param { 
    source: "train_hdf5file.txt" 
    batch_size: 64 
    shuffle: true 
    } 
} 
layer { 
    name: "conv1" 
    type: "Convolution" 
    bottom: "data" 
    top: "conv1" 
    param { 
    lr_mult: 1 
    } 
    param { 
    lr_mult: 2 
    } 
    convolution_param { 
    num_output: 20 
    kernel_size: 5 
    stride: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "relu1" 
    type: "ReLU" 
    bottom: "conv1" 
    top: "conv1" 
} 
layer { 
    name: "pool1" 
    type: "Pooling" 
    bottom: "conv1" 
    top: "pool1" 
    pooling_param { 
    pool: MAX 
    kernel_size: 2 
    stride: 2 
    } 
} 
layer { 
    name: "dropout1" 
    type: "Dropout" 
    bottom: "pool1" 
    top: "pool1" 
    dropout_param { 
    dropout_ratio: 0.1 
    } 
} 
layer { 
    name: "fc1" 
    type: "InnerProduct" 
    bottom: "pool1" 
    top: "fc1" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 500 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "dropout2" 
    type: "Dropout" 
    bottom: "fc1" 
    top: "fc1" 
    dropout_param { 
    dropout_ratio: 0.5 
    } 
} 
layer { 
    name: "fc2" 
    type: "InnerProduct" 
    bottom: "fc1" 
    top: "fc2" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "loss" 
    type: "EuclideanLoss" 
    bottom: "fc2" 
    bottom: "label" 
    top: "loss" 
} 
I0929 21:50:21.658833 13779 layer_factory.hpp:74] Creating layer data 
I0929 21:50:21.658859 13779 net.cpp:96] Creating Layer data 
I0929 21:50:21.658871 13779 net.cpp:415] data -> data 
I0929 21:50:21.658902 13779 net.cpp:415] data -> label 
I0929 21:50:21.658926 13779 net.cpp:160] Setting up data 
I0929 21:50:21.658936 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: train_hdf5file.txt 
I0929 21:50:21.659220 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1 
I0929 21:50:21.920578 13779 net.cpp:167] Top shape: 64 3 128 128 (3145728) 
I0929 21:50:21.920656 13779 net.cpp:167] Top shape: 64 1 (64) 
I0929 21:50:21.920686 13779 layer_factory.hpp:74] Creating layer conv1 
I0929 21:50:21.920740 13779 net.cpp:96] Creating Layer conv1 
I0929 21:50:21.920774 13779 net.cpp:459] conv1 <- data 
I0929 21:50:21.920825 13779 net.cpp:415] conv1 -> conv1 
I0929 21:50:21.920877 13779 net.cpp:160] Setting up conv1 
I0929 21:50:21.921985 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280) 
I0929 21:50:21.922050 13779 layer_factory.hpp:74] Creating layer relu1 
I0929 21:50:21.922085 13779 net.cpp:96] Creating Layer relu1 
I0929 21:50:21.922108 13779 net.cpp:459] relu1 <- conv1 
I0929 21:50:21.922137 13779 net.cpp:404] relu1 -> conv1 (in-place) 
I0929 21:50:21.922185 13779 net.cpp:160] Setting up relu1 
I0929 21:50:21.922227 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280) 
I0929 21:50:21.922250 13779 layer_factory.hpp:74] Creating layer pool1 
I0929 21:50:21.922277 13779 net.cpp:96] Creating Layer pool1 
I0929 21:50:21.922298 13779 net.cpp:459] pool1 <- conv1 
I0929 21:50:21.922323 13779 net.cpp:415] pool1 -> pool1 
I0929 21:50:21.922418 13779 net.cpp:160] Setting up pool1 
I0929 21:50:21.922472 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320) 
I0929 21:50:21.922495 13779 layer_factory.hpp:74] Creating layer dropout1 
I0929 21:50:21.922534 13779 net.cpp:96] Creating Layer dropout1 
I0929 21:50:21.922555 13779 net.cpp:459] dropout1 <- pool1 
I0929 21:50:21.922582 13779 net.cpp:404] dropout1 -> pool1 (in-place) 
I0929 21:50:21.922613 13779 net.cpp:160] Setting up dropout1 
I0929 21:50:21.922652 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320) 
I0929 21:50:21.922672 13779 layer_factory.hpp:74] Creating layer fc1 
I0929 21:50:21.922709 13779 net.cpp:96] Creating Layer fc1 
I0929 21:50:21.922729 13779 net.cpp:459] fc1 <- pool1 
I0929 21:50:21.922757 13779 net.cpp:415] fc1 -> fc1 
I0929 21:50:21.922801 13779 net.cpp:160] Setting up fc1 
I0929 21:50:22.301134 13779 net.cpp:167] Top shape: 64 500 (32000) 
I0929 21:50:22.301193 13779 layer_factory.hpp:74] Creating layer dropout2 
I0929 21:50:22.301210 13779 net.cpp:96] Creating Layer dropout2 
I0929 21:50:22.301218 13779 net.cpp:459] dropout2 <- fc1 
I0929 21:50:22.3net.cpp:404] dropout2 -> fc1 (in-place) 
I0929 21:50:22.301244 13779 net.cpp:160] Setting up dropout2 
I0929 21:50:22.301254 13779 net.cpp:167] Top shape: 64 500 (32000) 
I0929 21:50:22.301259 13779 layer_factory.hpp:74] Creating layer fc2 
I0929 21:50:22.301270 13779 net.cpp:96] Creating Layer fc2 
I0929 21:50:22.301275 13779 net.cpp:459] fc2 <- fc1 
I0929 21:50:22.301285 13779 net.cpp:415] fc2 -> fc2 
I0929 21:50:22.301295 13779 net.cpp:160] Setting up fc2 
I0929 21:50:22.301317 13779 net.cpp:167] Top shape: 64 1 (64) 
I0929 21:50:22.301328 13779 layer_factory.hpp:74] Creating layer loss 
I0929 21:50:22.301338 13779 net.cpp:96] Creating Layer loss 
I0929 21:50:22.301343 13779 net.cpp:459] loss <- fc2 
I0929 21:50:22.301350 13779 net.cpp:459] loss <- label 
I0929 21:50:22.301360 13779 net.cpp:415] loss -> loss 
I0929 21:50:22.301374 13779 net.cpp:160] Setting up loss 
I0929 21:50:22.301385 13779 net.cpp:167] Top shape: (1) 
I0929 21:50:22.301391 13779 net.cpp:169]  with loss weight 1 
I0929 21:50:22.301419 13779 net.cpp:239] loss needs backward computation. 
I0929 21:50:22.301425 13779 net.cpp:239] fc2 needs backward computation. 
I0929 21:50:22.301430 13779 net.cpp:239] dropout2 needs backward computation. 
I0929 21:50:22.301436 13779 net.cpp:239] fc1 needs backward computation. 
I0929 21:50:22.301441 13779 net.cpp:239] dropout1 needs backward computation. 
I0929 21:50:22.301446 13779 net.cpp:239] pool1 needs backward computation. 
I0929 21:50:22.301452 13779 net.cpp:239] relu1 needs backward computation. 
I0929 21:50:22.301457 13779 net.cpp:239] conv1 needs backward computation. 
I0929 21:50:22.301463 13779 net.cpp:241] data does not need backward computation. 
I0929 21:50:22.301468 13779 net.cpp:282] This network produces output loss 
I0929 21:50:22.301482 13779 net.cpp:531] Collecting Learning Rate and Weight Decay. 
I0929 21:50:22.301491 13779 net.cpp:294] Network initialization done. 
I0929 21:50:22.301496 13779 net.cpp:295] Memory required for data: 209652228 
I0929 21:50:22.301908 13779 solver.cpp:159] Creating test net (#0) specified by net file: train_test_hdf5.prototxt 
I0929 21:50:22.301935 13779 net.cpp:334] The NetState phase (1) differed from the phase (0) specified by a rule in layer data 
I0929 21:50:22.302028 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression" 
state { 
    phase: TEST 
} 
layer { 
    name: "data" 
    type: "HDF5Data" 
    top: "data" 
    top: "label" 
    include { 
    phase: TEST 
    } 
    hdf5_data_param { 
    source: "test_hdf5file.txt" 
    batch_size: 30 
    } 
} 
layer { 
    name: "conv1" 
    type: "Convolution" 
    bottom: "data" 
    top: "conv1" 
    param { 
    lr_mult: 1 
    } 
    param { 
    lr_mult: 2 
    } 
    convolution_param { 
    num_output: 20 
    kernel_size: 5 
    stride: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "relu1" 
    type: "ReLU" 
    bottom: "conv1" 
    top: "conv1" 
} 
layer { 
    name: "pool1" 
    type: "Pooling" 
    bottom: "conv1" 
    top: "pool1" 
    pooling_param { 
    pool: MAX 
    kernel_size: 2 
    stride: 2 
    } 
} 
layer { 
    name: "dropout1" 
    type: "Dropout" 
    bottom: "pool1" 
    top: "pool1" 
    dropout_param { 
    dropout_ratio: 0.1 
    } 
} 
layer { 
    name: "fc1" 
    type: "InnerProduct" 
    bottom: "pool1" 
    top: "fc1" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 500 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "dropout2" 
    type: "Dropout" 
    bottom: "fc1" 
    top: "fc1" 
    dropout_param { 
    dropout_ratio: 0.5 
    } 
} 
layer { 
    name: "fc2" 
    type: "InnerProduct" 
    bottom: "fc1" 
    top: "fc2" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "loss" 
    type: "EuclideanLoss" 
    bottom: "fc2" 
    bottom: "label" 
    top: "loss" 
} 
I0929 21:50:22.302146 13779 layer_factory.hpp:74] Creating layer data 
I0929 21:50:22.302158 13779 net.cpp:96] Creating Layer data 
I0929 21:50:22.302165 13779 net.cpp:415] data -> data 
I0929 21:50:22.302176 13779 net.cpp:415] data -> label 
I0929 21:50:22.302186 13779 net.cpp:160] Setting up data 
I0929 21:50:22.302191 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: test_hdf5file.txt 
I0929 21:50:22.302305 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1 
I0929 21:50:22.434798 13779 net.cpp:167] Top shape: 30 3 128 128 (1474560) 
I0929 21:50:22.434849 13779 net.cpp:167] Top shape: 30 1 (30) 
I0929 21:50:22.434864 13779 layer_factory.hpp:74] Creating layer conv1 
I0929 21:50:22.434895 13779 net.cpp:96] Creating Layer conv1 
I0929 21:50:22.434914 13779 net.cpp:459] conv1 <- data 
I0929 21:50:22.434944 13779 net.cpp:415] conv1 -> conv1 
I0929 21:50:22.434996 13779 net.cpp:160] Setting up conv1 
I0929 21:50:22.435084 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600) 
I0929 21:50:22.435119 13779 layer_factory.hpp:74] Creating layer relu1 
I0929 21:50:22.435205 13779 net.cpp:96] Creating Layer relu1 
I0929 21:50:22.435237 13779 net.cpp:459] relu1 <- conv1 
I0929 21:50:22.435292 13779 net.cpp:404] relu1 -> conv1 (in-place) 
I0929 21:50:22.435328 13779 net.cpp:160] Setting up relu1 
I0929 21:50:22.435371 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600) 
I0929 21:50:22.435400 13779 layer_factory.hpp:74] Creating layer pool1 
I0929 21:50:22.435443 13779 net.cpp:96] Creating Layer pool1 
I0929 21:50:22.435470 13779 net.cpp:459] pool1 <- conv1 
I0929 21:50:22.435511 13779 net.cpp:415] pool1 -> pool1 
I0929 21:50:22.435550 13779 net.cpp:160] Setting up pool1 
I0929 21:50:22.435597 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400) 
I0929 21:50:22.435626 13779 layer_factory.hpp:74] Creating layer dropout1 
I0929 21:50:22.435669 13779 net.cpp:96] Creating Layer dropout1 
I0929 21:50:22.435698 13779 net.cpp:459] dropout1 <- pool1 
I0929 21:50:22.435739 13779 net.cpp:404] dropout1 -> pool1 (in-place) 
I0929 21:50:22.435780 13779 net.cpp:160] Setting up dropout1 
I0929 21:50:22.435823 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400) 
I0929 21:50:22.435853 13779 layer_factory.hpp:74] Creating layer fc1 
I0929 21:50:22.435899 13779 net.cpp:96] Creating Layer fc1 
I0929 21:50:22.435926 13779 net.cpp:459] fc1 <- pool1 
I0929 21:50:22.435971 13779 net.cpp:415] fc1 -> fc1 
I0929 21:50:22.436018 13779 net.cpp:160] Setting up fc1 
I0929 21:50:22.816076 13779 net.cpp:167] Top shape: 30 500 (15000) 
I0929 21:50:22.816138 13779 layer_factory.hpp:74] Creating layer dropout2 
I0929 21:50:22.816154 13779 net.cpp:96] Creating Layer dropout2 
I0929 21:50:22.816160 13779 net.cpp:459] dropout2 <- fc1 
I0929 21:50:22.816170 13779 net.cpp:404] dropout2 -> fc1 (in-place) 
I0929 21:50:22.816182 13779 net.cpp:160] Setting up dropout2 
I0929 21:50:22.816192 13779 net.cpp:167] Top shape: 30 500 (15000) 
I0929 21:50:22.816197 13779 layer_factory.hpp:74] Creating layer fc2 
I0929 21:50:22.816208 13779 net.cpp:96] Creating Layer fc2 
I0929 21:50:22.816249 13779 net.cpp:459] fc2 <- fc1 
I0929 21:50:22.816262 13779 net.cpp:415] fc2 -> fc2 
I0929 21:50:22.816277 13779 net.cpp:160] Setting up fc2 
I0929 21:50:22.816301 13779 net.cpp:167] Top shape: 30 1 (30) 
I0929 21:50:22.816316 13779 layer_factory.hpp:74] Creating layer loss 
I0929 21:50:22.816329 13779 net.cpp:96] Creating Layer loss 
I0929 21:50:22.816337 13779 net.cpp:459] loss <- fc2 
I0929 21:50:22.816347 13779 net.cpp:459] loss <- label 
I0929 21:50:22.816359 13779 net.cpp:415] loss -> loss 
I0929 21:50:22.816370 13779 net.cpp:160] Setting up loss 
I0929 21:50:22.816381 13779 net.cpp:167] Top shape: (1) 
I0929 21:50:22.816388 13779 net.cpp:169]  with loss weight 1 
I0929 21:50:22.816407 13779 net.cpp:239] loss needs backward computation. 
I0929 21:50:22.816416 13779 net.cpp:239] fc2 needs backward computation. 
I0929 21:50:22.816426 13779 net.cpp:239] dropout2 needs backward computation. 
I0929 21:50:22.816433 13779 net.cpp:239] fc1 needs backward computation. 
I0929 21:50:22.816442 13779 net.cpp:239] dropout1 needs backward computation. 
I0929 21:50:22.816452 13779 net.cpp:239] pool1 needs backward computation. 
I0929 21:50:22.816460 13779 net.cpp:239] relu1 needs backward computation. 
I0929 21:50:22.816468 13779 net.cpp:239] conv1 needs backward computation. 
I0929 21:50:22.816478 13779 net.cpp:241] data does not need backward computation. 
I0929 21:50:22.816486 13779 net.cpp:282] This network produces output loss 
I0929 21:50:22.816500 13779 net.cpp:531] Collecting Learning Rate and Weight Decay. 
I0929 21:50:22.816510 13779 net.cpp:294] Network initialization done. 
I0929 21:50:22.816517 13779 net.cpp:295] Memory required for data: 98274484 
I0929 21:50:22.816565 13779 solver.cpp:47] Solver scaffolding done. 
I0929 21:50:22.816587 13779 solver.cpp:363] Solving MSE regression 
I0929 21:50:22.816596 13779 solver.cpp:364] Learning Rate Policy: inv 
I0929 21:50:22.870337 13779 solver.cpp:424] Iteration 0, Testing net (#0)

更新（這是後@lejlot的答覆）

訓練圖像更改後我的數據：

來源

2016-09-30 magneto

這似乎是學習，損失下降。但是，您的數據顯然存在問題。在學習之前（迭代0），您已經損失了0.0006。對於隨機型號這是非常小的損失。因此，您的數據看起來很奇怪。看看你的依賴值，他們真的很好地分佈在-1和1之間嗎？或者它有99％的「0」和其他幾個值嗎？這種方法本身沒有任何問題，你只需要對數據進行更多的分析。確保它實際上跨越[-1,1]間隔。一旦你修好了，會有更多的小東西可以玩 - 但這是目前最大的問題 - 你可以通過隨機模型獲得小錯誤，因此問題是數據，而不是算法/方法/參數。爲了讓事情變得更快，您還可以將您目前正在使用的學習速率從0.0001提高，但如前所述 - 首先修復數據。

來源

2016-10-01 09:43:49 lejlot

謝謝@lejlot的回覆。我只注意到我已經將圖像數據分成了255次兩次。現在我看到caffe.io.load_image自己做了分割，我不需要明確地做。由於-nan錯誤，我無法提高學習率。學習率爲0.0001，現在我得到的迭代0損失爲0.08，希望我能看到明顯的損失下降。我希望現在很好。而且你能告訴我什麼樣的損失應該令我滿意嗎（比如我應該認爲我的訓練足夠好）？我附上了新的訓練圖像。 – magneto

你不能說什麼是「滿意的損失」，也沒有這樣的對象。你必須測試它，從你的數據角度分析你的問題。數字本身並不重要。特別是 - 訓練損失並不意味着什麼。通常你可以訓練0次訓練錯誤（甚至認爲不建議） – lejlot

虧損不融合咖啡迴歸

回答

相關問題