比較圖像相似性的最佳技術是什麼？

-1

我有一個圖像master.png和超過10.000其他圖像（slave_1.png,slave_2.png，...）。它們都具有：比較圖像相似性的最佳技術是什麼？

相同的尺寸
相同的格式（PNG）
相同的圖像背景

從站的

98％是相同的（例如100x50像素）。主，但奴隸的2％，有一個稍微不同的內容：

新的顏色出現
新的小形狀出現在圖像的中間

我需要發現那些不同的奴隸。我正在使用Ruby，但我在使用其他技術時沒有問題。

我試圖File.binread這兩個圖像，然後使用==比較。它爲80％的奴隸工作。在其他奴隸中，它發現了變化，但圖像在視覺上是相同的。所以它不起作用。

替代有：

計數的存在於每個從顏色的數量，並與主比較。它將在100％的時間內工作。但我不知道如何以「輕鬆」的方式在Ruby中做到這一點。
使用一些圖像處理器通過直方圖進行比較，如RMagick或ruby-vips8。這種方式也應該工作，但我需要消耗更少的CPU /內存。
編寫一個C++/Go/Crystal程序來逐像素讀取並返回多種顏色。我想通過這種方式我們可以獲得性能。但肯定是困難的方式。

任何啓示？建議？

來源

2016-04-14 fschuindt

查找到[這個問題]（http://stackoverflow.com/questions/4196453/simple-and-fast-method-to-compare-images-for -相似）。那裏已經討論了很多選項。 – Uzbekjon

關於與'File.binread'比較的另一個注意事項。既然你只是比較文件內容和資源以及重要性的表現，那麼最好簡單地使用bash來做到這一點。看看：'diff'，'cmp'或'md5'。 – Uzbekjon

如果您需要分類器，可能是[張量流]（https://www.tensorflow.org）的工作。 – tadman

在ruby-vips，你可以做這樣的：

require 'vips' 

# find normalised histogram of reference image 
ref = VIPS::Image.new ARGV[0], :sequential => true 
ref_hist = ref.hist.histnorm 

# trigger a GC every few loops to keep memuse down 
loop = 0 

ARGV[1..-1].each do |filename| 
    # find sample hist 
    sample = VIPS::Image.new filename, :sequential => true 
    sample_hist = sample.hist.histnorm 

    # calculate sum of squares of differences, if it's over a threshold, print 
    # the filename 
    diff_hist = ref_hist.subtract(sample_hist).pow(2) 
    diff = diff_hist.avg * diff_hist.x_size * diff_hist.y_size 

    if diff > 100 
     puts "#{filename}, #{diff}" 
    end 

    loop += 1 
    if loop % 100 == 0 
     GC.start 
    end 
end

偶爾GC.start是必要的，使紅寶石免費的東西，防止內存填充。儘管每100張圖片只有一次，但遺憾的是，它仍然花費大量的時間進行垃圾收集。

$ vips crop ~/pics/k2.jpg ref.png 0 0 100 50 
$ for i in {1..10000}; do cp ref.png $i.png; done 
$ time ../similarity.rb ref.png *.png 
real 2m44.294s 
user 7m30.696s 
sys 0m20.780s 
peak mem 270mb

如果你願意考慮Python，它會更快，因爲它引用了計數，並且不需要一直掃描。

import sys 
from gi.repository import Vips 

# find normalised histogram of reference image 
ref = Vips.Image.new_from_file(sys.argv[1], access = Vips.Access.SEQUENTIAL) 
ref_hist = ref.hist_find().hist_norm() 

for filename in sys.argv[2:]: 
    # find sample hist 
    sample = Vips.Image.new_from_file(filename, access = Vips.Access.SEQUENTIAL) 
    sample_hist = sample.hist_find().hist_norm() 

    # calculate sum of squares of difference, if it's over a threshold, print 
    # the filename 
    diff_hist = (ref_hist - sample_hist) ** 2 
    diff = diff_hist.avg() * diff_hist.width * diff_hist.height 

    if diff > 100: 
     print filename, ", ", diff

我看到：

$ time ../similarity.py ref.png *.png 
real 1m4.001s 
user 1m3.508s 
sys 0m10.060s 
peak mem 58mb

來源

2016-04-17 20:35:28 user894763

比較圖像相似性的最佳技術是什麼？

回答

相關問題