我知道這是一個很晚的答覆。但我認爲未來的人可以從中獲得幫助。
下面是答案我想我從上述通道理解(爲所有代碼在OpenCV中的Python v 2.4-β):
我以此爲輸入圖像。爲了理解起見,這是一個簡單的圖像。
First we generate the binary image of the give image by thresholding it at 80% of its intensity and inverting the resulting image.
import cv2
import numpy as np
img = cv2.imread('doc4.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0.8*gray.max(),255,1)
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
閾值的圖像:
We considered simple 8-neighborhood connectivity and performed connected component (contour) analysis of the binary image leading to the segmentation of the textual components.
這只是OpenCV中的輪廓查找,也叫connected-component labelling.它選擇圖像中的所有白色斑點(組件)。
contours, hier = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
輪廓:
For next part of algorithm we use the minimum bounding rectangle of contours.
圍繞每個檢測到的輪廓現在我們發現邊界矩形。然後用小區域去除輪廓刪除逗號等見聲明:
Smaller connected patterns were discarded based on the assumption that they may have originated due to noise dependent on image acquisition system and does not in any way contribute to the final layout. Also punctuation marks were neglected using smaller size criterion e.g. comma, full-stop etc.
我們還發現,平均身高,avgh
。
height = 0
num = 0
letters = []
ht = []
for (i,cnt) in enumerate(contours):
(x,y,w,h) = cv2.boundingRect(cnt)
if w*h<200:
cv2.drawContours(thresh2,[cnt],0,(0,0,0),-1)
else:
cv2.rectangle(thresh2,(x,y),(x+w,y+h),(0,255,0),1)
height = height + h
num = num + 1
letters.append(cnt)
ht.append(h)
avgh = height/num
所以在這之後的所有逗號等被刪除,綠色長方形周圍繪製選擇的:
At this level we also segregate the fonts based on the height of the bounding rect using avgh (average height) as threshold. Two thresholds are used to classify fonts into three categories - small, normal and large
(給定公式爲通道)。
平均身高,平均身高,這裏得到的是40.所以一個字母是small
如果身高低於26.66(即40x2/3),normal
如果26.66大,如果身高> 60。但在給定的圖像中,所有高度落在(28,58)之間,因此一切都是正常的。所以你看不出有什麼不同。
所以我只是做了一個小的修改,以輕鬆查看它:如果小高< 30,正常的,如果3050
for (cnt,h) in zip(letters,ht):
print h
if h<=30:
cv2.drawContours(thresh2,[cnt],0,(255,0,0),-1)
elif 30 < h <= 50:
cv2.drawContours(thresh2,[cnt],0,(0,255,0),-1)
else:
cv2.drawContours(thresh2,[cnt],0,(0,0,255),-1)
cv2.imshow('img',thresh2)
cv2.waitKey(0)
cv2.destroyAllWindows()
現在你得到歸類爲小,中,大字母結果:
These rectangles were then sorted top-to-bottom and left-to-right order, using 2D point information of leftmost-topmost corner.
這部分我省略了。它只是對所有邊界矩陣的最左上角進行排序。