3
我試圖通過Tess-Two在Android中使用Tesseract OCR來識別圖像中的文本(使用Android Studio開發)。增強TessBaseAPI.getUTF8Text()的可讀性
在gradle這個,添加以下行成依賴關係部分:
compile 'com.rmtheis:tess-two:5.4.1'
然後,在主作業的onCreate()
,我有以下代碼來初始化該庫和加載圖像:
final String lang = "eng";
TessBaseAPI baseAPI = new TessBaseAPI();
boolean initResult = baseAPI.init(Environment.getExternalStorageDirectory().getPath(), lang);
if(initResult) {
InputStream is = null;
try {
is = getAssets().open("test2.jpg");
final Drawable drw = Drawable.createFromStream(is, null);
Bitmap bmp = ((BitmapDrawable) drw).getBitmap();
baseAPI.setDebug(true);
baseAPI.setImage(bmp);
ImageView imageView = (ImageView)findViewById(R.id.imageView);
imageView.setImageBitmap(bmp);
String recognizedText = baseAPI.getUTF8Text().trim();
Log.d(TAG, recognizedText);
TextView textView = (TextView) findViewById(R.id.txt_debug);
textView.setText(recognizedText);
baseAPI.end();
} catch (FileNotFoundException nfe) {
Log.d(TAG, "File Not Found");
nfe.printStackTrace();
} catch (IOException ioe) {
Log.d(TAG, "Unable to open the file");
ioe.printStackTrace();
}
} else {
Log.d("OCR", "Unable to init Base API");
}
最後,我把JPEG放在資產文件夾(app/src/main/assets/
)。這裏是JPEG,基本上是一段文字。
不過,OCR結果是(相當多的垃圾):
OWW WW ON
R W WWW WK
KW MK
214
3 W5 HE WM
M WW WWW
LFNWW VW QTY
VM ACNL 19 WE NH
5 332152391
HQ W M W
如何提高掃描的可讀性?
我嘗試以下頁面sec模式,但結果空:
// Automatic page segmentation with orientation and script detection
baseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO_OSD);
// Treat the image as a single text line
baseAPI.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_LINE);
感謝您的意見。我發現通過使用像Scantailor或者textcleaner ImageMagick腳本這樣的工具,可以提高OCR功能的可讀性。關鍵是要消除噪音,並將DPI提高到至少300dpi。 – Raptor