2011-03-21 52 views
1

在Xcode中,我試圖在將圖像發送到OCR之前預處理圖像。 OCR引擎Tesseract處理基於Leptonica庫的圖像。在Xcode項目中使用Leptonica處理圖像

作爲一個例子: Leptonica功能pixConvertTo8(「image.tif」)...有沒有一種方法可以將圖像原始數據從UIImage - > PIX轉換(參見leptonica庫中的pix.h) - >執行pixConvertTo8()並從PIX返回 - > UImage - 並且最好不將它保存到文件進行轉換 - 全部在內存中。

- (void) processImage:(UIImage *) uiImage 
{ 
NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init]; 

// preprocess UIImage here with fx: pixConvertTo8(); 

CGSize imageSize = [uiImage size]; 
int bytes_per_line = (int)CGImageGetBytesPerRow([uiImage CGImage]); 
int bytes_per_pixel = (int)CGImageGetBitsPerPixel([uiImage CGImage])/8.0; 

CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage])); 
const UInt8 *imageData = CFDataGetBytePtr(data); 

// this could take a while. 
char* text = tess->TesseractRect(imageData, 
           bytes_per_pixel, 
           bytes_per_line, 
           0, 0, 
           imageSize.width, imageSize.height); 

回答

2

這兩個函數就可以了....

- (void) startTesseract 
{ 
//code from http://robertcarlsen.net/2009/12/06/ocr-on-iphone-demo-1043 

NSString *dataPath = 
    [[self applicationDocumentsDirectory]stringByAppendingPathComponent:@"tessdata"]; 
/* 
Set up the data in the docs dir 
want to copy the data to the documents folder if it doesn't already exist 
*/ 
NSFileManager *fileManager = [NSFileManager defaultManager]; 
// If the expected store doesn't exist, copy the default store. 
if (![fileManager fileExistsAtPath:dataPath]) { 
    // get the path to the app bundle (with the tessdata dir) 
    NSString *bundlePath = [[NSBundle mainBundle] bundlePath]; 
    NSString *tessdataPath = [bundlePath stringByAppendingPathComponent:@"tessdata"]; 
    if (tessdataPath) { 
     [fileManager copyItemAtPath:tessdataPath toPath:dataPath error:NULL]; 
    } 
} 

NSString *dataPathWithSlash = [[self applicationDocumentsDirectory] stringByAppendingString:@"/"]; 
setenv("TESSDATA_PREFIX", [dataPathWithSlash UTF8String], 1); 

// init the tesseract engine. 
tess = new tesseract::TessBaseAPI(); 
tess->Init([dataPath cStringUsingEncoding:NSUTF8StringEncoding], "eng"); 

} 

- (NSString *) ocrImage: (UIImage *) uiImage 
{ 

//code from http://robertcarlsen.net/2009/12/06/ocr-on-iphone-demo-1043 
    CGSize imageSize = [uiImage size]; 
double bytes_per_line = CGImageGetBytesPerRow([uiImage CGImage]); 
double bytes_per_pixel = CGImageGetBitsPerPixel([uiImage CGImage])/8.0; 

CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider([uiImage CGImage])); 
const UInt8 *imageData = CFDataGetBytePtr(data); 
imageThresholder = new tesseract::ImageThresholder();  

imageThresholder->SetImage(imageData,(int) imageSize.width,(int) imageSize.height,(int)bytes_per_pixel,(int)bytes_per_line); 



// this could take a while. maybe needs to happen asynchronously. 
tess->SetImage(imageThresholder->GetPixRect()); 

char* text = tess->GetUTF8Text(); 
// Do something useful with the text! 
NSLog(@"Converted text: %@",[NSString stringWithCString:text encoding:NSUTF8StringEncoding]); 

return [NSString stringWithCString:text encoding:NSUTF8StringEncoding] 
} 

您將在.h文件中聲明這兩個苔絲和imageThresholder

tesseract::TestBaseApi *tess; 
tesseract::ImageThresholder *imageThresholder; 
0

我在Tesseract OCR引擎中發現了一些關於如何執行此操作的優秀代碼片段。值得注意的是在thresholder.cpp中的ImageThresholder類中 - 請參閱下面的鏈接。我沒有測試它尚未但這裏是一些簡短的說明:

對我來說最有趣的部分是其他塊,其中,深度爲32。這裏的 pixCreate() pixGetdata() pixgetwpl()做acctual工作。

The thresholder.cpp from the tesseract engine uses the above mentioned method