Improving Tesseract OCR results on the iOS platform

June 5, 2012 1 minute read

If you’ve found yourself using Tesseract on the iOS platform, and you’re scratching your head as to why the OCR results are so terribly incorrect, you might be interested in the following. Most of the tesseract iOS tutorials talk about compiling the libraries, but don’t really cover how to use it.

funny-pictures-cat-has-an-iphone Theres always an app for that, but how do you understand how it works?

Are you using something like this to interface with the tesseract API?

char* text = tess->TesseractRect(imageData,(int)bytes_per_pixel,(int)bytes_per_line, 0, 0,(int) imageSize.height,(int) imageSize.width);

NSLog(@"Converted text: %@",[NSStringstringWithCString:text encoding:NSUTF8StringEncoding]);

I was using this to start with, and the results were terrible, if it was able to read anything it was mostly returning special characters or just utter nonsense.

Looking closer at the API documentation, you can see this :

/**
   * Recognize a rectangle from an image and return the result as a string.
   * May be called many times for a single Init.
   * Currently has no error checking.
   * Greyscale of 8 and color of 24 or 32 bits per pixel may be given.
   * Palette color images will not work properly and must be converted to
   * 24 bit.
   * Binary images of 1 bit per pixel may also be given but they must be
   * byte packed with the MSB of the first byte being the first pixel, and a
   * 1 represents WHITE. For binary images set bytes_per_pixel=0.
   * The recognized text is returned as a char* which is coded
   * as UTF8 and must be freed with the delete [] operator.
   *
   * Note that TesseractRect is the simplified convenience interface.
   * For advanced uses, use SetImage, (optionally) SetRectangle, Recognize,
   * and one or more of the Get*Text functions below.
   */
  char* TesseractRect(const unsigned char* imagedata,
                      int bytes_per_pixel, int bytes_per_line,
                      int left, int top, int width, int height);

Therefore, swap over and use this implementation:

    tess->SetImage(imageData,(int) imageSize.width, imageSize.height, (int)bytes_per_pixel,(int)bytes_per_line);
    char* someChars = tess->GetUTF8Text();
    NSString * someString = [NSString stringWithCString:someChars encoding:NSUTF8StringEncoding];
    NSLog(@"Better results this way %@", someString);

Nothing groundbreaking here, just pointing it out!

Don’t forget to use blacklisting and whitelisting for character sets, that helps improve results tremendously.

Thanks

Share on

Twitter Facebook LinkedIn

Improving Tesseract OCR results on the iOS platform

Share on

You May Also Enjoy

How to do object detection and tracking on a Raspberry Pi with the AI HAT with Hailo

How to setup Automated Image Generation in Kinesis Video Streams to create images in S3

How to stream video from a Raspberry Pi to AWS Kinesis Video Streams using GStreamer

Non volatile storage (NVS) and custom partitioning the ESP32 with LittleFS