Text recognition from image in Swift

2 min readJun 25, 2024

OCR (Optical Character Recognition) and image recognition are two related but distinct technologies.

OCR is a technology that converts printed or handwritten text into machine-readable text. OCR works by analyzing an image of text and identifying individual characters. The process involves several steps:

Preprocessing: The image is preprocessed to remove noise, enhance contrast, and improve the quality of the image.
Text detection: The software identifies areas of the image that contain text.
Character segmentation: The text is segmented into individual characters.
Character recognition: Each character is analyzed and identified using machine learning algorithms.

In Swift, there are several OCR libraries available that can be used for OCR processing, such as Tesseract OCR and Vision OCR. Here’s an example of using Vision OCR in Swift:

import UIKit
import Vision

func recognizeText(image: UIImage) {
    guard let cgImage = image.cgImage else { return }
    
    let request = VNRecognizeTextRequest { (request, error) in
        guard let observations = request.results as? [VNRecognizedTextObservation] 
        else { return }
        
        let recognizedStrings = observations.compactMap { observation in
            observation.topCandidates(1).first?.string
        }
        
        print(recognizedStrings)
    }
    
    let requestHandler = VNImageRequestHandler(cgImage: cgImage, options: [:])
    do {
        try requestHandler.perform([request])
    } catch {
        print(error)
    }
}

Image recognition, on the other hand, is the process of identifying and classifying objects or patterns within an image. Image recognition involves the following steps:

Preprocessing: The image is preprocessed to enhance features and reduce noise.
Feature extraction: The software identifies key features within the image, such as shapes, colors, or textures.
Classification: The image is classified into categories based on the identified features.

In Swift, there are several image recognition libraries available that can be used for image recognition, such as Core ML and TensorFlow Lite. Here’s an example of using Core ML to perform image recognition in Swift:

import UIKit
import CoreML
import Vision

func recognizeImage(image: UIImage) {
    guard let model = try? VNCoreMLModel(for: MyImageClassifier().model) 
else { return }
    
    let request = VNCoreMLRequest(model: model) { (request, error) in
        guard let results = request.results as? [VNClassificationObservation], 
              let topResult = results.first else { return }
        
        print(topResult.identifier)
    }
    
    let requestHandler = VNImageRequestHandler(ciImage: CIImage(image: image)!, 
                                               options: [:])
    do {
        try requestHandler.perform([request])
    } catch {
        print(error)
    }
}

Note that these are just basic examples, and there are many ways to perform OCR and image recognition in Swift depending on your specific requirements and preferences.

Text recognition from image in Swift

Written by Nayana N P