Skip to main content

Mobile

Part 2: Building Mobile AI: A Developer’s Guide to On-Device Intelligence

Istock 2239922126

Subtitle: Side-by-side implementation of Secure AI on Android (Kotlin) and iOS (Swift).

In Part 1, we discussed why we need to move away from slow, cloud-dependent chatbots. Now, let’s look at how to build instant, on-device intelligence. While native code is powerful, managing two separate AI stacks can be overwhelming.

Before we jump into platform-specific code, we need to talk about the “Bridge” that connects them: Google ML Kit.

The Cross-Platform Solution: Google ML Kit

If you don’t want to maintain separate Core ML (iOS) and custom Android models, Google ML Kit is your best friend. It acts as a unified wrapper for on-device machine learning, supporting both Android and iOS.

It offers two massive advantages:

  1. Turnkey Solutions: Instant APIs for Face Detection, Barcode Scanning, and Text Recognition that work identically on both platforms.
  2. Custom Model Support: You can train a single TensorFlow Lite (.tflite) model and deploy it to both your Android and iOS apps using ML Kit’s custom model APIs.

For a deep dive on setting this up, bookmark the official ML Kit guide.


The Code: Side-by-Side Implementation

Below, we compare the implementation of two core features: Visual Intelligence (Generative AI) and Real-Time Inference (Computer Vision). You will see that despite the language differences, the architecture for the “One AI” future is remarkably similar.

Feature 1: The “Brain” (Generative AI & Inference)

On Android, we leverage Gemini Nano (via ML Kit’s Generative AI features). On iOS, we use a similar asynchronous pattern to feed data to the Neural Engine.

Android (Kotlin)

We check the model status and then run inference. The system manages the NPU access for us.

// GenAIImageDescriptionScreen.kt
val featureStatus = imageDescriber.checkFeatureStatus().await()

when (featureStatus) {
    FeatureStatus.AVAILABLE -> {
        // The model is ready on-device
        val request = ImageDescriptionRequest.builder(bitmap).build()
        val result = imageDescriber.runInference(request).await()
        onResult(result.description)
    }
    FeatureStatus.DOWNLOADABLE -> {
        // Silently download the model in the background
        imageDescriber.downloadFeature(callback).await()
    }
}

iOS (Swift)

We use an asynchronous loop to continuously pull frames and feed them to the Core ML model.

// DataModel.swift
func runModel() async {
    try! loadModel()
    
    while !Task.isCancelled {
        // Thread-safe access to the latest camera frame
        let image = lastImage.withLock({ $0 })
        
        if let pixelBuffer = image?.pixelBuffer {
            // Run inference on the Neural Engine
            try? await performInference(pixelBuffer)
        }
        // Yield to prevent UI freeze
        try? await Task.sleep(for: .milliseconds(50))
    }
}

Feature 2: The “Eyes” (Real-Time Vision)

For tasks like Face Detection or Object Tracking, speed is everything. We need 30+ frames per second to ensure the app feels responsive.

Android (Kotlin)

We use FaceDetection from ML Kit. The FaceAnalyzer runs on every frame, calculating probabilities for “liveness” (smiling, eyes open) instantly.

// FacialRecognitionScreen.kt
FaceInfo(
    confidence = 1.0f,
    // Detect micro-expressions for liveness check
    isSmiling = face.smilingProbability?.let { it > 0.5f } ?: false,
    eyesOpen = face.leftEyeOpenProbability?.let { left -> 
        face.rightEyeOpenProbability?.let { right ->
            left > 0.5f && right > 0.5f 
        }
    } ?: true
)

iOS (Swift)

We process the prediction result and update the UI immediately. Here, we even visualize the confidence level using color, providing instant feedback to the user.

// ViewfinderView.swift
private func updatePredictionLabel() {
    for result in prediction {
        // Dynamic feedback based on confidence
        let probability = result.probability
        let color = getColorForProbability(probability) // Red to Green transition
        
        let text = "\(result.label): \(String(format: "%.2f", probability))"
        // Update UI layer...
    }
}

Feature 3: Secure Document Scanning

Sometimes you just need a perfect scan without the cloud risk. Android provides a system-level intent that handles edge detection and perspective correction automatically.

Android (Kotlin)

// DocumentScanningScreen.kt
val options = GmsDocumentScannerOptions.Builder()
    .setGalleryImportAllowed(false) // Force live camera for security
    .setPageLimit(5)
    .setResultFormats(RESULT_FORMAT_PDF)
    .build()

scanner.getStartScanIntent(activity).addOnSuccessListener { intentSender ->
    scannerLauncher.launch(IntentSenderRequest.Builder(intentSender).build())
}

Conclusion: One Logic, Two Platforms

Whether you are writing Swift for an iPhone 17 pr0 or Kotlin for a medical Android tablet, the paradigm has shifted.

  1. Capture locally.
  2. Infer on the NPU.
  3. React instantly.

By building this architecture now, you are preparing your codebase for Spring 2026, where on-device intelligence will likely become the standard across both ecosystems.

Reference: Google ML Kit Documentation

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Follow Us