Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Inference setup

Integration choice

In the beginning, when it comes to building an iOS app with LLM, the developer needs to choose the way it will be integrated in the app. In our case, there were standard ways of using that on-device:

After some time of working with all these methods, we came across on pros/cons of each of those ways:

coremltools llama.cpp MediaPipe ONNX
Pros Easily integrated via Apple's CoreML A developer can gain access to lower-level settings Standard way of integrating Google's LLMs Use with coremltools by running just one command
Cons Not supported for now [08/03/2025] Too hard war for noobs Google Gemma 3n is not supported for now [08/03/2025] Need for high-performed Mac 16+ of RAM and Apple Silicon Pro+ processors

Unfortunately, we couldn't use coremltools or ONNX, which are considered as the best tools for using LLMs on iOS, so we narrowed such tools down to llama.cpp and MediaPipe. And, as it often happens, MediaPipe became not appropriate for us because we realized that there is no way to convert Google Gemma 3n into .task file extension. Hence, the only thing we could try is llama.cpp

We are going through each LLM integration step in the Gemergency iOS app. We first start with llama.cpp setup and finally go to building our own SwiftUI iOS app

llama.cpp setup

First things first, we had to install llama.cpp inference on macOS. For this, we need to clone the official repo on the Mac:

$ git clone --recursive https://github.com/ggml-org/llama.cpp.git && cd llama.cpp

By running that command, we clone and go to the root directory of llama.cpp. We can find example/llama.swiftui subdirectory there. This is what we need. But before going there, we have to build Xcode framework for further use in SwiftUI iOS app. Run this command in the root directory of llama.cpp:

$ ./build-xcframework.sh

And that's it! We can not proceed by integrating Google Gemma 3n into the iOS app