Inference setup
Integration choice
In the beginning, when it comes to building an iOS app with LLM, the developer needs to choose the way it will be integrated in the app. In our case, there were standard ways of using that on-device:
- coremltools from pip
- llama.cpp inference with .gguf file extension
- Google's MediaPipe
- Use of ONNX
After some time of working with all these methods, we came across on pros/cons of each of those ways:
coremltools | llama.cpp | MediaPipe | ONNX | |
---|---|---|---|---|
Pros | Easily integrated via Apple's CoreML | A developer can gain access to lower-level settings | Standard way of integrating Google's LLMs | Use with coremltools by running just one command |
Cons | Not supported for now [08/03/2025] | Too hard war for noobs | Google Gemma 3n is not supported for now [08/03/2025] | Need for high-performed Mac 16+ of RAM and Apple Silicon Pro+ processors |
Unfortunately, we couldn't use coremltools or ONNX, which are considered as the best tools for using LLMs on iOS, so we narrowed such tools down to llama.cpp and MediaPipe. And, as it often happens, MediaPipe became not appropriate for us because we realized that there is no way to convert Google Gemma 3n into .task file extension. Hence, the only thing we could try is llama.cpp
We are going through each LLM integration step in the Gemergency iOS app. We first start with llama.cpp setup and finally go to building our own SwiftUI iOS app
llama.cpp setup
First things first, we had to install llama.cpp inference on macOS. For this, we need to clone the official repo on the Mac:
$ git clone --recursive https://github.com/ggml-org/llama.cpp.git && cd llama.cpp
By running that command, we clone and go to the root directory of llama.cpp. We can find example/llama.swiftui subdirectory there. This is what we need. But before going there, we have to build Xcode framework for further use in SwiftUI iOS app. Run this command in the root directory of llama.cpp:
$ ./build-xcframework.sh
And that's it! We can not proceed by integrating Google Gemma 3n into the iOS app