Setting up an iOS app with Gemma 3n
Preparation
We were ready to build a brand new SwiftUI iOS app. Before we began, we had to set up the Xcode framework within the app. To start, we created a new SwiftUI project in Xcode by navigating to Xcode → File → New → Project, selecting SwiftUI as the primary UI framework, and creating a new app
Next, we had to add the Xcode framework we built earlier to the project. This can be done easily by simply dragging and dropping the framework into our app. Once that's done, we could move on to integrating the necessary controllers into the app
App setup
To work successfully with Gemma 3n, our SwiftUI iOS app requires two key controllers: LlamaState and LibLlama. Both can be found in llama.cpp/examples/llama.swiftui:
- LlamaState - acts as a bridge between the SwiftUI app and llama.cpp, using LibLlama
- LibLlama - serves as the core engine that manages LLM setup within the SwiftUI app
After adding these controllers to our SwiftUI project, we were ready to begin designing the app's user interface
Additional features to controllers
In addition to adding these controllers to the project, we also needed to modify them to ensure they functioned correctly
First things first, we had to add these lines of code into LibLlama:
func clear() {
tokens_list.removeAll()
temporary_invalid_cchars.removeAll()
llama_memory_clear(llama_get_memory(context), true)
self.n_cur = 0 // <- add this line
self.is_done = false // <- add this line
}
Without these lines of code, Gemma 3n won't respond to a second prompt. To clarify: the first prompt works as expected and receives a response, but the second prompt fails because the session cache isn't cleared between prompts.
static func create_context(path: String) throws -> LlamaContext {
llama_backend_init()
var model_params = llama_model_default_params() // <- add this line
#if targetEnvironment(simulator)
model_params.n_gpu_layers = 0
print("Running on simulator, force use n_gpu_layers = 0")
#endif
model_params.n_gpu_layers = 0 // <- add this line
let model = llama_model_load_from_file(path, model_params)
guard let model else {
print("Could not load model at \(path)")
throw LlamaError.couldNotInitializeContext
}
let n_threads = max(1, min(8, ProcessInfo.processInfo.processorCount - 2))
print("Using \(n_threads) threads")
var ctx_params = llama_context_default_params()
ctx_params.n_ctx = 2048
ctx_params.n_threads = Int32(n_threads)
ctx_params.n_threads_batch = Int32(n_threads)
let context = llama_init_from_model(model, ctx_params)
guard let context else {
print("Could not load context!")
throw LlamaError.couldNotInitializeContext
}
return LlamaContext(model: model, context: context)
}
Without these lines of code, the model won't load on physical devices. It may run successfully from within Xcode, it will fail to work — such as when distributed via TestFlight — on any actual device though
Other necessary settings and methods can be found on Github repo of Gemergency
Further steps
With that completed, we proceeded to develop the Gemergency iOS app. The next steps involved designing the UI with SwiftUI, integrating iOS system features, and implementing other core functionalities