Conclusion and Final Model Choice

The evaluation clearly demonstrates that quantization is not a lossless process. Lower-bit quantizations (Q4_0, Q3_K_S, Q2_K) can catastrophically degrade model safety and reliability, producing dangerously incorrect information.

Unsafe Models: Q4_0, Q3_K_S, and Q2_K are unsafe and must never be deployed in a real-world application.
Viable Models: Q3_K_M and Q3_K_L offer a strong balance of safety and efficiency, making them suitable for environments with limited resources.
Gold Standard: Q4_K_M provides the most comprehensive and safest response.

For this project, where user safety in an emergency is the absolute highest priority, we selected the Q4_K_M model as our final production choice. The marginal increase in file size is a small price to pay for the significant improvement in the detail, clarity, and trustworthiness of its guidance. Our fine-tuning and evaluation pipeline successfully produced a model that is demonstrably more reliable and fit for the critical purpose of emergency assistance.

Keyboard shortcuts

Gemergency iOS app docs

Conclusion and Final Model Choice