LOMA: Mobile Offline Medical AI Assistant

LOMA (Local Offline Medical Assistant) delivers a zero-cloud experience: the entire assistant, from embeddings to language model responses, runs on the user’s phone.

Motivation
#

Billions of people need reliable medical answers where connectivity is limited or privacy is paramount.
LOMA (Local Offline Medical Assistant) delivers a zero-cloud experience: the entire assistant, from embeddings to language model responses, runs on the user’s phone.

System Design
#

Model – Gemma 3n converted to a 4.79 GB GGUF checkpoint, served via llama.rn with GPU offload for up to 99 layers.
Retrieval – A doc2query-enhanced RAG stack indexes 5 million Q&A style medical documents so answers are grounded and cite exact sources.
Embeddings – ExecuTorch runs all-MiniLM-L6-v2 locally, generating 384-d vectors in ~70 ms using only 150–190 MB RAM.
Database – Turso (SQLite + vector extensions) ships as a pre-built bundle synced through Cloudflare R2; cosine search yields ~45 s results without ballooning storage.
Frontend – React Native application with shared abstractions for storage, queue-based inference, and lazy loading to keep both iOS and Android responsive.

Workflow
#

A user question is normalized into Gemma’s conversation format.
The query embedding searches both long-form documents and FAQ-style pairs.
Retrieved passages are assembled with structured citations.
Gemma 3n generates an answer entirely on-device, never sharing data with servers.

Impact & Metrics
#

Privacy-preserving responses with verifiable citations improve trust for clinical decision support.
Works offline after the initial 4.79 GB download; model + DB fit comfortably on mid-range phones.
Vector search: 94 ms for 50k vectors, ~45 s for the full 5 M-document store – acceptable tradeoff for a 250 MB footprint.
Response latency stays under one minute even on modest hardware, broadening device eligibility.

🔗 Read the full build log

Motivation #

System Design #

Workflow #

Impact & Metrics #

Motivation
#

System Design
#

Workflow
#

Impact & Metrics
#