Saathi: Building an AI App for People Who Still Use Button Phones
The hard part of Saathi wasn't getting an LLM to answer questions. The hard part was refusing to build the kind of AI app developers love making for each other.
The hard part of Saathi wasn't getting an LLM to answer questions. The hard part was refusing to build the kind of AI app developers love making for each other.
The SNPSU prompt was clear enough to kill most lazy ideas on contact. Build an AI-first interface for low-digital-literacy users in India. Reduce typing. Reduce reading. Reduce navigation. If you've spent enough time around AI demos, you know how unnatural that brief feels to most teams. The default instinct is to add features, add menus, add cleverness. We started there too, at least mentally. Then we looked at the target user and most of those instincts stopped making sense.
Our first MVP was intentionally boring. Phone number login with Firebase and OTP because anything more complicated would have been engineering vanity. Speech input through Sarvam, a Llama 70B model running through Groq as the core brain, Bulbul V3 for voice output. The first version was basically an AI wrapper with placeholder UI, nothing you'd write home about. That was the point. We wanted the core loop working before we let ourselves get attached to aesthetics. Once it spoke, answered, and survived a basic interaction, we redesigned the interface with visual LLM tooling, converted the good parts into code, and swapped out the placeholder screens. MVP 1 took about an hour.
Then we hit the first real problem. The model could answer, sure, but it had no memory of the user's situation unless the user kept re-explaining everything. For low-literacy users, that kills the product. We added a lightweight local vector store and basic RAG so the system could hold context without forcing more typing. A quick check-in with the judges told us we were roughly on the right track, which was useful because we'd only solved one piece of the brief.
The bigger shift came a little later, when we realized we were still thinking too much like smartphone users. A lot of the audience we cared about still uses button phones. Once that clicked, the product changed direction fast. If we could make this work on KaiOS, we wouldn't just be polishing accessibility. We'd be meeting people where they already were. Our Android prototype was in Kotlin, functional enough, and structured well enough to port. So we duplicated the codebase and rebuilt the interface in plain HTML, CSS, and JavaScript for KaiOS. With good prompting and a willingness to let AI handle the boring conversion work, that rewrite was much less painful than it sounds.
That decision did more for the product than any model upgrade would have. Suddenly every UI decision had a ruthless filter. Does this need text? Does this need another screen? Does this need a settings page at all? Most of the answer was no. We ended up with essentially two screens: login and the main voice chat. No maze of chats, no decorative profile area, no settings graveyard full of toggles nobody was going to touch. Just the core action.
We still added a couple of features that made sense in context. Live screen sharing let a user ask the assistant about what was already on the display. Auto-triggered image generation gave the app a way to explain visually instead of throwing more words at the user. None of this was unprecedented. Google, OpenAI, Anthropic, plenty of others have explored similar interaction patterns. We weren't pretending to invent a new genre. We were trying to package the right ideas for a very different user base, on devices most product people barely think about.
The funny part is that the technical challenge wasn't complexity. It was restraint. Our team could have built something much more elaborate. That's not false modesty. It would've been easy to stuff the demo with extra flows, more screens, richer settings, maybe a few features the judges would clap for and nobody in the actual target audience would use. But if your imagined user is somebody's 80-year-old grandmother holding a button phone, complexity stops looking impressive. It starts looking hostile.
Cost mattered too. Student hackathon. Tiny budget. Users unlikely to pay much for AI even if they liked the product. So every tool choice had to survive the cheapest-possible test. We got a ridiculous amount done without spending much by being shamelessly resourceful: Opus 4.6, GPT-5.4, DeepSeek V4, free Nvidia API credits, Hugging Face freebies, too many Google accounts, carefully structured prompts so we didn't waste calls, and a lot of judgment about when not to invoke the expensive model. People talk about AI app development like it's automatically a money pit. It can be. It can also be surprisingly cheap if you stop treating tokens like confetti.
Most of the actual building happened in under six hours. The rest of the time went into the stuff people summarize too casually: thinking, testing, questioning whether a feature was helping or just flattering us, and figuring out how to make the whole thing feel obvious to someone who does not want to read a wall of UI text. That last part matters more than most technical postmortems admit.
Saathi taught me something I keep relearning in different forms: simplicity is not the absence of ambition. Sometimes it's the proof that you understood the problem. Anyone can make an AI app feel powerful for a judge with a laptop. Making it usable for someone on a low-end device, with low patience, low literacy, and no interest in learning your interface from scratch, that's harder. And more interesting.
Links: