knowing the unknown
The cruel paradox of current assistive technology is that you have to know to ask about something before the app will tell you it's there.
Here's the catch-22 that breaks most assistive technology: How do you ask about something you don't know is there?
Picture this: You're in a new restaurant, menu in hand (or more likely, trying to get someone to read it to you). Current apps can help if you know what to ask: "What does this menu say?" "What's in front of me?" "Read this text."
But what about everything else? What about the daily specials written on a chalkboard behind you? The dessert display case you walked past without knowing it existed? The fact that the waiter has been trying to get your attention for the past minute? The interesting art on the walls might be a conversation starter. The emergency exit that's right next to your table?
This is the fundamental flaw in most assistive technology: it's reactive, not proactive. It waits for you to ask the right question rather than anticipating what you might need to know. It puts the burden on the user to somehow intuit what questions to ask about a world they can't see.
“If your product doesn’t solve a real problem for real people, it’s just a solution in search of a problem.”
It's that modern-day AI models were designed without a use case in mind, one could consider them a mere proof of concept, but it's when these models are used as a wheel in a greater agentic system that their true capabilities are shown, and most importantly, serve real people, in this case, the visually impaired.
If we need to create an agent that's tailored for them and that transfers as much of the visual experience as we can, we need to understand how it works for sighted people.
Sighted people don't work this way. Their vision constantly feeds them information they didn't ask for—environmental awareness that happens automatically, subconsciously. The person walking up behind them. The change in lighting that suggests weather outside. The body language of everyone in the room that helps them gauge the social atmosphere.
Imagine if your best friend worked the same way as current assistive technology. Instead of saying, "Hey, your favourite player just walked in," they'd wait for you to specifically ask, "Is my favourite player here?" Instead of mentioning, "There's a really cool sculpture behind you," they'd only describe it if you happened to ask about decorative objects in your vicinity.
That's not how humans help each other, and it's not how technology should work either.
The future of assistive technology isn't about building better tools that wait for commands. It's about creating AI companions that actively share the world with you, that notice what you might want to know and offer it freely, that understand context and timing, that act less like a search engine and more like a thoughtful friend who happens to have perfect vision.