The Next Frontier For Smart Glasses: Empowering Vision With On-Device AI

1 year ago 44

Wei Duan, built speech recognition and machine learning at Google and Snapchat, cofounder of 8glabs AI.

getty

In 2024, Meta achieved a remarkable milestone by selling over one million units of its smart glasses, signaling a pivotal shift in the evolution of this technology. The definition of smart glasses has changed significantly. Previously, the closest analog was augmented reality (AR) glasses, but progress in AR has been slowed by technical challenges.

With the emergence of large language models (LLMs) and large speech models, smart glasses are now repositioned as critical components in the ecosystem of AI. Their role is shifting from being standalone devices to essential accessories for AI assistants, acting as both sensors and displays for artificial intelligence.

The Case For On-Device AI Models

The rise of edge models underscores the importance of local computation in the future of smart glasses. Current devices rely heavily on remote servers for AI computations. For instance, when a smart glasses user asks, “What’s the make and model of this car?” the device typically sends data over Wi-Fi or cellular networks to remote servers. These servers then perform the necessary AI computations, calling APIs like OpenAI or Gemini to deliver an answer. While this setup works, it is both expensive and plagued by latency, limiting the overall user experience.

This reliance on server-side AI is poised for disruption. With advancements in hardware and smaller, more efficient AI models, many computations can occur directly on the device. This hybrid approach—a combination of edge and server-side AI—promises significant cost savings and an enhanced user experience.

Imagine a pair of smart glasses equipped with an AI processing unit (APU) that can locally process video streams, perform eye-tracking and run small language models to answer straightforward queries. Only complex tasks requiring heavy computational resources would be sent to remote servers. This division of labor would create a more seamless, cost-efficient interaction for users.

A Glimpse Into The Future

Consider a scenario where you’re walking down the street and see a car you don’t recognize. With a simple wink, you ask your glasses, "What’s the make and model of this car?" Using eye-tracking, the glasses pinpoint the object of interest and process video frames locally. A lightweight, multimodal edge model identifies the car and delivers the answer in real time.

For more complex queries, such as analyzing user behavior and recommending nearby events, the glasses would defer to server-side AI. This hybrid approach balances computational efficiency with user experience, addressing the limitations of current smart glasses.

The Role Of Small Language Models (SLMs)

The development of small language models for mobile and wearable devices is advancing rapidly. While speech models remain slightly behind in their development, the techniques powering SLMs are highly transferable. These models, designed for efficient sequence processing and generation, are crucial for the next generation of smart glasses.

Key innovations in SLMs include:

Architectural Advancements

• Meta’s MobileLLM demonstrates a “deep and thin” architecture by embedding sharing and grouped-query attention to achieve high performance with sub-1 billion parameters.

• Models like MiniCPM employ a mixture of experts (MoE) approach, enabling larger parameter counts (1 to 4 billion) while keeping active parameters low during inference.

Device-Specific Optimization

• Apple’s OpenELM suite, with models ranging from 270 million to 3 billion parameters, is optimized for iOS devices. While this ensures exceptional performance, it limits cross-platform compatibility.

• Microsoft’s Phi-4 utilizes advanced quantization techniques, enabling efficient operation on consumer devices like the iPhone 14.

Ecosystem And Adoption

• Open-source initiatives such as TinyLlama and Gemma foster vibrant developer communities, accelerating innovation and adoption.

• Proprietary models like OpenELM benefit from deep integration within their ecosystems but face challenges in reaching a broader audience.

While SLMs excel in efficiency, they still lag behind larger models in complex reasoning tasks. Continued innovation will bridge this gap, making mobile devices more capable without sacrificing efficiency.

A Billion-Dollar Opportunity

The future of smart glasses is bright, with the potential to generate a market worth over a billion dollars. Edge machine learning will be a cornerstone of this transformation, enabling devices that are not only more efficient but also more responsive and cost-effective. As edge and server-side AI work in harmony, the dream of truly smart glasses—seamless, intuitive and indispensable—will become a reality.

Smart glasses are no longer just a tech accessory; they are evolving into a vital interface for AI-driven interactions. By harnessing the power of on-device AI models, we are taking a significant step toward a more connected and intelligent future.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Read Entire Article