Your next earbuds can translate text and point to objects

Researchers at the University of Washington have developed a new prototype system that could change the way people interact with artificial intelligence in everyday life. Called VueBuds, the system integrates tiny cameras into wireless earbuds, allowing users to ask an AI model questions about the world around them in near real-time.
The concept is simple but powerful. A user can look at an item, such as a food package in a foreign language, and ask the AI to translate it. Within a second, the system responds with feedback through the earbuds, creating a seamless, hands-free interaction.
A Different Approach to AI Wearables
Unlike smart glasses, which have faced adoption difficulties due to privacy issues and design limitations, VueBuds take a more subtle approach. The system uses low-resolution, black-and-white cameras embedded in the earbuds to capture still images instead of continuous video.
These images are transmitted via Bluetooth to a connected device, where a small AI model processes them locally. This on-device processing ensures that data does not need to be sent to the cloud, addressing a major concern with wearable cameras.
To improve privacy, the earbuds include a visual light when recording and allow users to quickly delete captured images.
Engineering Around Strengths and Performance Limits
One of the biggest challenges the research team faced was the use of electricity. Cameras require much more power than microphones, making it impossible to use high-resolution sensors like those found in smart glasses.
To solve this, the team used a camera the size of a grain of rice, capturing low-resolution grayscale images. This method reduces battery consumption and allows for efficient Bluetooth transmission without compromising responsiveness.
Placement was another important factor. By tilting the cameras slightly outward, the system achieves a field of view of between 98 and 108 degrees. Although there is a small blind spot for close objects, researchers have found that this does not affect normal use.
The system also combines images from both earbuds into a single frame, improving processing speed. This allows the VueBuds to respond in about one second, compared to two seconds when handling images separately.
Performance Compared to Smart Glasses
In the test, 74 participants compared the VueBuds to smart glasses such as Ray-Ban’s Meta models. Despite using low-resolution images and spatial processing, the VueBuds performed similarly overall.

The report showed participants preferred the VueBuds for translation tasks, while the smart glasses performed better for calculations. In separate tests, VueBuds achieved accuracy ratings of around 83–84% for translation and object identification, and up to 93% for identifying book titles and authors.
Why This Matters and What’s Next
The research highlights a potential shift in the way AI-powered wearables are designed. By embedding virtual intelligence into a device that people already use, the system avoids many of the obstacles faced by smart glasses.
However, limitations still exist. The current system cannot define color, and its capabilities are still in the early stages. The team plans to explore adding color sensors and developing specialized AI models for functions such as translation and accessibility support.
The researchers will present their findings at the Association for Computing Machinery Conference on Human Factors in Computing Systems in Barcelona, providing a glimpse of a future where everyday machines become silently intelligent assistants.



