Meta’s Ray-Ban Glasses Added AI That Can See What You’re Seeing

“Hey, Meta. Take a look at this and tell me which of these teas is caffeine-free.”

I spoke these words as I wore a pair of Meta Ray-Bans at the tech giant’s New York headquarters, while I stared at a table with four tea packets with their caffeine labels blacked out with a Magic Marker. A little click sound in my ears was followed by Meta’s AI voice telling me that the chamomile tea was likely caffeine-free. It was reading the labels and making judgments using generative AI.

I was demoing a feature that’s rolling out to Meta’s second-generation Ray-Ban glasses starting today, a feature that Meta CEO Mark Zuckerberg had already promised in September when the new glasses were announced. The AI features, which can access Meta’s on-glasses cameras to look at images and interpret them with generative AI, were supposed to launch in 2024. Meta has moved to introduce these features a lot faster than I expected, although the early-access mode is still very much a beta. Along with adding Bing-powered search into Ray-Bans as part of a new update, which ups the power of the glasses’ already available voice-enabled capabilities, Meta’s glasses are starting to gain a number of new abilities fast.

I was pretty wowed by the demo because I had never seen anything like it. I have in parts: Google Lens and other on-phone tools use cameras and AI together already, and Google Glass — a decade ago — had some translation tools. That said, the easy-access way that Meta’s glasses have of invoking AI to identify things in the world around me feels pretty advanced. I’m excited to try it a lot more.

A restaurant sign in Italian, with captions above and below asking for an AI assistant to translate — The glasses don’t have a display, and it only speaks the responses back. But the Meta View phone app saves the photos and AI responses for later.

Meta

Multimodal AI: How it works right now

The feature has limits right now. It can only recognize what you see by taking a photo, which the AI then analyzes. You can hear the shutter snap after making a voice request, and there’s a pause of a few seconds before a response comes in. The voice prompts are also wordy: Every voice request on the Meta glasses needs to start with “Hey, Meta,” and then you need to follow with “Take a look at this” to trigger the photo-taking, immediately followed with whatever you want to request the AI to do. “Hey, Meta, take a look at this and tell me a recipe with these ingredients.” “Hey, Meta, take a look at this and make a funny caption.” “Hey, Meta, take a look at this. What plant is it?”

Every AI response, and the photo it looked at, are stored in the Meta View phone app that pairs with the glasses. I like this, because it’s a visual/written record for later, like memory-jogging notes. I could see wandering somewhere and asking it questions, using this as some form of head-worn Google search for my eyes, while shopping or who knows what.

A photo of grilling, with captions asking an AI assistant for cooking help — I didn’t try Meta’s glasses while cooking — yet.

Meta

It could also have possible uses for assistive purposes. I wore a test pair of Meta glasses that didn’t have my prescription, and I asked it what I was looking at. Answers can vary in detail and accuracy, but it can give a heads-up. It knew I was showing it my glasses, which it said had bluish-tinted lenses (blue-black frame, pretty close).

Sometimes it can hallucinate. I asked the glasses about fruit in a bowl in front of me, and it said there were oranges, bananas, dragonfruit, apples and pomegranates. It was correct, except for the pomegranates. (There were none of those.) I was asked to have it make a caption for a big stuffed panda in front of a window. It made some cute ones, but one was about someone being lonely and looking at a phone, which didn’t match.

I looked at a menu in Spanish and asked the glasses to show me spicy dishes. It read off some dishes and translated some key ingredients for me, but I asked again about dishes with meat and it read everything back in Spanish.

The possibilities here are wild and fascinating, and possibly incredibly useful. Meta admits that this early launch will be about discovering bugs and helping evolve the way the on-glasses AI works. I found there were too many “Hey, Meta, look at this” moments. But that process might change, who knows. When engaged in immediate image analysis, asking direct follow-up questions can work without saying “Look at this” again, but I’m sure my success will vary.

A hand pointing to a mountain, with bubbles asking AI to help caption a photo — When will the captions be helpful and when will they hallucinate, though?

Meta

The future of wearable AI is getting interesting

This AI, which Meta calls “multimodal AI” because it uses cameras and voice chat together, is a precursor of future AI that the company plans to mix many forms of inputs into, including more sensory data. Qualcomm’s AI-focused chipset on Meta’s new Ray-Bans already seems ready to take on more. It’s also a process that Meta plans to make more seamless over time.

Meta CTO Andrew Bosworth told me in September that while the glasses now need a voice prompt to activate and “see” so that they don’t burn through battery life, eventually they’ll “have sensors that are low power enough that they’re able to detect an event that triggers an awareness that triggers the AI. That’s really the dream we’re working towards.” Meta is also already researching AI tools that blend multiple forms of sensory data together, in advance of more advanced future wearables.

Right now, know that it’s an early-access beta. Meta is using anonymized query data to help improve its AI services during the early access phase, which may concern people wanting more privacy. I don’t know the specific opt-in details yet, but more discrete controls over sharing data look like they may be in place once the final AI features launch, likely next year.

It all reminds me of exactly what Humane is aiming for with its wearable AI Pin, a device I haven’t even seen in person yet. While Human’s product is expensive and needs to be worn on clothing, Meta’s glasses are $300 and are already on store shelves. As watches, VR headsets and smart glasses all evolve their AI capabilities, things could get very different for the future of wearable tech and its level of assistive awareness.

It’s becoming clear that a new frontier of wearable AI products is already underway, and Meta’s glasses are getting here first.

Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.

Reference

Eugen Boglaru

Eugen Boglaru is an AI aficionado covering the fascinating and rapidly advancing field of Artificial Intelligence. From machine learning breakthroughs to ethical considerations, Eugen provides readers with a deep dive into the world of AI, demystifying complex concepts and exploring the transformative impact of intelligent technologies.

Multimodal AI: How it works right now

The future of wearable AI is getting interesting

Leave a Comment Cancel reply