Apple is building a speech recognition model that runs entirely on your device as part of the new accessibility features it previewed this week, and one of its first jobs will be captioning the shaky birthday party footage sitting in your camera roll.
That on-device detail is the part that keeps getting buried in the announcements, but it changes what this feature actually means.
The audio from your home videos never leaves your phone. Apple has specifically designed the captioning engine to generate text locally, which means a clip of your kid’s school play gets the same privacy treatment as your banking password.
Where It Actually Works
The scope is wider than you might assume. When iOS 27 lands later this year, the automatic caption feature will cover videos you shoot yourself, clips friends text you, and videos you stream from the web.
That last category is the interesting one because it suggests the system is working at a layer beneath individual apps, capturing video wherever it plays rather than waiting for each app to build its own solution.
Caption appearance can be adjusted either through the playback controls while a video is running or through the Settings app for a permanent setup.
The feature spans iPhone, iPad, Mac, Apple TV, and Vision Pro, all arriving as part of the iOS 27 family of updates, which Apple will formally show off at WWDC 2026 on June 8.
Who Is This For?
Captioning tools have existed for years on professional platforms, but they have always assumed a certain kind of content, edited, uploaded, and intentional.
What Apple is doing here is applying that infrastructure to the completely unpolished end of the video spectrum.
At launch, the feature will support only English in the United States and Canada, so its geographic reach is limited for now. More languages will presumably follow, though Apple has not committed to a timeline on that front.
The practical audience here is anyone who has ever struggled to follow dialogue in a noisy environment, or handed their phone to someone hard of hearing and felt a little helpless.
Captions for personal videos have always been technically possible. Making them automatic and private on a device most people already own is genuinely new.