Doctors Are Shocked After ChatGPT Health Gave a False Heart Warning to a Low-Risk Person, Proving AI Can’t Replace a Real Doctor

The idea of an AI handing out letter grades for your heart health sounds like science fiction.

But that’s exactly what happened when a Washington Post reporter, Geoffrey Fowler, connected ChatGPT Health to ten years of Apple Watch data, and the results were wildly inconsistent and alarmingly confident.

Fowler’s experience is entertaining in a “wait, really?” kind of way, but it also exposes the limitations of AI masquerading as a health advisor.

Also: 5 things Apple changed in the new AirTag 2 that make finding lost items faster and less stressful

His doctor dismissed the assessment outright, pointing out his cardiac risk was so low that insurance wouldn’t even cover extra testing to disprove the bot’s findings.

Cardiologist Eric Topol called it “baseless.” Yet the AI didn’t stop there. On repeated queries, Fowler’s score bounced from an F to a B. The bot even forgot basic facts, such as his age and gender, despite having full access to his records.

ChatGPT Health treats fuzzy metrics like VO2 max and heart-rate variability as gospel. Anyone who has used an Apple Watch knows these numbers are estimates.

Swings in readings are common when devices are upgraded or recalibrated. The AI doesn’t contextualize that. It just spits out grades with all the confidence of a doctor, which can make users anxious or give a false sense of security.

Sleep scores get the same treatment. A late night can tank your Apple Watch rating, and ChatGPT will amplify that with an authoritative stamp of approval or failure.

Also: Your iPhone is getting a privacy feature that prevents carriers from tracking your location without draining your battery

The privacy implications are worth noting. OpenAI promises encryption and says data isn’t used for training, but ChatGPT Health isn’t HIPAA-covered. You’re sharing highly personal information with a company whose primary mission isn’t healthcare.

Fowler’s experiment shows the risk is way beyond theoretical. Claude, Anthropic’s competitor, didn’t do much better. It graded Fowler’s heart health a C but ignored the nuances of the data.

This is the kind of hype that Apple users should be skeptical of. The pitch of AI-driven personal health insights sounds amazing, but right now it’s still experimental.

Tools like ChatGPT and Claude can produce neat visualizations and identify general trends, but turning that into a meaningful health grade is premature.

Also: Apple secretly built this Siri shortcut years ago, but 99% of iPhone owners don’t know it can double their productivity instantly

And with the FDA saying it will “get out of the way” to promote innovation, there’s little regulatory pushback. That makes the stakes higher for anyone who trusts a bot over a doctor.

Fowler’s test is a cautionary tale. Your Apple Watch can track activity and trends reliably over time, but letting an AI assign letter grades to your heart health is not ready for prime time.

These bots are fascinating and hint at the future of digital health, but right now, they are best treated as experimental tools.

Users should enjoy the curiosity factor, not make decisions about their health based on a fluctuating grade from a machine that can’t even remember their age.

1 thought on “Doctors Are Shocked After ChatGPT Health Gave a False Heart Warning to a Low-Risk Person, Proving AI Can’t Replace a Real Doctor”

Doctors make fatal misdiagnoses all the time.

Doctors Are Shocked After ChatGPT Health Gave a False Heart Warning to a Low-Risk Person, Proving AI Can’t Replace a Real Doctor

Check out my favorite iPhone accessories:

More from

iPhone

1 thought on “Doctors Are Shocked After ChatGPT Health Gave a False Heart Warning to a Low-Risk Person, Proving AI Can’t Replace a Real Doctor”

Leave a Comment Cancel reply