One transcription product that relies on an AI model deletes the original audio, leaving doctors no way to check the transcriptions.
Credit: PeopleImages.com – Yuri A / Shutterstock An AI-powered transcription tool widely used in the medical field, has been found to hallucinate text, posing potential risks to patient safety, according to a recent academic study.
And that tool is being used in a commercial medical transcription product that, worryingly, deletes the underlying audio from which transcriptions are generated, leaving medical staff no way to verify their accuracy, AP News reported on Saturday.
OpenAI’s Whisper, the underlying AI tool, is integrated into medical transcription services from Nabla, which the company says are used by over 30,000 clinicians at more than 70 organizations. Nabla told AP its product had been used to transcribe around 7 million medical visits.
Whisper is also embedded in Microsoft’s and Oracle’s cloud computing platforms and integrated with certain versions of ChatGPT. Despite its wide adoption, researchers are now raising serious concerns about its accuracy.
In a study conducted by researchers from Cornell University, the University of Washington, and others, researchers discovered that Whisper “hallucinated” in about 1.4% of its transcriptions, sometimes inventing entire sentences, nonsensical phrases, or even dangerous content, including violent and racially charged remarks.
The study, Careless Whisper: Speech-to-Text Hallucination Harms, found that Whisper often inserted phrases during moments of silence in medical conversations, particularly when transcribing patients with aphasia, a condition that affects language and speech patterns.
In these cases, the AI sometimes fabricated unrelated phrases, such as “Thank you for watching!” — likely due to its training on a large dataset of YouTube videos. In more concerning instances, it invented fictional medications like “hyperactivated antibiotics” and even injected racial commentary into transcripts, AP reported.
For example, Whisper correctly transcribed a speaker’s reference to “two other girls and one lady” but added “which were Black,” despite no such racial context in the original conversation.Whisper is not the only AI model that generates such errors. In a separate study, researchers found that AI models used to help programmers were also prone to hallucinations. Harmful hallucinations Whisper’s errors are a […]
Patients may suffer from hallucinations of AI medical transcription tools