How voice recognition technology has evolved

How voice recognition technology has evolved

During the Nuance Healthcare Partnership Event in Berlin, senior director of healthcare solutions, marketing, Jonathon Dreyer gave a presentation on the innovation behind the Dragon voice recognition software. He explained to Ian Bolland afterwards how the technology has evolved over the course of two decades.

At the heart of Dreyer’s presentation was a video of a live demonstration showing how AI-Powered voice recognition software worked in a doctor-patient consultation.

This involved the doctor addressing a patient who had a form of arthritis. The notes from the exchange were compiled as the consultation took place – and categorised accordingly – so there was health history, the diagnosis from the consultation and any medication. It was noticeable how the medical professional in the video knew how to ask follow-up questions that would prompt the technology to assign certain terms to the relevant part of the document.

Summarising the demonstration, Dreyer said: “You didn’t see the physician distracted by technology, you didn’t see him take his attention off the patient or turn his back to use the mouse or a keyboard. You did see this entirely new, enhanced patient-physician interaction where it put the conversation at the forefront, where information retrieval was simplified, and clinical documentation wrote itself.”

Speaking to Digital Health Age, Dreyer gave a brief history of how the technology has evolved, while suggesting that continued evolution is what comes next in terms of development, so called Ambient Clinical Intelligence.

“It depends on the specialty areas. Radiology was the mid-90s with speech recognition. The technology had shown promise especially in very specific medical specialties; radiology was an area that was pretty big. Small, finite vocabulary of use so the technology had pretty good recognition even in the early days because of that small library of words.

“It was really the evolution from transcription into speech recognition and in that transition period you had the speech recognition technology helping the transcriptionist. For example, what was a workflow of the past was that the physician would dictate into a telephone or a handheld device, get their audio recording, it would go to a transcriptionist who would type up their report.

“So, they’re straight dictating their note, someone is typing it up. Over time the speech recognition technology played a role for the physician dictating directly into a computer, or now into mobile devices, but it also played a role to create draft notes for the transcriptionist.”

Providing the technology is one challenge, but the most important one is there for to be sufficient uptake to have on individuals and the healthcare workplace as a whole. So, what have Nuance seen in terms of uptake?

“When we rolled out the cloud based version, Dragon Medical One in 2016 in the US first, we looked at the first year of usage for the entire physician population which now are upwards of a couple of hundred thousand users on the latest version of the technology.

“We were tracking and measuring different metrics to see adoption usage and just how people in the care team are doing documentation. But we looked at how much they are using the system, how much effort they are putting into creating their notes.”

Dreyer noted the difficulty of measuring accuracy rates because physicians may want to change the wording of their notes – like anyone would to with a work email.

He added: “So, we track a change rate and change ratio that looks at how much effort goes into creating documentation. If you look at something like that over time it gives you an opportunity to see if someone is struggling with the technology. Are they getting good recognition results? Is that going up or going down? Which direction is that trending?”

Nuance measured the change rate, usage and how much physicians using the technology were documenting per month. Dreyer suggested there were promising trends with usage reportedly up 23%. He feels this indicates the medical professionals were having positive experiences.

Furthermore, they saw a decline in change rates, dropping from 7% to 4% – indicating better accuracy and only minor changes to detail.

“All of those things factored in you get a really high adoption rate because the technology works, its responsive.

“At the end of the day if you have something that works people will be drawn to it.”

Given the transferrable nature of the technology industry by industry, further developments should be expected. One aspect mentioned in Dreyer’s presentation was the ability for really conversational AI solutions to be used in motor industry, for instance combining voice recognition with gaze and emotion recognition to detect whether a driver is feeling dizzy or unwell in his vehicle. Or in healthcare application, where it can detect if you are depressed, for instance to automate clinical trials or give instant feedback to doctors and nurses.