[seopress_breadcrumbs]

Machine Learning for Medical Transcription – Current Applications

•

December 20, 2018

The Current State of Machine Learning for Medical Transcription

There are several companies claiming to offer AI-based medical transcription software, specifically speech recognition software, to hospitals and healthcare companies. We found that these solutions are intended to help hospitals and healthcare companies with medical transcription in different forms, transcribing speech into text in order to fill out and update patient medical records in electronic health record (EHR) and electronic medical record (EMR) databases.

Machine Learning for Medical Transcription – Insights Up Front

We can’t seem to find evidence that the prominent companies offering speech recognition software for medical transcription have what we would expect in terms of talent at their company, except for Nuance Communications. It isn’t clear how exactly their solutions could work without natural language processing, a kind of artificial intelligence. Nuance employs many data scientists with PhDs and Master’s degrees in computer science and hard sciences like physics. This is generally what we look for when it comes to vetting a company on their claims to leveraging artificial intelligence and cutting through the marketing hype we so often see on AI vendor websites.

That said, Nuance Communications offers natural language processing software to a variety of industries, not just healthcare companies. This makes them stand out from the other companies listed in this report. Ideally, a company sells into one or two niches, tailoring their software to specific use cases.

Natural language processing algorithms often require specificity in the way they’re trained. If a machine learning model built for a voice recognition system is fed only audio data from people with Boston accents, for example, the voice recognition system might have trouble picking up commands when they’re said by someone with a different accent. We explain this in further detail in our report on Crowdsourced Natural Language or Speech Training – Use Cases and Explanation.

Similarly, natural language processing models need to be trained on specific word and phrases, technical terms and argon. Machine learning engineers looking to build a voice recognition system for use in hospital and clinical settings will likely need to recruit subject-matter experts to provide audio data for the algorithm that involves jargon and commonly used phrases in those settings. In addition, these subject-matter experts would be required to correct the software as it transcribes the jargon incorrectly or fails to transcribe it altogether and feed those corrections back into the natural language processing algorithm. Only in doing this can the machine learning model behind the voice recognition system “learn” to transcribe jargon medical jargon.

What this means is that it is both resource and time-intensive to train voice recognition systems, making it difficult for companies to offer software for it to a variety of industries. This isn’t to say that Nuance’s voice recognition system is worse than other systems from companies that focus on the healthcare domain. In fact, the talent at the company, $2 billion in revenue, and 8,000 employees likely mean that they have the resources to train and update their voice recognition systems. Nuance is a rare case, however. Startups can’t often focus on more than one niche when it comes to building machine learning models, and business leaders should be skeptical of AI vendors claiming to offer robust machine learning software for more than three verticals.

Suki AI does employ a Senior Machine Learning Engineer who seems to have business experience in machine learning, but we were unable to find employees with similar credentials on the company’s LinkedIn page. Suki AI’s CTO used to work at Salesforce, which is certainly doing AI with its Einstein product, but he does not seem to have worked on Einstein. He was also Vice President of Big Data Services at Oracle from 2014 to 2016. In addition, he holds a Master’s in Engineering from Carnegie Mellon, which boasts one of the most renowned machine learning programs in the world.

As such, it seems as though AI’s medical transcription use case is relatively nascent. Many of the companies listed in this report offer transcription services that do not use AI and are not done in real time. Users can upload audio involving medical jargon, and employees as the companies with an understanding of the technical jargon can transcribe the audio into text. Speech recognition software would theoretically allow this in real time, but again, it seems like such a use case is relatively untested given the state of talent at these companies.

Nuance and in some ways Suki AI seem the most likely to offer machine learning-based speech recognition software. Business leaders in healthcare may try working with these companies, but they may also want to wait for the use case to gain a little more traction in the coming years before spending money on transcription software.

The ML Medical Transcription Vendor Landscape

Nuance Communications

Nuance Communications offers software called Dragon Medical One, which it claims can help doctors and healthcare providers transcribe a doctor’s speech into an EHR using natural language processing. It’s likely that healthcare providers can integrate the software into an existing EHR system.

We can infer the machine learning model behind the software was trained first on requests to transcribe speech with word processing, in various accents, inflections, and pitches. The machine learning algorithm behind the voice recognition system would then transcribe those speech requests into text. Specialists that understand the field would then correct the transcription and upload the edited transcription into the machine learning algorithm again. This would have trained the algorithm to recognize medical jargon and better be able to transcribe what doctors and patients are saying when they discuss symptoms and treatments.

Below is a short 3-minute video demonstrating how Dragon Medical One works:

Nuance Communications claims to have helped Nebraska Medicine improve the efficiency of their healthcare providers when they updated patient medical records and wrote up documents. Nebraska Medicine integrated Nuance Communication’s software into its EHR. Prior to integrating the software, Nebraska Medicine was paying for transcription services in order to transcribe doctor audio notes. According to the case study, Nebraska Medicine saw a 23% decrease in transcription costs. In response to a survey of all the company’s physicians, “71% stated that the quality of their documentation has improved, and 50% stated that they have saved at least 30 minutes every day with Dragon Medical.” Nuance Communications also lists Allina Health and Baptist Health South Florida as some of their past clients.

Joe Petro is CTO at Nuance Communications. He holds an MSME in Computer Aided Engineering from Kettering University. Previously, Petro served as SVP of Research and Development at Eclipsys.

Suki.ai

Suki.ai offers a namesake software, which it claims can help doctors and healthcare providers take notes during appointments and update electronic health records with their voice using natural language processing. Suki.ai claims hospital staff can integrate the software into their existing EHR databases.

We can infer the machine learning model behind the software was trained on hundreds of thousands of relevant speech snippets. These snippets would be from recordings of conversations between a doctor and patient, updating the patient’s EHR during their appointment, and making small notes or reminders in various accents and inflections from various types of people. The machine learning algorithm behind the voice recognition system would then transcribe those speech requests into text. Human editors would then correct the transcription and feed the edited text back into the machine learning algorithm. This would have trained the algorithm to recognize and correctly transcribe these speech snippets.

That said, we could not find a demonstration video showing how Suki.ai’s software works specifically. In addition, Suki.ai does not make available any case studies reporting success with their software, but they do list OrthoAtlanta and Piedmont Fayette Hospital as some of their past clients.

Karthik Rajan is CTO at Suki.ai. He holds a Master’s Degree in Engineering from Carnegie Mellon University. Previously, Rajan served as Vice President of Big Data Services at Oracle.

**M*Modal**

M*Modal offers software called Fluency Direct for Transcription, which it claims can help doctors and healthcare providers transcribe what is said during patient visits into EHR documents using natural language processing. M*Modal Claims healthcare providers can integrate the software into their existing EHR databases, such as Epic, Cerner, and Athenahealth.

We can infer the machine learning model behind the software was trained first on hundreds of thousands of relevant speech snippets, such as a request to begin transcribing what a doctor says into an EHR or to navigate the EHR dashboard. These speech snippets would be in various accents and inflections from various types of people to ensure that the software could be used in a variety of places. The machine learning algorithm behind the voice recognition system would then transcribe those speech requests into text. Human editors would then correct the transcription and feed the edited text back into the machine learning algorithm. This would have trained the algorithm to recognize and correctly transcribe the speech as a doctor and patient say them in a clinical setting.

Below is a short 3-minute video demonstrating how Fluency Direct works:

M*Modal claims to have helped Floyd Memorial Hospital increase the number of medical documents they wrote up per day. Floyd Memorial Hospital integrated M*Modal’s software into its EHR system. According to the case study, Floyd Memorial Hospital saw their total lines of documentation rise from an estimated 113,000-124,000 to 235,000 per pay period as a result. M*Modal also lists Flager Hospital and Thomas Jefferson University Hospital as some of their past clients.

Detlef Koll is CTO at M*Modal. He holds an MS in computer science from the University of Karlsruhe. Previously, Koll served as Director of Speech Research at Lernout and Hauspie Speech Products.

Medical Transcription Billing, Corp. (MTBC)

Medical Transcription Billing, Corp. (MTBC) offers software called TalkEHR, which comes with a voice assistant feature called Allison. The company claims the software helps healthcare providers and doctors transcribe notes for EHR databases and navigate the EHR dashboard using natural language processing. MTBC claims healthcare providers can integrate the software into their database where electronic health records are stored.

The company states the machine learning model behind the software was trained first on hundreds of thousands of speech requests that a doctor might make, such as to start recording, to transcribe speech that might include medical jargon, or to fill in a predetermined set of information in lieu of a doctor repeating themselves. For example, a doctor could verbally request the Allison assistant to refill a patient’s prescription, and the software would fill in the form with the same information as the last time the prescription was filled.

These requests would be in various accents and inflections from various types of people, allowing the voice recognition system to see use in various English-speaking parts of the world. The machine learning algorithm behind the voice recognition system would then transcribe those speech requests into text. In order to further train the algorithm, subject-matter experts would edit the transcription to make sure all of the medical jargon was transcribed correctly. They would then upload the corrected version back into the algorithm so that the software could in the future correctly transcribe the jargon.

We were unable to find a demonstrative video showing how TalkEHR and Allison works, and MTBC does not make available any case studies reporting success with their software. MTBC does list Metropolitan Foot and Ankle Specialists as one of their past clients, however.

Felix Cirillo is CTO at MTBC. He holds a Master’s Degree in Business Administration and Information Technology from Dowling College. Previously, Cirillo served as CIO at Advantedge Healthcare Solutions.

Header Image Credit: Compass Professional Health Services

Recommended from Emerj

Scaling AI with Storage Efficiency – Emerj AI Leader Insight

This article is sponsored by Pure Storage and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page.As enterprises race to implement AI, most hit a bottleneck that's hiding in plain sight: inefficient storage infrastructure. While…

Riya Pahuja

•

May 29, 2025

The Evolving Role of Banks in Fraud Detection and AML Compliance – with Nick Lewis of Standard Chartered

Financial institutions are increasingly burdened with detecting and preventing financial crimes, leading to heightened operational costs and resource allocation challenges. According to the FBI's Internet Crime Report 2024, cybercrime continues to rise sharply in both frequency and financial impact. Last year alone, the FBI received 859,532 complaints related to cybercrime — a notable increase that…

Riya Pahuja

•

May 26, 2025

Paving the Way for Continuous Auditing Workflows in Financial Services with AI – with Leaders from MindBridge, Wells Fargo, Gulfport, Bank of China, and Citi

This article is sponsored by MindBridge and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Traditional audit cycles — often conducted annually or quarterly across many different industries — are increasingly misaligned with the…

Riya Pahuja

•

May 23, 2025

The Future of IT Operations with Automation and Real-Time Insights – with Troy Felix of BigPanda

This interview analysis is sponsored by BigPanda and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Modern IT operations are inundated with alerts from various monitoring tools, leading to alert fatigue among IT professionals.…

Riya Pahuja

•

May 22, 2025

Preparing Financial Services for Automation in the Era of Agentic AI – with Leaders from Automation Anywhere, Barclays, and Wells Fargo

This article is sponsored by Automation Anywhere and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. As artificial intelligence moves from buzzword to reality, leaders find that successful adoption requires more than deploying chatbots…

Riya Pahuja

•

May 21, 2025

Artificial Intelligence at Aviva

Aviva is a British multinational insurance company headquartered in London, England. Primarily recognized as the UK's leading diversified insurer, Aviva provides various products and services across insurance, wealth management, and retirement solutions. With 19.2 million customers spanning the UK, Ireland, and Canada, Aviva has positioned itself as a major player in the financial services industry.…

Ashwin Telang

•

May 19, 2025

Navigating Challenges and Solutions in Data Security with AI – with Dimitri Sirota of BigID

This interview analysis is sponsored by BigID and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Find out more about how BigID can help your organization adopt AI safely and responsibly here. Uncontrolled AI…

Riya Pahuja

•

May 15, 2025

The Future of Customer Experience in Financial Services with Agentic AI – with Abhii Parakh of Prudential Financial and James Wood of Interactions

This article is sponsored by Interactions and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Low customer engagement is a persistent challenge in the insurance sector, particularly with policies held for an extended period.…

Riya Pahuja

•

May 12, 2025

Artificial Intelligence at AbbVie – Two Use Cases

AbbVie is a global biopharmaceutical leader with approximately 55,000 employees in over 70 countries. In 2024, the company invested over $10.8 billion in research and development, supporting active immunology, oncology, and neuroscience clinical programs. To accelerate drug discovery, AbbVie is applying artificial intelligence (AI) to improve early-stage decision-making. The company aims to streamline target discovery…

Marilie Fouche

•

May 12, 2025

Emerj: Building Readiness for AI Agents in Healthcare Systems - Raheel Retiwalla

Building Readiness for AI Agents in Healthcare Systems – with Raheel Retiwalla of Productive Edge

This interview analysis is sponsored by Productive Edge and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Burnout among hospital staff, particularly nurses and physicians, has reached critical levels. A report by the Center…

Riya Pahuja

•

May 8, 2025

Neurobiological and Cybernetic AI for Manufacturing, Part 2 – with Oleg Savin of Unilever

In our current technology-driven era, data is considered extremely valuable. Yet, data often goes unused or underutilized. The reasons vary, but it's certainly not a newly surfaced problem. An article initially published by Harvard Business Review highlights that organizations struggle with managing and analyzing existing data. This problem is more pronounced in manufacturing, where unused…

Sharon Moran

•

May 5, 2025

Artificial Intelligence at Charles Schwab – Two Use Cases

The Charles Schwab Corporation is a leading financial services firm, reporting $10.28 trillion in client assets as of February 2025, a 16% year-over-year increase. In Q4 2024, the company generated $5.3 billion in net revenues (up 20% year-over-year) and $1.8 billion in net income, resulting in $0.94 EPS. Core net new assets reached $114.8 billion…

Riya Pahuja

•

April 28, 2025

Search site

Search site

Machine Learning for Medical Transcription – Current Applications

Machine Learning for Medical Transcription – Insights Up Front