Artificial Intelligence for Medical Billing and Coding

Raghav Bharadwaj

Raghav is serves as Analyst at Emerj, covering AI trends across major industry updates, and conducting qualitative and quantitative research. He previously worked for Frost & Sullivan and Infiniti Research.

Artificial Intelligence for Medical Billing and Coding

Medical billing and coding is an integral component of healthcare. The medical billing outsourcing market alone is projected to reach $16.9 billion by 2021. The coding and billing process translates patient record information into standard codes which are used for billing patients and third-party payers such as a Medicare and insurance companies.

However, coding accuracy is an ongoing challenge. According to the Centers for Medicare & Medicaid Services (CMS), errors resulted in $36.21 billion in improper payments in FY2017. For example, a major category CMS attributes these errors to is “insufficient documentation for home health claims.”

Home healthcare is defined as medical services provided by health care professionals in the home of a patient. To be eligible for Medicare coverage, physician certification is required and improper payment errors occur when documentation is missing or incomplete.

In an effort to improve the accuracy and efficiency of the billing and coding process, companies are testing the possibilities of AI applications.

Some of the questions that need answering to better understand the role of artificial intelligence in efforts to improve medical billing and coding:

  • What types of AI applications are emerging to improve medical billing and coding?
  • How is the healthcare market responding to these AI applications?

Medical Billing and Coding AI Applications Overview

The majority of AI use-cases and emerging applications for medical billing and coding appear to fall into the category of Computer Assisted Coding. Specifically, companies are using machine learning and Natural Language Processing (NLP) to automatically recognize and extract data from medical documents for proper coding and billing.

In the article below, we present four representative examples as well as the current progress of each example. We have organized the companies using a set of 7 quantifiable factors (e.g. funds raised, target user, etc…) that we thought would be of particular interest to readers.

The companies are ranked based on estimated company revenue or total funds raised depending on the company type (e.g. private vs. startup). Please note that the ranking strictly serves as a method of understanding the level traction these medical billing and coding applications are gaining in the healthcare industry.  

We’ll conclude by discussing the potential value and future implications of these applications.


    • Estimated Company Revenue: $31.7 billion
    • Year founded: 1902
    • HQ location: St. Paul, Minnesota
    • Number of employees: 54,000+
    • Target user: Hospital
    • Type(s) of data processed: Text
    • Estimated number of current users: 1,700 Hospitals

3M claims that it utilizes NLP to deliver an automated medical coding through its 360 Encompass™ Professional System.

Trained on large datasets of medical terminology, the company’s web-based application automates the coding process by analyzing physician documentation from the text of clinical records and automatically identifies the proper billing codes.

The company reports that more than 1,700 hospitals nationwide currently use its software to streamline their workflows, improve coding accuracy and revenue potential. In a compilation of 8 brief case studies, the company describes their impact on healthcare system clients.

In one example, after integrating 3M’s coding software into their operational workflow, a hospital achieved 98 percent coding accuracy. However, it is not stated what the accuracy rate was prior to using the software. Additionally, no specific names of hospitals are provided.

In the 1:16 minute video below, Richard Wolniewicz of 3M’s Technology Leadership team discusses the relevance of NLP for coder productivity:

In the 1:55 minute client testimonial below, Brenda Nott, Coding and Operations Manager for Washington University, St. Louis provides details on the impact of 3M’s software on hospital operations.

According to the 3M’s LinkedIn page, currently over 54,000 professionals are associated with the company, including data scientists. For example, Brian Stankiewicz, PhD, a principal data scientist in the 3M’s healthcare division lists machine learning and computer vision among his specialties.


    • Estimated Company Revenue: $14.8 million
    • Year founded: 1991
    • HQ location(s): Paris, France/New York, NY
    • Number of employees: 79
    • Target user: Hospital
    • Type(s) of data processed: Text, Print and Cursive handwriting
    • Estimated number of current users: Unspecified

A2iA claims that AI drives its automated text extraction and document classification software products. The company claims that it has used automation and data analysis since its inception, however, it’s unclear to what degree AI has been implemented in A2iA’s initial products.

Trained on large amounts of medical keywords and phrases, the system learns to recognize them in various document formats and then can feed them into a computer-assisted coding system. For example, the company’s RIMES Database is composed of 12,723 documents from 1,300 volunteers, corresponding to 5,605 handwritten letters.

Documents are scanned as image files and categories include medical records, physician notes and billing documents. For example, the company markets its a2ia DocumentReader™ platform as a complement to NLP coding software by expanding data extraction capabilities.

Specifically, the software automatically identifies keywords or phrases in all types of documents, including those containing machine print, handprint and cursive handwriting.

In a case study, A2iA claims it assisted a healthcare company with extracting data from over 15 categories of medical documents and classified over 1.2 million medical documents on a daily basis. It is not specified how the software may have contributed to cost savings for the client.

The 1:44 video below provides a walkthrough of the key features of the a2ia DocumentReader software:

According to the A2iA’s LinkedIn page, currently 79 professionals are associated with the company including data scientists. For example, Christopher Kermorvant, PhD is an R&D manager who lists machine learning and deep learning among his specialties.


    • Estimated Company Revenue: $1.6 million
    • Year founded: 2002
    • HQ location: Eatontown, New Jersey
    • Number of employees: 11
    • Target user: Hospital
    • Type(s) of data processed: Text
    • Estimated number of current users: Unspecified

Artificial Medical Intelligence, Inc. (AMI) claims that its software platform, EMscribe®, integrates Natural Language Processing to automate medical coding.

The company’s solution is trained on high volumes of medical terminology and contextual data relevant to the coding process. The software analyzes healthcare documents and then generates medical codes by recognizing specific terms and phrases within the document.

For example, to assist clinicians in tracking and managing patient medications, EMscribe can abstract details from patient records such as the name of a drug, its dosage and form of the drug. The text can be abstracted from multiple file formats including a PDF file or Excel spreadsheet.

In a case study, the company reports that it assisted Robert Wood Johnson University Hospital in reducing coding time process from 60-90 seconds (manual) to 10 seconds. This translates into 44 hours saved per week or one FTE (full-time equivalent) per year. As the system learned the hospital’s unique operational needs the processing time was further reduced from 10 seconds to 0.5 seconds.

The 13 minute in-depth training demo below walks viewers through the features and process of running the EMscribe Computer Assisted Coding software in the clinical setting.

According to the company’s LinkedIn page, currently 11 professionals are associated with AMI including a number of software engineers. While no data scientists are listed, it is important to note that LinkedIn staff listings may not encompass the total employee roster.  


    • Total funds raised: $1.4 million
    • Year founded: 2012
    • HQ location: Annapolis, MD
    • Number of employees: 66
    • Target user: Hospital
    • Type(s) of data processed: Text
    • Estimated number of current users: Unspecified

Healthcare analytics company Pulse8 claims that its Popul8 coding technology integrates machine learning and NLP to optimize client workflow.

The platforms algorithms are trained on large volumes of data, learns to recognize specific text, and then matches it with corresponding codes. The Popul8 coding engine first performs a thorough analysis to identify all relevant to medical conditions from patient charts and physician notes. NLP is leveraged to minimize false positives and to maximize detection accuracy.

In an effort to provide an additional accuracy checkpoint, the coding engine pre-populates charts to provide clients with a head start to the coding process, helping to streamline their workflow.

While the company makes a claim that its platform can increase productivity by 40 percent, there is currently no such evidence to substantiate this specific claim on the company’s website.

Pulse8’s LinkedIn page lists 66 professionals associated with the company, including a number of healthcare data analysts and software engineers. The company’s chief data scientist, Scott Stratton, leads Pulse8’s product development strategies and provides predictive modeling and data mining expertise.

Concluding Thoughts

Medical billing and coding is a core component of how healthcare is delivered and received in the the U.S. The risks of inaccurate billing are still a challenge in this field and the large amount of data involved is prime territory for AI applications.

To provide additional context, the 10th revision of the International Classification of Diseases  (commonly known as ICD-10) is an industry standard for medical coding. The transition from ICD-9 to ICD-10 resulted in an increase from 3,824 to 71,924 for medical procedures. Similarly, the codes for medical diagnoses increased from 14,025 to 69,823.

Thus automation is poised help reduce risk for medical institutions and allow medical personnel to spend more time focusing on patient care and more complex coding scenarios.

Companies working to expand the character and phrase recognition capabilities of NLP, particularly for unstructured data, will have an advantage against competitors whose products may be more limited. A2iA is a good example of this with the ability of its software to recognize cursive handwriting.

Beyond processing high volumes of data, the potential of AI to significantly reduce human error is another advantage of implementing AI-driven solutions. Additionally, reducing the amount of company time required to manually complete the medical coding and billing processes is also important to consider.

With the increased role of AI in this sector, job security could be a potential area of concern. However, it’s important to note that the Bureau of Labor Statistics projects a 13 percent boost in employment for health information technicians between 2016 and 2026, which exceeds the average growth rate for all  occupations.

Comparing the companies profiled in this article, it appears that R&D time invested in developing Computer Assisted Coding software correlates with increased company revenue. Companies with longer histories would most likely have more time to build databases and resources to train algorithms compared to newer companies.

An important factor that may influence a newer company’s ability to more quickly gain a competitive edge is by expanding the type of data that its software can process.  For example, we could see more companies processing handwritten documents as well as offering voice-command options in the near future.

We can anticipate an increased amount of competition among companies offering AI applications for medical coding in the coming years. Potential contributing factors include the growing aging population, the increasing use of technology in healthcare and rising healthcare expenditures which are projected to equal nearly 20 percent of the GDP by 2025.



Header image credit: Adobe Stock

Stay Ahead of the AI Curve

Discover the critical AI trends and applications that separate winners from losers in the future of business.

Sign up for the 'AI Advantage' newsletter: