Artificial Intelligence in Epidemiology – Current Use-Cases

Ayn de Jesus

Ayn serves as AI Analyst at Emerj - covering artificial intelligence use-cases and trends across industries. She previously held various roles at Accenture.


Artificial intelligence is changing the way healthcare networks do business and physicians perform their routine activities from medical transcription to robot-assisted surgery. Although the more mature use-cases for AI in healthcare are those built on algorithms that have applications in various other industries (namely white-collar automation), we believe that in the coming three to five years, AI solutions for healthcare will become increasingly specialized to individual use-cases.

In phase two of the AI zeitgeist, leadership teams at healthcare networks will have familiarized themselves with basic AI concepts, and they’ll be able to work with vendors that offer AI solutions for very specific healthcare concerns.

In this article, we look at the state of AI for epidemiology, the study of the incidence and spread of disease. The large majority of AI applications for epidemiology are predictive analytics applications, perhaps unsurprisingly.

Predictive analytics involves AI algorithms that use historical data to predict future outcomes. As such, there’s some evidence that they can help government bodies, community health organizations, and researchers figure out how a disease might originate in a population and how it might spread based on these predictions.

Predictive analytics applications for epidemiology almost always require clients to supply very large quantities of anonymized patient data, which may prove challenging for research institutions that do not have robust partnerships with healthcare organizations.

Leaders interested in using AI for epidemiology should consider the data requirements of a vendor’s AI solution well before they choose to do business with a vendor.

We cover four AI vendors offering artificial intelligence solutions for epidemiology: IBM. Saama, SAS, and Orion Health. We begin our analysis of the space with IBM Watson Health.

IBM Watson Health

IBM Watson Health offers the Explorys data set and analytics solution, which the company claims can provide life sciences companies and epidemiologists a better understanding of

disease history, epidemiology, and disease progression, and determine the economic impact for select populations.

The company claims that knowing this will also enable organizations to identify efforts for deeper study and identify populations most likely to benefit from a treatment.

The company states the machine learning model behind the analytics software was trained on ambulatory, inpatient, and adjudicated claims data of 50 million anonymized patients sourced from electronic medical record systems. The data would then be run through the software’s machine learning algorithm.

This would have trained the algorithm to discern which data points correlate to the rates of health events in a population, the history of disease, and their most successful treatments.

The software would then be able to predict the incidence and prevalence of a disease in populations, disease treatment patterns, and treatment-related risks.

Below is a short 5-minute video explaining how IBM Explorys Data integrates data from multiple care settings. This includes 3 to 4 years worth of individual patient data, encompassing ambulatory, in-patient, specialty care, and post-acute care. Other types of data include lab results, vitals, biometrics, and patient-reported outcomes. The video reports that data is updated daily and weekly:

IBM Watson claims to have helped Smart Analytics study the treatment journey of more than 6,500 psoriasis patients using IBM Explorys. One of SmartAnalytics’s customers, a pharmaceutical company, wanted to know how long it took for psoriasis patients to transition from topical to oral treatments and finally to injectable treatments.

SmartAnalyst turned to IBM Watson Health and used Explorys to discover that over the course of three years, patients tended to skip oral medications and immediately transition from topical remedies to injectables.

The Explorys dataset revealed that a large number of patients transitioned from topical therapy to injectables within 206 days, not enough time for some topical treatments to take effect. However, patients who tried oral treatments before switching to injectables took an average of 488 days.

As a result of having this information, SmartAnalytics’s customer developed a communication plan to educate patients and doctors on the importance of giving oral medications time to take effect, and that the more expensive injectables are a last resort treatment.

IBM Watson Health also lists AHMC Healthcare, Schneck Medical Center, SmartAnalyst, Edward-Elmhurst Healthcare, Harrow Council, Floyd Health Care System, and Hallmark Health Medical Associates as some of their past clients.

Scott Spangler is the Chief Data Scientist and a Distinguished Engineer at IBM Watson Health. He holds an MS in Computer Science from the University of Texas at Austin. Previously, Spangler served first as Sr Tech. Staff Member and then as Principal Data Scientist, Distinguished Engineer at IBM Watson Innovations for 19 years.

Saama Technologies

Saama offers Real World Analytics, which it claims can help life sciences companies mine data that will allow them to monitor million-wide populations during clinical trials and forecast disease incidence or prevalence using machine learning.

Saama claims that the application resides in the cloud. The company adds that the machine learning model behind the software was trained on data consisting of billions of patients’ electronic medical and health records. The data would then be run through the software’s machine learning algorithm. This would have trained the algorithm to discern which data points correlate to the efficacy of a drug.

The software would be able to predict treatment patterns such as shifts in the type of drug, the ideal duration of a treatment, and the prevalence and incidence rates of a disease.

Below is a short 3-minute video demonstrating how Saama’s Real World Analytics uncovers data about treatment pathways of patients, in this case, with lung cancer. Users select the data source, diagnosis, and therapies to show the number of patients by gender and age, and reveal the popular treatments administered to them.

The results also show the duration of therapy as well as the period when patients change from one treatment to the another:

Saama claims to have helped Pharmacyclics, a company that developed and marketed small-molecule medicines to treat cancers and other autoimmune diseases, aggregate all its clinical operations data for a singular, central view. As its clinical operations grew, Pharmacyclics needed to maximize its clinical data better. However, the data silos made it difficult to generate automated and accurate reports.

Pharmacyclics turned to Saama, which deployed its Clinical Development Optimizer (CDO), part of the Life Science Analytics Cloud.

The case study reports that the implementation of the CDO resulted in:

  • A comprehensive view of clinical operations data
  • Consolidated data and automated, error-free reports
  • More transparency and consistency in clinical operations data
  • Enhanced collaboration among the business team members
  • Standardized data and business definitions
  • Improved data flow, data management, and governance

Saama also lists Actelion, Astellas, Bill and Melinda Gates Foundation, Brocade, Broadcom, Cisco, CSAA Insurance Group, Dignity Health, GoPro, Motorists Insurance Group, Otsuka, PayPal, Roche, and Unilever as some of its past clients.

Rajeev Dadia is the CTO at Saama. He holds an MS in Computer Science from the California State University-Chico. Previously, Dadia served as IT Manager.


SAS offers its Real World Evidence, which it claims can help healthcare providers and life sciences companies better understand a population and improve population health and treatments by providing data from a wide variety of sources.

These sources include:

  • The environment
  • Electronic medical and health records
  • Genomics
  • Socioeconomic data
  • Clinical trials,
  • Case reports
  • Healthcare insurance claims,
  • Public health investigations

The data will then be analyzed using machine learning and predictive analytics. The company states that the machine learning model behind the software was trained on point-of-care systems, electronic medical records, insurance claims, patient-reported outcomes, and third-party data.

The data would then be run through the software’s machine learning algorithm. This would have trained the algorithm to discern which data points correlate to, for instance, a drug’s utilization and performance, as well as the patient’s adherence to the treatment and treatment preferences.

The software would be able to predict how the therapeutic value of an existing drug can be expanded by identifying new illnesses it can treat and new customers. This may or may not require the user to upload information about their new customer segments or plans for a marketing campaign into the software beforehand.

SAS claims to be helping Renown Institute for Health Innovation (Renown IHI) better understand how genetic, clinical, environmental and socioeconomic factors affect population health. Renown IHI in September 2016 embarked on the Healthy Nevada Project and needed to develop an application that would reveal population health risks of patients based on gender, age, and personal or family health history.

The application will also be used to uncover public health risks of diseases, illnesses, and environmental factors such as air quality.

According to the case study, the pilot phase of the project had enlisted 10,000 participants whose DNA samples had been collected in 60 working days. The project’s second phase opened to 40,000 more Nevadans last March 2018.

SAS reported that using its application could potentially predict how environmental factors contribute to the Nevada population’s health, and understand what role age, gender, or genetics play. The results could then be used to progress precision medicine and other health innovations and research.

SAS also lists Honda, Nestle, HSBC, Lufthansa, Health Data Essentials, OhioHealth, Stockholm County Council, University of new hampshire, and Western Australia Department of Health as some of its past clients.

Orion Health

Orion Health offers Amadeus, a population health management and precision medicine software, which it claims can help healthcare organizations handle large volumes of data to predict and differentiate the health risks in a population using machine learning. The company explains that having this capability will allow healthcare organizations to make quick and informed decisions.

The company states the machine learning model behind the software was trained on claims, clinical, and non-traditional data such as omics, social, and behavioral data. The data would then be run through the software’s machine learning algorithm. This would have trained the algorithm to discern which data points correlate to the health risks of a population.

The software would be able to predict which patients are at risk, the treatments that can be administered, as well as the accompanying costs. This may or may not require the user to upload information about their care-coordination programs or personalized care plans into the software beforehand.

Below is a short 2-minute video demonstrating how Amadeus integrates health data such as claims, clinical, behavioural, social, genomic and device to ensure that physicians have complete patient records. The application then enables users to analyze this data to identify at-risk patients and provide specific treatments:

Orion Health claims to have helped the Scottsdale Health Partners (SHP), a physician-led clinical network that covers more than 60,000 patients, set up a health information exchange (HIE) that would provide complete, accurate patient data.

The SHP turned to Orion Health to establish the HIE that would allow its more than 700 participating clinicians using disparate EMR software to share health information. The HIE also enabled the clinicians to identify at-risk patients, find data related to appropriate treatments, and simplify reporting.

Using the application, the SHP was able to predict patients at-risk of readmission and the contributing factors after being discharged. This enabled them to develop programs to reduce the risks, resulting in a 9% decrease in readmissions within one year of the application’s deployment.

The case study also claims that within the same period, SHP was able to reduce costs by 10%, attributed to the predictions provided by the application.

The case study also claims that SHP is the only Arizona Medicare Shared Savings Program to achieve an almost $3.75 million in savings, attributed to the application’s predictions.

Orion Health also lists New Mexico Health Information Exchange, University of Louisville Hospital, Oregon Health Authority, Penn State Health Milton S. Hershey Medical Center, and Maine HealthInfoNet as some of their past clients.

Ian McCrae is CEO at Orion Health. He holds an MS in Electrical Engineering from the University of Auckland. Previously, McCrae served as telecommunications consultant at Ernst & Young, Senior Business Analyst at the London Stock Exchange, and as a scientist at the Department of Scientific and Industrial Research.


Header Image Credit: CIR Vitals