AI for Health Insurance Fraud Detection – Current Applications

Niccolo Mejia covers AI applications across industries at Emerj. He holds a bachelor's degree in Writing, Literature, and Publishing from Emerson College.

AI for Health Insurance Fraud Detection - Current Applications

Health insurance fraud may prove especially pressing due to the opioid epidemic, and thus routing out fraud could be of greater importance in the coming years.

In this article, we’ll explore the AI-based fraud detection software available to health insurers by covering the products of four vendors and assessing their team for AI experience and their case studies for evidence of success.

  • Friss uses anomaly detection to offer claims fraud detection to insurance providers. They claim their software allows for a transparent view of its processes; that said, the level of transparency is unclear.
  • SAS also offers claims fraud detection software, but their enterprise data mining solution can also leverage data for business intelligence in various departments. SAS claims the software can create interactive predictive models based on that data. SAS uses predictive analytics to detect fraudulent claims as they appear in client systems.
  • offers an open source machine learning platform that they claim can help health insurers create fraud detection software for healthcare claims.
  • LexisNexis, like SAS, uses predictive analytics to detect health insurance fraud. However, their solution is purportedly able to comb through data from various and often unexplored sources. These include social media interactions and past acquaintances between healthcare staff and patients. They claim the software can detect higher risk if a provider and patient know each other and are suspected or have a history with fraud.

We’ll start with some insights about the state of fraud detection for health insurers:

AI for Health Insurance Fraud Detection – Insights Up Front

Next to SAS, has the highest density of AI talent on their team. It also has more venture funding than Friss. These are good indicators they are actually using machine learning technology. They do not list any health insurance case studies, but they do emphasize a focus on healthcare and the technology requirements for the industry.

LexisNexis’s solution offers unique benefits from data analytics, such as finding relationships between healthcare personnel, and they employ data scientists and staff with an AI background.

Friss has less funding than and we could only find five data scientists with an AI background on its Linkedin profile. Each data scientist was hired in 2016 or later, and they do not list previous professional experience with AI.

SAS and are the most likely to be leveraging machine learning. This is visibly true from’s venture funding and both of their robust lists of AI talent. LexisNexis is likely to be using AI as well, as they have many educated AI staff. However, they do not have many staff members with a PhD in AI or computer science. Friss is the only company in this report with a lower chance of actually using AI in their software. Though they have been in business since 2006, their data scientists are all very new to working with AI and may have moved into their positions after the company’s software was developed.

We begin our exploration of AI-based fraud detection for health insurers with Friss:


Friss offers a namesake software which it claims can help health insurers detect fraudulent claims using predictive analytics.

Friss’ solution collects data using various techniques to detect new fraud methods. These include text mining, image screening, geo-mapping, and social media analysis.

The company claims their software provides a detailed view of why a claim might be marked as fraud. Friss calls this “explainable artificial intelligence” because machine learning software is not usually transparent about how it generates its outputs. This is known as the “black box” of AI, and it’s a particular challenge in healthcare. We discuss the “black box” further in our article on the difference between AI and machine learning.

It’s likely that Friss’ machine learning model was trained on data from hundreds of thousands of insurance claims—in this case, health insurance claims. The fraudulent claims would need to have been labeled as such. Developers would have then run the data through the machine learning algorithm, effectively training the software to discern the data points that correlate to a fraudulent health insurance claim. fraud.

A client company could then deploy Friss’ software, and the algorithm would be able to mark claims as fraudulent as they enter their system. These claims would continue to train the machine learning algorithm because employees can accept or reject it upon flagging certain claims. This way, the software could adapt to new fraud methods as it runs.

Friss claims to have helped Anadolu Sigorta save money by reducing fraudulent claims. Anadolu Sigorta integrated Friss’ software into its system for incoming claims so they could be run through the software.

According to the case study, Anadolu Sigorta was able to save $5.6 million by detecting fraudulent claims before they were paid. It also states that they were able to automate their claims process and implement a model called straight-through processing. This is when claims are processed by auto-filling repeated data points so that the claim does not need to go back and forth between parties.

Below is a short video from Friss detailing Anadolu Sigorta’s success with the software:

Friss also lists Folksam as one of their past clients.

Jeroen Morrenhof is CEO at Friss. He holds a PhD in Business Information Systems from University of Amsterdam. Previously, Morrenhof served as an investor at Wonderflow.


SAS offers software called SAS Enterprise Miner, which it claims can help health insurers detect fraudulent claims and find information that may be useful in their enterprise data using predictive analytics.

The company claims Enterprise Miner uses a client company’s data storage to find examples of insurance claims, fraudulent claims, and fraud tactics in order to detect them as they enter the system. SAS also claims the software can use enterprise data to make models of possible fraud cases based on variables attached to the given data points. For example, if a user wanted to test how often fraudulent claims are for amounts below $100, they could test the “fraud” variable against the “price” variable.

The machine learning model behind Enterprise Miner would likely need to be trained on data from thousands of health insurance claims. Each fraudulent or suspicious claim would be labeled as such, and then an employee could expose the machine learning algorithm to these labeled claims. This would train the algorithm to discern which data points correlate to a legitimate, suspicious, or fraudulent claims.

The software could then make predictions about which claims will turn out to be fraudulent, or at least suspicious. However, if new fraud methods are discovered outside of the enterprise such as online or from word of mouth, users may need to upload information regarding those new methods into the software beforehand.

Below is a short 7-minute video demonstrating how SAS Enterprise Miner works. The video explains the process of creating and using predictive models. The models in the video use an example topic, but the same principals can be applied to fraud detection. The video is structured as follows:

  • 0:00 – Selecting data points and variables and reviewing data
  • 3:00 – Checking target variables against others to find correlations, and model creation
  • 4:05 – Results and how to read predictive models
  • 5:19 -Using multiple models concurrently

SAS claims to have helped the dental insurer DentaQuest reduce fraud and increase ROI from their marketing campaigns.  According to the case study, DentaQuest was able to identify over 50 customer actions or behavioral patterns linked with fraud within their claims data.

SAS also lists CZ and Highmark Health as some of their past clients.

Jim Goodnight is the CEO of SAS. He holds a PhD in statistics from North Carolina State University. Goodnight has worked with SAS since 1976. offers a namesake machine learning platform which it claims can help tech companies and healthcare providers create their own artificial intelligence software using machine learning. They also offer an automated machine learning platform called Driverless AI. This version of the platform trains itself and develops features on its own. claims that the healthcare industry one of their chief focuses and that they have goals for helping develop AI healthcare solutions.

The company lists health insurance fraud detection is one of the solutions their platform can provide with some development. also claims there is an opportunity for insurance providers to detect fraud before claims are paid, similar to Friss’ software.

Clients creating an AI software with a machine learning model from can program the mechanical process of combing through claims into the software while they train the model on past claims data. A client can use either of’s platforms to build a machine learning model for their specific use case.

In order for a user to build a machine learning model for their business’ use case, which in this case would be claims fraud detection, it will likely need to be trained on thousands of data points. These data points could be from hundreds of thousands of health insurance claims, for example. Some of these claims would be labeled as fraudulent or suspicious, and the machine learning model would recognize this label more and more over time.

Then, the user would have to expose the machine learning model to these labeled claims. This process would train the algorithm to a certain extent to determine which claims have a high probability of being fraudulent.

A client company could then deploy’s AI software, and the algorithm would be able to flag fraudulent transactions as they appear in the company’s system. These claims would keep training the software’s algorithm as employees give it feedback on certain flags. If an employee could not find anything wrong with a claim the software flagged as fraud, they could reject the software’s notification. This would allow the software to pick up on new fraud methods and anomalies that don’t raise suspicion as it runs.

Below is a screencap from’s website. It shows the features of their AI platform alongside an example data science experiment on a laptop.’s driverless platform is depicted here, which can also be used to develop fraud detection software.

Features of H2O's AI platform does not list any case studies showing a healthcare insurance provider’s success with the software. also lists Kaiser Permanente as one of their past clients.

Arno Candel is the CTO of He holds a PhD in Computational Physics from ETH Zurich. Previously, Candel served as a Senior Member of Technical Staff at Skytree Inc.


LexisNexis offers software called Relationship Mapping, which it claims can help healthcare plan networks discover and investigate fraud within a healthcare provider using predictive analytics.

Relationship Mapping purportedly gathers data from various sources in order to find behavioral patterns associated with fraud and identify risky people within client companies. LexisNexis claims that the software also uses what they call “relationship data,” or data about suspicious social groups, connections, or affiliations associated with customers, patients, or employees within the client healthcare plan network. This allows clients to identify relationships that could be the source of fraudulent behavior.

For example, if a customer that an employee knows personally has a criminal history involving prescription drugs, the software may be able to identify that personal relationship with social media and customer profile data. Once the relationship is confirmed, the software will mark these individuals with a higher risk for fraud, and more so when near each other.

We can infer LexisNexis’ machine learning model behind their Relationship Mapping software was trained on large amounts of data from various sources. These include healthcare claims, data about individual healthcare providers within a healthcare plan network, individual profiles of customers and employees, business operational data, and relationship data.

Fraudulent or suspicious points within all of this data would be labeled as such, and then all data would be run through the machine learning algorithm. When the algorithm is exposed to large amounts of data, it is effectively being trained to discern which data points correlate to higher risk of fraud within a healthcare plan network.

Specific data points for all of these types of data may not come to mind immediately. For example, repeated healthcare claims for similar accidents may be fraudulent or a precursor to fraud. If a healthcare provider within a client’s network has a history of successful fraud attempts or employees who have committed fraud, those training the system would need to also label that as suspicious.

Individual employee and customer profiles will have some background information on the people listed, which could illuminate a criminal past or a problem with prescription drugs. Business operational data could range from profit and loss figures to recorded amount of waste within a given time period. Finally, relationship data could come from social media, email data, or information about where they work or went to school.

The software could then predict in which parts of the client’s network fraud is most likely to happen. It would compound risk factors onto each other organized by a provider, and within high-risk providers, could show the individuals or business areas at the highest risk. However, this may mean management or data scientists would have to upload data involving recently discovered frauds or new employees and customers into the system before running the software.

Below is a 2-minute video explaining how LexisNexis Relationship Mapping works, with a focus on the Suspect Address module of the software:

LexisNexis claims to have helped a large health plan discover provider fraud from previously unrecognized relationships between disparate data categories. The client company added LexisNexis’ Relationship Mapping software to their fraud detection process, which allowed them to aggregate their data from multiple sources.

These could include healthcare providers, healthcare claims, and any information they had on the relationships between individuals at the pharmacy level. According to the case study, the client company identified provider fraud and more specific cases needing in-depth investigation.

The company’s SIU found providers who had their licenses revoked still prescribing medicines, and providers wrote fake prescriptions to keep for themselves. However, it should be noted that the client is not mentioned by name.

LexisNexis does not list any major health insurers as clients, but they have raised $30 million in venture capital and are backed by Schoolhouse Partners and WR Hambrecht.

Jeff Reihl is Executive VP at LexisNexis. He holds a MS in computer science from the Johns Hopkins University. Previously, Reihl served as Executive VP of Solutions at Truven Health, formerly Thomson Healthcare.

Interested readers may also want to read our report, AI for Affordable Care Act (ACA) Management.


Header Image Credit: Dark Web News

Stay Ahead of the Machine Learning Curve

At Emerj, we have the largest audience of AI-focused business readers online - join other industry leaders and receive our latest AI research, trends analysis, and interviews sent to your inbox weekly.

Thanks for subscribing to the Emerj "AI Advantage" newsletter, check your email inbox for confirmation.