Information Extraction in Insurance – Claims and Underwriting

Dylan Azulay

Dylan is Senior Analyst of Financial Services at Emerj, conducting research on AI use-cases across banking, insurance, and wealth management.

Information Extraction in Insurance - Claims and Underwriting

Customer data is essential for insurance firms to stay competitive in the coming decade. Insurance companies at present have backlogs of data on past and existing customers in the form of policy agreements, applications, and claims forms. They’ve also collected millions of images showing car damage, property damage, and personal injuries.

Patterns exist within this data that could inform the decisions of various insurance departments. Discovering these patterns, however, is a challenge. People are generally very good at finding patterns within datasets, but this ability dulls as we’re presented with more and more data. A team of chief claims officers, for all intents and purposes experts when it comes to dealing with claims data, might still spend months sifting through millions of claims forms to garner any reasonably accurate insights from them.

This challenge is compounded because large insurance enterprises are still not entirely digital. In other words, this backlog of claims forms and policy agreements is still partly a collection of paper documents. Older documents are likely stored off-site in various locations across the region the insurance firm is operating in. Global firms may even store these documents in other countries.

What this means is that there are entire time periods of insurance data that are difficult to access at any given moment. Most insurance firms also still accept paper claims forms and applications, and they take payment and send claims payouts via check.

Not only that, but even digital information can be stored in systems that don’t communicate with one another. The claims department at a large division of a global insurance enterprise might use a completely separate system for dealing with claims forms than the underwriting department at another division of the same enterprise. As a result, insurance firms struggle to keep all of their customer data in the same location.

For example, if an employee at a nation-wide auto insurance enterprise wanted to figure out the optimal premium that a customer should pay, they would need to find patterns across similar customers. Perhaps the customer is in their 40s, puts 300 miles on their car every week, and lives in a high-crime area. How much is this customer worth to the insurer?

That isn’t something one can accurately determine without aggregating the lifetime value of every customer of a similar demographic. This would require underwriters to sift through thousands of past customer records, including the claims that customers of this demographic tend to file, the length of which they stay on the policy, and how much their premiums have been historically (which could vary wildly for a number of reasons).

These documents may or may not be digitized, and so underwriters may in some cases need to look through boxes of paper documents in order to find policy agreements, claims forms, and other documents belonging to customers that fit the demographic. This is a rigorous and time-consuming task, and so underwriters tend to settle for historical precedent that’s easily accessible to them when determining premiums.

Artificial Intelligence, on the other hand, is quite good at dealing with large volumes of data. Whether or not AI upends the insurance industry remains to be seen, but some of the largest insurance enterprises in the US are already implementing AI solutions for functions such as customer service.

Information extraction, otherwise known as document search or “document understanding,” as Iron Mountain calls it, is a more nascent use-case for AI in insurance. That said, we suspect that in the coming few years, this use-case will become more ubiquitous in the insurance industry. This is because information extraction software promises to reduce the time that underwriters and other insurance employees spend searching through documents.

The ability to search through digitized documents is made possible with natural language processing (NLP); the ability to digitize paper documents so that they’re searchable with an NLP software is made possible with machine vision. More specifically, optical character recognition (OCR) serves to read printed and handwritten letters and transcribe them into digital text.

We spoke with Anke Conzelmann, Director of Product Management at Iron Mountain, about where AI-based information extraction and document search could prove useful in insurance. In this article, we discuss several use-cases for AI-based document digitization and information extraction in insurance, such as claims processing, underwriting, and human resources.

For a more in-depth analysis of AI search applications in insurance, download the Iron Mountain-Emerj co-branded white paper on the topic.

Digitizing Paper Claims Forms and Images

Insurance enterprises struggle to answer simple questions about how to price their policies for maximum profit and how to accurately adjust claims for minimal claims leakage. This in part is due to the inability to access historical customer data that in many cases is stored in physical documents.

Digitizing these documents is the first step in extracting information from them, and it’s a necessary step for feeding the data in these documents into an artificial intelligence algorithm.

At present, a claims adjuster that wants to determine the optimal payout to a customer whose home is partially flooded may need to search through past paper claims forms to get a sense of what customers were paid historical for similar damages.

The key is that “similar damages” is subjective and requires discretion on the part of the claims adjuster. Adjusters often need to look at the images customers provide and make an assessment about how much repairs might cost based on a variety of factors.

Two different adjusters might look through the same claims form and the same images and come up with different payout amounts. Both of these amounts might be more than what the damage actually costs to repair, and the insurance company won’t find this out until later.

Artificial intelligence could help claims adjusters reduce claims leakage, but only if the claims forms and images attached to them are digitized. Employees at the insurance firm could scan physical documents and photographs, turning them into PDFs or image files.

When we spoke with Conzelmann, she pointed to another feature robust platforms may offer: the ability to find similar images According to Conzelmann, “Adjusters can simply ask for similar images to the one showing the damage for the claim they are working on and quickly find relevant claims that had similar damage.”

Then, an OCR software could transcribe the letters on the documents into digital text, thus making the text “machine readable,” or ready for feeding into a machine learning algorithm. After training the algorithm to suit the insurance firm’s purposes, an employee would in theory be able to search for specific information within these documents.

For example, they might be able to pull up historical claims forms for property damage of a certain amount. This would reduce the time adjusters spend searching through paper documents for the same information.

Machine vision software for image recognition could also classify images of damage by damage severity and by the amount that was paid out to the customer for that damage. This classification could be used as a factor for determining the optimal payout on a claim.

This would entail a prescriptive analytics capability that would use a customer’s demographics, the text information on their claims form, and the images attached to their claims to suggest the optimal payout for that customer’s claim. This is also why claims processing and adjustment are underdeveloped use-cases for AI in insurance. They require a robust network of machine learning capabilities involving natural language processing, optical character recognition, machine vision for image recognition, and prescriptive analytics.

We’ve been researching AI in insurance for years, and we can count the number of vendors that claim to offer AI solutions for claims adjustment and have the talent requirements to back it up on our hand. Even those companies can only offer their solution to very specific types of insurers. In other words, a legitimate AI vendor selling a solution for auto insurance claims generally doesn’t market their software for property insurance or health insurance.

Information Extraction for Underwriting

Although prescriptive analytics capabilities are rare in insurance due to the varied types of data (text, image, numeric), claims adjusters and underwriters can still use natural language processing software to search through their stores of documents once they’re digitized. This could prove beneficial because even digital documents can be unorganized.

Many exist in a variety of different systems across an insurance enterprise’s divisions and branches. They may even exist in different folders and organizational structures within the same department at the same branch. Conzelmann spoke to us about how AI could help search through these disparate data sources, emphasizing the value of AI for this scenario:

In addition to enriching the metadata by extracting information from the documents, there could be metadata that you have in a repository already, it could be metadata that’s available out in the market for purchase, it could be publicly available information…the key is to be able to create the relationship between all of these different bits and pieces and making it all part of the metadata that’s attached to an asset.

This “asset” in this case could be a particular insurance customer or an insured property.

An information extraction and document search application could prove useful for searching through digital documents across the insurance firm’s numerous branches if those documents are stored in the cloud or some file-share program.

For example, an underwriter might be able to answer the question “Should I onboard this customer?” much faster than they would if they had to manually search through digital documents one by one for information that might help them answer that question. Instead, the underwriter could pull up records from past customers similar to the customer they’re looking to onboard.

The underwriter could then search through these records for information about claims the customer has made and customer lifetime value, and this could give them a better idea of whether or not to onboard the potential new customer. It might also inform the premiums they offer that customer.

An underwriter could make their decision about the customer in a matter of minutes as opposed to the hours or days it may take them to do so manually. This has clear savings benefits for the insurance company, as well as customer experience benefits. It could allow an insurance firm to move closer to offering “on demand insurance,” the ability for an insurance company to onboard a customer when the customer needs insurance (such as the day they’re diagnosed with an illness).

Insurance firms are scrambling to cater to millennial customers, who more than any other generation expect a level of speed congruent with their experience growing up with the internet. They don’t find it necessary to show up at a physical location and discuss their insurance policies. They want to be able to apply via chatbot or email, and they want to start their policies very shortly thereafter. AI-based information extraction software could help with this, potentially giving insurance firms that integrate it an edge over their competitors.

The Bottom Line – What Insurance Firms Need to Know

Claims processing and underwriting are two areas of insurance that could benefit from AI-based information extraction/document search software. That said, neither are developed use-cases for AI in insurance right now. This will likely change over time as AI becomes more accessible to businesses, perhaps with autoML or a shift in the culture of innovation at older enterprises. At that point, AI use-cases in insurance will likely move from the cost-saving benefits of document search applications to more complex machine learning systems that involve document search, machine vision, and prescriptive analytics, allowing for capabilities that drive growth, such as tailor-made insurance policies.

For now, information extraction and document digitization software could reduce the time underwriters and claims adjusters spend searching for information through paper and digital documents that they regularly use to make decisions about premiums and claims payouts. A less laborious and more organized search process could result in more profitable premiums and less claims leakage, although without a prescriptive analytics function, the premium and payout amounts are still left up to underwriters and adjusters (in other words, human error).

For now, information extraction and document digitization software could reduce the time underwriters and claims adjusters spend searching for information through paper and digital documents that they regularly use to make decisions about premiums and claims payouts. A less laborious and more organized search process could result in more profitable premiums and less claims leakage, although without a prescriptive analytics function, the premium and payout amounts are still left up to underwriters and adjusters (in other words, human error).

That said, artificial intelligence is difficult to integrate into existing businesses, as we’ve covered extensively in our executive guides, including one of our most fundamental: Enterprise Adoption of Artificial Intelligence – When it Does and Doesn’t Make Sense. Bringing AI into the enterprise requires several challenging and resource-intensive steps, including feature engineering and a melding of minds between subject-matter experts and data scientists.

At the same time, there are ways to mitigate spend and achieve a quicker time to market. Currently there aren’t many AI vendors that offer products clients can use “out of the box” or that are “plug and play,” so to speak. Those that offer something close to this are often in customer service or similar horizontals that don’t differ much from company to company, although it’s very likely that these products still require training on the part of the client.

In most cases, however, the AI vendor will work with the client to train the software, and the client may not require a team of in-house data scientists.

As such, working with an AI vendor like Iron Mountain will often require less from the client than building an AI application in house. Iron Mountain specifically claims their information extraction software comes built-in with AI capabilities.

In summation, insurance firms might want to consider AI-based document search and document digitization solutions, especially older firms that have legacy systems and stores of physical documents in a variety of disparate locations. But in doing so they should consider their business needs and the time and resource-intensive nature of an AI project before rushing to work with an AI vendor. If they decide that AI is right for them, they’ll want to read our guide on cutting through the AI hype before buying.


This article was sponsored by Iron Mountain, and was written, edited and published in alignment with our transparent Emerj sponsored content guidelines. Learn more about reaching our AI-focused executive audience on our Emerj advertising page.

Header Image Credit: Updater