Text Mining in Banking – A Brief Overview of Capabilities

Raghav Bharadwaj

Raghav is serves as Analyst at Emerj, covering AI trends across major industry updates, and conducting qualitative and quantitative research. He previously worked for Frost & Sullivan and Infiniti Research.

Text Mining in Banking - An Overview of Capabilities

Banks are starting to deploy natural language processing (NLP) to make use of enterprise and customer data in text mining applications ranging from gauging customer sentiments to enterprise search.

Historically, banks have collected large volumes of data about customers and their own internal operations. A significant portion of these documents is textual information recorded mostly for compliance reasons.

This data might be siloed for different business functions inside a bank, and effective information search and discovery is challenging in its traditional form. AI systems could help navigate through the vast data stores to gain valuable business insights.

The banking industry was one of the early adopters of text mining technology with traditional systems like the Automatic Processing of Money Transfer Messages (ATRANS).

This was possible because the information that these early systems had to “read” and “understand” was relatively simple and highly structured. With NLP, the scope of what information can be made searchable has grown to include both semi-structured and unstructured free form text. 

In our AI in Banking Vendor Scorecard and Capability Map 2019 report, we categorized 77 banking AI vendor products across AI capabilities, including data collection and analysis of text data.

We discuss this category of AI vendor product offerings in this article. Users can download the executive brief of the report to learn more.

The graphic below shows the number of AI vendor product offering functions in our banking research. The column labeled Data Collection and Analysis (Text Data) shows the number of vendor products that were applicable to leveraging free form natural language text data. 

This data is taken from Emerj’s AI in Banking Vendor Scorecard and Capability Map report.

Data Collection and Analysis (Text Data) accounted for 13.48% of all the AI vendor product offering functions we categorized in our report and 11.49% of the total funding raised by AI vendors in banking.

Text and customer data analysis was the most common among all the AI product offering functions we categorized, accounting for nearly 27% of all vendor product offering capabilities.

In this article, we delve deeper into text mining use-cases in banking by covering the following topics:

  • Leveraging Enterprise Text Data
  • Leveraging Customer Text Data
  • Adoption Best Practices For AI in Text Analysis

We begin our brief overview with how text mining applications can allow a bank better access to its enterprise data.

Text Mining in Banking – Enterprise Data

As an example, banks could use NLP-based software to search for specific information from internal legal documents. With an AI solution, users across the bank could search for only finance-related or fraud-related excerpts from these documents.

Integrating such contextual search systems could reduce the cost and time associated with information search inside the enterprise.

Banks could deploy an NLP solution wherein they first need to upload existing enterprise documents to the AI system. The algorithms are then tweaked to recognize attributes in these documents that are marked as being relevant for extraction, summarization, or abstraction.

The subject matter experts, in this case, legal and compliance teams at banks, would need to train the algorithms to recognize these patterns by manually tagging each document first.

Once the system is tested and deployed, employees at the bank could contextually search for information on a dashboard. The NLP algorithms can identify if the search queries are relevant based on the previously tagged datasets (usually called meta-data).

The search application might then potentially output a more informed search result for users at the bank.

Text Mining in Banking – Customer Data

AI vendors offer NLP based software that can help analyze incoming customer feedback from surveys, feedback forms, and social media to identify customer sentiments. 

For example, if a bank has several million responses from customers to their open-ended text-based feedback form, NLP might help in cutting down the time taken to review these messages and unearth new insights about what customers really want. 

Banks would need to upload existing customer messages to an NLP software to first categorize this data. In the case of sentiment analysis, the software’s algorithms are designed to parse out sentences, phrases, and other significant parts of each customer’s message and automatically tag these categories as positive, negative, or neutral sentiment. 

With this information, the software can then be designed to output insights for several different objectives.

For instance, the software can be designed to identify top customer issues and complaints or what products customers are talking about and how.  

For example, using NLP software to read through customer feedback on social media, banks might identify that customer posts from one particular geographical region are unusually high. NLP software might help identify that top customer issue is password and login issues. The bank can then alert the IT team in that region to take action to resolve the issue. 

Vendor Spotlight – Expert System

Expert System offers its AI-based Cogito software to banks and other financial industries and markets it with numerous capabilities. One such capability is enterprise search, which the company calls Cogito Discover.

The product also has a lesser focus on external search for investment research. Cogito Discover can purportedly help compliance officers search for relevant customer information or contracts that they can use to prove compliance in various business areas.

Expert System specifies that Cogito Discover is capable of running through text data in a variety of sources, including enterprise documents, email communications, and web pages.

In addition, the company highlights Cogito’s ability to purportedly classify text data and documents, as well as enrich that text data and its meta-data. For example, a user could upload a new document into the system, and Cogito Discover seems able to organize that document so that it’s searchable with the Cogito Discover search function. It’s unclear how the product does this, but it isn’t outside the range of capabilities for a machine learning-based software.

Adoption Considerations for Text Mining Applications

Nishant Chandra, PhD, Sr. Director of Data Products at Visa believes the best way to approach such an AI project is to first look at what academic use-cases exist for NLP and which of those can be repurposed for banking applications.  He says:

What we want to do is take academic use cases and make it significantly useful in the production of AI in industry. For example, in natural language processing document summarization, the user can find the keywords and summarize it. The hierarchical approach to this is to take that document and create context. This can be applied to banking documents to allow search and retrieval of contextual information from within documents.

Bankers need to understand that NLP systems require a lot of maintenance and servicing before they can be accurate.

In enterprise applications, such software might need to understand banking jargon. In customer-facing applications, the software might need to be tweaked to identify sentiments on ambiguous customer messages.

Training software to overcome these hurdles requires data feedback loops and ongoing iteration of algorithms with input from both data scientists and subject-matter experts.

Banks might find that even when a solution is deployed, alterations to the software might be inevitable. For instance, customer preferences might change or a new abbreviation or symbol might be used by customers on social media that needs to be given context for the software to “understand” it.

Banks might also acquire a new communication channel, such as a chatbot, or restructure their organizations and their internal data flows. All of these would require further upgrades and tweaks to the NLP algorithms. 

NLP solutions are fairly well-established in banking. NLP products accounted for the largest share in our report in terms of vendor product offering functions. AI-based information search and discovery, text mining, and natural language generation (NLG) are applications of this technology which are now possible due to AI and machine learning techniques. 


Header image credit: Amvik Solutions