Big Data in Finance – Current Applications and Trends

Raghav Bharadwaj

Raghav is serves as Analyst at Emerj, covering AI trends across major industry updates, and conducting qualitative and quantitative research. He previously worked for Frost & Sullivan and Infiniti Research.

Big Data in Finance - Current Applications and Trends

International Data Corporation (IDC) reported in their  Worldwide Semiannual Big Data and Analytics Spending Guide that global investment in big data and business analytics (BDA) will grow from $130.1 billion in 2016 to more than $203 billion in 2020. In a previous report, we covered machine learning in the finance sector, and in this report, we dive deeper into big data solutions and data management platforms for financial institutions. The companies in this report all claim to help financial institutions with at least one of the following:

This report covers vendors offering software across three applications:

  • Business Intelligence
  • Cybersecurity
  • Regulatory Compliance

Business Intelligence


Qlik offers a software called Qlik Analytics Platform, which it claims can help banks and financial institutions gain business intelligence insights, such as identifying which of their products are not selling well, or run what-if scenarios for events such as natural disasters using big data analytics.

Qlik claims financial institutions can work with them to first create a big data repository where all the company’s data is collected. This includes unstructured and structured data such as data streams, server logs, and RFID logs from enterprise computers and machines, website activity and point-of-sale data, transaction records, and social media feeds.

Once all the data has been collected in the repository, most finance businesses might choose to use an Enterprise Data Warehouse software which can store, extract, organize and load data into the analytical tool in the platform. Users can then search and discover patterns in the data through a dashboard to gain business insights.

For example, a bank could use the Qlik Analytics platform to make use of their enterprise data to better understand their sales metrics. The user could upload customer transaction data and sales revenue records that the platform. The bank would work with developers from Qlik to create a data repository and an enterprise data warehouse which in turn feeds the organized data to the analytics tool.

The algorithm behind the software would then be able to find relevant and contextual insights for specific departments or roles in the bank. For instance, local branch managers might be able to access the sales, customer intelligence, and market dynamics specific to their branch. The system then provides other contextual information related to the branch manager’s requirements, such as graphs and charts correlating sales and products or a list of categories including date, location, customer, and sales history that can be used to discover ways to personalize products for customers.

Below is a short 4-minute video demonstrating how Qlik Analytics Platform works:

Qlik claims to have helped ANZ to democratize business intelligence to ‘the average employee’ in the organization. ANZ worked with Qlik to organize their structured and unstructured data such as transaction records, social media feed and customer feedback into ‘data lakes’. Qlik’s analytics platform was used to allow non-technical and non-data scientist employees in ANZ to search and discover information in this data through a simple search interface in a dashboard.

According to the case study, ANZ employees were then able query the data lakes to receive information by using simple search phrases like mortgage risk or customer transaction data. The employees would then be automatically shown links to all relevant and contextual data sources from the data lakes for each search query. We could find no comparable results or ROI information for the project.

Qlik also lists Citigroup, Westpac and Teachers Mutual Bank as some of their past clients.

Charles Potter is CTO at Qlik . He holds a MS and a Bachelor’s degree in computer science, from the University of Ottawa. Previously, Potter served as Director of Development Business Intelligence and Cognos Platform at IBM and Senior Vice PResident of Engineering at Ca technologies.



Versive offers a software called Versive Security Engine, which it claims can help banks and financial institutions analyze large datasets of transactions and cybersecurity-related data using machine learning.

Versive claims financial institutions can integrate the software into standard infrastructure in cloud, hybrid or on-premises environments. Clients can use their netflow, proxy, DNS data (computer network data) as inputs to the Versive Security Engine. Banks or financial institutions work with a team of developers from Versive to integrate their security platform in 2 phases.

Initially, the Versive Security Engine software is installed on the cloud, hybrid or on-premises enterprise network of the company, and then the software ingests the company’s internal data (mentioned above). In the second phase, the software uses machine learning algorithms to identify patterns in the data for the networks that indicate “normal” network characteristics. These patterns are correlated with a “baseline of operations” from periods of data where the bank does not face any cybersecurity events.

Then, over a brief training period using human analysts, the software learns to identify abnormal network characteristics that could indicate a cybersecurity threat. The software also builds a daily “Threat Case” list identifying patterns in the data which might indicate future intrusion attempts.

The system then provides a dashboard showing the threat level for each potential cybersecurity event with a map of the highlighted host computers that can be reviewed by the security analysts at the bank or financial institution. The software also generates a report on the key findings with an executive summary for the security heads in the client firm.

Below is a short 2-minute video demonstrating how Versive Security Engine works:

Versive  claims to have helped Riaz Invest Limited improve their customer data security. Riaz integrated Versive’s software into its enterprise security network and used their proxy, flow, and DNS data to generate Threat Cases for review by Riaz security analysts. Versive claims their  software reduced the number of false positives in the threat identification process for RIaz. We were unable to find any other measurable results for this case study.

Versive also lists Thomson Reuters, Komatsu and Deutsche Telekom as some of their past clients.

Brigham Anderson and Majid Alkaee Taleghan serve as machine learning Scientists at Versive. Anderson holds a PhD in Robotics from Carnegie Mellon and previously, served as data scientist  at eBay, Microsoft and Maana. Taleghan holds a PhD in Machine learning from the ORegon State University and previously served as a Deep- Learning Intern at NASA,

Regulatory Compliance


Ayasdi offers big data analytics and artificial intelligence services through its software Ayasdi’s Model Accelerator (AMA,) which it claims can help enterprises in financial services predict and model regulatory risk using machine learning. Ayasdi claims their software can help banks with applications such as regulatory compliance for anti-money laundering (AML),  automatically monitoring customer transaction data to identify anomalies and reduce the false positive rates in fraud detection as compared to traditional rule-based methods. Ayasdi also reports that it’s platform uses Topological Data Analysis (TDA), which was developed for a project funded by DARPA.

Ayasdi claims banks and financial institutions can integrate the software into their enterprise data networks. The user could then upload customer transaction data or sales revenue records into the AMA. The algorithm behind the software would then be able to comb through the data to test and compare several different risk-models, such as loss-given default (LGD), probability of default (PD) and other regulatory models. The system then provides users with options of viewing insights from the data on a dashboard that allows them to search, discover and predict risks.

Below is a short 2-minute video demonstrating how AMA software can be used for fraud detection:

Ayasdi claims in a 2017 case study that they were chosen by Citigroup to help create justifiable models of Citi’s revenue and capital reserve forecast to pass the Federal Reserve’s Comprehensive Capital Analysis and Review (CCAR) process.

The CCAR process was initiated after the 2008 financial recession to assess the financial standing of banks and Citi had failed the first two out of three annual CCAR stress tests conducted by the Federal Reserve. A team of developers from Ayasdi worked alongside subject matter experts in the bank’s business units to understand and collect data on macroeconomic variables such as revenue and capital reserves as stipulated Federal Reserve.

Ayasdi’s machine learning platform was then used to correlate the impact of the increase or decrease in these variables on each business unit’s monthly revenue performance over a six-month period.

The company claims to have developed several models to predict the future performance of these business units under different market conditions using the AMA. The feedback learning component of the project was in the form of insights from the business unit heads, who were once again roped in to evaluate the predictive model’s final performance.

According to Ayasdi, before the integration, the regulatory methodology that Citi followed was a  nine-month process involving hundreds of employees. After the project, this time was cut down to three-months utilizing less than 100 employees.

Ayasdi also names HSBC as a client for an anti-money laundering application.

Ayasdi was co-founded by Gunnar Carlsson, Professor Emeritus in the Department of Mathematics at Stanford University, and CEO Gurjeet Singh who previously earned a PhD from Stanford in computational and mathematical engineering.

Takeaways for Business Leaders in Finance

We found that AI solutions for compliance and fraud detection automation have the most traction. With data privacy becoming a hot-button topic, financial institutions have been forced to upgrade cybersecurity measures to ensure that they don’t suffer from data breaches (such as the Equifax data breach in 2017). The fact that this application directly correlates to the image and customer concerns seems to be the reason why finance companies have prioritized this application for big data analytics.

Qlik and Versive do not employ C-level executives with robust AI talent on its team. Ayasdi raised the most venture capital with around $106 million so far, and Qlik raised the least venture capital with $12.5 million. Ayasdi lists the most case studies with marquee clients including HSBC and Citigroup. All the companies listed in this report state that their software does not require clients have data science talent on staff to help with integration. Search and discovery applications seem to take the longest to integrate.

Companies like Ayasdi that offer big data analytics solutions are held up by C-level executives with an extensive academic background in AI, which lends credence to the legitimacy of their software.

Financial business looking to adopt big data practices might want to start by crossing off the below checklist of data requirements for most such applications

  • Data Volume might be considered “big data” when a company generates several Terabytes Petabytes of data that need to be analyzed. The financial industry produces a huge volume of quotes, market data, and historical trade data making it ripe for big data analytics
  • Data Velocity represents the speed of incoming data to an enterprise or the speed at which the data is being generated. All of this data needs storage or processing and for financial markets what this means in real-world terms is that faster trade data processing leads to faster trading.
  • Data Variety is the existence of various formats and sources of data. In banking and finance data such as reference data, trade and market data, transaction data is can be structured or unstructured. Big data solutions of today can help businesses manage all their data in one place.


Header Image Credit: The Conversation