Machine Vision in Banking – Facial Recognition and OCR

Niccolo Mejia

Niccolo is a content writer and Junior Analyst at Emerj, developing both web content and helping with quantitative research. He holds a bachelor's degree in Writing, Literature, and Publishing from Emerson College.

Machine Vision in Banking - Facial Recognition and OCR

The conversation surrounding machine vision can sometimes feel dominated by facial recognition, especially for banking. Banks are necessarily concerned with security and customer service, so it follows that facial recognition, which could help with both, would prove of interest to them. 

It’s true that many of the machine vision solutions marketed to banks are used for facial recognition, but some offer optical character recognition (OCR) for digitizing paper documents.

In this article, we’ll provide an overview of machine vision in banking from facial recognition to OCR by exploring companies that offer these capabilities to banks.

SenseTime offers facial recognition, facial comparison, and live body detection for security at ATMs. This allows for detecting attempts at tricking the camera with a still image. Yitu Technology also offers facial recognition for ATMs, but their solution can also scan the faces of passersby and compare moving clips of customers or pedestrians with static images to verify a person’s face.

ABBYY is a provider of optical character recognition (OCR) software that can extract data from physical documents and organize data points to be stored in appropriate databases. Banks could use a solution like this to digitize important information from customers’ physical documents like tax forms and certificates showing good standing with taxes. Ocrolus also offers OCR software, but it is mainly marketed for bank statement review and application approval.

The companies we found that claim to offer machine vision solutions to banks advertise their software to handle at least one of the following problems:

We will start with SenseTime, which offers facial recognition software, and we will go one by one through four companies that offer machine vision software to banks.

Facial Recognition-based Security at ATMs


SenseTime offers software called SenseID along with SenseID authentication services, which it claims can help banks ensure that the user’s identity is authentic by combining facial recognition with other identity authentication tools. This allows for improved security at ATMs or on mobile apps.

SenseID purportedly lowers risks for banks and financial institutions by offering identity authentication powered by facial recognition. In addition, the software can also detect the presence of live bodies to catch attempts to trick it with a still image. The software can also recognize bank cards and can likely match them to the faces associated with the attached bank account. SenseTime also claims the software can compare faces to ones already in its system for more efficient verification.    

SenseID purportedly lowers risks for banks and financial institutions by making sure the customer at an ATM or using a mobile app is not trying to “trick” the software by using a falsified image of someone’s face. If it detects that this may be the case, the software can deny the user access to the account. They do this by requiring the customer to stand before the camera at the ATM and look into it. On a mobile app, the customer would need to position their face to be level with their mobile device’s camera. The software then scans the customer’s face and checks for signs of liveliness and specific facial features of the face in the photo ID associated with the account they are accessing.

It’s likely that the machine learning model behind SenseTime’s software was trained on tens of thousands of images and video clips of human faces at multiple angles and lighting conditions. Client companies would have also needed to upload images and video of their customers’ faces as training data. A data scientist would have to expose the software to this data. This would train it to discern the 1’s and 0’s that, to humans, forms the image of a person’s face as shown in a picture or live footage from a camera. In this case, the camera is one attached to the front of an ATM.

A bank could then outfit an ATM with the software and show it a new person’s face through the ATM camera. The algorithm behind the software could then recognize if that person is a customer, and if not, can signal to not allow the person to access accounts. For example, even if the ATM recognized a customer’s bank card was inserted, the machine could still deny access if it does not detect the customer’s face in the camera’s view.

A bank could then outfit an ATM with the software and attempt to trick it into accepting a high quality printed or screen-displayed image of someone’s face. The algorithm behind the software could then recognize if the face in the camera’s view belongs to a customer, and if that instance of their face is actually them trying to access their account. If the software determines that the face is not moving enough to show livelihood, or that it is not the face of an active customer, it will deny the user access.

SenseTime does not make any case studies available that show a bank’s success with the software.

SenseTime does not list any major banks as clients, but they have raised $2.6 Billion in venture capital and are backed by Silver Lake Partners, Tiger Global Management, and Alibaba Group.

Li Xu is CEO at SenseTime. He holds a PhD in Computer Science and Engineering from the Chinese University of Hong Kong. Previously, Xu served as Manager and Advisory Researcher of imaging and sensing at Lenovo.

Yitu Technology

Yitu Technology offers software called YITU Dragonfly Eye Intelligent Security System, which it claims can help banks offer heightened security with facial recognition for ATMs using machine vision.

Dragonfly Eye is the company’s chief solution for AI-based security. Banks can generate a database of faces from the urban population surrounding their ATMs to more easily identify customers as they approach. This would allow them to record the faces of passersby and use the data to find their identity should they use the ATM in the future.

We can infer Yitu Technology’s machine learning model for Dragonfly Eye was trained on tens of thousands of images and clips from ATM cameras showing customers’ faces. These would also need to be from multiple angles and in multiple lighting conditions. This is to prepare the software for the range of factors during a public transaction such as time of day. Each new face would then be labeled by the associated name and account number. A Yitu Technology employee or client company’s AI staff would then run this data through the software’s algorithm. This would have trained the software to recognize the words, phrases, and chains of text that appear to people as someone’s face in footage from an ATM camera or that of another ATM.

A customer could then approach an ATM fitted with a camera powered by Dragonfly Eye and try to make a cash withdrawal. The software’s algorithm would then be able to recognize the customer’s face and determine the bank account associated with it. It can then accept or reject the request based on if the person is in the bank’s customer database.

Below is a short 2-minute video demonstrating how Yitu Technology’s software works with cardless ATMs that rely solely on the customer’s face and pin number for verification:

Yitu Technology claims to have helped JD Finance transform password verification for their banking and finance clients. JD Finance utilized Yitu Technology’s Dragonfly Eye system to replace passwords and signatures with facial recognition  According to the case study, JD Finance incorporated this solution into their internet banking and financial services. However, it should be noted that the case study offers no statistics on JD Finance’s success. We caution readers away from understanding case studies like this to be accurate to the actual value the solutions provide.

Yitu Technology also lists Shanghai Pudong Development Bank and China Merchants Bank as some of their past clients.

Leo Zhu is the CEO of Yitu Technology. He holds a PhD in Statistics specializing in statistical modeling of computer vision and AI from UCLA. Before founding Yitu Technology with Lin Chenxi in 2012, Zhu was a postdoctoral fellow at MIT’s AI Laboratory.

OCR for Data Mining and Document Search


ABBYY offers a namesake forms processing software which it claims can help banks extract and utilize data from physical documents using machine vision.

ABBYY’s solution can purportedly digitize forms and documents of various levels of complexity into an electronic format ready for validation or processing. Information from physical documents can be formatted electronically for use in other departments. The company states the software can also organize data points from digitized documents into categories, such as dates, names, or a payment amount.

ABBYY claims their machine learning model for their software likely needs to be trained on their client’s company data and forms they want to get information from. For banks, this could be documents such as bank statements, customer IDs, or certificates showing a good standing with taxes. Each document would be labeled by the type of document it is. A data scientist would then run this data through the software. This would train the algorithm to discern the words, phrases, and chains of text that appear to humans as the image of letters and chains of text as shown in banking and legal documents such as contracts.

A user could then run new, unlabeled documents through the software. The algorithm behind it could then recognize all the written characters on the given pages and transcribe them into a digital database. For example, a bank could speed up opening a new account by using the software to digitize customers’ photo ID or proof of residency for approval by an employee.

Below is a short 3-minute video demonstrating how ABBYY’s software works for digitizing and extracting data from physical financial documents:

ABBYY claims to have helped Finansbank automate the input of credit card application forms. When applying for a credit card, physical documents such as a passport photocopy or a revenue certificate are required. Finansbank needed an OCR solution for collecting data from these documents to verify and process applications. According to the case study, Finansbank increased their daily credit card application processing from 1,000 to 10,000 applications processed. At the same time, they were able to cut their number of human operators buy 87.5%.

ABBYY also lists Bankstream, Banque Populaire de l’Ouest, and Caisse d’Epargne Bretagne Pays de Loire as some of their past clients.

Andrey Isaev is Chief Product Officer at ABBYY. He holds an MS in Management and Applied Mathematics from the Moscow Institute of Physics and Technology. Previously, Isaev served as Chief of Mobile Software Development Division at Paragon Software.


Ocrolus offers software called PerfectAudit, which it claims can help bank lenders and financial institutions digitize bank statements and financial documents to be digitally audited by employees.

The company claims the software can use data gathered from digitized documents to produce analytics for the client company to use for business insights. For example, a client company could use document data to find the most and least often submitted document types and which demographics tend to submit them. This could help with customer service or segmentation.

The machine learning model behind PerfectAudit was likely trained on tens of thousands of bank statements and other financial documents like invoices. Each document would have its data points labeled according to the type of data they are. This could include customer profile data like date of birth, or financial data like the amount deposited into a customer’s account across a given period of time. Ocrolus would then run this labeled data through PerfectAudit’s machine learning algorithm. This would train the software to discern between the words and phrases and people perceive as information written or printed on a financial document.

A client could then run new documents that haven’t been labeled into the software The algorithm could then recognize all the text characters on each document and transcribe them to a digital format. These categories could be dates of birth, peoples names, credit scores, or certifications of tax standing. The software could also mark areas that are left blank or have missing information.

The company states PerfectAudit can be integrated into SalesForce and other large CRMs, and a client’s existing underwriting models.

Below is a short 9-minute video demonstrating how Ocrolus’ PerfectAudit works:

Ocrolus does not make available any case studies reporting a bank’s success with their software, but they list Lendio and Pearl Capital as some of their past clients.

Vikas Dua is COO at Ocrolus. He holds a PhD in Computer Science and Engineering from the Chinese University of Hong Kong. Previously, Dua served as Director of Special Operations at Handy HQ.


Header Image Credit: PBS

Stay Ahead of the AI Curve

Discover the critical AI trends and applications that separate winners from losers in the future of business.

Sign up for the 'AI Advantage' newsletter: