Traditionally, responsible AI management practices are not just a matter of theoretical ethics debates but a practical matter of compliance. Financial sectors are highly regulated and monitored to ensure the safety and security of consumers and the overall economy. Many of those regulations are written so that responsible management practices are meant to avoid penalties.
However, tools that use natural language processing to generate text – like ChatGPT – require immense resources and infrastructure to operate to avoid harm to consumers and their information. ChatGPT has limitations regarding its ability to make reliable decisions in customer-facing situations, as its decision-making process is not always transparent or easily explainable.
This lack of transparency can lead to potential biases or errors in decision-making, which can have significant consequences for customers and financial institutions. Now more than ever, responsible AI management practices mean avoiding direct impact on bottom lines when AI-enhanced systems are mismanaged – not just compliance concerns.
In the following analysis of their conversation, we examine two key insights:
- ChatGPT’s limitations in financial services and other sectors: Using ChatGPT for research is significant, but it poses limitations regarding controls and responsible AI regarding customer-facing decisions.
- Integrating AI and human workflows: ChapGPT might provide compelling alternatives to aid human decision-making, but working with – not instead of – humans to achieve the best results is crucial.
Listen to the full episode below:
Guest: Scott Zoldi, Chief Analytics Officer at FICO
Expertise: Fraud analytics, Cybersecurity, Explainable and Ethical AI, Unstructured data analytics, Unsupervised machine learning and Utility analytics
Brief Recognition: A seasoned technology executive with over 25 years of experience in AI, machine learning, and advanced analytics, Scott Zoldi leads the development of innovative analytics solutions for FICO’s clients. He has over 120 authored patents, with 80 patents granted and 47 patents in progress. He holds a Ph.D. in computer science from Duke University and is a member of the Forbes Technology Council.
ChatGPT’s Limitations in Financial Services and Other Sectors
Scott begins setting the record straight by mentioning that, although ChatGPT is an excellent tool for gathering specific types of information, there’s still much to be explored and researched to understand how the tool makes its decisions. He raises concerns about whether or not the data it has access to has been fact-checked or if there’s any potential for bias in its responses.
While he concedes Chat GPT has objective and tremendous value in financial services, he cautions that current iterations are still far too primitive for direct, customer-facing operations. Other large language model-driven tools that resemble ChatGPT have potential, but there’s still a long way to go before they can be relied upon for crucial decision-making.
Emerj Senior Editor Matthew DeMello asks Zoldi about the comparison in his FICO blog post about ChatGPT between the technology and the sensationalized benefits of Ozempic, the newly released diabetes and weight loss wonder drug.
Zoldi is quick to clarify the substance of the comparison is that responsible systems are explainable systems: “What generally happens with AI is that we get to a point where we’re calloused because the technology told us so, and we can’t figure it out,” Zoldi tells Emerj, “That is dangerous, and we need controls on top.”
In other words, Scott describes many business decisions as “auditable, safe, and ethical” AI practices are downstream from transparency and other values built-in to systems to ensure users and administrators understand those systems.
Noting that developing technology that impacts as many people as LLMs takes work and time to make optimally safe for mass use, Zoldi makes a larger analogy, relating AI with transformers which are good at taking information and putting it in a format we can understand and ingest.
Therein, he acknowledges that AI has advantages, such as enhancing human understanding and efficiency, but notes that it’s important to consider use cases carefully.
For example, it may be safe to use AI for research rather than tasks such as customer interactions or writing term papers. He cautions that it will be a problem if we allow the advanced technology to talk to a customer who is not satisfied with a decision another AI model made for him.
The automated responses that go to customers are very carefully reviewed and approved. He cites recent statements by Sam Altman, the CEO of Open AI, cautioning against overexcitement about LLMs and other AI capabilities due to reliability issues.
Integrating AI and Human Efforts
Scott later discusses the limitations and potential dangers of artificial intelligence and machine learning. He cautions against blindly trusting AI, especially with large-ranging or especially customer-facing operations – where mishaps can echo into the public echo chamber.
While large language models (LLMs) have great potential to enhance customer experience pipelines, Scott emphasizes the importance of other use cases in compliance.
He underscores how even the user interface can help amplify the ability of machine learning and other AI capabilities to detect anomalies in transaction records and other financial databases:
“Think of it almost like a detective; gathering all the facts will be Sherlock Holmes. Then you put all those pieces of information that this ChatGPT surface for you into your final conclusion,” he tells Emerj of the ostensible workflow.
For these reasons, Scott insists that ChatGPT’s initial purpose in many financial services use cases may be best thought of not as autonomously writing code without human supervision but instead as offering human code and sales script writers alternatives to stimulate their creativity.
Part of the reason for keeping the technology at arm’s length from direct customer interactions with the help of human supervision is that LLMs remain prone to “hallucinations” – or mistaking context based on predicting the next word in a sentence.
ChatGPT’s capacity for propagating misinformation is already well documented, prompting Zoldi to recommend that business leaders think of ChatGPT’s offerings in these areas as Mad Libs-style templates needing humans to fill concrete, verifiable facts in the blanks.
As LLMs become more integrated throughout financial services workflows, Scott notes human agents in customer experience and other pipelines will need to be extra discerning (or “jaded”) about what information ChatGPT and other LLM-powered technologies bring to them and their customers.
Adding to the problem, making language models bigger or increasing their training hours does not inherently make them better at following a user’s intent. These models are not always aligned with the user, ensuring a pivotal role for human supervisors in aligning language models with user intent through fine-tuning.
Clarifying the problem, AI21Labs Co-founder and Co-CEO noted the following regarding large language models in a separate Emerj interview:
“When you use these language models, they encode some knowledge, but they don’t encode it in an explicitly controllable way. It’s still very limited, and you need to work hard to extract certain knowledge. Also, there is probably no reliability because, at the end of the day, it’s a statistical model.”
– Ori Goshen, Co-founder and Co-CEO of AI21Labs
Both Scott and Ori emphasize that the objective of language models should not be mistaken for understanding and communicating complex ideas – but instead for predicting the next word token in a prompt based on the statistical probability it will produce a positive user response. The distinction is vital for understanding why models often misuse facts, generate biased text, and do not produce consistent outcomes from user instructions.
Scott ends the interview by affirming his belief that, in time, the financial services sector will realize and accept that all the generative or experiential AI tools are not for autonomous creative tasks. In areas of verifying facts and fine-tuning detection anomaly efforts, humans and technology must ultimately share responsibility in workflows – at least given current iterations of ChatGPT.