[seopress_breadcrumbs]

AI for Speech Recognition and Transcription in Law and Legal

•

March 29, 2019

Speech Recognition and Transcription in Law and Legal

Have you ever been frustrated with how Alexa or Siri don’t always understand your verbal requests? If so, then you already understand the problem that our guest this struggles with. He’s Tom Livne, co-founder and CEO of Verbit.ai.

Verbit is a company that focuses on AI for transcription, specifically for the law and legal space. They use a combination of machine learning and human experts to transcribe audio in different accents, in different noise environments, with different diction, to give people more accurate results and hopefully help the process scale.

In this episode, Livine explains five different factors that go into getting transcription right and getting AI to be able to aid in the process. In addition, Tom talks about some of the critical factors for where transcription will come into play in terms of bringing value into business.

Subscribe to our AI in Industry Podcast with your favorite podcast service:

Guest: Tom Livne, co-founder and CEO – Verbit.ai

Expertise: Entrepreneurship/tech startup life cycle

Brief Recognition: Livne holds an MBA in Business Administration from Yale

Interview Highlights

(03:00) Give us an understanding of what’s possible with transcription today?

TL: Think about this podcast. We are recording this episode and let’s assume we want to get a professional transcript. When I’m referring to a professional transcript, I mean 100% accuracy. And the way it’s been done today, it’s fully manual, right? People are listening and typing it from scratch and it creates a limited capacity of scale and low gross margin.

On the other hand, speech recognition technology can reach only 70 to 80%. If we’re going to court and give the automated transcript only, this is not good enough. So the way we solve it at Verbit is [with] the approach of the machine-human hybrid.

So we have our own speech recognition technology we’ve developed in-house. We have patterns register for our technology. We have a team of nine PhD’s working on it. We have the combination of our network and platform of freelance transcribers from all around the globe that take in the automated output of the machine and correct it in order to bring it to 100%.

So regarding what is possible, I mentioned that the technology is not there. And the reason for it, I’ll explain why. There are few parameters that affect the accuracy of the speech, and this is the reason the machine. and also. in my point of view, even in 10 years from now, we won’t be able to get to the 100% machine only.

So the parameters that affect the accuracy of the speech recognition is one, the language model. So think about if you go to legal transcription or medical transcription, there is a lot of specific jargon and specific words that are relevant for this use case. For the machine, it’s really really hard to do it, also to get the names of people, also to get specific terminology, so this affects the accuracy.

The second thing is the acoustic model. So if you do it talking in an open space or if you’re talking via phone or if you have a courtroom, et cetera, so all of this different acoustic model that also affects the accuracy of the speech.

And the third one, as you can hear my terrible Israeli accent, so usually accents affect the accuracy of the voice-to-text. So you need to tune it to train the machine for a specific accent. Then you have the fourth one: background noise. Overlapping of people, all the background noise, is really damaging the quality of the output of the machine.

And the fifth one is the pace of when you talk. You talk really, really fast or you’re talking slowly, then it also affects the accuracy.

And the last one will be the diction. If there are people, young people or children talking or elder people talking, this is also really specific diction that affects the accuracy of the speech. So if you combine…all the parameters in a different use case, it’s really, really hard, almost impossible to get all of this correctly. Unless you have specific data for this specific use case, combine all these parameters together for this specific customer, this will enable you to get 90 plus percent accuracy.

Our work in Verbit is not to replace the humans, [but] actually to help the human to do a better job and to make their life easier.

(08:30) These are challenging factors here. I’m wondering which of these is the most insurmountable.

TL: I think each one of those are very tough in their own unique way but if you ask me I think all the acoustic model and the background noise and the ability to identify different speakers, et cetera, this is very hard to talk of and to make adaptation for different acoustic environments and…controlling the quality of the audio recording.

To be able to adapt the algorithm accordingly, this is something that is very challenging and with all of the neural nets and the ability to train, still it’s having a hard time to understand sometimes when you put to the machine something with bad recording and bad acoustic…I think this is the toughest one.

(10:30) In other words, is that still where human intuition might still have a bastion of specialness, even if algorithms are trained to…take poor audio and fill in the blanks, is that still something where you think humans are gonna have the edge?

TL: I do believe so because they have the ability to hear it again and again and get the input in to understand the context of what has been said.

So I guess a courtroom…will never be satisfied with a machine only because they are required by law to have the 100% [accuracy] and this is going to take a lot of time and a leap of faith until they would be able to believe that the machine would be able to just get the perfect output for them to submit…You have Google, you mention Baidu…they are building something very generic. Something that should be suitable for everyone and…because we are taking more the vertical approach, this allows us to be much more tailor-made for any of the customers and will give us the advantage to get better results.

Because at the end of the day…what is speech recognition technology? Speech recognition is trying to identify what has been said and there are very complex statistical models that give the ranking, in showing you the best probability of the best guess for the machine what has been said. You have a lot of parameters that try to guess in the best way what has been said there. And this [is] actually because you think about verbiage as a contextual there. When you are in a generic engine, speech recognition engine, you just put the input, which is audio, and output will be text based on the same algorithm that everyone used for speech recognition.

If you think about verbiage…you need to use this contextual layer that gives you [information such as] the person that talked, and you have this accent and this is the jargon that he is talking about, legal space, in this acoustic environment. So use all these parameters in order to give better accuracy in the transformation from voice-to-text before you do it. This is something that helps us because we are not trying to be generic, we are trying to be very tailor-made.

(14:30) When you think about what we’ll be able to do five years from now that we can’t do now with transcription, where are you most hopeful that real traction will be made in terms of improvement?

TL: So the way we are thinking about it is in verbiages to be much beyond just transcription. We think that transcription just got much smarter, and what do I mean by that? Think about the use-case of…calls? When you have publicly traded companies…at the end [of the] quarter talking to the analyst about the company results.

Think about having an automated transcription for it, and then you already have the pace data and you can create actionable links and intents and you know let’s say Apple is talking about iPhone X, so you can identify in your transcription that this is what has been said and you can…click…and go directly to the website and buy the iPhone X. You can do a comparison, take all the numbers that you just automatically transcribe and create a graph and create a visualization and compare it to past results because you already have the transcription of the past results. And to get much more insights from the data.

Because we are allowing people to get more value out of their verbal assets so all this verbal communication and information that has been exchanged we want to allow our customer to get more value.

(17:30) Can you talk about the business value of transcription?

TL: Think about once you have the examination of a witness and then you can see if in his past testimonial does he contradict himself? Maybe he’s lying [so we can] try to analyze in his voice to get some realization of the text. You have many things that you can extract, so the speech and the transcription is the first layer. You can do on top of it many, many things. We think that the transcription market is very, very big. Once we would be able to increase the accuracy and we would be able to allow more people to get more value out of their verbal assets.

Subscribe to our AI in Industry Podcast with your favorite podcast service:

Header Image Credit: The Globe and Mail

Recommended from Emerj

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

This interview analysis is sponsored by Microsoft and NVIDIA. It was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Manufacturers worldwide are under increasing pressure to enhance operational efficiency and agility in response to evolving…

Marilie Fouche

•

August 26, 2025

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

There is significant regulatory uncertainty in global AI oversight, primarily because of the fragmented legal landscape across countries, which hinders effective governance of transnational AI systems. For instance, as noted in a 2024 Nature study, the lack of harmonized international law is complicating AI innovation, making it difficult for organizations to understand which standards apply in…

Riya Pahuja

•

August 25, 2025

Artificial Intelligence at ABB- Two Use Cases

ABB is a global technology leader specializing in electrification and automation, with a history spanning over 140 years and approximately 110,000 employees worldwide. Headquartered in Zurich, Switzerland, ABB operates in over 100 countries, supported by approximately 170 manufacturing sites worldwide. In 2024, the company reported revenues of $32.9 billion and an order intake of $33.7…

Riya Pahuja

•

August 18, 2025

Transforming Shutdown, Turnaround, and HSE Operations in Energy Spaces with AI – with Leaders from Oxy, NOV, and AltaML

This interview analysis is sponsored by AltaML and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Shutdowns, turnarounds, and outages (STOs) are among the most resource-intensive and risk-laden operations in the energy sector. These…

Marilie Fouche

•

August 13, 2025

AI as Enterprise-Wide Enabler of Clinical Trial Innovation – with Leaders from Medable, Takeda, Sanofi, Novartis, and Daiichi Sankyo

This article is sponsored by Medable and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Clinical trials are becoming increasingly complex as pharmaceutical companies pursue more personalized therapies, navigate tighter timelines, and expand access…

Marilie Fouche

•

August 13, 2025

Laying the Groundwork for Enterprise AI in Banking and Finance – with Leaders from EPAM and Edward Jones

This interview analysis is sponsored by EPAM and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. While AI stands poised to transform even legacy financial institutions, many organizations across BFSI spaces struggle to get…

Riya Pahuja

•

August 11, 2025

AI in Biopharma Innovation and Regulatory Challenges – with Nishtha Jain of Takeda Pharmaceuticals

As life sciences organizations race to adopt AI, the biopharmaceutical sector remains one of the most complex and high‑stakes environments for implementation. The median cost to develop a new drug is $708 million, according to the RAND corporation, rising to an average of $1.3 billion when accounting for failures and capital costs. According to the…

Emily Smith

•

August 11, 2025

Artificial Intelligence at Centene

Centene Corporation is a leading healthcare enterprise that is committed to helping people live healthier lives through government-sponsored and commercial healthcare programs. The company serves as a managed care organization providing a comprehensive range of healthcare services, primarily through Medicaid, Medicare, and Health Insurance Marketplace contracts. In 2024, Centene reported an annual revenue of $163.1…

Ashwin Telang

•

August 4, 2025

Shaping the Future of Healthcare with AI – with Lyndi Wu of NVIDIA and Will Guyman of Microsoft

This interview analysis is sponsored by Microsoft and NVIDIA and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. U.S. hospitals are facing an unprecedented digital infrastructure crunch. According to the U.S. Department of Health…

Riya Pahuja

•

July 30, 2025

AGI Governance: Insights From Asanga Abeyagoonasekera and the Millennium Project

As global conflict and economic instability dominate headlines, a quieter but no less urgent challenge is gaining traction among international institutions: the governance of Artificial General Intelligence (AGI). Over the past year, AGI has transitioned from an abstract theory to a top priority for policymakers in both the United States and the United Nations. Across…

Matthew DeMello

•

July 28, 2025

Video Data in Retail for Security and Beyond – with Leaders from Solink and Amazon

This article is sponsored by Solink and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Retailers are facing a surge in both organized retail crime (ORC) and internal theft, resulting in massive financial losses.…

Riya Pahuja

•

July 24, 2025

Inside the AI Playbook for Scientific Discovery and Optimization – with Brian Lutz of Corteva

This interview analysis is sponsored by Deloitte and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Agriscience companies are under increasing pressure to develop safer, more effective, and environmentally responsible products at scale, without…

Riya Pahuja

•

July 23, 2025

Search site

Search site

AI for Speech Recognition and Transcription in Law and Legal

Interview Highlights

Recommended from Emerj

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

Artificial Intelligence at ABB- Two Use Cases

Transforming Shutdown, Turnaround, and HSE Operations in Energy Spaces with AI – with Leaders from Oxy, NOV, and AltaML

AI as Enterprise-Wide Enabler of Clinical Trial Innovation – with Leaders from Medable, Takeda, Sanofi, Novartis, and Daiichi Sankyo

Laying the Groundwork for Enterprise AI in Banking and Finance – with Leaders from EPAM and Edward Jones

AI in Biopharma Innovation and Regulatory Challenges – with Nishtha Jain of Takeda Pharmaceuticals

Artificial Intelligence at Centene

Shaping the Future of Healthcare with AI – with Lyndi Wu of NVIDIA and Will Guyman of Microsoft

AGI Governance: Insights From Asanga Abeyagoonasekera and the Millennium Project

Video Data in Retail for Security and Beyond – with Leaders from Solink and Amazon

Inside the AI Playbook for Scientific Discovery and Optimization – with Brian Lutz of Corteva

Customize Your Experience

AI for Speech Recognition and Transcription in Law and Legal

Interview Highlights

Share article

Subscribe to updates

Recommended from Emerj

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

Artificial Intelligence at ABB- Two Use Cases

Transforming Shutdown, Turnaround, and HSE Operations in Energy Spaces with AI – with Leaders from Oxy, NOV, and AltaML

AI as Enterprise-Wide Enabler of Clinical Trial Innovation – with Leaders from Medable, Takeda, Sanofi, Novartis, and Daiichi Sankyo

Laying the Groundwork for Enterprise AI in Banking and Finance – with Leaders from EPAM and Edward Jones

AI in Biopharma Innovation and Regulatory Challenges – with Nishtha Jain of Takeda Pharmaceuticals

Artificial Intelligence at Centene

Shaping the Future of Healthcare with AI – with Lyndi Wu of NVIDIA and Will Guyman of Microsoft

AGI Governance: Insights From Asanga Abeyagoonasekera and the Millennium Project

Video Data in Retail for Security and Beyond – with Leaders from Solink and Amazon

Inside the AI Playbook for Scientific Discovery and Optimization – with Brian Lutz of Corteva

This Content is Exclusive to Emerj Plus Members

In-Depth Analysis

Exclusive AI Capabilities Matrix

Exclusive AI White Paper Library

Best Practices and executive guides

Register

Customize Your Experience