[seopress_breadcrumbs]

Closing Gaps in Natural Language Processing May Help Solve World’s Tough Problems – A Conversation with Dr. Dan Roth

•

April 17, 2016

Closing Gaps in Natural Language Processing May Help Solve World’s Tough Problems - A Conversation with Dr. Dan Roth

Episode Summary: People often mark progress by what they see, but there’s often much more going on behind the scenes, the up and coming, that marks actual current progress in any particular field. The same can said to be true for natural language processing, and Dr. Dan Roth’s research in this field makes him privy to the advancements that most of us are bound to miss.

In this episode, Dr. Dan Roth explains what the last 10 years of progress in natural language processing (NLP) have brought us, what’s happening with approaches in developing this technology today, and what the next steps might be in a computer capable of real conversational speech and understanding language in context.

Guest: Dan Roth

Expertise: Machine learning and natural language processing

Recognition in Brief: Dan Roth is a Founder Professor of Engineering at the University of Illinois at Urbana-Champaign. He is a Professor in the Department of Computer Science and the Beckman Institute and holds faculty positions also at the Statistics, Linguistics and ECE Departments and at the graduate School of Library and Information Science. Prof. Roth got his B.A Summa cum laude in Mathematics from the Technion, Israel and his Ph.D in Computer Science from Harvard University. He has published broadly in machine learning, natural language processing, knowledge representation and reasoning and learning theory, and has developed advanced machine learning based tools for natural language applications that are being used widely by the research community.

Current Affiliations: Fellow of the American Association for the Advancement of Science (AAAS), the Association of Computing Machinery (ACM), the Association for the Advancement of Artificial Intelligence (AAAI), and the Association of Computational Linguistics (ACL); Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR)

The Wild Frontier of Natural Language Processing

Most people see applications like Siri, Google translation, and Google search, and know what it’s like to interact with these softwares using text and voice commands. But there are also a great many burgeoning technologies, including text mining and text analytics tools, that allow us to extract information from free text and use as structured data for database technology. “We can look at collection of millions of media and text documents and process them; there has been lots of progress in shallow natural language processing (NLP) technology which now is common, and there’s a lot more to come,” says Roth.

With all of the progress that AI and machine learning is making today, why are we still in the shallow end of the pool with NLP? “People do not realize how difficult the problem is; any word you want to take in English, take the word table for example…may mean different things and parts of speech in different contexts,” explains Dan. Machines face a complex set of cognitive tasks that require a lot of knowledge, something that we too often take for granted.

Roth suggests that conventional programming (the typical “if-then” statements and defining variables) is not good enough with NLP, which requires combining learning techniques with reading and listening to lots of data, all of which must be understood in context – a monumental task for a machine with no built-in background knowledge.

And much of the technology that we might assume is NLP is often barely scraping the surface of this technology. Google Translate, for example, can find a word similar to it in another language, but it does not understand text, it won’t answer a question, and it cannot help you beyond simple translations. Humans (more often Americans) can refer to the last Thursday of November or use the term Thanksgiving, and both are understood to represent a national holiday, but Google doesn’t understand this idea. What is trivial for people is still a huge gap that needs to be closed machines.

To solve the problem of context, the approach being taken today by many researchers is two-fold: the first prong is machine learning, which reads through a lot of text and tries to understand abstractions and learn from masses of data, and the second is acquisition of knowledge, a lot of which can be done automatically through well-organized, human-made sources like Wikipedia, for example.

Writing programs that can learn from both methods and better understand the ‘meaning’ of text is a line of work that many are pursuing, says Roth, and it’s called ‘indirect supervision‘. “We’re trying to catch signals of supervision wherever we can, minimizing human effort, using annotated information to develop algorithms to better understand data,” he explains.

When we think about computer vision and machine learning, there are programs able to sort through types of images – cars, for example – based on what it knows about cars: this is the front, this is the side, these are the wheels, etc. Humans still manually edit any images that machines identify falsely, and then feed it back to ‘teach’ the machine.

The cycle of labeling and training to identify is very similar to NLP, only there are many more stages and much more nuanced information to teach. Parts of speech, categories of phrases, inferences – these are just a few of the factors that a thinking machine must take into account. If we’re to achieve a computer that can listen to and understand the meaning of speech, even in simplified language, then there has to be real effort in trying to minimize human supervision.

Blazing the Trail Ahead

In the next 10 years, access to information may be elevated to a new level with significant progress in the field of NLP. “I can see us being able to communicate with computers in a real natural fluent way, much better than Siri,” says Roth. A middle schooler trying to solve a difficult problem in algebra may be able to consult the machine on strategies for solving an equation. Firms involved in a big legal case, where it often takes many lawyers and thousands of hours to look through millions of documents to think about a legal issue, may be able to much more quickly provide an intelligent summary with evidence pointing to specific documents and correspondents.

There are many more areas in which NLP may revolutionize the way we interact with each other and our machines, from customer service to smart toys. At present, some of the greatest challenges for NLP involve medical and compliance information, due to the sheer volume of documents that need to be accessed in an intelligent way. “Articles printed in biomedical last year were over a million, that means no one knows what’s happening,” says Roth. But the next decade may see advances that allow physicians to verbally request relevant research articles that pertain to a specific medical issue.

The potentials here are endless, but one example might be a medical practitioner trying to determine the influence of a certain drug on a particular gene, something that may require unnecessary experimentation in a lab if he or she is not aware of existing pieces of information that could help answer tough questions. If physicians and other experts have better, almost real-time access to the information that they need to know, then humans will be much better poised to figure out some of the most difficult questions that still plague our global society.

Image credit: University of Illinois

Recommended from Emerj

AI as a Catalyst for Supply Chain and Workforce Transformation – with Kuo Zhang of Alibaba.com

Small businesses and enterprises alike are running into similar roadblocks when it comes to deploying AI at scale and developing resilience in today's global supply chains. While many leaders understand the urgency, their organizations often face structural, cultural, and logistical barriers to implementation. According to the U.S. Census Bureau's Small Business Pulse Survey, 38.8% of…

Matthew DeMello

•

September 18, 2025

Artificial Intelligence at Bayer

Bayer is a global life sciences company operating across Pharmaceuticals, Consumer Health, and Crop Science. In fiscal 2024, the group reported €46.6 billion in sales and 94,081 employees, a scale that makes internal AI deployments consequential for workflow change and ROI. The company invests heavily in research, with more than €6 billion allocated to R&D…

Emily Smith

•

September 15, 2025

CoCreate 2025: Driving Supply Chain Resilience with New Agentic AI Tools

Event Title: CoCreate 2025 Event Host: Alibaba.com Location: Las Vegas, NV, US Date: September 4-5 Team Member: Matthew DeMello, Emerj AI Research Editorial Director What Happened CoCreate 2025, Alibaba.com’s flagship sourcing and entrepreneurship event, convened global leaders from across supply chains, technology, and commerce in Las Vegas. With more than 200 networking sessions, 100 industry…

Matthew DeMello

•

September 11, 2025

Balancing Trade-Offs in Hybrid Cloud and the Infrastructure Behind Scalable AI – with Jason Hardy of Hitachi Vantara

This interview analysis is sponsored by Hitachi Vantara and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Organizations across various industries are making significant investments in enterprise AI capabilities to enhance their efficiency and…

Riya Pahuja

•

September 11, 2025

Reimagining Customer Experiences with AI-Driven Conversations – with Leaders from Cognigy and Prudential Financial

This article is sponsored by Cognigy and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Repetitive administrative tasks continue to be a significant source of employee burnout across various industries. In healthcare, as Microsoft’s…

Riya Pahuja

•

September 9, 2025

Artificial Intelligence at Fifth Third Bank

Fifth Third Bank, a leading regional financial institution with over 1,100 branches in 11 states, operates four main businesses: commercial banking, branch banking, consumer lending, and wealth and asset management. Founded in 1858 and headquartered in Cincinnati, the bank has assets in excess of $211 billion. During the first quarter of 2025, Fifth Third Bank…

Sharon Moran

•

September 8, 2025

Navigating the Build vs. Buy Conversation in Service and Manufacturing Spaces – with Leaders from Aquant, Generac, Lexmark, Electrolux, Danaher, and Comfort Systems USA

This article is sponsored by Aquant and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. In high-stakes field service sectors, such as manufacturing heavy machinery or critical medical devices like hospital ventilators, equipment failure…

Matthew DeMello

•

September 4, 2025

Breaking Down AI’s Role in Genomics and Polygenic Risk Prediction – with Dan Elton of the National Human Genome Research Institute

While protein sequencing efforts have amassed hundreds of millions of protein variants, experimentally determined structures remain exceedingly rare, lagging far behind the number of unresolved structures. The 2024 UniProt knowledgebase catalogs approximately 246 million unique protein sequences, yet the Worldwide Protein Data Bank holds just over 227,000 experimentally determined three-dimensional structures — covering less than…

Ashwin Telang

•

September 1, 2025

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

This interview analysis is sponsored by Microsoft and NVIDIA. It was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Manufacturers worldwide are under increasing pressure to enhance operational efficiency and agility in response to evolving…

Marilie Fouche

•

August 26, 2025

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

There is significant regulatory uncertainty in global AI oversight, primarily because of the fragmented legal landscape across countries, which hinders effective governance of transnational AI systems. For instance, as noted in a 2024 Nature study, the lack of harmonized international law is complicating AI innovation, making it difficult for organizations to understand which standards apply in…

Riya Pahuja

•

August 25, 2025

Artificial Intelligence at ABB- Two Use Cases

ABB is a global technology leader specializing in electrification and automation, with a history spanning over 140 years and approximately 110,000 employees worldwide. Headquartered in Zurich, Switzerland, ABB operates in over 100 countries, supported by approximately 170 manufacturing sites worldwide. In 2024, the company reported revenues of $32.9 billion and an order intake of $33.7…

Riya Pahuja

•

August 18, 2025

Transforming Shutdown, Turnaround, and HSE Operations in Energy Spaces with AI – with Leaders from Oxy, NOV, and AltaML

This interview analysis is sponsored by AltaML and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Shutdowns, turnarounds, and outages (STOs) are among the most resource-intensive and risk-laden operations in the energy sector. These…

Marilie Fouche

•

August 13, 2025

Search site

Search site

Closing Gaps in Natural Language Processing May Help Solve World’s Tough Problems – A Conversation with Dr. Dan Roth

The Wild Frontier of Natural Language Processing

Blazing the Trail Ahead

Recommended from Emerj

AI as a Catalyst for Supply Chain and Workforce Transformation – with Kuo Zhang of Alibaba.com

Artificial Intelligence at Bayer

CoCreate 2025: Driving Supply Chain Resilience with New Agentic AI Tools

Balancing Trade-Offs in Hybrid Cloud and the Infrastructure Behind Scalable AI – with Jason Hardy of Hitachi Vantara

Reimagining Customer Experiences with AI-Driven Conversations – with Leaders from Cognigy and Prudential Financial

Artificial Intelligence at Fifth Third Bank

Navigating the Build vs. Buy Conversation in Service and Manufacturing Spaces – with Leaders from Aquant, Generac, Lexmark, Electrolux, Danaher, and Comfort Systems USA

Breaking Down AI’s Role in Genomics and Polygenic Risk Prediction – with Dan Elton of the National Human Genome Research Institute

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

Artificial Intelligence at ABB- Two Use Cases

Transforming Shutdown, Turnaround, and HSE Operations in Energy Spaces with AI – with Leaders from Oxy, NOV, and AltaML

Customize Your Experience

Closing Gaps in Natural Language Processing May Help Solve World’s Tough Problems – A Conversation with Dr. Dan Roth

The Wild Frontier of Natural Language Processing

Blazing the Trail Ahead

Share article

Subscribe to updates

Recommended from Emerj

AI as a Catalyst for Supply Chain and Workforce Transformation – with Kuo Zhang of Alibaba.com

Artificial Intelligence at Bayer

CoCreate 2025: Driving Supply Chain Resilience with New Agentic AI Tools

Balancing Trade-Offs in Hybrid Cloud and the Infrastructure Behind Scalable AI – with Jason Hardy of Hitachi Vantara

Reimagining Customer Experiences with AI-Driven Conversations – with Leaders from Cognigy and Prudential Financial

Artificial Intelligence at Fifth Third Bank

Navigating the Build vs. Buy Conversation in Service and Manufacturing Spaces – with Leaders from Aquant, Generac, Lexmark, Electrolux, Danaher, and Comfort Systems USA

Breaking Down AI’s Role in Genomics and Polygenic Risk Prediction – with Dan Elton of the National Human Genome Research Institute

Transforming Manufacturing with AI-Powered 3D Digital Twins and Remote Monitoring – with Rad Desiraju of Microsoft and Mike Geyer of NVIDIA

Global AI Regulations and Their Impact on Industry Leaders – with Michael Berger of Munich Re

Artificial Intelligence at ABB- Two Use Cases

Transforming Shutdown, Turnaround, and HSE Operations in Energy Spaces with AI – with Leaders from Oxy, NOV, and AltaML

This Content is Exclusive to Emerj Plus Members

In-Depth Analysis

Exclusive AI Capabilities Matrix

Exclusive AI White Paper Library

Best Practices and executive guides

Register

Customize Your Experience