Obstacles to Progress in Machine Learning – for NLP, Autonomous Vehicles, and More

Daniel Faggella

Daniel Faggella is Head of Research at Emerj. Called upon by the United Nations, World Bank, INTERPOL, and leading enterprises, Daniel is a globally sought-after expert on the competitive strategy implications of AI for business and government leaders.

Obstacles to Progress in Machine Learning - for NLP, Autonomous Vehicles, and More

Episode summary: Machine learning currently faces a number of obstacles which prevent it from advancing as quickly as it might. How might these obstacles be overcome and what impact would this have on the machine learning across different industries in the coming decade? In this episode we talk to Dr. Hanie Sedghi, Research Scientist at the Allen Institute for Artificial Intelligence, about the developments in core machine learning technology that need to be made, and that researchers and scientists are working, on to further the application of machine learning in autonomous vehicles.

We also touch on some of the impact that might be made if machine learning is able to overcome its own boundaries in terms of computational research, in terms of certain algorithms, and what kind of impact that might have in the arena of autonomous driving and in the realm of natural language processing (NLP).

Guest: Dr. Hanie Sedghi

Expertise: Machine learning, high-dimensional statistics, deep neural networks.

Brief recognition: Hanie Sedghi is Research Scientist at the Allen Institute for Artificial Intelligence where she works on large-scale machine learning, especially latent variable probabilistic models. She holds a Ph.D. from department of Electrical Engineering at University of Southern California with a minor in Mathematics, and a B.Sc. and M.Sc. in Electrical Engineering from Sharif University of Technology, Tehran, Iran.

Big idea:  

“3 Barriers to Overcome in Machine Learning”

Much of the progress in machine learning today is held back by three barriers which Dr. Sedghi addresses in the interview:

Barrier 1: The time it takes to train a system with current approaches is lengthy when you are using complex models

Potential Solution: To come up with algorithms that train faster

Barrier 2The computational resources required are vast and costly when the data is being used in the clouds

Potential Solution: Implement localized devices, more computationally efficient hardware, which will use less resources and less processing power.

Barrier 3: The huge amounts of data required: A vast amount of specific data is needed for training complex patterns.

Potential Solution: Use data reduction techniques such as the addition of machine vision, so machines are able to learn faster using unsupervised learning.

Lessening the burden of any of these factors will have profound impacts on AI’s capabilities across industries. Adding machine vision to NLP, for example, would drastically expand the background knowledge and capacity for interaction between the user and the application and also allow the machines to learn faster, by watching a video for example (in the full interview audio, Dr. Sedghi goes into this use case in detail using Amazon’s Alexa as an example).

Interview Highlights on the Obstacles to Progress in Machine Learning

The following is a condensed version of the full audio interview, which is available in the above links on Emerj’s SoundCloud and iTunes stations.

(2.55) How do articulate the meaning of a neural net to someone who doesn’t have the academic background in this field?

Hanie Sedghi: In general, a neural net is a processing system that is loosely modeled after neurons in a human brain, meaning that it’s made of these small devices called neurons, they’re very simple elements, and what happens with them is it’s very close to firing or not firing; they polarize the system. These neurons are organized in layers, so you have layers and layers. Each of them is made of neurons and each neuron in a different layer is connected to the layer before them and after them and they have theses baited connections.

The goal is that you model the patters in your input such that this system of layers can capture the patterns in them. Neural networks are essential universal approximators and they capture associations and discover irregularities reading a set of patterns.

Then you have the volume on a number of your variables and the diversity there is huge. It can be the case that the relationships you have are much more complex than just a simple non-linearity and that’s where the power of neural networks comes in. The way you train them is that you get feedback from the error you’ve made and you just force your bait so that it captures the irregularities that you may not be aware of.

(For readers with additional interest in the topic, MIT News has written a rather in-depth guide for understanding neural networks, found here.)

(4.32) When you say the patterns in play are not a simple non-linearity, what do you mean by that in the context of neural nets?

Hanie Sedghi: Basically you have these different layers and each neuron has a non-linear functionality but if you put it in terms of a number of layers, it’s not just one of them it’s a big multiplication…which makes it a more complex function and gives it more power to express relationships…

So basically neural nets learn by example. If you give them a number of examples of what a dog looks like, essentially they will understand that this is a pattern that a dog looks like, so if you give them new samples, they will figure out if it’s a dog or not…When I say they learn layer by layer it means that at different layers they learn more structures of the patterns.

(6.50) You are already getting to see some of the work that you folks do leave the lab and make a difference…earlier you were talking about the application of machine vision in autonomous vehicles. What are you excited about in that area?

Hanie Sedghi: There’s been a lot of progress in machine vision…Right now the amount of training we have is enough for a machine to drive on highways, but if you urban places then there’s a lot more complex structures and that’s the part that we currently don’t have. I’ll be excited to see it pass beyond this barrier and actually make this happen.

(Readers with an additional interest in autonomous vehicles may want to read our full article on timelines and progress for self-driving cars.)

(8.12) What are some of the core areas where machine learning hasn’t quite gotten there? Is it mostly a data problem at this point in your opinion, or is there something about the base tech that we really have to crack in order to be able to take your hands off the wheel in the middle of Boston or San Francisco and say “take me home?”

Hanie Sedghi: I think one element that we have is that training in neural network right now takes a lot of time. Of course you need data, I think we have the data but it’s a matter of timing. So there are two things that are interesting, one is to come up with algorithms that train faster. The other thing is how do you change your algorithms so they are implementable on small devices?

(9.38) So you’re talking about using machine vision locally on handheld devices instead of using a whole load of GPUs (Graphics Processing Units). You’re talking about using a CPU, which uses a lot less resources a lot less power and a lot less processing ability which will work with a lot more effective algorithms to crunch its data and make the same kind of efficient decisions with less processing power. Is that important in vehicles because a lot of that brainpower happens in the car? Is that one of the hindrances here?

Hanie Sedghi: Yes, I agree.

(10.26) Now that seems like an awfully difficult problem…in order to do that you mentioned more effective algorithms. Is this going to be about a lot of trial and error and adjusting and tweaking of algorithms based on what you think is going to distill the same level of insights? How do you eventually get there?

Hanie Sedghi: If you could change your algorithms so that you could come up with better initialization, or the initialization doesn’t suffer essentially you have less of that computation.

(11.56) When you mention initialization are we talking about the neural network attuning itself to the initial test set that you sent through it?

Hanie Sedghi: What we mean by training neural net is you need to put the baits in so the structure follows the pattern you have. So you need to start from somewhere, you need to put some numbers in to begin with and that’s what I mean by initialization.

(13.56) What are some of those areas where right now, technology’s approaches in NLP (such as Amazon’s Alexa) are failing us?

Hanie Sedghi: Currently we are far from natural language understanding. There are some components that need to be improved such that we get better experience. One is that we have scarcity of data, so if you ask Alexa some questions that it hasn’t heard before, so it has a hard time understanding and getting back to you. Also there are some cases where you need the system to do reasoning with background knowledge.

The other thing is the time-bound components. If we’ve chatted before, we have a sense of the data we have. But when you start talking to a system, it needs to have that knowledge somewhere. And when you refer to different objects, the reference clarification needs to be added to the system.

(15.12) You use the term “reference clarification”, and also talk about “background knowledge”, what does it look like to fill in those gaps for Alexa? It seems like it would be so hard to do by just plugging in facts to a machine, and that this could be one of the more challenging obstables to progress in machine learning, is that right?

Hanie Sedghi: True. Essentially if you just wanted to ask Alexa for a recipe that’s very easy to do, so what you need to do is model these recipes in some sort of a knowledge that captures these interactions and timings and also it needs to understand the reasoning so if this element is taken out, what happens?

So those are things that are doable with machine learning , but essentially there are some parts of our general knowledge and background knowledge that come from vision, so what is really needed is to bring in natural language processing and vision and other kinds of things in place. For example, if you’re talking with different tones, like speech analysis, that’s also important, because if you’re asking different questions is has to understand your tone and how you feel.

(18.55)  It seems like…we would have to have so much data…in one way it feels like, for example, data security. We’re taking in a gazillion examples and finding distinct cases and acting on them and understanding threats and non-threats in the security environment and in vehicles where we’re assessing al this sensor data and all the GPS data and all this Lidar data and those are important questions where there’s a lot of hardware being put to work to make sure big goals get met.

Hanie Sedghi: I agree with the point that application drives this, but one of the core problems that machine vision people are working at is how can you teach a machine to understand these processes? This is what we call unsupervised learning because I don’t give you a label for everything. What’s really needed these days more than ever is a proven understanding of unsupervised learning. So the machine would understand these processes given a video…so you don’t need for every single application to go back and give a lot of data.