[seopress_breadcrumbs]

Seeing the World through Machine Eyes – with Dr. Irfan Essa

•

January 10, 2016

Seeing the World through Machine Eyes - with Dr. Irfan Essa

Episode Summary: Most of us forget that just about a decade ago, Facebook’s software was incapable of tagging people in a photo, but today can so without difficulty, sometimes without us even knowing. Machine vision has progressed to the point where it’s also common for computers to be able to pick out dogs from cats in images, another task that was not possible 10 years ago.

In this episode, we talk with Dr. Irfan Essa, an expert in Computer Vision at the Georgia Institute of Technology (GA Tech), about progress made in machine vision over the last 10 years, related projects in the works today, and where machine vision may be headed in the next decade.

Guest: Dr. Irfan Essa

Expertise: Computer Vision, Computational Perception, Robotics and Computer Animation, Machine Learning, Social Computing

Recognition in Brief: Professor Irfan Essa joined the GA Tech faculty after earning his MS, PhD, and teaching at MIT (Media Lab). In addition to teaching in the School of Interactive Computing, Dr. Essa is also associate dean in the College of Computing. He has published over 150 academic articles, several winning best paper awards, and presented at numerous conferences. Dr. Essa has been awarded the NSF Career and elected to the grade of IEEE Fellow. Since 2011, he has worked with Google Research as a research consultant.

Current Affiliations: Georgia Institute of Technology (GA Tech)

Building Machine Vision

Much of the progress made in computer vision over the past decade has been in the ability of algorithms to better “understand” the objects at which they’re looking in various images and scenes. We’ve gotten there in part, says Dr. Irfan Essa, by trying to mimic the human vision system, with the ultimate near-term goal of transplanting that technology into more intelligent machines that are aware of their environment and can interact and behave appropriately. He notes that one of the bigger tasks quickly coming down the pipeline is the ability to analyze the sheer number of images and videos available on the Internet.

Bigger strides have also been made in machines being able to identify objects in more dynamic scenes, helped by the auto and tech industries’ push to develop autonomous cars. Such vision systems require cameras and sensors to detect and identify the complex forms and movements of pedestrians, landmarks, and other real-world phenomena.

Irfan states that one of the biggest advances in the past decade involves the overlap of computer vision and machine learning. He notes that scientists spent a long time in an era of aggregation and collecting data, and have recently transitioned to an era of sense making i.e. using machine learning techniques – in particular deep convolutional neural networks – that are scalable with large amounts of data and able to more carefully disseminate the pieces of an image – to identify features in a face, for example, or to differentiate between species and within species. The next step, he says, is for machines to start asking questions and inferring information from the received data.

Machines that See the Road Ahead

Where might machine vision evolve in 10 years? “Places like GA Tech are thinking of taking a more diverse, multi-pronged approach,” says Essa. One prong is continuing to develop a theoretical, foundational framework that addresses how computational entities can be used to deal with large amounts of information. The second prong is applying machine learning to computer version in order to deal with more complex sets of features, and investigating how to use the technology to better understand images. The third prong, explains Irfan, is the availability of an application program interface (API), the tools that make it easier for anyone who has access to data in the Cloud to have the ability to use machine vision technology.

Another area of continued development is in prediction, assessment, and analytics to detect temporal aspects, says Irfan. “How do we start taking a unstructured, ad hoc data stream from the population to understand more about the signal itself and the content?”, asks Essa. The solution may only be found when we apply machine vision technologies in more dynamic instances, such as robots interacting with objects.

For example, if we want to develop a robot that can cook, the robot needs a sophisticated model of how to pick up an object in space and time. This requires more than taking pictures and showing those images to a machine, although the Internet of Things (IOT) could potentially help in this arena. “If the image of the object at which a machine is looking is available on the cloud, with a community seeing and saying something about (that object), and that information is then brought back to the machine, allowing it to infer…this provides more contextual intelligence in an environment,” says Irfan.

Behavioral imaging is yet another growing domain. The ability for machines to watch and analyze videos of people moving, and to then be able to predict the likelihood of what will happen next, could be of great use in many areas, including healthcare.

For example, a machine that could watch an elderly person get up from their chair in a video, then analyze and assess the types of necessary support that an individual likely needs in order to abstain from falling in the near-term future, would be a great leap, says Irfan. There are many relevant and pressing needs in healthcare – Kevin Hartnett from the Boston Globe writes about an app for the blind made possible through advancing computer vision technology.

A GA Tech project about which Essa sounds particularly enthusiastic is applying machine learning and computer visioning to observing children on the autism spectrum, with the goal of predicting underlying factors at an earlier age. Almost 20 years ago, Irfan’s PhD thesis was on building a system that would recognize human expression, and it’s an area that many have worked on since, he explains. “Can we actually observe a person in various types of dialogic situations, perhaps with a caregiver, and how do they react in home situations…can we predict how (the individual) is responding to certain signals?”, explains Irfan.

He and other researchers are interested in observing children with autism work with experts, who know the type of behavioral markers that trigger certain behaviors, information that could then be encoded into a machine used to detect how a particular response is likely to trigger a particular reaction. In turn, this information on early warning signs or triggers could then be provided to a caregiver for better support. “A bigger aha moment for me was…if a machine could actually hear the speech of a child at a certain age, would it be able to identify the types of support a child might need in the future?”, says Essa. Building such a tracking app is another project currently underway at GA Tech.

Is machine vision a necessary factor in better understanding artificial general intelligence (AGI)? “I believe both are connected to the extent that embodiment is part of the paradigm, though the pragmatic part of me says it’s required depending on the task at hand,” says Irfan.

If a robot requires a physical embodiment for its ultimate purpose, then its creators need to build a system that uses vision to react and interpret; but such a machine will likely also need to leverage more targeted abilities, like asking a question at the right time. Forms of embodiment are a practical issue, says Irfan. If an entity’s use is limited to information on the Internet, then such embodiment is probably not required. One uniform aspect Essa does see crossing all areas of future AGI development is experts working together to make advances across domains.

Recommended from Emerj

Artificial Intelligence at Barclays – Two Use Cases

Barclays is a leading British universal bank with a diversified portfolio serving retail and wholesale customers globally. The bank employs over 100,000 people worldwide, reflecting its significant global footprint and scale. In its most recent financial results for Q1 2025, Barclays reported £7.7 billion as total income, up 11% year-on-year. Barclays’ approach to AI centers…

Riya Pahuja

•

June 30, 2025

Building AI Systems That Think Like Scientists in Life Sciences – with Annabel Romero of Deloitte

This interview analysis is sponsored by Deloitte and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Large language models have significantly advanced the field of genomics by enabling the prediction of genome-wide variant effects,…

Riya Pahuja

•

June 24, 2025

Inside Enterprise Strategies for Fighting First Party Fraud at Scale – with Leaders from Justt and Walmart

This article is sponsored by Justt and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Fraud and chargebacks present a massive and growing financial burden for merchants and financial institutions. According to Visa's Fall…

Riya Pahuja

•

June 23, 2025

Artificial Intelligence at CVS Health

As one of the largest healthcare companies in the United States, CVS Health generated $357.8 billion in revenue in 2023 and serves over 100 million people annually across its insurance, retail, and pharmacy operations. With such an expansive footprint, the company’s internal application of AI is critical not only to improve operational efficiency but also…

Emily Smith

•

June 20, 2025

How Leaders in Regulated Industries Are Scaling Enterprise AI – with Leaders from Searce, Blue Cross Blue Shield, US Bank, and Citi

This interview analysis is sponsored by Searce and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Regulated industries, such as healthcare and finance, face significant barriers to AI adoption—compliance constraints and legacy systems hinder…

Marilie Fouche

•

June 18, 2025

Artificial Intelligence at BMW

Headquartered in Munich and founded in 1916 in Germany, the BMW Group is a multinational vehicle manufacturer that manufactures vehicles in Germany, the United Kingdom, the United States, Brazil, Mexico, South Africa, India, and China. In the US alone, BMW brand sales totaled 87,615 vehicles in the first quarter of this year, which represents a…

Sharon Moran

•

June 16, 2025

AI for Drug Development and Portfolio Management – with Leaders from Intelligencia AI and Novartis

This interview analysis is sponsored by Intelligencia AI and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Rising costs, long development cycles, and uncertain success rates continue to challenge pharmaceutical R&D efficiency. According to…

Marilie Fouche

•

June 12, 2025

Building Better PCB Layouts with AI Driven Optimization – with Alain-Sam Cohen at InstaDeep

This interview analysis is sponsored by InstaDeep and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. Printed circuit boards (PCBs) are the foundation of virtually every electronic device, from smartphones to spacecraft. Despite decades…

Marilie Fouche

•

June 11, 2025

Securing GenAI at Scale for Observability, Guardrails, and Risk – with Leaders from ActiveFence and Barclays

This article is sponsored by ActiveFence and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. As generative AI systems enter mainstream enterprise workflows, the conversation around AI safety has shifted from theoretical concern to…

Marilie Fouche

•

June 6, 2025

AI Data Strategies for Life Sciences Agriculture and Materials Science – with Daniel Ferrante of Deloitte

This interview analysis is sponsored by Deloitte and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page. R&D teams across life sciences, agriculture, and materials science are under increasing pressure to deliver innovation — but…

Matthew DeMello

•

June 5, 2025

How Cloud Computing and AI are Shaping Life Sciences – with Pranav Joshi of Merck

Personalization and localization have become critical imperatives in the pharmaceutical industry. The shift from broad-spectrum therapies to precision medicine — rooted in pharmacogenomics and biomarker analysis — has demonstrated measurable improvements in treatment efficacy while reducing adverse drug reactions by up to 30%, according to a study by the Leiden University Medical Center known as …

Ashwin Telang

•

June 2, 2025

Scaling AI with Storage Efficiency – Emerj AI Leader Insight

This article is sponsored by Pure Storage and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page.As enterprises race to implement AI, most hit a bottleneck that's hiding in plain sight: inefficient storage infrastructure. While…

Riya Pahuja

•

May 29, 2025

Search site

Search site

Seeing the World through Machine Eyes – with Dr. Irfan Essa

Building Machine Vision

Machines that See the Road Ahead

Recommended from Emerj

Artificial Intelligence at Barclays – Two Use Cases

Building AI Systems That Think Like Scientists in Life Sciences – with Annabel Romero of Deloitte

Inside Enterprise Strategies for Fighting First Party Fraud at Scale – with Leaders from Justt and Walmart

Artificial Intelligence at CVS Health

How Leaders in Regulated Industries Are Scaling Enterprise AI – with Leaders from Searce, Blue Cross Blue Shield, US Bank, and Citi

Artificial Intelligence at BMW

AI for Drug Development and Portfolio Management – with Leaders from Intelligencia AI and Novartis

Building Better PCB Layouts with AI Driven Optimization – with Alain-Sam Cohen at InstaDeep

Securing GenAI at Scale for Observability, Guardrails, and Risk – with Leaders from ActiveFence and Barclays

AI Data Strategies for Life Sciences Agriculture and Materials Science – with Daniel Ferrante of Deloitte

How Cloud Computing and AI are Shaping Life Sciences – with Pranav Joshi of Merck

Scaling AI with Storage Efficiency – Emerj AI Leader Insight

Customize Your Experience

Seeing the World through Machine Eyes – with Dr. Irfan Essa

Building Machine Vision

Machines that See the Road Ahead

Share article

Subscribe to updates

Recommended from Emerj

Artificial Intelligence at Barclays – Two Use Cases

Building AI Systems That Think Like Scientists in Life Sciences – with Annabel Romero of Deloitte

Inside Enterprise Strategies for Fighting First Party Fraud at Scale – with Leaders from Justt and Walmart

Artificial Intelligence at CVS Health

How Leaders in Regulated Industries Are Scaling Enterprise AI – with Leaders from Searce, Blue Cross Blue Shield, US Bank, and Citi

Artificial Intelligence at BMW

AI for Drug Development and Portfolio Management – with Leaders from Intelligencia AI and Novartis

Building Better PCB Layouts with AI Driven Optimization – with Alain-Sam Cohen at InstaDeep

Securing GenAI at Scale for Observability, Guardrails, and Risk – with Leaders from ActiveFence and Barclays

AI Data Strategies for Life Sciences Agriculture and Materials Science – with Daniel Ferrante of Deloitte

How Cloud Computing and AI are Shaping Life Sciences – with Pranav Joshi of Merck

Scaling AI with Storage Efficiency – Emerj AI Leader Insight

This Content is Exclusive to Emerj Plus Members

In-Depth Analysis

Exclusive AI Capabilities Matrix

Exclusive AI White Paper Library

Best Practices and executive guides

Register

Customize Your Experience