Artificial Intelligence in Video Marketing – Emotion Recognition, Video Generation, and More

Ayn de Jesus

Ayn serves as AI Analyst at Emerj - covering artificial intelligence use-cases and trends across industries. She previously held various roles at Accenture.

Artificial Intelligence for Video Marketing - Emotion Recognition, Video Generation, and More

eMarketer estimates that 62% of global internet users accessed digital video in 2017, and that number is expected to rise to 63.4% by 2020. This may be marginal growth, but, essentially, consumers are expected to spend more time watching video content. 

As more and more of the world’s internet users gain access to faster internet, how can marketers best leverage digital video to target prospective customers?

This report aims to give businesses a glimpse of how AI-powered video marketing software could allow for improved marketing capabilities and how businesses could potentially benefit from them in ways such as:

  • Reducing costs
  • Improving sales
  • Acquiring customer information

Readers interested in the broader applications of AI in marketing may enjoy our larger Machine Learning in Marketing – Export Consensus of 51 Executives and Startups piece.

Some of the software companies we discuss below offer systems that claim to work for video advertising, whereas others use computer vision to provide insight into more traditional modes of marketing, such as in-store marketing or email marketing.

Emotion Recognition – Tracking Responses to Video Content

Affdex for Market Research

Affdex for Market Research, by Affectiva, a spinoff of the MIT Media Lab, is an emotion-detection app that measures and reports user facial expressions through computer vision. The app is intended to be used by content creators, researchers, and digital media specialists to remotely test digital content, advertising, movie trailers, and TV programs.

The remote feature can potentially save businesses time and other resources, as it saves the them from having to transport local or overseas participants to a studio to view the video ad in person.

The software is trained by continuously feeding it with many examples of a facial expressions, such as a smile or smirk. The system identifies and learns the key characteristics of each expression and will learn to recognize it over time. The company claims that its database has grown to more than 6 million faces analyzed in 87 countries, that its emotions-measuring capabilities are accurate 90% of the time, and that the app works under challenging conditions such as poor lighting and background noise.

The app must be first downloaded by the testing participants to their personal computers to enable the company to capture the data. Once the app is installed on a personal computer, it accesses the computer’s webcam to detect and interpret the emotions behind the user’s facial expressions. In the 2-minute demo video below, Affectiva outlines how businesses could gather insights from the Affdex emotion analytics dashboard:

The company reports that Affdex measures facial expressions using a webcam. It first identifies a human face in real time. Computer vision algorithms then identify key landmarks on the face such as the corners of eyebrows, the corners of the mouth, and the tip of the nose.

The company also states that through deep learning, the software analyzes pixels in those facial regions to recognize 20 facial expressions and map them to 7 emotions: anger, contempt, disgust, fear, joy, sadness and surprise. Affectiva claims that Affdex also predicts the tester’s age, ethnicity and gender based on their appearance.

One of the company’s clients, Mars Inc. wanted to evaluate if their advertising evoked the intended emotion from consumers to affect sales. Using Affdex for Market Research, more than 1,500 participants from France, Germany, the UK, and the US were asked to view over 200 ads showcasing Mars’s products: chocolate, gum, pet care, and instant foods. As the participants’ facial reactions and emotional responses were captured via webcam, the app predicted that ads that scored high in evoking strong “positive” emotions would have better short-term sales than those that evoked “negative” emotions.

Aside from watching the video, participants were also asked to respond to a questionnaire related to the products. Responses to this questionnaire were intended to support the emotional responses, enabling the app to better predict short-term sales results with an accuracy of 75%.

From the results, the app has the ability to compare responses to products and determine which products evoked strong emotions, and therefore perform better in the market. For example, the results revealed that the client’s chocolate ads evoked the highest emotions, garnering better sales. Food ads drew the least emotions, but this insight could potentially help the company revise its creative approach and make other marketing decisions.

Another client, EBuzzing, a video advertising network, wanted to know which content elicited the strongest emotions from viewers to figure out which would result in viral ads. For this study, more than 2,600 participants viewed 40 video ads on YouTube, revealing that:

  • Emotionally charged video ads are 4x more likely to be shared
  • Video ads that provoked a smile are 5x more likely to receive 10 million views
  • Video ads with product placements are up to 3x more engaging than the norm
  • Video ads for movie sequels are 17 percent less engaging than the average movie trailer

Businesses considering trying out Affdex can acquire commercial usage licenses starting at $25,000 after the 60-day trial period.

Affectiva, a spinoff of the MIT Media Lab, is headed by Chief Executive Officer (CEO) and Co-founder Rana el Kaliouby, who holds a Ph.D. in computer science. She also holds a degree in Executive Education, Global Leadership from the Harvard Kennedy School.


Kairos is a computer vision software that claims to identify and verify faces in videos and photos. Kairos explains that face images must first be enrolled into the system’s database to be detected. They claim the software can then determine the unique facial features on those faces to determine the identity of the person. When presented with other instances of the person’s face, Kairos claims their software can break the new picture down into key features and then compare them against the face images in the database to find a match with a high level of confidence.

Kairos also claims their software can detect emotions. The 1-minute video below showcases how Kairos Emotional Analysis maps a face to a variety of emotions based on its expression. The Kairos Emotion Analysis demo of facial biometrics in this 1:21-minute video demonstrates the user showing a variety of facial expressions. Even in poor light and with glasses, the application is capable of emotion recognition:

IPG uses Kairos Emotion Analysis to conduct large-scale, consumer-focused qualitative research. The company needed to test advertisements and products across different geographies and demographics, and improve the turnaround time and accuracy of the process.

IPG claims that they processed 18,000,000 facial emotion measurements monthly, helping the client move from small groups to much larger scale qualitative research covering people of different ages and backgrounds.

Legendary Pictures needed a precise understanding of audience responses to their films. Kairos claims Legendary Pictures used their software to gauge audience responses every quarter-second they watched the film. More than 450,000 emotional measurements were recorded per minute during the film screenings, for a total of around 100 million facial measurements processed throughout the course of the film. Kairos claims Legendary Pictures used these measurements in their marketing campaign efforts.

Enterprise packages carry a perpetual license on the software and are custom-priced. The package includes capabilities such as face detection, face identification, face verification, emotion detection, age detection, gender detection, multi-face detection, attention measurement, facial features, sentiment detection, face grouping and diversity recognition. It also includes email support during business hours and business interruption insurance.

Founder and CEO Brian Brackeen has worked at Apple and IBM.

Insights NOW

Insights Now , by nViso, is a software that can be used to recognize human emotions by analyzing facial expressions and eye movements using 3D imaging technology. The visual intelligence capabilities combined with emotion analytics enables it to detect and predict the target audience’s emotional engagement in real time using standard camera devices on smartphones, tablets, and computers. The application can be used in market research and product development.

According to information on the company website, videos are uploaded to the Insights Now portal from where viewers can watch it. Viewers are also asked to complete a survey related to the video. During the viewing session, the application captures viewers’ facial expressions and reactions and sends this data to the Insights NOW server where it is analyzed and interpreted by the software in real time. An online report about the viewer’s emotional engagement is generated immediately.

The technology captures the seven dominant human emotions and interprets them into levels of emotional engagement. Through deep learning, a type of machine learning that requires the system to repeatedly perform calculations to find patterns, the software interprets the behaviors and continuously learns and improves upon itself over time, increasing the accuracy with which it recognizes emotions.

The company claims that their facial expression database grows daily. Currently, they claim their software has analyzed over 25 million faces.

The 1-minute video below featuring nViso’s Head of Marketing, Michael O’Sullivan, quickly describes how Insights NOW’s facial recognition AI works to give marketers the ability to discover which video snippets resonate most with their audience. This could potentially allow them to improve the conversion rate on their advertisements.

Businesses could use Insights Now to write copy, develop and test products, and improve the quality of their A/B tests.

Advertising agency Fabler Studio wanted to know how a preliminary video would fare in emotional engagement compared with the final production video. The agency asked 200 Participants aged 18 to 65 to view both styles of video. Participants used the webcams on their personal computers, and their reactions were analyzed second by second.

Based on the survey, both the preliminary and final production versions of the same video evoked strong emotions, but the higher production quality of the finished ad more strongly boosted the overall emotional reactions. The study helped the agency discover how certain images or messages evoked emotions, how these insights could be used for, and potentially win the agency more projects and clients.

nViso has generated $5 million in revenue. CTO and co-founder Dr. Matteo Sorci earned his PhD in engineering from the École Polytechnique Fédérale de Lausanne in Switzerland.

InSight Software Development Kit

InSight Software Development Kit (SDK), by Sightcorp, is a face analysis application that uses computer vision and deep learning technology to capture and analyze faces as they watch videos, advertisements, or website content in controlled environments.

Sightcorp claims that InSight SDK captures emotion, demographics and eye movement in real-time, one user at a time using a webcam. It can translate tiny facial movements into universal emotions like happiness, surprise, sadness, anger, among others. The faces are then classified by age or gender, potentially giving businesses better insight into their audience’s demographics, preferences, and buying behaviors. This information can help them make marketing decisions.

In the 3-minute video below, Sightcorp explains how the application interprets peoples faces to reveal their demographics, gender, emotional state, and other information.

The CrowdSight SDK is also face-recognition software that is meant to track multiple people at the same time in a commercial setting. The application gathers information about the shopping experiences of the audience through a webcam. Aside from recognizing the seven general facial expressions, it is capable of determining age, gender, ethnicity, head position, and gaze. 

The data is processed and generated in real-time, enabling businesses to dynamically display content according to the audience’s behavior.

The Cameleon Group used Sightcorp’s CrowdSight SDK application (rebranded as “InsightManager” by the client) to track shoppers inside a store in real time and gather data about their age and gender, behavior, and engagement level by tracing their attention span while looking at specific products and advertisement campaigns. The software enabled the client to gather insights about the shoppers and adjust product shelves and marketing campaigns with the help of the application and a camera.

Roberto Valenti is co-founder and CTO at SightCorp. He holds a PhD in artificial intelligence from Intelligent Systems Lab Amsterdam, University of Amsterdam. Previously, Valenti served as CTO at EmoVision and ThirdSight and was CEO and co-founder of Aixiom.

Generating Video Advertisements


Magisto is an online video creation software that uses AI to create an entirely new video for marketing purposes using uploaded footage and photos, according to the company website. (Readers interested in AI for content creation might find our interview with Tomás Ratia García-Oliveros useful).

The application’s business package includes a component that claims to provide businesses insight into an audience’s viewing behavior. The company claims that this data will help businesses understand, among others, where and why viewers stop watching business videos and which content moves the viewers to make purchases.

To create the new video with the existing footage and images, users will be asked by the app to select an editing style (such as nostalgic, fun, romantic, upbeat, etc.) and in-app music. Magisto claims that these attributes guide the machine learning software in recognizing, analyzing, and choosing action scenes, camera motion, facial expressions, speech and other elements. The system selects the parts that best represent the editing style and creates the script or storyline. The app then also applies professional effects and transitions that support the story: zooming, panning, and lowering or increasing the volume, among others.

The 2-minute video below shows users how Magisto’s software works. The user can import images from their Google Drive, but the software also offers free-to-use images and royalty-free music. Magisto claims its software will notify the user via email when the movie it creatures is ready and that it will provide an embed code for Youtube.

The company claims that the app can publish to various channels such as social media, email marketing, and video advertising platforms.

Hampton Beach Casino Ballroom claims to have used Magisto-made marketing videos with the aim of boosting concert ticket sales and engagement, testing video marketing against image-based marketing techniques, and increasing the speed of creating videos.

Magisto claims Hampton Beach Casino Ballroom did an A/B test on Facebook in which a Magisto-created video ad was tested against traditional image-based ads to see which garnered the greatest increase in ticket sales.

Hampton Beach Casino Ballroom claims it was able to generate 6 video creatives in 48 hours and that it was able to deploy those videos in less than 2 weeks. They also claimed the videos sold 300% more tickets than the image-based ads, although they did not say whether videos sold more tickets than the image-based ads simply because they were videos.

Another client, Millar and Company, a real estate agent based in the San Francisco Bay Area, aimed to drive brand awareness, promote their listings, and attract new prospective clients on Facebook. By creating videos using Magisto, the company garnered these results from their testing:

  • Views: Magisto videos received 220% more views than non-Magisto-created videos
  • Likes: Magisto video posts were liked 160% more than non-Magisto posts, and 320% more likes than non-video posts.
  • Shares: Magisto videos were shared 150% times more than non-Magisto videos, and shared 300% more than non-video posts
  • Comments: Magisto posts received 100% more comments than non-Magisto videos and 200% more than non-video posts

Enterprise licenses targeted at businesses run $34.99/month. The company claims the included video analytics functionality gives businesses the ability to gather and analyze the results of the video ad campaigns they run.

Dr. Oren Boiman is the founder and CEO of Magisto. He earned his Ph.D. in computer vision.

Concluding Thoughts

Based on our research, AI-powered video marketing software have the potential to aid marketers and advertisers in making videos more personalized and relevant to target audiences. This is largely through facial recognition applications that can provide insight into how viewers might react to a video advertisement.

Video generation software can help marketers garner insights about their potential video advertisements relatively quickly, allowing them to make adjustments accordingly before releasing the completed video on social media.

Machine learning video marketing software could help marketers:

  • Determine which kinds of content generates the strongest positive or negative reactions
  • Reduce the time to deploy campaigns
  • Strategically determine the most effective content for targeted audiences.
  • Conduct large-scale reaction surveys of possible content
  • Garner insights on the preferences of prospective customers
  • Create multiple versions of videos in less time, allowing for relatively fast test periods


Header Image Credit: Kairos

Stay Ahead of the AI Curve

Discover the critical AI trends and applications that separate winners from losers in the future of business.

Sign up for the 'AI Advantage' newsletter: