Episode Summary: One facet of business that nearly any industry has in common is the need to stay on top of news in their respective market, including competitor strategies or understanding changes in news related to the field. Media monitoring is a domain that machine learning (ML) is well suited for, with it’s ability to coax out headlines, contextual information, and financial data from the seemingly endless stream of social, blog, and other information on the web today. Signal is a company that uses ML specifically for these purposes. In this episode, we speak with Signal’s Chief Data Scientist and Co-founder Dr. Miguel Martinez, who dives into real business use cases illustrating the use of machine learning for media monitoring across industries.
Expertise: Text Mining, Information Retrieval, Machine Learning, and Natural Language Processing
Brief Recognition: Dr. Miguel Martinez is chief data scientist at Signal, where he manages a team of data scientists to transform best algorithms from the fields of machine learning, information retrieval, and natural language processing into large-scale commercial products. Prior to co-founding Signal, Martinez was a researcher in natural language processing, information retrieval, and machine learning at University of Essex. He earned is MS in computer science from Universidad de Oviedo and his PhD in computer science, information retrieval from Queen Mary University of London, where his research focused on modeling text classification using descriptive approaches and knowledge exploitation for quality and productivity increase.
Current Affiliations: Chief Data Scientist and Co-founder of Signal; Industry Advisor for University of Essex
Big Ideas:
1 – Machine learning for media is now an invaluable tool to marketers, communications experts, and CEO’s alike.
Covering the influx in modern media has necessitated the need for a tool beyond human cognition;, machine learning is an ideal fit with its ability to process millions of articles in seconds and extract customized points of data.
2 – Trend discovery and predictions for future allow for wiser decisions.
Machine learning in media monitoring has the potential to coax out otherwise invisible trends and even potential “next right actions”, providing the potential for companies and individuals to become “wiser” and better-informed decision-makers.
Interview Highlights:
The following is a condensed version of the full audio interview, which is available in the above links on Emerj’s SoundCloud and iTunes stations.
(2:10) Talk to us a little about what machine learning is doing in media monitoring as a field?
Miguel Martinez: I think in order to explain media monitoring and how it’s changed over the past two decades, we have to go back to the beginning; when companies were…monitoring advertising campaigns and how well they did and also to understanding the perception of the company itself or the CEO in the process…and it was a very manual process…and that was alright 20 years ago when you had 20 to 30 publications that were crucial from the business point of view…but most importantly the news cycle was 24 hours…that has changed dramatically and at the moment AI and ML are, I believe, the only possible way forward for media monitoring, and that’s mainly because of three challenges: the scale, the speed, and the relevancy…
…once you add ML, you have things like topic classification, in which you can process one of these articles individually and decide if it’s an IPO article, or one about technologies…whatever your topic is that you want to monitor.
(10:49) What are some of those common cases where the value is really evident and people are eager to jump on?
MM: I like to put them into two different categories; the first one is tracking individual articles from things you know are going to happen, and on the other side the discovering of unknown threads, and each one of these categories has a set of different use cases; so, let me start with tracking individual articles…we’re talking about things like reputation management, when you need to understand your space very well—say I’m getting news about wearables (if I’m in that space) about my company, because I need to understand the perception; related companies that will be my partners, or that could be my distributors; it could be my competitors—that will give you a full understanding of the area you operate in…
…when it becomes a revolution, from my point of view, is when you look at trend discovery, even future prediction using media monitoring; these are the kinds of things you can only see on the aggregate of articles, so it’s not one article at a time. If I give you all the article about wearable technologies and you would have the time to read and memorize all of them at the same time, you would be able to see threads; obviously that’s close to impossible for humans, but for computers it’s something doable, and you can see very exciting use cases here, like the perception of the brand and how it changes for specific publications…
…you can also keep track automatically of new trends in the space, so the system will be able to tell you ‘there’s an anomaly here, I haven’t seen it before, you should check this thing’; it could be a new method, it could be a competitor, which is key…
(15:28) When you said how the brand is being represented, my presumption is there’s some degree of sentiment analysis and being able to look at that over a time sequence for a particular publication and for a particular company…is this what you’re referring to?
MM: Sentiment is one of the aspects…but what is sometimes more important is how are the publications describing you, what are the main keywords that appear with your brand. Before, it might have been innovation…and then suddenly you’re becoming more of a robust or a growing company; that’s critical for the folks in communication…(you want to know), is the last company you launched successful enough that the newspapers are changing the way they write about you?
(19:57) What you’re referring to, how did that marketing campaign influence how we’re (as a company) being talked about…(can you) walk us through an example of what sort of nuanced details we’d want to be picking up on?
MM: Imagine that a company wants to create a gigantic inflatable balloon in the middle of the city with a logo…and they tell a lot of the newspapers because the balloon has something special— the idea with tracking this would be how many of the publications we tell about this have actually published about it; so the first thing is, who are the publications who mentioned us in the press…but then once those people published, who picked up the story, how did the story develop over time, because there’s a huge degree of syndication based on other publications who wrote them, you can see how a story has a life in a way…so that’s critical to the marketing and communications…and if it was powerful enough, you can basically explain the amount of money you paid for (the service) was really good because you hit these specific publications that has a demographic that you want of potential clients…
(23:53) I imagine almost anybody running a business is interested in what is their share of “voice” compared to their top 10 or 20 direct or indirect competitors; talk about how that’s calculated.
MM: At the moment, because we process every one of the documents individually in our data processing pipeline…we can trace absolute numbers between different companies; some people care about the absolute number, some people care about the absolute number with things specific to a set of publications…the majority of people care about the absolute number.
(25:20) What sort of major applications of media monitoring have I maybe missed and that we could talk about in a hypothetical example?
MM: I think prediction for the future is the one linking to the detection of patterns. One of my friends is kind of a foodie, and he was mentioning that because of the technology that we have and the data that we have, we can very easily list the types of food mentioned in recipes in San Francisco newspapers…based on past scenarios, we can see that the trends that happen related to food are really well represented in blogs first, then they (the ingredients) make it into the shops, and then they cross the Atlantic to Europe…you can see the power (in this)—if you can listen to all the news, you can see how things will potentially happen in the future…
…another (longer-term) example…if I’m interested in where there’s been a case of corruption in my company, I can ask the assistant (for the) most recent cases in the last 20 years for when there was something similar to this, and to let me know what happened when people issued a correction, or when the CEO went public, or when suddenly someone got fired; so we can look at the past based on the news of the world to see where there are potential implications, and then potentially decide what to do.