Data Dominance – How Companies and Countries Win with Artificial Intelligence

Daniel Faggella

Daniel Faggella is Head of Research at Emerj. Called upon by the United Nations, World Bank, INTERPOL, and leading enterprises, Daniel is a globally sought-after expert on the competitive strategy implications of AI for business and government leaders.

Data Dominance 950x540

In 2016 and 2017 I spoke with dozens of venture capitalists, many of whom have a specific and overt focus on artificial intelligence technologies. I wanted to know what made an AI company worth investing in, and what business models were generally the most appealing for investment.

Winner takes all.

It took me almost a year of interviews to come to that conclusion.

Winner takes all.

That’s the potential promise of artificial intelligence. VCs all want to invest in business models with a defendable “moat”. Companies that can acquire more data and more users in a positive feedback loop have the chance to blast beyond the competition and become nearly unassailable. “The next Google”, or “the next Facebook”, it is said, will be a company predicated on taking advantage of this dynamic.

In this edition of AI Power, I’ll explore the winner-take-all AI dynamic that I refer to as “Data Dominance.” I’ve broken the article out into the following sections:

  • What it is and how it works
  • Examples of how companies can use data dominance (Data Dominance in Action)
  • The differing incentives that data dominance creates for large and small companies (Dynamics of Power with Data Dominance)
  • What business leaders can do about data dominance now

I’ll start off with a definition and explanation:

Data Dominance – What it is and How it Works

I’ve heard it called a “self-feeding data ecosystem.” Ben Narasin of Canvas Ventures calls it “a proprietary data plume” – an apt phrase (listen to Ben’s full interview on our AI in Industry podcast).

How Data Dominance Works

  1. Acquire more users, customers or installs 
  2. This leads to more data 
  3. More data leads to more learning and more AI applications 
  4. More learning and more AI applications lead to a better product
  5. A better product that is widely known leads to acquiring more users
  6. This leads to more data
  7. (And on and on and on…)

In a nutshell:

More users, customers, installs >> More data >> More AI capability >> Better product >> More users, customers, installs

But this isn’t just about acquiring data for data’s sake. Macy’s, Exxon Mobil, and Wells Fargo have access to vastly more data than most businesses that have ever existed – why aren’t they AI innovators?

Valuable, Proprietary Data

The flywheel of data dominance does not spin simply because a company has access to data – it is only specific kinds of data that matter. We might think “data dominance data” as having two traits:

Valuable – It can enable beneficial outcomes for users or for business processes.

  • Amazon collects data about everything its users do on its site. Which products get clicks? Which viewed products get added to cart? Which products added to cart get bought? Which patterns of purchases correlate (i.e. does buying backpacks lead to more purchases of notebooks)? Users provide Amazon with tons of proxies for user interest, and tons of longitudinal evidence of their purchase behavior. This is valuable data that allows Amazon to better prepare for demand, and allows Amazon to better recommend products to its users.
  • Wal-Mart can collect data about the number of sales (and of what products) at its various stores, but is often unable to tie these purchases to individuals or to families, and it is unable to determine statistics for what products people spent time looking at, but didn’t buy, or for what products people added to their cart, then removed. For this reason, Wal-Mart is incredibly efficient at inventory and stocking its shelves, but it is still catching up to Amazon in terms of product recommendations.

Exclusivity and Access – Few other organizations have it, few organizations have access to as much of it.

  • Facebook’s platform is unique, and the data it collects is exclusive to its platform alone. This is not much of a strength for a small company, but for a company Facebook’s size (i.e. the largest social network on Earth), it means a torrent of data that allows Facebook to customize its experiences for its users – allowing it to stay ahead of other social networks in terms of user growth and engagement time on the platform.
  • Exxon Mobile is probably able to collect data from thousands of oil wells. However, it is unclear whether this provides Exxon an advantage over any other oil and gas company, as other firms also have access to many wells. It is possible that an oil and gas company could gain a data dominance advantage with data from specific kinds of wells. For example, Exxon Mobile may able to gather more data on Arctic oil wells – potentially allowing it drill and extract more effectively from those specific kinds of locations. Sinopec might be able to do the same with deepwater drilling wells. As of today, these advantages are uncertain.

While it is possible to develop “me too” AI applications (for example, both Spotify and Pandora can recommend music, both Google and Bing can display search results), the more exclusivity, the better. Companies that can collect unique and valuable data, or simply vasty more of that valuable data, will have a distinct advantage in improving their user experience and business processes.

But Macy’s and Wal-Mart aren’t just behind the tech giants (Amazon, Google, Facebook, etc) because of data alone. There are other success factors in enabling the data dominance flywheel – all of which pose major challenges to existing enterprises who have never had a focus on data or AI in the past.

Enabling Factors

Subject Matter Experts
Related article on Emerj.com: The Critical Role of Subject-Matter Experts

Technical AI Skills – Data scientists, data engineers, machine learning engineers, AI-savvy programmers of all kinds.

  • When Facebook wants to build a new AI model (for mobile ads, for encouraging the use of a new messaging app, etc), it can draw upon a huge army of AI talent to bring those ideas to life. Even better, Facebook’s AI programmers have robust experience turning data into value, and their AI teams have the benefit of working with some of the best data science experts on Earth – an advantage that few other companies can match.

Business AI Context – Business leaders and functional team members (in marketing, in customer service, in inventory, etc) who understand roughly what AI can do, where it can be used, and what problems it can be applied to.

  • In many cases, Amazon doesn’t need to rely on data scientists of ML engineers to think up new ways to generate value with AI. Many of Amazon’s business leaders have a robust understanding of AI’s capabilities, and have robust experience bringing such ideas to life along with data science talent. Hence, they can think of more and better ideas for driving business value with AI.

Data Infrastructure – The right data is stored in the right formats. Data from different sources is “harmonized” so that it can be combined, searched, or used to train different kinds of AI models. Important data is treated with care, data is a strategic consideration across business functions.

  • Google often doesn’t need to ask where and how they’ll get access to the data they need (i.e. user engagement with Gmail, eCommerce purchase data, etc) – because they know where and how it is stored. They have an overt focus on data infrastructure – and an ability to access and use data assets quickly. This is a major advantage of being a digitally-native company.

These three critical factors are part of the reason that digitally-native tech firms (like Google, Facebook, etc) have such an AI advantage.

US Tech Giants – The Poster Children of Data Dominance

Tech Giants AI
Related article on Emerj.com: The AI Advantage of Tech Giants

Think of it as a data, improvement, and acquisition flywheel. You’ve already seen tech giants nail this dynamic:

  • Google gets so many users searching on its platform that it receives the bulk of the data from general online search. Because it has so much more data, it’s algorithms can serve increasingly better and better content, giving users even more reason to come back to Google instead of another search engine. The cycle has continued enough times for Google to be unassailable in general online search.
  • Facebook gets so many users on its platform that it can test features, news feed items, and advertisements across its user base at an unprecedented scale. Because it has so much more data, it’s algorithms can serve increasingly more and more engaging content, giving users even more reason to come back to Facebook instead of another social media platform. The cycle has continued enough times for Facebook to be unassailable in terms of social networking dominance.
  • Amazon gets so many customers on its platform that it can test product recommendations, site layouts, and promotions across its user base at an unprecedented scale. Because it has so much more data, it’s algorithms can serve increasingly more and more relevant products and prices, giving users even more reason to come back to Amazon instead of another physical or online store. The cycle has continued enough times for Amazon to be unassailable in terms of general eCommerce.

Any AI startup is looking for this flywheel dynamic, and they’re pitching this dynamic to VCs.

Any VC interested in AI (i.e. essentially all of them) is looking for this dynamic.

Many savvy thought leaders in the world of “AI ethics” are aware of – and wary of – the power granted by this data dominance loop.

Enterprise leaders are the last group to the party. Most enterprise leaders see AI as another kind of IT tool, or as a means to automate processes and gain efficiencies. Enterprises in manufacturing, pharma, banking, and other sectors are still mostly oblivious to the kind of “economic moat” that can be created by artificial intelligence in the right application.

Over the next five years, company leadership across enterprise sectors will increasingly be aware of this dynamic, and will begin putting together plans to devise AI strategies to do this. Much of our enterprise AI strategy work involves helping large companies find pockets of data advantage to build their own flywheel – to focus on market share expansion instead of simply focusing on efficiencies.

Data Dominance in Action

Organizations that leverage the dynamic of data dominance will generally follow a pattern:

Land, Command, Expand

Land – Determine a critical business opportunity or challenge, and the critical data that would help to improve the business processes around that opportunity or challenge, or improve the customer/user experience of those who would potentially benefit from it. Collect as many data streams as possible and test how artificial intelligence can use those streams to improve efficiency or user experience. Learn and iterate, determine what kinds of data, and what kinds of algorithms, can deliver value. Focus on scaling access to data streams (acquiring users, retaining users).

  • Note: A “user” could be the user of an app (such as Facebook Messenger), a customer (Amazon’s customers), or a client who installs software or hardware that collects data (industrial IoT applications, like Uptake).

Command – Using lessons learned from data access, and using increasing access to torrents of new data, own your position in the market by being able to offer a better product, or a more profitable process. Lock in existing users with data-based benefits they couldn’t get elsewhere.

  • Example: InsideSales is a CRM company focused on sales rep activity. If InsideSales can use data from its users to help enable its new customers to have more productive sales staff just by using the software (say, by knowing which opportunities are most likely to close, when to reach out to new clients, etc), then it can potentially dominate the market.

Expand – Using profits, data streams, and locked-in users, expand to solve adjacent business problems or opportunities, by finding new kinds of data to collect from one’s user base, and delivering more value to users in doing so.

  • Amazon went from recommending purchases online to recommending purchases from in-home Alexa devices. Facebook went from owning social networking “stickiness”, to owning the communication channels that friends use outside of the platform (WhatsApp, Facebook Messenger).

These concepts are somewhat abstract by themselves, so we’ll look at how they might be applied to a company or to a nation:

Example: HVAC IoT Sensor Company

Business description: This company provides HVAC parts and sensors, sold mostly to large commercial buildings. The parts and sensors are designed to help predict.

AI use-case: The software uses data from sensors and HVAC layouts to determine leaks, blockages, and inefficiencies – based on feedback from operators on the site (recording all fixes and issues as they come in). Historical data can be used to train systems to detect a potential failure of HVAC equipment, or energy inefficiencies.

Land: Sell to a variety of large commercial establishments – use venture money to fund pilots and trials to get as many sensors in as many locations as possible. Potentially partner with other HVAC equipment manufacturers and installers to find more ways to get a wide installation-base for sensors and software.

Command: The initial data from the install-base may not be useful for real-world applications – some portions of the data from sensors or user inputs may be completely useless or messy. However, some portions of the data may be found to have predictive value for different kinds of HVAC problems, or may be able to be used to improve efficiencies by detecting patterns of airflow or temperature. The sensors and user inputs could be continually improved to have more and more predictive value.

Expand: This predictive advantage could allow the company to offer immediate benefits to a client in ways that other sensor companies or HVAC equipment companies can’t do – and using their data advantage, they’d be able to win more business – and with more business, win more data. The cycle of data dominance. With enough profit, with enough in-roads in thousands of locations, and with enough experience turning IoT data and user input into predictive value, the company could eventually expand into other kinds of commercial IoT applications, such as security systems, elevators, etc.

Example: India

AI in India
Related article on Emerj.com: AI in India

Current situation: A strong agricultural economy with a growing manufacturing industry, and a relatively profitable (and quickly expanding) market for IT services and business process outsourcing (call centers, accounting, back-office processing [BPO], data entry, etc).

AI opportunity: India could focus on refining it’s processes and data streams around crops where it leads in global production (such as buffalo milk, cow milk, mangoes, potatoes, etc). India could also focus on turning its back-end IT and BPO efficiencies into powerful products that could lock in the value of the manual efforts being done by cheap Indian tech and BPO workers.

Land: India could economically incentivize IT and BPO companies to invest in AI and automation, encouraging them to turn their low-cost services into high-value services, making the economy less vulnerable to automation, and more likely to continue the growth and profitability of its tech and BPO sectors.

Command: Indian IT services firms and BPO firms could use the data streams and experience of their workers to become the must-have IT and automation products for enterprises globally, securing their position as “the back office of the world” into the era of AI.

Expand: Indian firms could take their market share and data streams from IT services and BPO and expand them to own a greater and greater share of IT processes and BPO solutions, widening their net of “best-in-class” solutions with an inherent advantage over other firms.

Not Always a Google-Sized Monopoly

Some data is naturally all-consuming, like general online search (Google), like social media (Facebook) like general eCommerce (Amazon).

While there are still some gigantic monopoly-inducing data dominance opportunities in the world (many more than we can now imagine, I’m sure), many other data dominance opportunities will be smaller. The defensibility of these businesses may be high, but they may never grow to the size of Amazon or Google, not even close.

eBay, for example, might get better than any other firm when it comes to selling used goods online. This might create lock-in via AI-enabled capabilities (and a well-known brand), but it may never involve a market size anywhere near the size of Amazon.

Our hypothetical example of an HVAC IoT company is also a good example. While such a company might expand into an industrial IoT super-giant, it might also just master certain kinds of HVAC systems, giving it a secure lock-in in the market, but no global predominance like Google or Amazon.

Data dominance works in small or large markets, what matters for a company is their ability to find a pocket of opportunity, own the data streams, and iterate until they are clearly the best product on the market – and continue the flywheel of users turning into data turning into improvement turning into more users.

Dynamics of Power with Data Dominance

The dynamics of data dominance will not imply the same strategy for all nations, or for all corporations. As is always the case with power (as with governance and regulation), the strategies of the weak are different than the strategies of the strong.

Companies or Nations with AI Power

Legal:

  • Slow down any progress on laws that would change monopoly law to involve data dominance.
    • Facebook and other tech giants are predictably lobbying against European regulation and GDPR, as well as lobbying against data privacy and data access legislation in the United States.
    • China has remarkably little presence in the global AI ethics conversation – because it behooves China’s swelling technology superpowers to be as unhindered as possible. When China does enter the “AI ethics” or “AI regulation” fray, it will be with a united plan that is designed ostensibly to benefit users, but will be entirely geared toward aiding China’s goals of growing in power and dominance.

Perception Management:

  • Do not openly acknowledge the dynamics of data dominance.
  • Partially acknowledge the dynamics of data dominance, but frame it as nothing close to monopoly power.
  • Only admit data dominance as an advantage when there is no other choice, and feign an interest in helping to alleviate any unfair advantage or unfair treatment – but in fact, protect your own interests.
    • After years of being grilled on security and privacy concerns (as well as political meddling through the Facebook platform), Zuckerberg is finally conceding that internet regulation might be beneficial. Any suggestions Zuckerberg makes about such a plan can be expected to help lock in Facebook’s market position, and limit and new openings for incumbent companies.

Business:

  • Expand user base (data source) rapidly, potentially even at a loss.
  • Improve user experience by leveraging, increasing, and improving data streams.
    • Amazon selling Kindles at dirt-cheap prices is an example of this strategy. Google sent me a Google Home just for being a Google Suite customer. Facebook has built and bought a fleet of apps like Facebook Messenger and like WhatsApp. More ways to collect data, to customize experiences, and to find opportunities to monetize.
    • The United States may be less likely to force tech giants like Amazon and Google to pay their taxes because it may behoove the USA’s general economic growth to keep it’s biggest tech giants growing unhindered.

Competition:

  • Acquire them, or engage in price wars that smaller competitors can’t win.
    • Amazon has engaged in both strategies, buying companies like Zappos and Kiva Systems, and pricing out companies like Diapers.com.

Companies or Nations without AI Power

Legal:

  • Emphasize the unfairness and dangers of data dominance, paint it as a cause of concern for citizens, not just for smaller companies (companies will be acting for their self-interest, but will leverage the claim to be acting for “the people” whenever possible).
  • Aim to change monopoly law to break up or severely hamper companies with AI power and with data dominance.
    • The United Kingdom wants to lead in AI ethics – or soft power – mostly because it cannot lead in hard power (AI innovation, global technological and economic predominance).
    • In general, a country or company saying “We’re focusing on being leaders in AI ethics” means “We don’t want the more powerful AI companies to get all the profits and benefits, so we will protect our own self-interest by feigning to defend a just cause for ‘the people’.”

Perception:

  • Appear to be the good guy, fighting valiantly for a free marketplace, and for the rights of the less powerful masses.
    • The EU is styling itself as the central hub for AI ethics and regulation, playing the “good guy” when in fact they are merely protecting their own self-interest by ensuring that big tech innovators (few of whom are based in Europe) don’t get to gobble all the benefits and profits in this new AI era.

Business:

  • “Beachheads” – Find niche problems not currently being solved by AI, and find unique ways to develop competence in those use-cases, and conquer them with excellent customer acquisition strategies.
  • Build data openness and transparency into products – and appear progressive a virtuous by doing so. Or, find pockets of data that can be safely owned (quietly), without blowback.
    • Companies like BuildDirect.com can potentially compete with Amazon by focusing on specific niche markets (in BuildDirect’s case, building supplies), and building a specific client base and understanding of a specific set of customers. Companies like Discord can build user bases (data streams) through communications channels that big players like Facebook haven’t focused on (in Discord’s case, gamers and gaming fans).

Competition:

  • Outsmart competitors by finding faster ways to acquire proprietary data streams – by partnerships, by marketing and acquisition strategies, or by simply raising or making more money to spend on aggressive acquisition.
  • Focus on user growth and data stream growth, but avoid letting your data dominance strategy be known widely to competitors. Acquire your advantage slowly and over time.

What This Means for Business Leaders

VCs and startups are almost all aware of the dynamics of data dominance, if not by name. Enterprise leaders and mid-size business leaders are – for the most part – not considering data dominance in their strategic plans. Most see AI as an enabler of efficiencies through automation, rather than an enabler of defensibility and of winning market share.

Executives and leaders might think through some of the frameworks below – which are abbreviated versions of the strategy work we do with clients:

Frameworks for Data Dominance

Framework for determining winning AI areas for a new product or service:

  • Find business problems not being addressed, or not addressed with data dominance
  • Determine the critical data in those processes, needed to solve those problems
  • Determine a flywheel for acquiring said data

Framework for determining winning AI areas for an existing product or service:

  • Determine the problems we already solve
  • Determine which of those problems has:
    • The biggest overall market potential
    • The most current traction within our own business (we have biggest market share)
    • The most internal competence (we have special, proprietary skills or understanding)
    • The most potential overlap with AI… where proprietary data could make a real difference
    • The most reliance on proprietary data
  • Determine a handful of viable problems that meet the criterion above, then:
    • Determine flywheel data acquisition strategies for each
    • Decide on which of them to commit to / ensure that they align with long-term company goals

Short-Term Implications

In the short term, more and more startups will be explicitly focused on data dominance – and almost any firm with global ambitions will consider this dynamic early in their business planning process.

As a consequence, more and more enterprises will be thinking about their customer base as an enabler of data streams to build product improvements, with the ultimate goal of having the best product on the market.

Google, Amazon, and Facebook’s AI-enabled data dominance is likely to continue.

New regulation may be introduced to encourage data use that doesn’t allow for overt monopoly power, and “monopoly” may be re-thought altogether, potentially leading to the breakup of existing tech giants, and new strategic and practical considerations for companies aiming for data dominance.

Long-Term Implications

While data dominance will continue in startups, small countries, and small companies – the biggest power game will play out between the world’s largest companies (Google, Facebook, Baidu, Tencent) and the world’s two technology superpowers (USA, China). The ultimate goal is to control the computational substrate that houses the majority of human experience, and/or that houses the powerful AI (read: Substrate monopoly).

Smaller nations and countries will be more-or-less helpless in preventing this dynamic from happening unless the great powers agree to come kind of global transparency and steering of AI technologies.

That’ll have to be the topic of another article…

See you next week on AI Power.

Subscribe
subscribe-image
Stay Ahead of the Machine Learning Curve

Join over 20,000 AI-focused business leaders and receive our latest AI research and trends delivered weekly.

Thanks for subscribing to the Emerj "AI Advantage" newsletter, check your email inbox for confirmation.