Methodology: Women in AI

  • Data set: Bloomberg Beta overview

Our data set, including industry and sub-industry categories, is based on the 2016 machine learning landscape published annually (since 2014) by Bloomberg Beta’s Shivon Zilis and James Cham. We looked at top-level management (see definition above) and C-level roles across 287 companies in seven industries that leverage AI and ML technologies as part of their main product and service delivery. An inclusive look at categories and subcategories, as well as companies, can be found in the infographic found in the provided link.

  • How we collected data

For each company included in Bloomberg’s Beta machine learning landscape, we collected the following information: total number of companies with top-level female presence (defined as senior-level females, including senior-level management, founders and co-founders, and chairpersons); specific C-level roles; and numbers of women in C-level roles. In each case, we first went to a company’s team page. When team information was not available on a company’s website, we use LinkedIn for finding company employees and titles. Though LinkedIn is a mainstream application, participation is voluntary and therefore total inclusivity is not certain. While there are methods for verifying this information, such as contacting the company directly, limited time precluded us from exercising all options.

In a few cases, companies either did not have explicit team pages or they employed another approach, such as displaying team pictures with names but with no identifying titles. In other cases, companies didn’t use typical C-level titles. While it’s highly likely that these organizations have heads of operations, etc., and other key line roles, in order to keep the data uniform for this study, we only looked at C-level titles; companies to which this applied were the exception, and while room for error exists we believe it’s not significant. It’s also important to clarify that when looking at C-level positions that we counted all individual roles, even if held by the same person (for example, if a person held double C-level titles such as CTO and CIO, we counted both roles as individual roles). These cases were a rare exception and the resulting error rate is not significant.

After collecting the first described set of data, we realized the need for additional information on the average number of C-level executives in total (both male/female) for each sub-industry in order to find the ratio between male and female C-level executives. In a second round of data collection, we looked at every other company within each sub-industry (taking care to find information for at least half of the companies listed in each category), and tallied the total number of C-level executives. This information was later used to calculate the percentage of female C-level executives across industries by dividing total number of female C-level roles by total number of C-level roles counted across industries.

It’s worth noting that data on C-level roles included in the published study was based on whether or not a “significant number” of individuals filled a given C-level role, which we defined as five or more persons. There were almost as many identified C-level roles filled by women as those where there were no women counted. Out of a total of 33 identified C-level roles, 17 were held by at least one woman and 15 had no women. Many of the roles with no women were not significant positions, in that they were often a single instance or filled by two or three individuals across all industries. There were two exceptions to this rule: we identified 22 male Chief Commercial Officers (CCOs) and six Chief Architects, roles that were not held by any females in the companies for which we collected data.

  • Industries and sub-industries not included

As we went through the process of collecting the above data, we identified one industry and one sub-industry that we ultimately decided to leave out of this particular study: the industry (or category) of open source libraries and the sub-industry of personal agents (under the broader industry of agents). Our reasons for eliminating open source libraries was twofold: we believe these are better described as programs rather than companies, and several—such as TensorFlow—are created by groups within much larger corporations.

In regards to the sub-industry of personal agents, many of those listed (including Facebook M and Apple Siri) are produced by companies that are far bigger in size and profit than the average company included under each industry in our data set (an average of between one and 100 employees). For consistency of data, we decided not to track C-level roles or top-level female presence at these companies and instead focused only on companies that fell under the sub-industry of professional agents.

  • Perspectives of female C-levels/founders

In addition to collecting quantitative data, we thought it important to garner individual women’s perspectives on their industry, professional roles, and the topic of gender diversity in leadership positions in AI and ML. We sent interview requests to 15 women in C-level and/or founding roles at companies, who were selected randomly using a list generated and maintained by the Women in Machine Learning (WiML) Organization. We received five interview responses—four via email and one interview conducted via Skype. We asked each individual the same four questions:

  1. What business opportunities in your field or company are you excited about in the next 5 to 10 years?
  2. What were the driving motivators that led you to start your own company/get involved in your current field and company?
  3. What has your experience been as “a woman” in AI and business i.e. empowering, challenging, more or less equal opportunity? There is no right or wrong answer. Feel free to tell a short story if preferred.
  4. Eschewing the cliché, do you think it’s important for more women to strive for C-level exec/founding roles in AI-related domains? Why or why not?

While not all responses were included in the final study and article, their responses are no less insightful and may be used (with permission) in future articles.

  • Limitations of study

Our method for collecting data includes limitations, some of which we note here:

  • The data set used (287 companies total) is relatively small and dependent on those chosen by Zilis and Cham, and hence subject to potential biases outside of our control.
  • While numbers and titles of tallied C-level roles were checked twice, this information was collected and verified by one individual. In addition, any information collected on the LinkedIn platform may have not have been inclusive; hence, the information collected is subject to potential error in identification of applicable roles.