Episode Summary: When it comes to data science and machine learning, what are the related skills that are getting people jobs and what are the industries that are supplying those in-demand jobs? These are two important questions that we discuss in this week’s episode with CrowdFlower’s CEO Lukas Biewald, whose company is providing a pragmatic perspective of the industry by focusing on assessing job listings and related information in the field of data science. If you’re a company that is interested in finding someone with in-demand data science and related skills, or if you’re in the market to find a position in this field, this episode will likely be very useful!
Expertise: Data science and machine learning
Recognition in Brief: Prior to co-founding CrowdFlower, Lukas Biewald worked as a senior scientist and manager within the Ranking and Management Team at Powerset, Inc., a natural language search technology company later acquired by Microsoft, and also led the Search Relevance Team for Yahoo! Japan. He graduated from Stanford University with a BS in Mathematics and an MS in Computer Science. Lukas is also an expert-level Go player.
Current Affiliations: Co-founder and CEO of CrowdFlower
Business of Data Science: Who Gets Hired
When CrowdFlower assesses the job marketplace for data science, machine learning, and other related positions, they cast a wide net and look at job descriptions, parsed out technical skills, and the coding languages that tech companies are most looking for today. In their most recent survey, Lukas found that there were two platforms that appeared in more than half of the job descriptions surveyed – Sequel and Hadoop – for which employers across industries were looking for employees with experience.
Both platforms center around storing data and getting data outside of storehouses, and a top in-demand skills seems to be the ability to get data out of wherever it exists. The top general purpose language, Python, exhibited a similar pattern, showing up in over half of researched job descriptions. Python is good at parsing and moving bits of code around around, which alludes to a bigger takeaway – that companies are really looking for data engineering skills, or getting data into a form in which analysis is possible.
The above is a sampling of the types of research and insights that Biewald’s company is able to provide. Lukas notes that it’s an exciting time, with more and more data continuing to push advances in machine learning through the use of more advanced techniques – and companies are looking for employees who can keep at the forefront of rapidly evolving skill-sets.
In addition to cutting-edge skills, Biewald has also noticed companies making moves away from hiring data scientists in academia to bringing on those who have a stronger business acumen. Data scientists who can figure out what a company needs by bridging communication between the business players who are trying to accomplish long-term goals are particularly valuable. Perhaps not ironically, a recent CrowdFlower survey showed that data scientists say their biggest interest at present is learning business skills that can help them bridge gap the gap between science and corporate objectives.
“There’s so much that data can do that matters, every company has a different set of data problems, and data scientists are becoming the bridge between understanding data and what it can and cannot do”, says Lukas. When you have a person in-house who can both run data science experiments and think about what a business needs, without having to constantly bring in top-level executives, then you have a much more efficient business cycle. As for machine learning, many companies are interested but haven’t yet figured out how to use machine learning applications, though data scientists with background in this area are also in high demand.
“Almost every company believes its important to the future and are looking for those that can help build that future,” notes Biewald. Increasing amounts of data and other technologies, including cloud-based and other data storage services, are helping advance this technology. Machine learning has gone from a fascinating concept that’s relegated to academia to an application that businesses have realized they need to figure out how to apply in order to solve more complex problems. Though the number of business use cases is still small, more companies are using machine learning and yielding successful results.
Business of Data Science: Who Does the Hiring
In general, the industries that are hiring data scientists are those that have already seen the value in using this technology, observes Biewald. “You’ll get a killer use case and you see the industry take off around it,” he says. At present, the biggest industry for data science is eCommerce, which may be because search is directly tied to revenue. “If you can make 1 percent improvement in a search algorithm and probably get 1 percent improvement in revenue, that’s huge.”
Biewald names the consumer Internet space i.e. the LinkedIns, Facebooks, Googles, etc., as the second largest space for new hires. “They’re the big uses cases around user-generated data,” he notes. The third largest industry is financial services, which was arguably the first to learn how to handle and deal with big data sets and largely drove initial advances in big data in the 1980s.
Healthcare is also an up-and-coming industry for data scientists. “I think changing regulations and making more data available will lead to an explosion in data applications,” says Biewald. Companies like 23andMe that provide genetic data and those like FitBit providing vitals data will also have a real impact on the future industry, and Lukas posits that healthcare, with the help of modifications to regulations like HIPAA, could become one of the biggest places for hiring data scientists in the next few years.
One of the things that Lukas sees as necessary for the data science field to continue to accelerate is a more flexible language that acknowledges the fact that the main part of a data scientist’s job (at present) is getting data out of data stores. Biewald believes we’ll see larger data sets at every company five years from now, following a trend similar to what companies like Google are doing today, and he thinks hiring more data scientists will become commonplace and that the creation of new positions, like chief data officers and vice presidents of data, will become more prevalent. “It’s not unusual to see business school students learning to write Python code, and I think these trends have no real sign of letting up in the next five years,” says Lukas.