Better Than Elasticsearch? How Machine Learning is Improving Online Search

Raghav Bharadwaj

Raghav is serves as Analyst at Emerj, covering AI trends across major industry updates, and conducting qualitative and quantitative research. He previously worked for Frost & Sullivan and Infiniti Research.

Better Than Elasticsearch? How Machine Learning is Improving Online Search 1

Episode summary: In this episode of AI in Industry, we speak with Khalifeh Al Jadda, Lead Data Scientist at CareerBuilder, about the applications of machine learning in improving a user’s search experience.

Khalifeh also talks about what the future of search might look like and how AI will continue to make the search experience more intuitive (for search engines, platforms, eCommerce stores, and more).

Business leaders listening in will get a sneak peak into the future of online search – and an understanding of how and where improvements in search features could impact their business.

Subscribe to our AI in Industry Podcast with your favorite podcast service:

Guest: Khalifeh Al Jadda, Lead Data Scientist at CareerBuilder

Expertise: Machine learning, big data, search engines

Brief recognition: Khalifeh previously earned BSc and Master’s degrees in computer science from the Jordan University of Science and Technology, and a PhD in computer science from the University of Georgia. He has also served as a Lecturer in computer science at the King Saud University briefly before beginning work at CareerBuilder, where he went on to become Lead Data Scientist. He is founder of the Southern Data Science Conference.

Big Idea

Online businesses like recruitment or job-sites, for example, are primarily ‘search tools’. Khalifeh explains that recruitment is a process involving two sides –  the employers and job seekers – are both searching for potential matches for their requirements.

Many other domains and functionalities in online businesses today depend on running a search and  improving search to ensure highly relevant results directly correlates to returning customers.

Khalifeh explains that there are traditional search tools (like Elasticsearch) that can perform relevant search right away, leveraging aspects of natural language processing and other AI techniques. Yet, he also believes that businesses based on each (such as large eCommerce stores, or online platforms like CareerBuilder) should focus on improving and tailoring their search experience beyond open source offerings.

Khalifeh explains what he considers to be the challenges with traditional search tools:

  • Traditional search is usually keyword-based and lacks contextual awareness. For example, if one were to input a search query ‘Java’, traditional search engines would retrieve all documents that are related to the coffee, the island or computer language.
  • Domain jargon is not taken into account in the search. For example, in the healthcare sector, the abbreviation DON stands for director of nursing and is used commonly in the industry. Traditional searches would retrieve a lot of irrelevant information (possibly about someone named Don) in such cases.

Khalifeh explains that AI can today enable semantic search where the search engine can improve searches by understanding the context in search queries.

He elaborates on the steps in which this can be achieved through an example search query – ‘hadoop java developer machine learning Atlanta’:

First, there is the question of meaning. What does “Atlanta” represent? Is that a physical location, a person’s name, or a software language? In this case, the meaning is rather clear and a taxonomy of cities might help a search application quickly come to that conclusion. Similarly, the term “Hadoop” should be saved to mean a collection of open-source software.

Khalifeh mentions that since recruiters often aren’t experts in the fields that they hire within, it’s important for the search system to parse out these meanings quickly, rather than counting on the user to use the term “Hadoop” in the proper context – or counting on a recruiter to know that “machine learning” is closely related to “data science”.

CareerBuilder can analyze millions of historical search logs for jobs and applicants and ask: How many times are “java” and “developer” searched together as “java developer”? If this term is seen together in both job applications and job listings, the system might believe that there is a high likelihood that the two terms belong together in a string.

Khalifeh explains that firms with advanced search needs might build a specialized in-house taxonomy of such terms, with the ability to (a) bootstrap those terms with public sources like Wikipedia, and (b) update those taxonomies over time as new terms, new entities, and new meanings arise.

In the future, Khalifeh believes that search will become more interactive, and will involve more inputs than a simple text box. Voice and chat will become more popular, and in many cases the conversational nature of voice and chat will make search better.

For example, if a recruiter is looking for a “Project Manager”, they may imply a job in construction, or they may be looking for someone in IT or software. In this case, a system which can prompt a next relevant follow-up question (or present a list of choices based on the context and the search) might be much more useful than a simple text box. Khalifeh believes that this dynamic of “search as conversation” will become more prevalent in the years ahead.

Interview Highlights with Khalifeh Al Jadda

The main questions Khalifeh answered on this topic are listed below. Listeners can use the embedded podcast player (at the top of this post) to jump ahead to sections they might be interested in:

  • (2:50) What do you see as the critical business value for online search?
  • (6:20) What are some of the innovative developments in search today (using AI) that go over and above traditional search capabilities?
  • (22:04) What do you see as some of the future developments in search?

Related Interviews about AI for Search

Subscribe to our AI in Industry Podcast with your favorite podcast service:

 

Header image credit: Sourcecon

Subscribe
subscribe-image
Stay Ahead of the Machine Learning Curve

Join over 20,000 AI-focused business leaders and receive our latest AI research and trends delivered weekly.

Thanks for subscribing to the Emerj "AI Advantage" newsletter, check your email inbox for confirmation.