Airbnb Machine Learning – How Data and Social Science Make it All Work

Shopify's Kit - The AI Personal Marketing Assistant 1

Expertise: Data science and economics

Brief Recognition: Elena Grewal leads a team of data scientists responsible for the user’s online and offline travel experience at Airbnb. Her team partners with the product team to understand and optimize all parts of the product, using experimentation and machine learning in a wide variety of contexts. Prior to Airbnb, Elena was a doctoral candidate in the Economics of Education program at the Stanford University School of Education. She received a B.A. in Ethics, Politics, and Economics, with distinction, from Yale University, and a Masters degree in Economics at Stanford University. She was also the recipient of the Stanford Interdisciplinary Graduate Fellowship.

Current Affiliations: Interim Leader of Data Science at Airbnb; Illuminating Engineering Society (IES) Fellow

The Logic Behind Airbnb Listings

Wondering how Airbnb sorts and delivers its listings when you search for a place to stay on your next getaway? If you know anything about machine learning, you might have expected that there are a plethora of variables that go into sorting the tens of thousands of listings that are sometimes available in a specific location.

Unlike machines, it’s impossible for one human being to go through each listing – and if you’re indecisive by nature, this could pose an existential problem. That’s why Airbnb’s machine learning algorithms do the work for you, pulling signals from a variety of data points, depending on whether you’re host or guest.

This was all explained by Elena Grewal, Interim Leader of Data Science at Airbnb, who took the time to sit down and shed some light on what signals and information the company uses to make sense of and sort these listings. While recommendation engines aren’t simple, the data science behind Airbnb’s algorithms are all the more complex, simply due to the number of variables that goes into displaying synchronized preferences for both guests and hosts. [Search] factors include a user’s past searches and clicks, host preferences, and other preferences, which go into the algorithmic system before it churns out ideal listings.

“Another big piece is pricing”, says Grewal. “You might not see on the guest side, but for host suggestions…there’s lots of information about how many people are looking for your kind of home in that area, comparisons to other homes, and the probability that you’ll get a booking.” All of these variables help feed an effective pricing algorithmic model for host users. There are also machine learning models that guests and hosts don’t see, like those used to help detect fraud on one side or the other.

What you see versus what your neighbor sees in his or her list could depend on what kind of guests enjoyed that space, as well as featured highlights, and reviews.For example, someone who tends to book a lot of small, cabin-like lodgings might see a similar space in a particular area highlighted in a similar listing recommendation. But don’t get too comfortable just yet.

A Mission of Inclusion

“We have a mission to make people feel like they belong anywhere…we want to show you something you’re likely to like, but also try to get you out of your comfort zone,” says Grewal. So while you might love the cabins, Airbnb could try to give you a ‘little push’ by throwing in other available list options that are outside of your typical pattern of choice – why not try a treehouse or a canvas wall tent? Novelty can be a valuable commodity, and Airbnb seems to have struck a reasonable balance in this department. As Grewal sums it up, “We’re looking to exceed your expectations for something you didn’t initially have as one.”

While Airbnb was able to ditch the machine learning training wheels a while back, Grewal points out that they’re constantly tweaking the algorithms to make a better model.The ranking of the listing, the type of home preference (private room, whole house, etc.), the kind of clicks in which a customer engages most on the platform – these are all factors that go into revamping working models.

It’s all about keeping the parameters turned to what guests want or need and pairing that with an optimum location match. For example, if you’re looking to stay in the San Francisco area, does that mean that Airbnb should show searches for the East Bay as well? These are the types of related questions Grewal and her team are continuously asking themselves in trying to figure out and incorporate relevant variables.

Fighting Discrimination with Machine Learning

Grewal again points out that Airbnb is a two-sided marketplace, and the host has to say yes to the guest as well. Host preferences also dictate a guest’s search results, which include listings of those guests who are likely to accept you as a guest. For example, if you’re searching for a one night stay in the middle of the week, there are hosts who won’t have that day available in their calendar. This is an interesting and potentially controversial territory (hence, the discrimination issue that attracted much attention after Harvard-published claims). We asked Grewal how the technology, in addition to public relations efforts, has played a role in ensuring that hosts are playing a fair game.

“It’s something we take seriously…our mission is for people to belong anywhere,” she emphasized. Grewal was part of a discrimination task force that she says has been doing 90 day reviews over the last 3 months. At the start, she noted that leaders met every day and came up with a plan to talk about how to prevent discrimination (subconscious and other-wise), both within the company and amongst hosts.

Airbnb’s data science team has an important voice in this ongoing conversation, bringing a unique understanding of the problem and how to tackle it from a machine learning-based point of view. Grewal mentioned that Airbnb recently hired a “Marketplace Belonging and Diversity” member, whose goal is to work full-time on a product team dedicated to improving fairness and inclusion on the platform.

The task force’s top-down review, which included advisers like former US Attorney General Eric H Holder Jr., resulted in a public report of anti-bias policies, which the company introduced in September of this year. One of the ideas that emerges from the report is that while innovation is not immune to biases, however subconscious they may be, machine learning may be able to help advance belonging and inclusion through detection of patterns.

Some have expressed concerns that the report doesn’t shed enough light on how Airbnb technology that’s responsible for identifying discriminatory behavior actually works, but this may be because the data science team is right in the thick of uncovering these holes and biases in the system. Airbnb’s anti-discrimination team, which to its credit is being hailed as the first of its kind, is dedicated to finding patterns in host behavior, performing “tests” with input from social science experts, examining algorithms, and making ongoing adjustments to the technical underpinnings of Airbnb’s platform.

The report does specify, however, new changes to Airbnb policy that will help combat blatant discrimination. For example, rental hosts must sign a “community commitment” on November 1, 2016, that prevents them from choosing certain guests over others during specific time periods; if a host rejects a guest’s request for a certain block of days, they’ll be unable to accept a different guest for those same dates.

The company’s new “Open Doors” program commits to helping guests, who feel they’ve been discriminated against, find a different accommodation at any hour of the day. Other anti-discrimination strategies include efforts to reduce the prominence of user photographs and the acceleration of instant bookings (which don’t need host approval) to 1 million hosts by January 2017.

In dealing with this issue, Airbnb faces an interesting challenge in that it (by choice) doesn’t collect information on sex or race from either guests or host; they don’t even have labels for photo images. An initiative that was meant to provide free and open choice poses a paradoxical conundrum – how does the data science team measure these aspects in the decision process if there’s no hard data? Grewal explained that they don’t want to ask users, so the team has been busy thinking about how they can use existing data to come up with viable solutions. In the end, machine learning can’t solve everything, but it can provide new insights for data scientists working to create more equitable systems.