I have heard the story behind the development of "Booking.com" that conducts 1000 A / B tests a day [Part 1]

The reservation site 'Booking.comIs a company that started with the startup of several employees and has grown to 13,000 worldwide scale. In order to investigate "Which one of the two options is better for the user?"A / B testAlthough there are many web services and websites that do websites and websites, Booking.com is doing 1,000 A / B tests every day, so what is behind the development, the chief product officer (CPO I asked Mr. David Bishmansu of the story.

Domestic as well as overseas! Reservations for hotels and inns at Booking.com

Men in the photoDavid · BismuthsMr.

"What is the A / B test that is being held at Booking.com?" First, let me show you the sample. In the first A / B test, when you click on the hotel link that appeared as a search result, one website will not open a new tab and the page will migrate, the other website will open a new tab Will the final closing rate change when the content shifts? something like.

Whether the contract rate increases or decreases depending on whether a new tab is opened or not, or is it not changed?

In the second A / B test, "The previous week (previous week)", "The next week (one week later)", when the reservation could not be made at the hotel he wanted to stay within the specified period, It is to attach a link to show the availability of the room. Again, it was tested whether there will be a change in the final contract rate in cases where we do not make suggestions and when we propose other weeks with vacancies.

The third test is about expressing text. When changing the notation as "Free cancellation, pay when you stay (cancellation free, local payment)" "Free cancellation-PAY LATER (cancellation fee free payment later)" in reservation of each hotel, the change in the closing rate changes It was tested whether it goes out or not.

Finally, as to the display on the mobile terminal, whether the change in the contracting rate will change or not depends on whether it is set as "price / maximum capacity number / selection button" from the left or "maximum capacity number · price · selection button" I was checked.

The answer to the first question is "The closing rate increases as the new tab opens and the page shifts," the second answer is "If the proposal is made for the previous week and the following week, the closing rate will change "The third contractor rate is higher if you write" Cancellation free · Payment later "", the third question is "The maximum number of people, price, selection button, the closing rate increases" That was the correct answer.

Until now, there was a method of consulting experts on the question "how to design a website", but without advancing to the experts with the development of technology, we used the A / B test It has become possible to reveal the best means in a scientific way. However, it is also a fact that the answer derived by the A / B test does not understand "Why?" For example, it is true that when the order of notation is changed to "Maximum number of people, price, selection button", it is true that the final contract rate increases, but as to the reason, "Because the notation of the price was displayed in the middle and it was conspicuous I guess it could be predicted that the contract rate has risen to ", but I can not know whether it is a fact or not.

It is very difficult to predict "what kind of customer responds" to the changes that have been made to websites and applications, and the results of the A / B test necessarily coincide with the ideas of experts There is none. Therefore, if you do not make sure of the measurement of the test, there is a possibility that it will deteriorate the product by bringing about change.

It is said that Booking.com is doing 1000 A / B tests as described above a day. From the following images that opened the same top page in different environments, you can see that the placement and notation of contents are different. So, I asked you how you were able to perform a large number of A / B tests.

Is it the method of collecting the subjects of the A / B test done at Booking.com or what is the requirement?

David Bishmans (Mr. Bishmans):
It's not just focusing on specific users, but all the general users watching Booking.com. Whether it's a web page or a mobile, half of those who access Booking.com shows one display and the other half shows a different display.

Does the testee have no consciousness that they are doing A / B tests?

That is exactly right. And we measure which pattern leads to the final contract. Since we do 1000 tests every day, there are various patterns depending on the display combination, so the customer is almost not watching the same site.

What is the approximate number of samples?

Because it is all users who accessed, it is about 2 million to 20 million.

I think that only one comparison can be done, but how do you combine 1000 pieces as well? Are you staggered in time?

At the same time there are tools for 1000 tests, and some users may be watching several tests.

I think it is very difficult to judge which combination is best when multiple tests are done at the same time. How do you verify the results and lead to a new experience?

It's a very complicated system, but you can do it (laugh) But it takes a lot of time to explain it.

For example, if you look at the result manually and judge that "This is good," or is it based on the power of computers such as deep learning to automatically derive the judgment?

It is human being to analyze the results with the tool and finally adjust it.

Is that the ultimate decision is human power?

The final decision is so.

I think that it is an A / B test, a messenger service, an impression after a user reserved a room, and I think that a huge amount of information gathers every day, but how much information is coming up Is it?

I have stopped counting it, but now the total amount of data on Booking.com is about 15 petabytes. Even if there is enough quantity, the system is not catching up and it may not be seen as much as 20%, so there is a possibility that very important data is sleeping there.

How do you scrutinize that vast amount of data and lead to improvement of the website?

I read the data pattern. By looking at customer trends, we use it to see what kind of new functions can be increased. We can make use of reviews and experiences written in word of mouth.

For example, in the past direction, it was great to help the part of "Where do customers want to go?", But now that you do not know where you want to go or when you do not know when to support it There is also a function to do.

this is"Destination searchAlthough it is a function to find a destination, if the destination has not been decided yet, I can help the user here and find the place I want to go.

Booking.com Booking dot com: Destination search

Destination search uses the method called "Endorsements", which is based on the opinions of travelers who went there before. When searching "Where is the best beach?", A list made based on the opinion of what beach was good for the past user comes out.

Recommended by travelers! When saying "beach" all over the world, here is the best - Booking.com Booking.com Come: Destination search

Although I have booked my accommodation, it is also possible to look up local information on what kind of things can be done and what options are available after arriving at my destination in the state of "what to do there?"

Even after arriving at the destination, we provide technology in various ways to make the whole user's journey better, making full use of technology. We believe that such an approach to developing new products is a very scientific and correct way.

Also, when the owner of the accommodation speaks a different language, a system called "booking message" was also born, which is translated in real time and sent the correct message.

Booking.com Launches New Messenger App

Now we are interacting with people, but I think that they want to reduce the burden on the hotel by talking automatically using artificial intelligence. However, before that, we need to collect large amounts of data of questions and answers. I am using booking messages even in the sense that I collect information on what kind of questions and what kind of answers are available.

Do you think that you can package the huge big data gathered in the interactions with the people of the hotel and a lot of people, and sell it to other partners?

No, there is not.

Why is that?

It is because we are watching in rough form using machine learning instead of looking at each conversation. Since it is very important information for our company, we do not disclose those data.

· Continued
I have heard the story behind the development of "Booking.com" that carries out 1000 A / B tests a day 【The second part】 - GIGAZINE

in Coverage, Posted by logq_fa