What did you find out by machine learning 4 million hotel reviews?
by Jennifer Woodard Maderazo
As a result of analyzing 4 million hotel reviews around the world posted on the travel information site TripAdvisor by Monky Learn, which is a text analysis by machine learning, various things have come to light.
Machine Learning over 1M hotel reviews finds interesting insights
https://monkeylearn.com/blog/machine-learning-hotel-reviews-insights/
First, when the evaluation contents of the reviews were divided into 'positive' and 'negative', the percentage of reviews that gave a positive evaluation was 82%.
Breaking this hotel rating by city reveals that London's hotel ratings clearly had more negative ratings than other cities. The graph below shows the seven cities of New York, Paris, London, Bangkok, Madrid, Beijing, and Rio de Janeiro from the left. Although the positive evaluation is generally over 80%, only London has 80%. I'm cutting it.
In order to know the reason, this is divided by evaluation axis. In terms of 'comfort and facilities,' London has become the most negatively rated city.
In terms of food, London has a slightly higher negative rating than others.
In terms of 'location', all cities have almost the same results. In other words, it seems that it is a facility issue that is holding back the evaluation of London.
Registered hotels are rated from '1 star' to '5 star'. The higher the number of stars, the higher the quality of service and equipment, which is also reflected in the results of the reviews, and the percentage of positive ratings increases as the number of stars increases.
However, if we focus only on the 'Internet,' the results show that the ratings are about the same for both 3-star and 5-star hotels.
Furthermore, in terms of monetary value, the result was that 3-star hotels had a higher percentage of high ratings than 5-star hotels. Many people thought, 'This equipment is OK at this price.'
In addition, as individual problems, carpets, beds, falling hair, Nanjing insects, dirt, etc. were raised as hygiene problems in all cities, but it was pointed out that only hotels in Bangkok 'cockroach will come out'. Was included. There was also a complaint in New York that the 'shared toilet' was dirty. The word 'croissant' appears only in reviews about hotels in Paris, but they all appear in a negative context, and if you take a closer look, the fact that breakfast is all about croissants at every hotel has led to complaints. I did.
The code used in this data analysis is published on GitHub.
GitHub --monkeylearn / hotel-review-analysis: Sentiment analysis and aspect classification for hotel reviews using machine learning models with MonkeyLearn.
https://github.com/monkeylearn/hotel-review-analysis
Related Posts:
in Note, Posted by logc_nt