"Googlebot" that supports Google search corresponds to clicks of AJAX · JavaScript, allowing realistic user-like behaviors


ByMechanekton

Googlebot(Google Bot) is a Google Web crawling robot, a series of programs and processes for downloading and collecting new pages and updated pages one after another in Google's search index , Also known as a crawler because Googlebot gets (crawling) billions of pages on the Internet.

But after all it's a program, so for exampleAJAXAnd using JavaScript do not crawl quite well well, although you can also crawl AJAX etc, but because behavior is different from what people actually click on, Google itself is also "AJAX Crawl: Guide for Webmasters and Developers"And I was forced to recommend a scheme to make AJAX crawl more effectively to Googlebot in that.

ByOkto

However, it seems that Googlbot has recently been upgraded, arbitrarily "clicks" on AJAX and JavaScript, and it has turned out to be performing realistic user-like behaviors.

Google Bot now crawls arbitrary Javascript sites
http://swapped.tumblr.com/post/23133779276/google-bot-now-crawls-arbitrary-javascript-sites


It is a professional software developer living in Vancouver, Canada, famous for VPN software "HamachiIt is also one of developersSwapped.ccWhat Alex Pankratov found.

According to Alex Pankratov 's blog, one day he seems to have found something like the following in the Apache log.

66.249.67.106 ... "GET / ajax / xr / ready? X = clcgvsgizgxhfzvf HTTP / 1.1" ...

This is one of swapped.cc's AJAX requests, which means that somewhere the bot executed JavaScript on the page. Examining the recorded IP address "66.249.67.106", this is "crawl-66-249-67-106.googlebot.com", and from the agreement of this A record this is indeed genuine authentic It turned out to be Googlebot.

Further analysis, we also found the following logs.

66.249.67.106 ... "GET /content/halloc/index.html?&x=clcgvsgizgxhfzvf ...

This is due to AJAX when you click on a menu item, what you mean is that Googlebot emulates the behavior the user actually clicks and crawls the site, That means that you can crawl pages as well.

This helps Googlebot teach the AJAX page that Google recommended so farEscaped_fragmentIt is no longer necessary to generate a URL for Googlebot using.

Although this update is modest, it means that the behavior of Googlebot is finally becoming a real person, a real user, and it will improve the freedom of future website creation.

ByЅolo

in Note,   Web Service, Posted by darkhorse