|
In order to ensure everything regarding GDPR, the IP in the log files should be anonymized. This can be done, for example, by replacing the last character block with a 0 using a script. How do I see what Google is crawling on my site? Googlebot can be identified via the user agent that is included with every call. The typical Googlebot is called. Since the Mobile Index, the “Googlebot Smartphone” crawls even more frequently.
This means that not every user agent “Googlebot” actually has to be Google’s crawler. This means Special Data that it does not make sense to rely solely on the user agent. How can I verify Googlebot? Google typically crawls from IPs starting with 66,249. begin. This IP range can be used to verify access as a real Googlebot. To be on the safe side, you can compare daily Googlebot traffic with the official numbers from Google Search Console (in the old Search Console under Crawling > Crawling Statistics ). How can I evaluate the log files? Now you could of course start unpacking the huge files, preparing the individual elements as columns in Excel and filtering out the corresponding Googlebot rows.

You can do it, but it quickly becomes frustrating. That's why resourceful people have developed tools. We use Screaming Frog Log File Analyzer . We can simply drag and drop the compressed log files there and the tool will automatically filter out all search engine bots. It then looks something like this: screaming frog log file analyzer But there are also various other tools. In order to continuously process and evaluate the log files, a popular solution is the so-called ELK stack , consisting of the open source solutions Elasticsearch, Logstash and Kibana.
|
|