Log Analysis:
A Powerful SEO Lever
As a cornerstone of technical SEO, log analysis provides valuable insights to enhance your organic ranking. But how does it actually work?
What is Log Analysis in SEO?
How do search engine robots perceive your website?
For your SEO strategy, it’s essential to understand how search engine robots, commonly known as “bots,” interact with your site. Understanding what they observe during their crawling process is vital, as page exploration is guided by various factors such as server response speed, URL depth, and update frequency. It’s not uncommon to encounter discrepancies between what a search engine knows about your site and the actual pages that appear online and generate traffic. Unfortunately, conventional tools like Search Console, Google Analytics, and SEO crawling software provide only partial answers. The most accurate way to determine what bots see is by delving into the server’s log files. This is where the concept of “log analysis” in SEO becomes fundamental for the technical aspect of search engine optimisation.
Log analysis: a crucial lever in technical SEO
The term “log” refers to a file hosted on the website server. It acts as a “journal” that records all events occurring on the server, known as “calls” or “hits”. Every time an entity (user or bot) tries to access the site’s resources, a new line is added to the log file. Log analysis in SEO involves extracting information from this file, studying it, and deriving insights to optimise web pages. This approach provides a holistic view of the site’s performance, its influence on bots, technical errors, internal linking, and more. Most importantly, these data are the only ones that can be 100% relied upon. They help in better understanding the crawling process by bots and ultimately applying relevant strategies to continuously improve SEO.
What data can be found through log analysis?
-
Timestamp: Date and time of each request
-
Request type: The type of request made
-
User-agent: The source of the request (user or bot)
-
Visited URL: The URL accessed
-
Referrer URL
The source URL that led to the request
-
Page details
Size, response time, HTTP response status codes (200, 301, 404, 500), and more
-
User-related information
IP address, search engine used, operating system, and more
Two types of crawling (to be used in combination)
When it comes to crawling, it’s important to distinguish between search engine bots and SEO crawl tools used by specialists. Both approaches are complementary, and combining their data leads to optimal results.
-
Search engine bot crawling
Search engine bots crawl your website’s pages by following links discovered during their exploration, both internal and inbound. They also take into account instructions provided through XML sitemaps and robots.txt files. Additionally, they identify previously indexed pages that are no longer accessible and pages visited by users but not linked to the site. Log analysis allows us to study and understand this type of crawl.
-
Crawling with SEO tools
SEO crawl tools analyse all pages of a website, but they primarily focus on interconnected pages through internal links. This provides a more limited perspective compared to search engine bots, as they can uncover “orphaned” pages that are not linked to the rest of the site. While data obtained from this type of crawl is valuable, it is not as precise as log analysis in terms of understanding the full scope of your website’s crawling dynamics.
GOOD TO KNOW
Log analysis is primarily suitable for extensive websites, typically consisting of at least 1,000 pages (although this approach can be applied to sites with fewer pages). However, the larger the site, the higher the likelihood of encountering crawl issues, making log analysis even more critical.
Why is log analysis important in SEO?
Why analyse logs?
Through log analysis, it becomes possible to accurately reconstruct the path taken by search engine bots on a website. The objective is to gain precise insight into the crawl process with unquestionable reliability. By understanding how the site is explored by these bots, like which pages are being visited or neglected (along with the reasons), one can identify any obstacles and implement necessary fixes to enhance the indexing of web pages effectively. In essence, this analysis helps address a fundamental question: “Does the search engine perceive my website in the same way that I do?” The aim is to align these two perspectives – that of the search engine and your own – as closely as possible.
What is the purpose of log analysis?
-
Monitoring bot crawls
Log analysis allows you to extract crucial information about the crawl process, particularly how each search engine’s exploration budget is spent by bots on your site. Which pages are considered important by search engines? Are they the same as your strategic pages? Are there any potential areas of crawl budget waste? In essence, you can precisely monitor the progress of the bot crawl.
-
Understanding what bots see
Studying logs puts you in the shoes of crawler robots and shows you what they see when they explore your website. This helps uncover vital data: gaps in exploration, pages indexed despite not being linked to the site through internal linking, errors in URL response codes, excessively long loading times, types of crawled files, focus on the mobile version, and more.
-
Conducting a comprehensive technical SEO audit
Log analysis is an essential part of any technical SEO audit. It is necessary to identify the indicators that need optimisation. However, such analysis requires specific skills, including extracting data from log files, interpreting it, and cross-referencing it with other SEO information. Only SEO experts can fully leverage its benefits.
-
Overcoming obstacles to organic search ranking
Log analysis is a fundamental aspect of SEO. Its goal is to understand what hinders the proper indexing of pages and even identify the causes of a sudden drop in SERP rankings. This allows for concrete actions to facilitate bot crawling, such as fixing response code errors, detecting “bot traps,” and improving page loading times.
-
Quantifying and supporting SEO efforts
The true value of log analysis lies in the precision and reliability of the information it provides. These insights are presented as measurable indicators, serving as clear evidence of the challenges faced by the website and the need for corrective actions. As a result, the findings from this analysis allow for prioritising optimisation strategies and gaining the support of higher-ups by showcasing the identified issues and proposing practical solutions.
When should log analysis be conducted?
Log analysis can be undertaken at any stage of a website’s existence, particularly if it boasts a significant volume (over 10,000 pages). It proves particularly beneficial during technical SEO audits or monthly reviews. For smaller-scale websites (1,000 to 10,000 pages), this approach becomes especially useful in various scenarios: during a website redesign or migration, when introducing a large number of new URLs, or if changes in SEO page rankings are observed. Furthermore, log analysis is particularly well-suited for e-commerce sites, which often feature numerous crucial pages (such as product listings), as well as news sites that require a keen understanding of crawl frequency (ensuring rapid indexing of new content by search engines).
How to analyse data from logs?
How to conduct a log analysis
The log file is enormous and presented in a raw format with disorganised information. In other words, manually going through it would be overwhelming and time-consuming. It’s essential to utilise tools capable of doing the work for you, sifting through the raw data to extract the necessary information quickly for your SEO. Various solutions are available in the market, like Screaming Frog or OnCrawl. However, their usage should be complemented with SEO crawling tools and data analysis tools (such as Search Console, Google Analytics, etc.). While these tools facilitate log interpretation, they alone are insufficient for a comprehensive analysis or deriving actionable recommendations from the data. Lastly, it is vital to know how to effectively utilise relevant indicators!
Relevant indicators to consider during log analysis
-
Crawl frequency
This indicator provides insights into how often search engine bots explore your website, including the dates of their recent visits and the number of visits (a single page may be crawled multiple times per day if it is deemed highly important). It allows you to verify whether strategic pages are being visited by bots regularly.
-
Crawl window
This refers to the number of days required to complete a crawl of all the URLs on your website. The crawl window typically ranges from 1 to 3 weeks, representing the average interval between bot visits. It’s important to consider this timeframe when making modifications and wanting to determine when they will be indexed and when their impact on search rankings will be felt. It also helps anticipate favourable periods for adding new content.
-
Use of crawl budget
Log analysis provides a valuable insight into how search engines allocate their crawl budget, which represents the limit on the number of pages a bot can crawl. This information is crucial for understanding which pages or sections of the site receive priority during crawling and enables the redirection of bots to prioritise indexing the URLs that are deemed most important for SEO.
-
Technical SEO errors
Multiple indicators aim to identify technical obstacles that hinder site indexing. These may include duplicated URLs, issues related to page loading speed, or pages displaying errors such as 301, 404, or 500. Such issues not only impede the exploration of the site by search engine bots but also affect the overall user experience, as search engines consider these factors when assessing the relevance of a page.
-
Irrelevant content
Log analysis helps identify irrelevant content on your website. Bots crawl through this content, resulting in two negative effects: it consumes resources allocated to the crawl budget, diverting attention from strategic pages and causing delays in their discovery, and it can have a detrimental impact on your SEO. This includes duplicated content, 404 errors, compromised pages, and low-quality content, among others.
-
Orphaned web pages
By combining log analysis with SEO crawling, you can uncover “orphaned” pages. These are pages that search engines recognise due to their high metrics but are not linked to the website due to a lack of inbound links. Despite their visibility in search engine results and potential to generate traffic, it is crucial to integrate these pages into the internal linking structure of the website.
-
Mobile crawling
By analysing logs, you can assess the proportion of mobile crawling compared to desktop crawling. On Google, this indicator helps determine if a site has transitioned to the Mobile-First index. In this case, it is predominantly visited by the mobile bot (80%) and the desktop bot (20%) for the traditional index. This information allows you to focus on optimising the mobile versions of your pages if needed.
-
Strength of internal linking
By comparing the URLs crawled by search engines to those explored during an SEO crawl, log analysis provides valuable insights into a website’s structure and internal linking. The next step is to optimise and strengthen this internal linking structure, making it easier for future bot crawls to discover important pages that may have been previously overlooked or neglected.
Our Commitment
-
Expertise
Since 2010, we have worked with over 2000 clients across 90 countries.
-
Passion
We are a team of passionate, industry-focused individuals who are committed to your success.
-
Performance
We’re committed to implementing a data-driven strategy, making a real impact on your bottom line by providing avenues for growth.
Any questions?
In SEO, log analysis involves extracting information from log files (associated with the website’s server) to optimise performance. Logs record all events that occur on the server, including activities from users and search engine bots during crawling. Log analysis provides highly precise and reliable data, offering valuable insights for optimisation purposes.
By analysing logs, we can reconstruct the exact path taken by search engine robots during the crawl of a website. This allows us to “see” the site exactly as the bots “see” it and extract a wealth of essential information for the site’s SEO. Log analysis is an indispensable tool for technical SEO.
Manual log analysis is not feasible due to the complexity of the task and the large volume of raw data to process. Therefore, dedicated tools such as Screaming Frog or OnCrawl are used for log extraction, visualisation, and exploitation. These data should also be cross-referenced with the results from a standard SEO crawl. Finally, it is crucial to identify relevant indicators that align with your SEO objectives.