Search
Close this search box.

XML Sitemaps: A Comprehensive Guide to Creation and Practical Tips

XML Sitemaps A Comprehensive Guide to Creation and Practical Tips

Share This Post

An XML sitemap is a file on your website that lists pages for search engines to index, providing additional details about your content. They are crucial for SEO as they aid search engines in locating new and updated pages more quickly.

“XML” stands for Extensible Markup Language, a format designed for easy URL data storage, enhancing search engine data processing.

Here’s an example of what an XML sitemap looks like:

Photo by Semrush

Google first introduced XML sitemaps in 2005, followed by other search engines like Yahoo and Microsoft.

This article will explore the necessity of a sitemap, discuss different sitemap types and best practices, and guide you through creating, verifying, and submitting your XML sitemap.

Evaluating the Need for an XML Sitemap

Google outlines situations where a sitemap might be unnecessary. For example, if your website is small (under 500 pages) and has well-organized internal linking, you might wonder if a sitemap is essential.

The short answer is no, it’s not required for such sites. However, it’s still advisable to have one. Here’s why:

1. Creating a sitemap requires minimal effort.

2. A sitemap is always a boon for your website, with no downsides.

3. It can expedite Google’s process of discovering your pages.

Google’s Gary Illyes highlights that XML sitemaps are the second most significant way for Google to find new URLs.

So, why not leverage this tool?

Moreover, an XML sitemap is crucial for large websites with thousands of pages and for new websites with few inbound links.

Types of XML Sitemaps

XML sitemaps come in different forms, catering to various content types. Google, for instance, recognizes sitemaps specifically for:

1. Video content

2. Images

3. Google News

These specialized media types can be included in a standard sitemap or have their own dedicated sitemaps.

While a single simple sitemap suffices for most websites, certain scenarios may necessitate multiple sitemaps or specialized ones for different content types.

For instance, if your site has a large number of indexable URLs, you might need more than one sitemap. This is because XML sitemaps are limited to 50,000 URLs or a file size of 50MB. Exceeding this limit means you’ll have to split your sitemap.

Additionally, you might opt for separate sitemaps for different page categories, like blog posts or author profiles.

In cases where multiple XML sitemaps are used, it’s best to employ a sitemap index. This is a meta-sitemap that lists all your individual sitemaps, serving as a comprehensive directory.

An example of a sitemap index is as follows:

Photo by Semrush

Optimizing XML Sitemaps: Essential Guidelines

Let’s explore Google’s technical guidelines and best practices to ensure your XML sitemap is set up correctly:

1. Include only URLs you want indexed by search engines. This means avoiding multiple versions of the same page and focusing on canonical URLs.

2. Ensure all URLs in the sitemap return a 200 status code, avoiding any that result in redirects or 4xx/5xx errors.

3. Keep each sitemap file within the limits of 50MB or 50,000 URLs. If your site is larger, split it into multiple sitemaps.

4. If using multiple sitemaps, compile them in an index sitemap for easy navigation.

5. Confirm your sitemap is UTF-8 encoded for proper formatting.

6. Include links to localized versions of URLs, as per Google’s documentation.

7. Regularly update your sitemap to reflect new or changed URLs.

8. Add the “lastmod” attribute to indicate when a page was last updated.

9. Link to your sitemap in your site’s robots.txt file for better discoverability.

10.  Submit your sitemap to Google for indexing.

Fortunately, most CMS platforms, plugins, or sitemap generators automatically adhere to these guidelines when creating an XML sitemap.

Next, we’ll cover the process of creating an XML sitemap.

Creating an XML Sitemap with a CMS

Many content management systems (CMS) automatically generate and update XML sitemaps for you. Some of the popular platforms that offer this feature include:

In most cases, these CMS platforms don’t allow for manual edits to the sitemap, which generally isn’t an issue for most users.

If you’re using one of these platforms, there’s usually nothing additional you need to do regarding your sitemap.

WordPress XML Sitemap Plugins

Advanced WordPress users desiring complete control over their sitemap.xml file can opt to replace the default version. This is particularly useful when needing to exclude specific pages from the automatically generated WordPress XML sitemap, a task not easily accomplished without modifying PHP code.

Plugins are a practical solution in this scenario.

Broadly, there are two varieties of WordPress plugins for XML sitemap creation:

  • General SEO Plugins: These include popular ones like Yoast SEO, All in One SEO, and Rank Math.
  • Dedicated Sitemap Plugins: Specific to sitemap creation, such as XML Sitemaps or XML Sitemap & Google News.

A deeper look into Yoast SEO, a widely-used WordPress SEO plugin, reveals its utility.

Post-installation, Yoast SEO not only crafts a new sitemap but also supersedes the native WordPress sitemap. Accessing the “Yoast SEO” settings in the WordPress dashboard and navigating to “General” settings followed by the “Features” tab brings up the “XML Sitemap” section.

 
Photo by Semrush

By default, this feature is activated.

To inspect your XML sitemap, click the question mark icon, then follow the “See the XML sitemap” link.


 Photo by Semrush

Yoast SEO generates an index sitemap comprising individual sitemaps for posts, pages, categories, authors, and more.

 
Photo by Semrush

Upon clicking an individual sitemap link, it displays a list of included URLs.

Photo by Semrush

For those pages you wish to exclude from the sitemap, Yoast SEO allows you to disable their indexing, effectively removing them from search results and the sitemap. To do so for a specific URL, edit the page or post and locate the Yoast SEO settings at the bottom, expanding the “Advanced” section. There, under the “Allow search engines to show this Post in search results?” heading, choose “No.”

 
Photo by Semrush

For broader exclusions like entire content types, archives, and taxonomies, visit the “Search Appearance” settings in Yoast SEO. For example, to prevent indexing of author archives, go to the “Archives” tab and toggle the relevant setting to “Off.


Photo by Semrush

This action also eliminates the author sitemap from your sitemap index.

Options for Generating XML Sitemaps Without a CMS

For those not utilizing a Content Management System (CMS), there are primarily two methods to create an XML sitemap:

  • Manual Creation: This approach is feasible if you have a static website with only a few pages. However, manually creating a sitemap becomes cumbersome and inefficient for larger websites or those with frequently updating content.
  • Using a Sitemap Generator: This is a more efficient method, particularly for dynamic or large websites. A sitemap generator automates the creation of your sitemap.

It is  advised opting for a downloadable desktop application, such as Inspyder Sitemap Creator or Sitemap Writer Pro. These tools have the capability to automatically update your sitemap whenever new pages are added or existing ones are modified.

Verifying the Effectiveness of Your XML Sitemap

To ensure your sitemap is functioning correctly, Semrush’s Site Audit tool is an effective resource.

Start by setting up a complimentary account on Semrush (no credit card required) and initiating your first website crawl. You’ll find assistance in getting started with this setup guide.

Once your site audit is complete, your audit overview will be displayed as follows:

Photo by Semrush

To identify any issues with your XML sitemap, navigate to the “Issues” tab.

Photo by Semrush

In the “Issues” tab, you can filter the results by typing “sitemap” to focus specifically on issues related to your sitemap.xml file.

Photo by Semrush

The tool categorizes the problems into three levels based on severity: errors, warnings, and notices. For issues involving multiple URLs in your sitemap, you can click on the corresponding issue link to view all the affected URLs.

Photo by Semrush

For a deeper understanding of each issue, select the “Why and how to fix it” link. A pop-up window will provide a detailed explanation of the issue and suggest corrective actions.

Example of this modal for the “Sitemap.xml not indicated in robots.txt” warning:

Photo by Semrush

For a comprehensive overview of the Site Audit tool and to explore the various technical SEO aspects it can assess on your website, our detailed Site Audit guide offers extensive information.

Additionally, conducting a full technical SEO audit of your website is recommended to ensure optimal online performance.

How to Submit Your XML Sitemap to Google

Submitting your XML sitemap to Google is a recommended practice, even though Google’s crawlers will eventually find it on their own. There are several advantages to doing this:

  • Speed Up Discovery: It expedites the process of Google discovering your sitemap.
  • Identify Issues: It allows you to spot any potential problems Google might encounter with your sitemap.

To complete this process, follow these steps within Google Search Console:

Access Google Search Console: If you don’t have a Google Search Console account yet, check out our guide on setting it up.

Open the “Sitemaps” Dashboard: In the left menu, go to the “Indexing” section and click on “Sitemaps.”

Photo by Semrush

Submit Your Sitemap URL: Locate the “Add a new sitemap” field, paste your sitemap’s URL, and then click “Submit.”

Photo by Semrush

Consider an Index Sitemap: If you have multiple sitemaps, you can simplify the process by submitting your index sitemap that references all the others.

Keep in mind that your sitemap won’t be processed immediately; it may take a couple of days. Once Google has indexed your sitemap, it will display a green “Success” status next to it in the “Submitted sitemaps” section.

Photo by Semrush

If there are any issues with your sitemap, Google will indicate the status as “Has errors” or “Couldn’t fetch.” To view the detected issues and access Google’s detailed instructions for addressing them, click on the row corresponding to your sitemap.

By following these steps, you can ensure that your XML sitemap is promptly recognized by Google, enhancing your website’s search visibility.

FAQs

Your XML sitemap should exclusively contain the pages you intend for search engines to index and display in their search results. Therefore, it should not include pages you wish to exclude from search results, such as non-canonical duplicate pages, administrative pages, content behind a paywall, or thank you pages.
Additionally, it's crucial that your XML sitemap only lists valid pages, those that return a 200 status code. Ensure that it does not encompass pages with other status codes, including:
  • 3xx: Pages with redirections (e.g., 301 permanent redirects).
  • 4xx: Pages that are inaccessible (e.g., 404 page not found).
  • 5xx: Pages that generate server errors (e.g., 502 bad gateway).
  • The primary distinction between HTML and XML sitemaps lies in their purpose. HTML sitemaps aid website visitors in navigation, while XML sitemaps facilitate search engines in discovering web pages.
    For SEO purposes, Google strongly recommends the use of XML sitemaps. As for HTML sitemaps, Google Search Advocate John Mueller advises concentrating on clear navigation and sound site architecture rather than relying on HTML sitemaps, stating that "they should never be needed."
    The "Priority" and "Changefreq" attributes are optional elements that can be added to each URL within an XML sitemap. However, Google has repeatedly emphasized that these attributes do not significantly influence sitemaps.
  • Priority attribute: Assigns a priority value ranging from 0 to 1 to each URL. For example, assigning a page a priority of 0.8 indicates that you consider this page moderately important to search engines.
  • Changefreq attribute: Informs search engines about the frequency of updates to a page (with values like "always," "daily," "weekly," etc.).

  • Despite their apparent utility, the URL of a page and the "lastmod" attribute are the two primary elements of importance for Google, as confirmed by John Mueller on Twitter.
    To ascertain whether your XML sitemap is correctly configured, you can utilize Semrush's Site Audit tool, which can assess websites for various XML sitemap-related issues. Here's how to proceed:
  • Create a free Semrush account (no credit card required).
  • The tool will check for the presence of a sitemap.xml file and identify any formatting errors. It will also flag pages that should not be included in the sitemap.
  • Additionally, it will verify whether your sitemap adheres to technical requirements (e.g., size limits) and best practices (e.g., proper linking from your robots.txt file).

  • By employing this tool, you can ensure that your XML sitemap is in proper working order and aligns with recommended practices.

    More To Explore

    Do You Want To Boost Your Business?

    Book Your Meeting With Us and Let's Level Up Your Business!