🚀 Astro XML Sitemaps #
In this post, we see a couple of ways to set up Astro Sitemaps. First, we use the Astro sitemap integration. Then we see how you get more fine-grained control by creating your own sitemaps on resource routes. On top, we see adding these and even custom styling are not at all difficult. Before we get into that, though, we take a look at why XML sitemaps are important and also which fields are needed in 2022. If that sounds like what you came here for, then let’s crack on!
🤷🏽 Astro Sitemaps: why add an XML Sitemap? #
XML sitemaps are great for Search Engine Optimization (SEO) as they provide an easy way for search engines to determine what content is on your site and when it was late updated. This last part is essential as it can save the search engine crawling a site which has not been updated since the last crawl.
Crawling is the process by which search engines discover sites and also attempt to ascertain what a particular page is about. When crawling, the search engine bot looks for anchor tags and uses its existing data about the site linked to, as well as the text between the opening and closing anchor tag. These are used to work out what your site and the linked site are all about. Anyway, the crawl is all about finding links and updating the search engine’s index. Sites get allocated a budget which will vary based on a number of factors. This caps the time a search engine will spend indexing your site. You risk the search engine recrawling existing content without discovering and indexing your new pages if you don’t have an up-to-date sitemap.
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="https://example.com/sitemap.xsl"?><urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd http://www.google.com/schemas/sitemap-image/1.1 http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>https://example.com/best-medium-format-camera-for-starting-out/</loc><lastmod>2022-11-12T09:17:52.000Z</lastmod></url><url><loc>https://example.com/folding-camera/</loc><lastmod>2022-11-13T09:17:52.000Z</lastmod></url><url><loc>https://example.com/twin-lens-reflex-camera/</loc><lastmod>2022-11-14T09:17:52.000Z</lastmod></url></urlset>
In the example sitemap content above, we see three entries. Each has loc
tag which contains the URL of a site page as well as lastmod
,
the last modification date for the page. Hence, by scanning this sitemap the search engine will be
able to work out which content you updated since the last index. That saves it re-crawling
unchanged content. On a larger site, bots might discover fresh content quicker.
🤔 Which fields do you need in 2022? #
On older sitemaps, you might see priority
and changefreq
tags. Although search engines used these in the past, they no longer matter much to Google . For that reason, we will skip those tags in the rest of this post.
🧱 How to add an XML Astro Sitemap with the Integration #
Astro integrations let you quickly add certain features to your site and typically need little or no configuration. Here we see how to set up the Astro sitemap integration. If you already know you need something more sophisticated, skip on to the next section.
How to Create an Astro Sitemap with the XML Sitemap Integration #
-
Like other integrations, the
astro add
command helps you get going quickly on the sitemap integration: When prompted, typepnpm astro add sitemapY
to accept installing the integration and also placing necessary config in yourastro.config.mjs
file. -
Update the
site
field in yourastro.config.mjs
to match your site’s domain:astro.config.mjsjavascript1 import sitemap from '@astrojs/sitemap';2 import svelte from '@astrojs/svelte';3 import { defineConfig } from 'astro/config';45 // https://astro.build/config6 export default defineConfig({7 site: 'https://your-site-domain.com',8 integrations: [sitemap(), svelte()]9 }); -
As a final step, you can update the HTTP headers for the final Astro sitemap (more on this
later). The method will depend on whether you are building your site an SSG
Static Site Generated site (Astro default) or SSRServer-Side Rendered .
To see the sitemaps, you need to build your site normally (pnpm run build
), then run the preview server (pnpm run preview
). You can see
the sitemaps at http://localhost:3001/sitemap-index.xml
and http://localhost:3001/sitemap-0.xml
. The first is an index which will just links to the second. This second one includes dynamic
pages like posts (depending on your site structure). You can also inspect these two files in the
project dist
folder.
<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"><url><loc>https://example.com/</loc></url><url><loc>https://example.com/best-medium-format-camera-for-starting-out/</loc></url><url><loc>https://example.com/contact/</loc></url><url><loc>https://example.com/folding-camera/</loc></url><url><loc>https://example.com/twin-lens-reflex-camera/</loc></url></urlset>
You will notice there are no dates by default. It is pretty easy to add a lastmod
date parameter, though this will be the same date
for all pages . If you want something more sophisticated, you can supply functions in the configuration, which
generate the data you want. My opinion is that for this use case, it makes more sense to add a resource route with your own custom XML sitemap. This should keep the sitemaps more maintainable. We see this in the next section.
🗳 Poll #
🧑🏽🍳 Rolling your own Astro Sitemap for Increased Control #
We will add three sitemaps, though for your own site you might decide to go for more or even
fewer. All the sitemaps will include lastmod
tags to help search engines
optimize indexing. The first sitemap will be an index with links to pages
and posts
sitemaps. The index one is easiest with the least dynamic
data, so let’s start there.
Astro JS Index Sitemap #
Astro does not just create fast HTML pages; you can also use it to create resource routes. We see
how to serve JSON data and even PDFs in the Astro Resource route post, so take a look there for
further background if you are interested. Let’s start by creating a src/pages/sitemap_index.xml.js
file. This will generate the content served when a search engine visits https://example.com/sitemap_index.xml
.
1 import website from '~config/website';23 const { siteUrl } = website;45 export async function get({ request }) {6 const { url } = request;7 const { hostname, port, protocol } = new URL(url);8 const baseUrl = import.meta.env.PROD ? siteUrl : `${protocol}//${hostname}:${port}`;910 const postModules = await import.meta.glob('../content/posts/**/index.md');11 const posts = await Promise.all(Object.keys(postModules).map((path) => postModules[path]()));12 const lastPostUpdate = posts.reduce((accumulator, { frontmatter: { lastUpdated } }) => {13 const lastPostUpdatedValue = Date.parse(lastUpdated);14 return lastPostUpdatedValue > accumulator ? lastPostUpdatedValue : accumulator;15 }, 0);1617 const lastPostUpdateDate = new Date(lastPostUpdate).toISOString();1819 const xmlString = `20 <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="${baseUrl}/sitemap.xsl"?>21 <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">22 <sitemap>23 <loc>${baseUrl}/page-sitemap.xml</loc>24 <lastmod>${lastPostUpdateDate}</lastmod>25 </sitemap>26 <sitemap>27 <loc>${baseUrl}/post-sitemap.xml</loc>28 <lastmod>${lastPostUpdateDate}</lastmod>29 </sitemap>30 </sitemapindex>`.trim();3132 return { body: xmlString };33 }
This code is based on the Astro Blog Markdown starter, and we will get information on post modification dates from the Markdown front matter for blog posts. The two entries (pages and posts) have the same last modified date because the home page includes a list of recent posts, so we assume the content there gets updated each time a blog post is updated. Because we are adding the logic ourselves, you can modify this better to suite your own use case if this is a poor assumption.
We add our sitemap to a get
function, since this is the method the
search engine will use to access it. We are assuming static generation here. If your site runs in SSR
mode, consider adding an HTTP content-type
header (see Astro resource
routes post for details).
In line 8
above, the baseUrl
will vary
depending on whether we are running the site locally in dev mode or in production mode. We use Astro
APIs to get the last date a post was updated, pulling data from post metadata. The full code is on
the Rodney Lab GitHub repo, see link further down the page. Most important, for your own project, is
the code in lines 20
– 30
with the XML markup. You can even add images and videos in here if you want to.
Styling #
We also included an XSL stylesheet in line 20
just to make the site
look a bit nicer for you, while debugging! Create a public/sitemap.xsl
file with this content:
public/sitemap.xsl
— click to expand code.
<?xml version="1.0" encoding="UTF-8"?><!--Copyright (c) 2008, Alexander MakarovAll rights reserved.Redistribution and use in source and binary forms, with or withoutmodification, are permitted provided that the following conditions are met:* Redistributions of source code must retain the above copyright notice, thislist of conditions and the following disclaimer.* Redistributions in binary form must reproduce the above copyright notice,this list of conditions and the following disclaimer in the documentationand/or other materials provided with the distribution.* Neither the name of sitemap nor the names of itscontributors may be used to endorse or promote products derived fromthis software without specific prior written permission.THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THEIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AREDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLEFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIALDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS ORSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVERCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USEOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.--><xsl:stylesheet version="2.0"xmlns:html="http://www.w3.org/TR/REC-html40"xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"xmlns:sitemap="http://www.sitemaps.org/schemas/sitemap/0.9"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/><xsl:template match="/"><html xmlns="http://www.w3.org/1999/xhtml"><head><title>XML Sitemap</title><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><style type="text/css">body {font-family: Helvetica, Arial, sans-serif;font-size: 13px;color: #545353;}table {border: none;border-collapse: collapse;}#sitemap tr:nth-child(odd) td {background-color: #eee !important;}#sitemap tbody tr:hover td {background-color: #ccc;}#sitemap tbody tr:hover td, #sitemap tbody tr:hover td a {color: #000;}#content {margin: 0 auto;width: 1000px;}.expl {margin: 18px 3px;line-height: 1.2em;}.expl a {color: #da3114;font-weight: 600;}.expl a:visited {color: #da3114;}a {color: #000;text-decoration: none;}a:visited {color: #777;}a:hover {text-decoration: underline;}td {font-size:11px;}th {text-align:left;padding-right:30px;font-size:11px;}thead th {border-bottom: 1px solid #000;}</style></head><body><div id="content"><h1>XML Sitemap</h1><p class="expl">This is an XML Sitemap, meant for consumption by search engines.<br/>You can find more information about XML sitemaps on <a href="http://sitemaps.org" target="_blank" rel="noopener noreferrer">sitemaps.org</a>.</p><hr/><xsl:if test="count(sitemap:sitemapindex/sitemap:sitemap) > 0"><p class="expl">This XML Sitemap Index file contains <xsl:value-of select="count(sitemap:sitemapindex/sitemap:sitemap)"/> sitemaps.</p><table id="sitemap" cellpadding="3"><thead><tr><th width="75%">Sitemap</th><th width="25%">Last Modified</th></tr></thead><tbody><xsl:for-each select="sitemap:sitemapindex/sitemap:sitemap"><xsl:variable name="sitemapURL"><xsl:value-of select="sitemap:loc"/></xsl:variable><tr><td><a href="{$sitemapURL}"><xsl:value-of select="sitemap:loc"/></a></td><td><xsl:value-of select="concat(substring(sitemap:lastmod,0,11),concat(' ', substring(sitemap:lastmod,12,5)),concat('', substring(sitemap:lastmod,20,6)))"/></td></tr></xsl:for-each></tbody></table></xsl:if><xsl:if test="count(sitemap:sitemapindex/sitemap:sitemap) < 1"><p class="expl">This XML Sitemap contains <xsl:value-of select="count(sitemap:urlset/sitemap:url)"/> URLs.</p><table id="sitemap" cellpadding="3"><thead><tr><th width="80%">URL</th><th width="5%">Images</th><th title="Last Modification Time" width="15%">Last Mod.</th></tr></thead><tbody><xsl:variable name="lower" select="'abcdefghijklmnopqrstuvwxyz'"/><xsl:variable name="upper" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/><xsl:for-each select="sitemap:urlset/sitemap:url"><tr><td><xsl:variable name="itemURL"><xsl:value-of select="sitemap:loc"/></xsl:variable><a href="{$itemURL}"><xsl:value-of select="sitemap:loc"/></a></td><td><xsl:value-of select="count(image:image)"/></td><td><xsl:value-of select="concat(substring(sitemap:lastmod,0,11),concat(' ', substring(sitemap:lastmod,12,5)),concat('', substring(sitemap:lastmod,20,6)))"/></td></tr></xsl:for-each></tbody></table></xsl:if></div></body></html></xsl:template></xsl:stylesheet>
This is based on code in an Alexander Makarov GitHub repo . Try opening the Sitemap in your browser. It should look something like this:
data:image/s3,"s3://crabby-images/cba56/cba564c3ac65c5418268ff37a388998df8adb7bd" alt="Astro Sitemaps: Styled X M L sitemap shows links to the page and post sitemaps with last modified dates."
Astro Sitemap: Page XML Route #
Next up, here is the page code. The update dates are a bit more manual here. You have to remember to update them manually when you update content. An alternative is using the file modified date, though this can be complicated when using continuous integration to deploy your site. Note that the index sitemap links to this one.
1 import website from '~config/website';23 const { siteUrl } = website;45 export async function get({ request }) {6 const { url } = request;7 const { hostname, port, protocol } = new URL(url);8 const baseUrl = import.meta.env.PROD ? siteUrl : `${protocol}//${hostname}:${port}`;910 const postModules = await import.meta.glob('../content/posts/**/index.md');11 const posts = await Promise.all(Object.keys(postModules).map((path) => postModules[path]()));12 const lastPostUpdate = posts.reduce((accumulator, { frontmatter: { lastUpdated } }) => {13 const lastPostUpdatedValue = Date.parse(lastUpdated);14 return lastPostUpdatedValue > accumulator ? lastPostUpdatedValue : accumulator;15 }, 0);1617 const lastPostUpdateDate = new Date(lastPostUpdate).toISOString();1819 const pages = [20 { path: '', lastModified: lastPostUpdateDate },21 { path: '/contact/', lastModified: '2022-09-28T08:36:57.000Z' },22 ];2324 const xmlString = `25 <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="${baseUrl}/sitemap.xsl"?>26 <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd http://www.google.com/schemas/sitemap-image/1.1 http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">27 ${pages.map(28 ({ path, lastModified }) => `29 <url>30 <loc>${baseUrl}${path}</loc>31 <lastmod>${lastModified}</lastmod>32 </url>33 `,34 )}35 </urlset>`.trim();3637 return { body: xmlString };38 }
We repeat the logic to get the last post update date. In your own project, you will probably want
to move that code to a utility function if you also need it twice. In lines 19
– 22
we have a manually compiled list of site
pages (excluding dynamic post routes, which we add to their own sitemap in the next section). For each
page, we include the path and lastModified
date. Then we use this array
to generate the output XML.
Astro Sitemap: Post XML Route #
Finally, our posts will have dynamic dates, using logic similar to what we saw earlier, to get
last modified fields from post Markdown front matter. Here is the src/pages/post-stemap.xml
code:
1 import website from '~config/website';23 const { siteUrl } = website;45 export async function get({ request }) {6 const { url } = request;7 const { hostname, port, protocol } = new URL(url);89 const baseUrl = import.meta.env.PROD ? siteUrl : `${protocol}//${hostname}:${port}`;10 const postModules = await import.meta.glob('../content/posts/**/index.md');11 const posts = await Promise.all(Object.keys(postModules).map((path) => postModules[path]()));12 const postsXmlString = posts.map(({ file, frontmatter: { lastUpdated } }) => {13 const slug = file.split('/').at(-2);14 return `15 <url>16 <loc>${baseUrl}/${slug}/</loc>17 <lastmod>${new Date(lastUpdated).toISOString()}</lastmod>18 </url>`;19 });2021 const xmlString = `22 <?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="${baseUrl}/sitemap.xsl"?>23 <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd http://www.google.com/schemas/sitemap-image/1.1 http://www.google.com/schemas/sitemap-image/1.1/sitemap-image.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">2425 ${postsXmlString.join('26 ')}27 </urlset>`.trim();2829 return { body: xmlString };30 }
That’s it! Check out the new sitemaps in the browser.
⚽️ HTTP Headers #
Search engines do not need to index the sitemaps, so you can serve a robots noindex
directive from these routes. There are a few ways to set this up. If you are use SSG (Astro default)
and hosting on Cloudflare or Netlify, then to let the host know which headers you want to include,
add a public/_headers
file to the project:
/page-sitemap.xmlcache-control: public, max-age=0, must-revalidatex-robots-tag: noindex, follow/post-sitemap.xmlcache-control: public, max-age=0, must-revalidatex-robots-tag: noindex, follow/sitemap_index.xmlcache-control: public, max-age=0, must-revalidatex-robots-tag: noindex, follow
However, if you are running in SSR mode, then you can just include these headers in the Response
object which your get
function returns.
🙌🏽 Astro Sitemaps: Wrapping Up #
In this post, we saw how to add Astro Sitemaps to your project. In particular, we saw:
- how to use the Astro sitemap integration for hassle-free setup,
- how you gain more control over the sitemap content using Astro Sitemaps XML resource routes,
- serving noindex HTTP headers on sitemap routes.
You can see the full code for the project in the Astro Blog Markdown GitHub repo.
Hope you have found this post useful! I am keen to hear what you are doing with Astro and ideas for future projects. Also, let me know about any possible improvements to the content above.
🏁 Astro Sitemaps: Summary #
Is there an Astro Sitemap plugin or integration? #
- Yes, there is an Astro sitemap integration. It makes it quick and easy to add a basic sitemap to your site. To set it up, just run `pnpm astro add sitemap` from the Terminal. The Astro add tool will prompt you on installing the plugin and updating your config. You can answer yes to both. Next, just update the `site` field in your Astro configuration in the `astro.config.mjs` file. Although it is easy to set up a basic sitemap with the integration, if you want more control, for example over page modified dates, you can consider serving your own custom Astro sitemap from a resource route.
Can you create a custom XML sitemap with Astro? #
- Yes! And although Astro has a sitemap integration, unless you need a fairly basic sitemap, you might find your code becomes more maintainable and intuitive if you create your own custom sitemaps and serve them from resource routes. Doing so, you can add logic to derive the last modified date for each page. As an example, we saw for a Markdown blog, we could pull the last updated date from the post front matter for each post. Doing this, we better reflected the update date, hopefully helping search engines to crawl the site more efficiently.
Do Astro static sites support resource routes? #
- Astro makes it easy for you to add resource routes. These let you serve a file or some data instead of the regular HTML, Astro is famous for serving at speed! You could go for an XML sitemap, an RSS feed, or even a PDF of the company newsletter. You would expect this in Server-Side Rendered mode, but note, Astro lets you create resource routes even in the default, Static Site Generator mode. To get going create a source file to match the path of the resource (just add .js or .ts on the end). Then export a `get`, `post` etc function which returns the resource data.
🙏🏽 Astro Sitemaps: Feedback #
Have you found the post useful? Would you prefer to see posts on another topic instead? Get in touch with ideas for new posts. Also, if you like my writing style, get in touch if I can write some posts for your company site on a consultancy basis. Read on to find ways to get in touch, further below. If you want to support posts similar to this one and can spare a few dollars, euros or pounds, please consider supporting me through Buy me a Coffee.
Just dropped a new, free post on adding a sitemap using the integration and also the pro mode resource route alternative.
— Rodney (@askRodney) November 16, 2022
We look at styling, which tags to include in 2022 and also serving noindex directives.
Hope you find it useful!
#askRodneyhttps://t.co/s7QcicRgQb
Finally, feel free to share the post on your social media accounts for all your followers who will find it useful. As well as leaving a comment below, you can get in touch via @askRodney on Twitter, @rodney@toot.community on Mastodon and also the #rodney Element Matrix room. Also, see further ways to get in touch with Rodney Lab. I post regularly on Astro as well as SEO. Also, subscribe to the newsletter to keep up-to-date with our latest projects.