How to get a list of website pages & metadata

When building a new version of a website, we often need to extract the current pages, the page addresses (URLs) and the meta title and description. This can then be used to ‘310 redirect’ old pages to their new versions to help keep search engine rankings and avoid ‘page not found’ errors. This can be a time consuming task to do by hand!

Luckily there are two free tools that make this quick and easy.

Getting a list of page URLs

xml-sitemaps.com

Go to https://www.xml-sitemaps.com/, paste your website address into the bar and click start. Once it’s scanned all your pages, scroll down a bit and download the zip file containing all your sitemaps. Unzip these files to a folder and you’ll then see your sitemap in a variety of formats.

list of downloaded sitemaps

Getting metadata

Open the urllist.txt file up (Should automatically open in Notepad on Windows and Textedit on Mac) and copy all the URLs. Go to http://tools.buzzstream.com/meta-tag-extractor and paste the list of URLs into the box on the left.

extracting metadata from list of URLs

Let that scan all your pages and click ‘download CSV’ under the list, and you now have a list of all the website’s pages, with their URLs, meta titles and meta descriptions. This can then be opened up in Numbers/Excel/Google Sheets so you can work out the redirects, and also copy the metadata out to paste into the SEO data plugin of the new site.

I love the SEOPress plugin as its a lot less aggressive with adverts and upsells than the most common WordPress SEO plugin.

Summary

By using these free tools and following these easy steps you’ve learned how extract a list of website pages, the page addresses (URLs) and the meta titles and descriptions.

Close

Free Website Audit

Find out what’s holding your website back