Web Archiving and Web Continuity
Web Archiving and Web Continuity
Web archiving is capturing content that has been made available via the Web, permanently preserving this as archived content, and making this accessible to users. We use a web crawler to capture and preserve websites. Our crawler visits and explores a selected website via its hyperlinks -much like a human user would using a web browser- and copies and preserves the web content as it does so. The Web Archive makes this preserved content available as a series of “snapshots” showing the date on which the website was crawled; the public can view the content as it appeared on that date. We revisit the sites periodically to make a new capture; this allows the public to see how the site and its content have changed over time.
The Web Continuity Service builds on this web archiving working model. It provides website owners with the option to enable a redirection service on their website(s) that will seamlessly take users from missing pages on their live site(s) into the NRS Web Archive, where a search for latest archived version of the missing page will be made. If an archived version is found it is directly made available to the user with a banner showing that the page is an archived version. This means users see many fewer ‘404 page not found’ error messages when visiting these live sites.
NRS has a contract in place with the Internet Archive, who deliver the technical aspects of the Web Continuity Service on our behalf. For further information, see our Web Continuity Service Model (298 KB PDF).