How to prevent or repair broken links
| This page in a nutshell:
Steps may be taken to reduce or repair dead external links. Do not remove citations simply because they have URLs which no longer work.
|
Like most large
websites
, Wikipedia suffers from the phenomenon known as
link rot
,
where external links become
dead
, as the linked web pages or complete websites disappear, change their content, or move without HTTP redirection. This presents a significant threat to Wikipedia's
reliability
policy and its
source citation
guideline.
In general, do not delete cited information
solely
because the URL to the source does not work any longer
. Tools, procedures, and processes are available as outlined in this document.
Preventing link rot
Automatic archiving
Links added by editors to the English Wikipedia mainspace are automatically saved to the
Wayback Machine
within about 24 hours (though in practice not every link is getting saved for various reasons
[
specify
]
). This is done with a program called
"NoMore404"
which Internet Archive runs and maintains; other language wiki sites are included. It monitors
EventStreams API
, extracts new external URLs and adds a snapshot to the Wayback. This system became active sometime after 2015, though previous efforts were also made. Also, sometime after 2012,
archive.today
(aka archive.is) attempted to archive all external links then existing on Wikipedia at that time. This was incomplete but a significant number of links were added to archive.today during this period making it a major archival source filling in gaps of coverage. Archive.today is still making some automated archives as of 2020, though the extent of coverage and frequency is unknown.
As of 2015, there is a Wikipedia bot and tool called
WP:IABOT
that automates fixing link rot. It runs continuously, checking all articles on Wikipedia if a link is dead, adding archives to
Wayback Machine
(if not yet there), and replacing dead links in the wikitext with an archived version. This bot runs automatically but it can also be directed by end users through its web interface. It is available when viewing any page's history, located near the top of the page on the line of "External Tools", with the "Fix dead links" option.
As of 2015, the periodic bot
WP:WAYBACKMEDIC
checks for link rot in the archive links themselves. Archive databases are dynamic: archives move or go missing, new ones are added, etc. This bot maintains existing archive links on English Wikipedia. It also archives resources on request at
WP:URLREQ
. It is a flexible tool that can carry out many custom jobs such as URL migration/move,
usurped domains
, soft-404 discovery and repair.
Manual archiving
Suggestions for ways to manually improve archiving:
- Avoid
bare URLs
. Use citation templates such as
{{
cite web
}}
for citations, and
{{
webarchive
}}
for external links sections.
- Use a
web archiving
service such as
Internet Archive
or
Archive.today
. A complete list is available at
WP:List of web archives on Wikipedia
. Within citation templates, put the archive URL in
|archive-url=
and add an
|archive-date=
. If the link is still valid, include
|url-status=live
, otherwise set
|url-status=dead
.
- To add more than one archive URL, as extra insurance against provider outage,
{{
webarchive
}}
accepts up to 10 archive provider URLs. The
|format=addlarchives
option produces output appropriate for trailing a CS1|2 template. e.g.
{{cite web|archive-url=..}}{{webarchive|format=addlarchive|url1=..|url2=..|url3..}}
will show 4 archive URLs (one from the cite web and three from the webarchive).
- If the link is still live but not yet archived, visit the web site of the archive service of your choice and request that the page be archived.
- Run
WP:IABOT
on pages via its user interface.
Alternative methods
Most
citation templates
have a
|quote=
parameter that can be used to store text quotes of the source material. This can be used to store a limited amount of text from the source within the citation template. This is especially useful for sources that cannot be archived with web archiving services. It can also provide insurance against failure of the chosen web archiving service. Storing the entire text of the source is not appropriate under
fair use policies
, so choose only the most important portions of the text that most support the assertions in the Wikipedia article. Where applicable,
public domain
materials can be copied to
Wikisource
.
Talk page information
To indicate that all used external links in an article have been successfully archived by the date of edit, you can add the template
{{
Archived reflist
}}
at the top of the article's
talk page
, which renders as:
Repairing a usurped link
When a domain on the Internet expires, anyone is allowed to pay for and control that domain. Some organizations actively seek these domains and "usurp" them to create spam and scam sites. To repair an external link to one of these sites from Wikipedia, remove the link and replace it with an archived version of the original as described at
Wikipedia:Link rot/Usurpations
.
There is an automated system for usurping entire domains. See
WP:URLREQ
to register all links in a domain for usurpation treatment.
Repairing a dead link
"WP:DEADLINK" redirects here. For the guideline on what to do when a link is dead (including potential removal of the cited material), see
WP:DEADREF
.
There are several ways to try to repair a dead link, detailed below. In general, avoid removing citations (or cited material) simply because a URL no longer works, especially if the citation is formatted with other information (like a title, author, date and publication name) that could alternatively be used to find the source.
Searching
If the dead link includes enough information (article title, names, etc.) it is often possible to use it to find the web page at a different location, either on the same site or elsewhere.
Often web pages simply move within the same site. A site index or site-specific search feature is a useful place to locate the moved page, searching for the title or other information. If these tools are not available, many Internet search engines allow a search on a specified site. For example, with Google add
site:en.wikipedia.org
to the search string to search English Wikipedia only. Occasionally changing
http://
to
http
s
://
works.
Failing this, searching the Web for the page title can find alternative sites. Searching the Web for the data to support can find a different source.
If you find a suitable new URL, then you can edit the parameters within the citation. If the citation uses one of the common citation templates (e.g. {{
cite web
}}, {{
cite news
}}, {{
Citation
}}), you can:
- Change the
|url=
to point to the new URL;
- Change or add
|access-date=
to refer to the current date.
Internet archives
Check for archived versions at one of the many web archive services. The "Big 3" archive services are
web.archive.org
,
webcitation.org
and
archive.today
. These account for over 90% of all archives on Wikipedia, with
web.archive.org
being over 80% of all archive links. Other archive services are listed at
WP:WEBARCHIVES
.
Add-ons (extensions)
are available for most browsers to search for archived copies, with names such as
Resurrect pages
.
The
Mementos interface
allows one to search multiple archiving services with a single search. The Memento database is cached, meaning results are returned quickly, but the cache also becomes out of date, and should not be relied on as the final word ? it will often incorrectly report that no archives are available. You may still need to check individual archive sites, but Mementos can be a quick first check.
Bookmarklets to check common archive sites for archives of the current page
(all open in a new tab or window)
Archive site
|
Bookmarklet
|
Archive.org
|
javascript
:
void
(
window
.
open
(
'https://web.archive.org/web/*/'
+
location
.
href
))
|
UKGWA
|
javascript
:
void
(
window
.
open
(
'https://webarchive.nationalarchives.gov.uk/ukgwa/*/'
+
location
.
href
))
|
If multiple archive dates are available, use the one that is most likely to be the contents of the page seen by the editor who entered the reference on the
|access-date=
. If that parameter is not specified, a
search of the article's revision history
can be performed to determine when the link was added to the article.
View the archive to verify that it contains valid page information. Usually dates closer to the time the link was placed in the Wikipedia page, or earlier, are more likely to show valid information.
If you find a suitable archive URL, then you can add it to the citation. If the citation uses one of the common templates (e.g. {{
cite web
}}, {{
cite news
}}, {{
Citation
}}), then you can edit as follows:
- Leave the
|url=
unchanged, pointing to the source URL.
- Add
|archive-url=
, pointing to the archive URL.
- Add
|archive-date=
, specifying the date when the archived copy was saved. YYYY-MM-DD format is usually easiest but any format can be used.
- Add or change
|url-status=
. Use
|url-status=dead
if the old URL does not work. Use
|url-status=unfit
or
|url-status=usurped
if the old URL has been usurped for the purposes of spam, advertising, or is otherwise unsuitable (see
WP:USURPURL
). Use
|url-status=live
if
|url=
still works and still gives the correct information, but you want to preemptively add an
|archive-url=
.
- Leave the
|access-date=
unchanged, referring to the date when a previous editor last accessed the
|url=
. Some editors believe
|access-date=
should be removed once a working
|archive-url=
is established since the
|url=
is no longer available, as maintaining an
|access-date=
is redundant clutter.
Mitigating a dead link
At times, all attempts to repair the link will be unsuccessful. In that event, consider finding an alternative source so that the loss of the original does not harm the verifiability of the article. Alternative sources about broad topics are usually easily located. A simple search engine query might locate an appropriate alternative, but be extremely careful to avoid citing
mirrors and forks of Wikipedia
itself, which would violate
Wikipedia:Verifiability
.
Sometimes, finding an appropriate source is not possible, or would require more extensive research techniques, such as a visit to a library or the use of a subscription-based database. If that is the case, consider consulting with Wikipedia editors at
Wikipedia:WikiProject Resource Exchange
, the
Wikipedia:Village pump
, or
Wikipedia:Help desk
. Also, consider contacting experts or other interested editors at a relevant
WikiProject
.
Sometimes a link is dead because the website moved the URL (e.g.
http://example.com
moved to
http://example.co.uk
). If you discover a URL change like this, please submit a request at
WP:URLREQ
for a url move. A bot will make the change.
In general, the fact that a URL is broken does
not
mean that a source has ceased to exist entirely, and
a broken URL in a citation does not mean it must be removed
. See the guidance at
WP:DEADREF
for when it is appropriate to remove citations with dead links. Crucially, books, magazines, journals and other print sources
exist offline
, and continue to do so even if websites go down or change locations; the lack of a functioning URL for a book does nothing to decrease its value as a source for Wikipedia content. Permanently inaccessible convenience links for print sources can be removed, but the reference should be retained. Before removing a citation with a dead URL, consider whether it would be possible to track down the source without using the URL at all; if so, it should probably be kept.
Keeping dead links
A dead, unarchived source URL may still be useful. Such a link indicates that information was (probably) verifiable in the past, and the link might provide another user with greater resources or expertise with enough information to find the reference. It could also return from the dead. With a dead link, it is possible to determine if it has been cited elsewhere, or to contact the person originally responsible for the source. For example, one could contact the Yale Computer Science department if
http://www.cs.yale.edu/~EliYale/Defense-in-Depth-PhD-thesis.pdf
[
dead link
]
were dead.
Place {{
dead link
|date=May 2024}} after the dead citation, immediately before the
</ref>
tag if applicable, leaving the original link intact. Marking dead links signals to editors and to link rot bots that this link needs to be replaced with an archive link. Placing {{
dead link
}} also auto-categorizes the article into
Articles with dead external links
project category, and into specific monthly date range category based on
|date=
parameter.
Do not delete
a citation just because it has been tagged with {{
dead link
}} for a long time.
Link rot on non-Wikimedia sites
Non-Wikimedia sites are also susceptible to link rot. Following a
page move
or
page deletion
, links to Wikipedia pages from other websites may break. In most page moves, a
redirect
will remain at the old page?this won't cause a problem. But if a page is completely deleted or
usurped
(i.e. replaced with other content) then link rot will have been caused on any external websites that link to it.
Replacement of page content with a
disambiguation page
may still cause link rot, but is less harmful because a disambiguation page is essentially a type of
soft redirect
that will lead the reader to the required content. If a page is usurped with content for another subject that shares its name, a
hatnote
may be placed at the top that directs readers to the original content on its new page?this again is a type of soft redirect, but less obvious. In these cases, readers arriving from an external rotten link should be able to find what they're looking for, but the situation is best avoided as they would have to get there via an additional page, potentially giving a poor impression of both Wikipedia and the linking website.
Because the Wikipedia software does not store
Referer
information
, it will be impossible to tell how many external web pages will be affected by a move or deletion, but the risk of link rot will probably be greatest on older and higher profile pages. In truth, there is not a lot that can be done; maintenance of non-Wikimedia websites is not within the scope of being a Wikimedian, nor in most cases within our capability (although if they
can
be fixed, it would be helpful to do so). However, it may be good practice to think about the potential impact on other sites when deleting or moving Wikipedia pages, especially if no redirect or hatnote will remain. If a move or deletion is expected to cause significant damage, then this might be a factor to consider in
WP:RM
,
WP:AFD
and
WP:RFD
discussions, although other factors may carry more weight.
See also
Essays
Tools and how-to guides
Bots
External links
Notes
|
---|
|
|
|
|
---|
- Adminitis
- Akin's Laws of Article Writing
- Alternatives to edit warring
- ANI flu
- Anti-Wikipedian
- Anti-Wikipedianism
- Articlecountitis
- Asshole John rule
- Assume bad faith
- Assume faith
- Assume good wraith
- Assume stupidity
- Assume that everyone's assuming good faith, assuming that you are assuming good faith
- Avoid using preview button
- Avoid using wikilinks
- Bad Jokes and Other Deleted Nonsense
- Barnstaritis
- Before they were notable
- BOLD, revert, revert, revert
- Boston Tea Party
- Butterfly effect
- CaPiTaLiZaTiOn MuCh?
- Complete bollocks
- Counting forks
- Counting juntas
- Crap
- Don't stuff beans up your nose
- Don't-give-a-fuckism
- Don't abbreviate "Wikipedia" as "Wiki"!
- Don't delete the main page
- Editcountitis
- Edits Per Day
- Editsummarisis
- Editing Under the Influence
- Embrace Stop Signs
- Emerson
- Fart
- Five Fs of Wikipedia
- Seven Ages of Editor, by Will E. Spear-Shake
- Go ahead, vandalize
- How many Wikipedians does it take to change a lightbulb?
- How to get away with UPE
- How to put up a straight pole by pushing it at an angle
- How to vandalize correctly
- How to win a citation war
- Ignore all essays
- Ignore every single rule
- Is that even an essay?
- Mess with the templates
- My local pond
- Newcomers are delicious, so go ahead and bite them
- Legal vandalism
- List of jokes about Wikipedia
- LTTAUTMAOK
- No climbing the Reichstag dressed as Spider-Man
- No one cares about your garage band
- No one really cares
- No, really
- No sorcery threats
- Notability is not eternal
- Oops Defense
- Play the game
- Please be a giant dick, so we can ban you
- Please bite the newbies
- Please do not murder the newcomers
- Pledge of Tranquility
- R-e-s-p-e-c-t
- Requests for medication
- Requirements for adminship
- Rouge admin
- Rouge editor
- Sarcasm is really helpful
- Sausages for tasting
- The Night Before Wikimas
- The first rule of Wikipedia
- The Five Pillars of Untruth
- Things that should not be surprising
- The WikiBible
- Watchlistitis
- Wikipedia is an MMORPG
- WTF? OMG! TMD TLA. ARG!
- What Wikipedia is not/Outtakes
- Why not create an account?
- Yes legal threats
- You don't have to be mad to work here, but
- You should not write meaningless lists
|
|
|
|
---|
About essays
| |
---|
Policies and guidelines
| |
---|
|
|