A part whose failure will disrupt the entire system
A
single point of failure
(
SPOF
) is a part of a system that, if it
fails
, will
stop the entire system from working
.
[1]
SPOFs are undesirable in any system with a goal of
high availability
or
reliability
, be it a business practice, software application, or other industrial system.
Overview
[
edit
]
Systems can be made robust by adding
redundancy
in all potential SPOFs. Redundancy can be achieved at various levels.
The assessment of a potential SPOF involves identifying the critical components of a complex system that would provoke a total systems failure in case of
malfunction
. Highly
reliable systems
should not rely on any such individual component.
For instance, the owner of a small
tree care
company may only own one
woodchipper
. If the chipper breaks, they may be unable to complete their current job and may have to cancel future jobs until they can obtain a replacement. The owner of the tree care company may have
spare parts
ready for the repair of the wood chipper, in case it fails. At a higher level, they may have a second wood chipper that they can bring to the job site. Finally, at the highest level, they may have enough equipment available to completely replace everything at the work site in the case of multiple failures.
-
Possible SPOFs in a simple setup
-
Using redundancy to avoid some SPOFs
-
Completely redundant system without SPOFs (note: assumes generator and grid sources are each rated at N, each UPS is rated at N, and "A/C" and "Electrical" are in and of themselves completely fault tolerant systems)
Computing
[
edit
]
| This section needs to be
updated
. The reason given is: Needs updating for public cloud computing.
Please help update this article to reflect recent events or newly available information.
(
May 2022
)
|
A
fault-tolerant computer system
can be achieved at the internal component level, at the system level (multiple machines), or site level (replication).
One would normally deploy a
load balancer
to ensure high availability for a
server cluster
at the system level. In a high-availability server cluster, each individual server may attain internal component redundancy by having multiple power supplies, hard drives, and other components. System-level redundancy could be obtained by having spare servers waiting to take on the work of another server if it fails.
Since a data center is often a support center for other operations such as business logic, it represents a potential SPOF in itself. Thus, at the site level, the entire cluster may be replicated at another location, where it can be accessed in case the primary location becomes unavailable. This is typically addressed as part of an
IT disaster recovery
program.
Paul Baran
and
Donald Davies
developed
packet switching
, a key part of "survivable communications networks". Such networks – including
ARPANET
and the
Internet
– are designed to have no single point of failure. Multiple paths between any two points on the network allow those points to continue communicating with each other, the packets
"routing around" damage
, even after any single failure of any one particular path or any one intermediate node.
Software engineering
[
edit
]
In
software engineering
, a
bottleneck
occurs when the capacity of an
application
or a computer system is limited by a single component. The bottleneck has lowest throughput of all parts of the transaction path.
Performance engineering
[
edit
]
Tracking down bottlenecks (sometimes known as
hot spots
? sections of the code that execute most frequently ? i.e., have the highest execution count) is called
performance analysis
. Reduction is usually achieved with the help of specialized tools, known as performance analyzers or profilers. The objective is to make those particular sections of code perform as fast as possible to improve overall
algorithmic efficiency
.
Computer security
[
edit
]
A vulnerability or security exploit in just one component can compromise an entire system.
Other fields
[
edit
]
The concept of a single point of failure has also been applied to fields outside of engineering, computers, and networking, such as corporate
supply chain
management
[2]
and transportation management.
[3]
Design structures that create single points of failure include
bottlenecks
and
series circuits
(in contrast to
parallel circuits
).
In transportation, some noted recent examples of the concept's recent application have included the
Nipigon River Bridge
in Canada, where a partial bridge failure in January 2016 entirely severed road traffic between
Eastern Canada
and
Western Canada
for several days because it is located along a portion of the
Trans-Canada Highway
where there is no alternate
detour
route for vehicles to take;
[4]
and the
Norwalk River Railroad Bridge
in
Norwalk
,
Connecticut
, an aging
swing bridge
that sometimes gets stuck when opening or closing, disrupting rail traffic on the
Northeast Corridor
line.
[3]
The concept of a single point of failure has also been applied to the fields of intelligence.
Edward Snowden
talked of the dangers of being what he described as "the single point of failure" ? the sole repository of information.
[5]
Life-support systems
[
edit
]
| This section
needs expansion
. You can help by
adding to it
.
(
October 2019
)
|
A component of a
life-support system
that would constitute a single point of failure would be required to be extremely reliable.
See also
[
edit
]
Concepts
[
edit
]
Applications
[
edit
]
- Kill switch
? Safety mechanism to quickly shut down a system
- Jesus nut
? Slang term for the main rotor-retaining nut of some helicopters
- Reliability engineering
? Sub-discipline of systems engineering that emphasizes dependability
- Safety engineering
? Engineering discipline which assures that engineered systems provide acceptable levels of safety
- Dead man's switch
? Equipment that activates or deactivates upon the incapacitation of operator
In literature
[
edit
]
- Achilles' heel
? Critical weakness which can lead to downfall despite overall strength
- Hamartia
? Protagonist's error in Greek dramatic theory
References
[
edit
]
- ^
1:
Designing Large-scale LANs ? Page 31, K. Dooley, O'Reilly, 2002
- ^
Gary S. Lynch (Oct 7, 2009).
Single Point of Failure: The 10 Essential Laws of Supply Chain Risk Management
. Wiley.
ISBN
978-0-470-42496-4
.
- ^
a
b
"Crucial, Century-Old, And Sometimes Stuck: Connecticut Bridge Is Key To Northeast Corridor"
.
Connecticut Public Radio
, August 8, 2017.
- ^
"The Nipigon River Bridge and other Trans-Canada bottlenecks"
.
Global News
, January 11, 2016.
- ^
"Edward Snowden: the true story behind his NSA leaks"
.
Telegraph.co.uk
.
Archived
from the original on 2022-01-12
. Retrieved
2016-12-13
.
|
---|
|
|
|
|
---|
Activities
| |
---|
Competitions
| |
---|
Equipment
| |
---|
Freedivers
| |
---|
Hazards
| |
---|
Historical
| |
---|
Organisations
| |
---|
|
|
|
|
|
|
---|
Diving
disorders
| Pressure
related
| Oxygen
| |
---|
Inert gases
| |
---|
Carbon dioxide
| |
---|
Breathing gas
contaminants
| |
---|
|
---|
Immersion
related
| |
---|
|
---|
Treatment
| |
---|
Personnel
| |
---|
Screening
| |
---|
Research
| |
---|
|
|
|
|
---|
Archeological
sites
| |
---|
Underwater art
and artists
| |
---|
Engineers
and inventors
| |
---|
Historical
equipment
|
|
---|
Military and
covert operations
| |
---|
Scientific projects
| |
---|
Awards and events
| |
---|
Incidents
| |
---|
|
|
Publications
|
---|
Manuals
| |
---|
Standards and
Codes of Practice
| |
---|
General non-fiction
| |
---|
Research
| |
---|
Dive guides
| |
---|
|
|
Training and registration
|
---|
|
|
|
---|
Surface snorkeling
| |
---|
Snorkeling/breath-hold
| |
---|
Breath-hold
| |
---|
Open Circuit Scuba
| |
---|
Rebreather
| |
---|
Sports governing
organisations
and federations
| |
---|
Competitions
| |
---|
|
|
|
---|
Pioneers
of diving
| |
---|
Underwater
scientists
archaeologists and
environmentalists
| |
---|
Scuba record
holders
| |
---|
Underwater
filmmakers
and presenters
| |
---|
Underwater
photographers
| |
---|
Underwater
explorers
| |
---|
Aquanauts
| |
---|
Writers and journalists
| |
---|
Rescuers
| |
---|
Frogmen
| |
---|
Commercial salvors
| |
---|
|
|
|
|
|