[Tfug] Server purchase (!)

Bexley Hall bexley401 at yahoo.com
Thu Oct 12 09:17:44 MST 2006


--- Brian Masur <bcmasur at hotmail.com> wrote:

> All electronics (especially ones with moving parts,
> like hard drives) are subject to failure at any 
> time.

Actually, most "electronics" (as distinct from
electromechanical devices -- like hard drives)
exhibit a bathtub curve of failure probabilities.
They fail soon after manufacture (one edge of the
bathtub) or "at maturity" (the other end of the
bathtub).

This assumes they are used in their proper
environment (Vcc, Tambient, etc.) -- something that
isn't always guaranteed in a "PC" case (you'll
note that a "genuine" server case tends to have
a sh*tload more fan noise associated with it -- since
cooling is the number one cause of accelerated
failures (I think +10C halves MTBF, as a rule of 
thumb)

Some devices (notably, newer CPU's with sub micron
geometries) will have a more predictable "wearout"
point due to electromigration, etc.  (I've not
done the math but it intuitively *feels* like
the number of clock cycles you'll get from a CPU
is a constant, K, regardless of speed, over it's
lifetime... this would be an amusing bit of
research to pursue).

Other devices (CD-R/W) have similar wearout points
inherent in their designs (e.g., total PoHr's
limits usable life).  Remedies here tend to be
either "difficult" (i.e. independantly control
power to said device -- easier done for external
devices) or "practical" (i.e. plan on replacing it!)

> This is the reason for redundancy, 
> especially when you bring "backup" into the picture.

Disks and power supplies are most often made
redundant.
Power supplies fail due to their greater stresses --
they live in a warmer environment (since many are not
well ventilated) and handle "extremes" (i.e. switching
currents result in lots of thermal/mechanical shocks
to these devices at the switching frequency... things
that their low mass experience more dramatically
than would be evident outside the package (where
larger masses prevail).  Capacitors (and connectors)
tend to be a high failure rate item as well.
Capacitors can degrade even when not in service
(store a machine in a hot Tucson garage and you're
not doing it any favors!).

Disks nowadays have very high MTBF's (1,000,000
PoHrs).
But, most sit in poorly ventilated cases where they
run a lot warmer than they would like.  You either
invest in loud fans or redundant data stores
(note that a power supply/controller failure can
render a RAID array useless.  And, software bugs
can hose redundant drives in an ohnosecond -- I
lost two copies of an archive to a driver bug in
an old version of FreeBSD  :< )

Bottom line, if you *really* want to save your data,
keep it redundant *machines* (so a failure of one
doesn't kill "both" copies of your data) -- and in
different locations (so a fire or lightning strike
doesn't do likewise!)

I rely on *lots* of spindles (so I only lose a little
at a time) plus three backups on different media:
CD-R (which is really NOT very reliable, long term),
magnetic tape and MO cartridges.  Thankfully, the
amount of *dynamic* data that I have that would be
subject to this sort of backup is much less than
the archive already built  :>

Last note on connectors... they lurk in often ignored
locations!  In a modern PC, one of the connectors
most frequently ignored (for reliability) are those
for the memory devices (SIMMs/DIMMs/etc.).  The
quality
of these vary *immensely* -- and, there is no easy
way to determine what you've got!

In addition to tin/gold issues, many of these have
very low insertion ratings -- some as few as *6*!
Of course, that doesn't guarantee that it will
fail on the 7th attempt... but, if you are pulling
memory frequently, you have to wonder *when* you
will cross the line from "reliable" to "marginal".

Tinkerers beware  :>
--don

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 




More information about the tfug mailing list