[Tfug] Advice on building a new machine

Sun Feb 19 23:32:21 MST 2012

Hello again,

After what seems like far too long, I've finally been given the go
ahead (and funds) to replace our aging and crippled server/workstation
in the office.  I can handle choosing most of the parts myself, but I
would very much appreciate some input and advice on some aspects of
the hardware which I either have little firsthand knowledge or which
have changed markedly since I last put a computer together.

* Solid state storage
As we need a decent amount of storage on this machine, I will be
installing between two and four regular hard drives in either a RAID1
or RAID5 configuration.  But, for the OS (and perhaps /home) I was
considering a SSD.  But, I do not own one nor have I ever used one.  I
know they're much faster, especially for random I/O, but I've always
been a bit worried about their longevity.  Considering that the drive
would be used for at most five years, is this something I need to be
concerned with?  The machine will be on 24/7, of course.  Would I have
to do anything special with regard to frequently written areas like
/var/log?  Maybe put them on the HDD instead?  It's a solid state
drive, but I suppose I should still put in two in a RAID1 setup, just
to be safe?

* GPGPU
A bit late to the game, perhaps, but we're finally starting to look
into this.  It couldn't be a better fit for the type of image and data
processing we do, so I don't really know why we've waited so long.
Per dollar, ATI cards have always seemed (to me, anyway) to be the
better buy for raw performance.  But this isn't about pretty pictures.
 Would an nVidia card be a better purchase for this purpose?  The
closed source nVidia drivers usually seem to work better with Linux/X
than those from ATI and they support more features, too.  But, the OSS
Radeon driver seems much further along than the OSS Nouveau driver.
However, recent news seems to indicate that the Nouveau driver now has
support for OpenCL, at least for some models.  Which way should I go
here?

* Other hardware
For a CPU I'm looking at an Intel Xeon E5620 2.4 GHz quad-core
"server" CPU with an EVGA 270-WS-W555-A2 LGA 1366 motherboard.  We'd
only be filling it halfway, that is, one CPU and 24 GB of RAM.  The
idea being that we could always expand later on if circumstances
required it.  I don't really have any specific question regarding
this, other than to ask if this jumps out at anybody as being
particular odd or unwise.

And, while I'm at it, a couple of software/configuration questions:

* File system
I've been using XFS for just about everything since SGI first ported
it to Linux and I had to patch the kernel myself.  It's always worked
remarkably well, always been fast, and I've never had any data loss
using it.  I really don't think btrfs is ready for prime time yet, but
I'm wondering about ZFS.  As storage becomes ever more enormous, I'm
somewhat worried about "silent" data corruption on the drives.  ZFS
has checksums all over the place and can do periodic scrubbing to help
alleviate this concern.  I'm looking at a total usable storage
capacity (after RAID use is accounted for) of between two and four TB.
 Is this large enough that I need to worry about this sort of thing?
Unfortunately, ZFS is not in the mainline kernel and I'd hate to go
back to rolling my own kernels again (just one more task I could do
without).  And, the md subsystem (devmapper subsystem too, perhaps?)
can perform a periodic scrubbing/data integrity check on its own, so
maybe I don't need ZFS for that?  Ideally, I'd prefer to just stick
with something I've used for a long time and that I know works very
well, but I also don't want to lose data.  But is the risk high enough
that I realistically need to worry about it?  And, while our
(primarily astronomical) data is certainly important to us, we're not
talking a life or death situation here.

* md (RAID/multiple disk) versus dm (devmapper) subsystems
For a while I have been using LVM on top of an md-built RAID array as
my usual setup for a number of systems, sometimes also throwing in a
crypto layer.  This ties together a number of kernel systems.  It
works, but perhaps there is a better way.  All of these subsystems are
actively maintained, but I've read that the md subsystem can exhibit
throughput bottlenecks.  I haven't seen any hard data, or read a
particularly convincing explanation of this.  Since the dm subsystem
can also handle RAID, is this a better way to organize things?  Does
it really matter or do they all map to the same kernel functionality
in the end?

Okay, I guess that's about it.  Plus I need to let my keyboard cool off...
Thanks in advance for any advice.

-- 
--John Gruenenfelder    Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for Palm OS  --  http://weaselreader.org
"This is the most fun I've had without being drenched in the blood
of my enemies!"
        --Sam of Sam & Max