[Sysadmins] Fwd: re: My Great Linux System Repair Adventure

Сб Май 19 15:11:29 MSD 2007

----- Forwarded message from Michael Shigorin -----

Date: Sat, 19 May 2007 14:10:27 +0300
From: Michael Shigorin
To: sjvn@
Subject: re: My Great Linux System Repair Adventure

	Hello Steven,
re http://www.linux-watch.com/news/NS9297647757.html:
[...]

> On the other hand, when things go badly wrong, getting the
> ReiserFS fsck file tree rebuild to work properly can be very
> tricky.

I've seen worse yet cases when consulting reiserfs folks was
neccessary (happily for a person, he had known one of those
who once worked at Namesys).

Basically you must not recommend anyone running reiserfsck 
(and generally r/w fsck) on a troublesome volume after serious
hardware disaster if you're far enough to suspect it: doing 
a partition table backup with at least fdisk -l (better yet,
sfdisk -d /dev/?da to be able to feed it to sfdisk again)
and then backing up a partition-in-question image to a larger
drive is requisite in case the data since last backup are unique
and valuable enough.

I usually recommend dd_rescue or ddrescue for that, one of those
should be in every half-decent rescue livecd.

Then copying that image (if a drive twice as large as a
partition is available, or another one just as large) *and*
continuing to fsck on/with a copy helps a lot when things 
do go wrong.

Corollary: one should not store valuable data on extremely large
partitions; in particular, /home is usually better separate from
a "wastedump" and perhaps 2x or 4x as large as its current volume
(meaning useful data) but not in orders of magnitude.

> NTFS3

it's ntfs-3g ;-)

> and, with its small memory footprint of 128MB, it appeared to
> load fine. 

It could still use available memory for caches.  Passing 
"mem=128M" (or "mem=256M") would be safer if there was suspicion
on some part of memory or semi-cooked controller.

> Even over my 100Mbps Fast Ethernet connection, I really didn't
> want to waste time sending all that data.

Then I wonder why did you waste enough time tarring that on the
same spindle (it's seeks being bottleneck then).

If at all possible, I'd set up NFS server on a LAN system,

service portmap start
mount lanpc:/net /mnt
tar cvzf /mnt/sjvnhomedir.tar.gz /home/sjvn

> I decided once more to go with easy, over other alternatives.

and that would finish way faster.

> Once logged in on the MEPIS PC, I logged into the SLED system's
> SSH server 

Oh, using KVM to get back by ssh is funny too, even if one
usually sees less fun until getting data back.

PS: 

> Despite no fewer than three power surge protectors, including a
> master power protector for the entire house

As they say, $300 tube will protect $.05 fuse by blowing up
first. 8)

Hope you have better luck next time, and if things are that bad,
an online UPS (like Powerware or APC _online_ models) might help.

----- End forwarded message -----

-- 
 ---- WBR, Michael Shigorin <mike на altlinux.ru>
  ------ Linux.Kiev http://www.linux.kiev.ua/