This Space for Rent

Now *that’s* backwards compatability for you

At work, I use the linux lvm code to “speed up” backups by the simple expedient of building lvm snapshots, then doing the backup from inside a chroot jail while the rest of the system charges ahead blissfully unaware that there’s an rsync chugging along in the background. I built the structure for doing this up with lvm on the 2.4 kernel (back when sistina was a standalone company and not a part of chapeausoft) and was very happy with it up until the point where I started rolling systems over to the (new! more bugs! almost as reliable!) 2.6 kernels, where I found myself having to do a bunch of work to get the old functionality back.

Just yesterday, though, I found a wonderful failure case with the new! kernel (and the new lvm code, because chapeausoft decided to take previously working code and rewrite the whole thing) when I was using a 2.6 kernel on an unconverted lvm partition that had been built with a 2.4 kernel.

The 2.6 lvm says it’s backwards compatable with the 2.4 lvm, and I can mount a lvm1 volume and read and write to it at will without any sort of complaint, but when I did a simple lvcreate --snapshot[…] on that lvm1 volume, it grunted “invalid lv in extent map” 5 or 6 times, then it irrevocably ate the entire lvm configuration and made 200gb of lvm go away forever.

That’s, um, pretty impressive.

I guess the lvm maintainers just didn’t want to bother to be backwards compatable, and decided that it was just too much trouble to add in the bit of code that says “no, we don’t support lvm1 snapshots. convert to lvm2 and they’ll work for you again, thanks!”

It’s not as if you might ever have anything valuable on your old lvm1 volume that you might not want to have just vanish when you take your next snapshot.

It’s not as if Linux isn’t the second most successful Unix out there, or is being used in places where people might not want to have unexpected surprises like this happen to production systems (like the poor bastards I spotted when doing web searches for “invalid lv in extent map” who posted, to various technical mailing lists, variations on “I got this error message and when I rebooted my 100gb/500gb/1tb/20tb of data was gone; can anyone tell me how to get it back?”, followed, a few days later, by “hello? Anyone out there? Heeeeeellllp!”)

I didn’t think that anyone could beat the FreeBSD “ho ho, we’re going to pretend to overwrite the good disk on your raid set!” lvm failure case, but this makes the horrible FreeBSD software raid look robust in comparison.

Comments


Been bit hard by this thing myself. Lost so much data I’m thinking of going back to Microsoft products.

Fragula Thu Mar 20 01:35:56 2008

Comments are closed