kpartx to the rescue

Sat 20 April 2013

Filed under Howto

Tags Cool

failed disks

Not every admin is equally intelligent, interested, disciplined or motivated. Those attributes represent the balancing act of professional service in any capacity. However, when both motivation and interest are lacking, it's probably a sign you hate your job. Cleaning up after messes made by disinterest and apathy has become something of a past-time for me. Employers discover this the way skiers discover avalanches - a silent but rapidly rising disaster. Most of us will simply be blind-sided when a situation hits critical mass. That's why Documentation, Budgets, RAIDs and Backups are necessary to fundamental competence.

recovery

The package today was a Case Management server (so-called) which began life as a Compaq Presario CQ5320F desktop. Perhaps not a bad machine, but never in life should it be called server. When I try to picture the prior admin, all I see is a man spending a great deal of effort to find out how close he can get to a cliff's edge, while still doing handstands. It does not matter that the new admin is speedily retrofitting all of the infrastructure while remaining on budget. Virtualization, redundant equipment, central storage and backups can't fix the two old machines which haven't yet been migrated. The failing system is our Case Management server. The Case Management server has failed to properly backup for some weeks now. Second guessing won't help, suffice it to say that numerous attempts did not yield success. (Which makes it just about identical to everything else the predecessor left us.) This weekend's scheduled power outage seems to have triggered the failure. All other systems restarted and ran properly, but the Windows 2003 Case Management Server hung on login. Not even Safe Mode worked. In desperation, the Admin attempted a 'Last Known Good Config' boot, which surely guarantees problems if it does not work. It too failed.

The onsite admin spent a deal of time before calling me in. He wasn't sure what he was dealing with and we needed Case Management, if at all possible, by Monday. My offsite thirty-second diagnosis was: Likely Disk Failure. I arrived in the role of Troubleshooter. The admin has been onsite for about ninety days and he's been busy. Since day one, his hands have been full with maintaining service, designing solutions, building and implementation. His keen eye and patient effort has resulted in a general migration away from peril with zero down-time. We spent just under an hour debating various approaches to the problem, since everything would require hours of data copy, and we didn't want to tarnish his sterling image. Multiple attempts at extraction might crash drives or push us out of our window of opportunity.

careful assembly

The predecessor had taken a Compaq Desktop, with a 500GB drive, and then later added a Barracuda 1TB disk. Since the predecessor lacked mounting rails, and didn't want to be bothered to use the other 3.5" slot, he felt it appropriate to mount the second disk at a 35 degree angle, in a full-width slot, but with only one screw. Due to bad luck, both drives began to fail at about the same time. This machine has been slow for months, but everything was slow from the beginning. The only change after restart was that sort of permanent slowness which most of us call "crashing". I don't know when slow becomes static, but this server was offering us an excellent opportunity to measure.

resources

Resources: NetApp FAS2220, QNAP NAS, VMware ESXi 5.1 cluster, and the failing Server with both disks.

Qnap - setup nfs export casemgmt, and a 1.1TB iSCSI Target.

NetApp - Keep on trucking, serving persistence to the VMware Cluster.

We booted the Compaq with SysRescueCD. We found the disks, and ran SmartMonTools against the drives. They passed basic inspection, but reported "replace immediately". After mounting the QNAP:/casemgmt, we started in on dd_resuce.

fdisk -l revealed two disks, number one ~500GB, number two ~1TB. In a second Virtual Terminal, I loaded iftop. I quickly saw the bandwidth usage climb to 95mb/sec. I didn't know if that was Megabits or Bytes, and after a little digging, I sadly determined that it was mega-bits, which meant our link was only 10/100. The Ethernet chipset was an Nvidia Forcedeth, the card was a 10/100, and the unchanging "Estimated Time to Completion" was sightly under 12 hours. We scrambled around, found nothing and eventually headed to the only Portland computer store to carry gigabit PCI-Express NICs. Computek, just west of I-405 on Jefferson, sold us a an Intel PCI-E 1Gb NIC. When we shutdown to install the NIC, we relocated the Data Drive to a different workstation (also with a Gigabit NIC).

The Case Management Server was restarted with SysRescueCD, and we resumed the data extraction with dd_rescue. The System Drive had two major faults, which I didn't log, but they occurred between the 37GB and 41GB boundaries. After that, the disk was fine. The speed was adequate, beginning near 72MB/sec, and ending just under at 19MB/sec, with occasional sustained drops to around 10MB/s. The average, based on extraction time, was a little more than 40MB/sec - assuming all the math is correct and my tools were correct.

The Data Drive was Much Slower. Despite the fact that it only had ~385GB of data, it took more than twice as long to copy. Read's started at 20MB/sec, slowly rose to a peak of 32MB/sec with many, MANY, drops to under 10MB/sec. Tried a few multi-threaded copy tools. Primarily we used FastCopy for the initial sweep, and Robocopy (because of the /ZB flag) to clean up. FastCopy ran for more than five hours and was able to reach all but ~5GB of the data. Robocopy ran for a little more then twenty minutes and had no trouble at all.

Once the copy was completed, we ran a read-only chkdsk to give us an over-all view of disk usage, and we also examined Properties in Windows Explorer for the Top-Level directories. All the numbers totalled and so we archived the failing disk.

p2v, manual edition

It's been a long time since I manually P2v'd a Windows Server. This Windows Server wasn't working properly to begin with, the System Drive had experienced significant failures, and I was forced to run an FS-altering chkdsk prior to real recovery. The QNAP held our image file on an NFS export. Windows 7 would provide us with Microsoft-written NTFS repair tools. All I needed was a shim-layer to provide access to the System Drive Image File.

For this, we deployed an Ubuntu 12.04 LTS VM in the VMware Cluster. We gave it 2GB of Ram, and 4 vCPUs. We installed the minimal packages, added htop, iftop, mc, nfs-utils, open-iscsi-utils, and iscsitarget (et. al.). With our shiny VM, I mounted the QNAP with "hard,nointr" NFS options and moved to iSCSI setup. First, I set the IQN, then I grabbed an ietd.conf template and changed the path to reflect the location of the System Drive Image. I left all masking wide-open and then used Microsoft iSCSI Initiator tools on Windows 7 to add the Ubuntu server as a Discovery Portal. The LUN immediately appeared, and I connected it to a drive-letter. We ran chkdsk /f, and the file-system recovered nicely. Only three files were lost, and I was unable to determine the files from the contents, which leads me to suspect completely corrupt data in them anyway.

We did not have a licensed NTFS copy tool, so we decided to use ntfs-3g tools under Linux to accomplish the transfer. After a few futile attempts to do a loop-back iSCSI connection, I tried to get a loop-mount going instead. The problem with loop-mounting is that it only works directly if you have a partition image, not a disk image. I needed to locate the start of the partition, and as I examined the problem I found that I was not familiar enough with the math of sectors and offsets to feel comfortable basing a critical system recovery on it.

Five minutes of Google turned up kpartx. I don't know who wrote it, but that poor bastard probably had received a few direct courses in the School of Hard Knocks. (Despite it being my Alma Mater, I prefer mail-order courses these days.) I think you can always tell software written at the pointy of the sharp stick called "Experience". When a utility is VERY easy and requires NO BRAINS to operate, it probably means that the author is either a genius or had a lot of practice. I don't want to diminish the authors' genius, but given that no one I've heard of did this before him, I am voting for the latter.

kpartx was very small, it's in the main Ubuntu repo, and it has two easy flags - one to interpret the image and tell you what it's going to do, and another to do it. I completely ignored the remaining options. I ran the test, and received exactly what I expected. Then I ran the live command, and I had a new loop device in /dev/mapper. The loop device worked smoothly with ntfsclone --save-image, and the resulting image was 9GB.

That's right, 3 hours, a full disk, and 9GBs of actual data. I tried very hard not to be deeply annoyed. Once we had this, it was still a damaged image of a Windows 2003 Standard SP2 Server originally running on SATA disks, and I needed to load it into a new Guest Machine, on an emulated LSI controller, in the VMware cluster to make it run.

We setup a Windows 2003 Guest, "CaseMgmt", and set the Virtual Drive to 40GB. With CaseMgmt VM shutoff, we attached the virtual disk to the Ubuntu VM, and used fdisk to create an aligned partition. This was when we discovered that ntfsclone won't allow you to restore an image to a smaller partition than the original. After a little monkeying about, we gave up and increased the size of the CaseMgmt Virtual Drive to 500Gb. After a lot of testing, including using Windows 2008 to partition the Virtual Drive with aligned partition, we gave up. Given the nature of the problems we were experiencing, it was growing to be a waste of time.

Our testing cycle went like this: Shutdown the CaseMgmt VM, boot the Ubuntu VM, attempt an operation, then Shutdown Ubuntu and start CaseMgmt. It was pretty simple, and our VMware Cluster running atop our FAS2220 meant that it took about 30 seconds to start and 10 seconds to stop - for both Ubuntu and the CaseMgmt VMs.

At length, we abandoned aligned partitions, and created a new Virtual Drive. We started the CaseMgmt VM on a Windows 2003 Standard CD ISO Image. We used the initial phase of Setup to partition and format the C: Drive. Then we halted the CaseMgmt VM, and restarted Ubuntu. We used ntfsclone to restore the partition, and anecdotally, it seemed slower. Since I had not planned for this moment, I had not timed previous ntfsclone operations. Hence, it was pointless to attempt to get any performance data.

When next I booted the CaseMgmt VM, Windows 2003 did indeed attempt to Boot, however, it stopped very quickly with "INACCESSIBLE_BOOT_DEVICE". We booted to the Windows 2003 Standard Install Disk again, and found that the local-system Administrator password was not properly documented by the previous admin. We booted a copy of Hiren's Boot CD and used the Password Reset tools to clear the Administrator password. From there, we tried enabling a few drivers, but all this yielded was an unbootable, and unrecognizable, Windows instance.

We restored again with Ubuntu, and this time, abandoned all pretense at p2v. I executed a Windows 2003 Standard Repair Install. Everything worked exactly as I had hoped. The resulting image booted, ran, remained joined to the domain, and the services of the Case Management software even operated correctly.

grinding

All of the time we spent working on the System Drive, we were keeping our eyes on the Data Drive. Now that we had time, we also loaded up HDD Guardian to give us a GUI to look at the Smart info for the Data Drive. There were two firmware updates pending for it, and it had a just under 600 bad sectors. The recommendation in orange was: Backup your data and replace this disk immediately.

As I stated above, we carefully examined the results and when all was complete, we detached the iSCSI Drive from the Workstation and attached it to the CaseMgmt VM with Microsoft iSCSI Initiator. Once we set the drive letter, the CaseMgmt services immediately resumed operation and behaved normally.

The only remaining hitch is Windows Updates. Internet Explorer, and many other services, wouldn't run until we re-installed Service Pack 2. Thankfully, the VMware Tools did not care either way. The first two things we did after our successful boot up was to set the local Administrator Password and install VMware Tools. I would not attempt deployment of any .Net software on this host. The install of .Net is in an uncertain state, the patches keep trying to apply but there are errors about mscoree.dll and other .net-related DLLs. My past with the 1.1 and 2.0 releases of the .Net CLR informs me that it's simpler to rebuild a server than it is to fix .Net. This server, and it's configuration, are hardly suitable for long-term use. However, this was not a migration, it was a staying action intended only to circumvent disaster. Long term services will be deployed in a more suitable fashion.

gratitude

I am grateful to all the fellows of the OpenSource movement for providing me with tools which can be assembled to provide useful, powerful and subtle solutions. This operation was one-step shy of genuine data-recovery, and none of it would have been possible without Linux, GNU and the massive host of utilities. Thank, in particular, to kpartx which takes the complex math out of loop-mounting partitions within hard drive images.

notes

kpartx

I only used -l and -a. The -l lists, the -a set's up loop devices for each detected partition.

kpartx -l <image file>
kpartx -a <image file>

(for more info, see the nfolamp article.)

nfs and dd_rescue

For them who might be curious about exact commands....

Mounting the share

mount -t nfs qnap:/casemgmt /nfs/casemgmt -o nfsvers=3,rsize=8192,wsize=8192,soft,intr,nolock

Listing partitions

fdisk -l

Starting dd_rescue

dd_rescue /dev/sda /nfs/casemgmt/casemgmt-disk0.img -l /nfs/casemgmt/casemgmt-disk0.log

nvidia-forcedeth

The Nvidia chipsets - Video or Motherboard, have always been a mix of frustration and pleasure. I don't know if Forcedeth even supports Gig-E, but in this case it was pretty lame. Also, ethtool is a morass of complexity and obscure options. FFS, can someone please make this: "ethtool ethX" give a report of the well-known stats about any card? TSO, TX/RX Cksum, Phy State, Phy Speed, etc, etc. I hate how much I have to read the manual the two times per year that I need to find out what is the damn link speed.

aligned partitions

I love linux tools, but it's painful to use traditional fdisk to align a partition. Now that the heat is off, I think I could have used parted to perform this work - but I didn't. Now, I don't intend to test again.

nfs

It seems that most NAS-solutions dislike locking. That said, when we first attempted to connect to the QNAP, we had a lot of failure. Finally I used this:

showmount -e xx.xx.xx.xx

That revealed... NOTHING.... and we discovered that NFS had never been used from the QNAP. After enabling NFS, it was very nice.

NFS Command used during Image Extract:

mount -t nfs xx.xx.xx.xx:/*exportname* /local/path -o nfsvers=3,rsize=8192,wsize=8192,soft,intr,nolock

NFS Command used during iSCSI Target Export:

mount -t nfs xx.xx.xx.xx:/*exportname* /local/path -o nfsvers=3,rsize=8192,wsize=8192,hard,nolock

iSCSI

If you're over Gig-E, this is probably a great thing. It's better if you don't have to do iSCSI across your production edge-network, and still better with dedicated interfaces. That said, we can stream ~90MB/sec from our QNAP, over 1500-byte-frames production edge network. It's Good Enoughâ„¢ for me.

#/etc/iet/ietd.conf
Target iqn.2012-07.com.ashbyte:MomsLT_NTFS
    Lun 1 Path=/media/ExternalVol00/rootSnapshot01/Sunbeam2/MomsLT_drive.img
    Alias MomsLT
    InitialR2T Yes
    ImmediateData Yes
    MaxOutstandingR2T 1
    MaxConnections 1
    MaxSessions 0
    HeaderDigest None,CRC32C
    DataDigest None,CRC32C
    QueuedCommands           32              # Number of queued commands

ntfs-3g

These guys should probably be showered with money from Microsoft and most MS Users. They have provided the only legitimate alternative to Windows Boxes for accessing, managing and munging NTFS volumes. I dislike using it, but I am ALWAYS thankful for their tools.

driver injection

I would love to see a GOOD Windows 2003/XP/etc offline driver injection toolkit. I have experimented with this before, but it's very tedious and unrewarding. If I keep this up, I may write one. I would love it if the injection framework only installed VMware drivers =). But, I would similarly love it, if there was an offline tool for simply changing partition or other information for a Windows registry which would allow for boot-drive adjustments which currently require a re-install. God Bless *nix for their love of "/" (root).


Comments


Up To Something © Joshua M Schmidlkofer Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More