hardware issue weekend

April 14th, 2008

This weekend I relocated myself and my servers to our temporary stomping grounds. Everything went without a hitch. Except of course for my main server, the trusty dual p3 800 machine that has hosted hasno for a long time. Luckily the drive array was not troubled by the loss of it's brainier compatriot. I picked up a nice new core2duo e6550 and mobo at a local store and had it up and running later in the day. Things are a lot speedier with current hardware. Rest in peace my 800mhz friend.

In addition to the server hardware troubles, my macbook's hard drive has yet again. This is my third hard drive failure. Hitatchi's 160gb sata drives are krud. The laptop itself is great, I just wish I didn't have the hard drive dying every 3-6mo. At least the server migration went through without data loss, I guess I'll just focus on the bright side of the weekend.

As a side note, cablevision can't and won't maintain static ip addresses through a move. So if your a business user hosting content at your office, expect longer downtime due to dns propagation and ip address setup snafus when moving. Make sure you have them unblock ports 80 and 25, since by default on all new ip setup they block them even if they were unblocked for your account previously.

My friend was over yesterday and decided that he would upgrade his macbook to osx leopard (10.5). I'm not sure how it happened but it seems that something got on the dvd and caused a read error and a failed upgrade. After cleaning off the disk, the upgrade completed successfully. The only problem was that his user seemed to be gone. All of his attempts with his various passwords failed. He looked fairly dejected at being locked out of his now leopardized laptop.

The first thing we did to troubleshoot was boot the macbook into target disk mode (hold down T when powering it on). Looking around the filesystem revealed that his user folder still existed in /Users, with the username we expected.

Our next attempt was single user mode (hold down apple-S when powering on). Single user mode drops you into a root shell with readonly access to the filesystem. So we did the following:

fsck -yf 
mount -uw /
launchctl load /System/Library/LaunchDaemons/com.apple.DirectoryServices.plist 
launchctl load /System/Library/LaunchDaemons/com.apple.DirectoryServicesLocal.plist 
passwd #this will change root's password, you will be prompted to enter it twice
dscl
That last command will load the directory services command line app (they removed netinfo manager in leopard, otherwise you could easily reboot and fix the problem using a gui from the root account). At the dscl prompt we entered the following:
list Local/Default/Users
create Local/Default/Users/<your_old_username>
create Local/Default/Users/<your_old_username> UniqueID 501
create Local/Default/Users/<your_old_username> PrimaryGroupID 501
create Local/Default/Users/<your_old_username> NFSHomeDirectory /Users/<your_old_username>
create Local/Default/Users/<your_old_username> UserShell /bin/bash
create Local/Default/Users/<your_old_username> RealName 
After that was complete we rebooted the machine by typing shutdown -r now. When the machine booted up again, my friend was able to log into his account. All of his settings had been preserved since they all live in your home directory. The only exceptions were his user account picture, full name and password.

I hope someone else finds this useful, it took a good bit a of playing around to figure this out.

lemontastic macbook woes

September 26th, 2007

My poor little macbook is once again in the throws of hard drive failure. I returned from lunch and attempted to unlock the screensaver, to find that bringing up the login dialog took ~2 minutes. The only large processes running at the time were firefox and azureus on a machine with 2gb of ram. When I was finally able to login, I noticed an error message from azureus regarding failed writes. At that point, I craned my head towards where the hard drive was located. That tell-tale click-grind-whir noise was repeating softly above the fan noise. I figure, I have maybe a day before this thing kraps itself. Time to make a backup and start calling Apple. Unfortunately this always seems to happen when I am about to travel somewhere. I love this little laptop more than my old pb 12", which was a great little machine. Hopefully apple will just replace it at this point, since I've had it less than one year and have had the hd logic board and fan fail before this.

<update>

I've spoke with apple yesterday, and the tech was adamant about me running DiskUtility's verify. I didn't want to do that before attempting to backup the latest changes to the machine. I do have a backup from 3 weeks ago, but it's missing a vm that I would like to have. So I left the machine attempting to copy that vm and a few other files overnight. When I woke up ~6hrs later, it was still "working", but the UI was not responsive. So I turned if off and on again (The IT Crowd...). The wonderful missing OS screen was what I got in return.

I contacted apple this morning and was told that since the part hadn't failed 3 times, I couldn't just get a laptop replacement. This means that to date I have had 16 days of downtime with this laptop due to hardware issues. Let's assume that it will take 5 days to get me the new part. I think that's a fair assumption since it's Thursday. That will mean 21 days with out the laptop, in less than one year. Let's figure out what the availability of the laptop is. Google tells me that 1 month = 30.4368499 days. Ok, so I've had it since January, so that's 9 months or 273 days. So it seems like I have been with out the laptop 7.69% of the time since I got it. Great, so conversely it's been functional 92.31% of the time. This only ever happens before I travel, it's like the macbook is protesting.

</update>