Friday, February 22, 2013

The dreaded SMART failed email

My NAS emailed me about a 3 weeks ago telling me that a drive had failed.

This email was generated by the smartd daemon running on:
host name: shank-nas   
DNS domain: dhcp.theshanks.net   
NIS domain: (none) 

The following warning/error was logged by the smartd daemon: 
Device: /dev/sdc [SAT], Self-Test Log error count increased 
from 0 to 1 For details see host's SYSLOG. You can also use 
the smartctl utility for further investigation. Another 
email message will be sent in 24 hours if the problem 
persists.

I ignored these messages for a while, after all, this drive had just been replaced. It was part of a 2-drive enclosure that I purchased from woot a few years ago. The enclosure starting stalling on data reads in November of 2012 and I sent it in for RMA repair. They send me a new unit with supposedly new drives. Less than 1 month later, this one starts failing. Go Hitachi.

The good news is that I had built my NAS for just this purpose. I googled around for some instruction on what to do but couldn't find anything. I finally ended up figuring it out on my own.

  1. Buy a new 2TB HDD
  2. Identify the drive using blkid
  3. Unmount the mhddfs share (sudo umount /mnt/media)
  4. Unmount the offending drive and comment out the line from /etc/fstab (umount /mnt/disk1)
  5. Install the new drive, partition and format ext4
  6. Edit /etc/fstab and use the new drive UUID in place of the bad drive for /mnt/disk1
  7. Execute sudo snapraid fix

You're going to see lots of this:

Reading missing data from file '/mnt/disk1/movies/Big Buck Bunny (2008).mkv' at offset 11825315840.
error:6914817:d1:movies/Big Buck Bunny (2008).mkv: Read error at position 45110
fixed:6914817:d1:movies/Big Buck Bunny (2008).mkv: Fixed data error at position 45110

If all goes well, you should get this:
100% completed, 1934092 MiB processed
 5452810 read/data errors
 5452810 recovered errors
 No unrecoverable errors

You will still need to fix the permissions on these files since SnapRAID doesn't back them up.

Now, pat yourself on the back, if you hadn't properly setup SnapRAID in the first place, you would be very sad right now.

1 comment:

  1. [...] My NAS emailed me about a 3 weeks ago telling me that a drive had failed. This email was generated b... [...]

    ReplyDelete