I was taking a mid-afternoon nap (yes at 3 am, I work nights) and I came back to my PC to see CPU usage on my server hovering around 15% – not at idle like usual. Doing a quick check revealed md0_raid5 and md0_resync running which is normally not a good sign.
mdadm –detail /dev/md0 showed the following:
Update Time : Sun Oct 5 03:22:19 2008 State : clean, recovering Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 85% complete
Uh oh. Why was the array rebuilding itself? All drives were listed as active and working … but did we experience a drive momentarily dropping from the array or a SATA device reset? Was this a sign of impending hardware failure? Tailing /var/log/messages displayed this useful piece of information:
Oct 5 01:06:01 rigel md: data-check of RAID array md0
Ok, so “data-check” doesn’t sound so worrysome. A quick Google search revealed this nice gem:
root@rigel:~# tail /etc/cron.d/mdadm # By default, run at 01:06 on every Sunday, but do nothing unless the day of # the month is less than or equal to 7. Thus, only run on the first Sunday of # each month. crontab(5) sucks, unfortunately, in this regard; therefore this # hack (see #380425). 6 1 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le 7 ] && /usr/share/mdadm/checkarray --cron --all --quiet
Ah, so this is the first Sunday of the month and the check kicked off at 1:06 AM. You trixies Ubuntu. Apparently a bug has been filed causing performance issues on some boxes. Good idea to verify data integrity, although slightly more obvious notice would be nice.