Oct 5

I was taking a mid-afternoon nap (yes at 3 am, I work nights) and I came back to my PC to see CPU usage on my server hovering around 15% – not at idle like usual.  Doing a quick check revealed md0_raid5 and md0_resync running which is normally not a good sign.

mdadm –detail /dev/md0 showed the following:

    Update Time : Sun Oct  5 03:22:19 2008
          State : clean, recovering
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
         Layout : left-symmetric
     Chunk Size : 64K
 Rebuild Status : 85% complete

Uh oh.  Why was the array rebuilding itself?  All drives were listed as active and working …  but did we experience a drive momentarily dropping from the array or a SATA device reset?  Was this a sign of impending hardware failure?  Tailing /var/log/messages displayed this useful piece of information:

Oct  5 01:06:01 rigel md: data-check of RAID array md0

Ok, so “data-check” doesn’t sound so worrysome.  A quick Google search revealed this nice gem:

root@rigel:~# tail /etc/cron.d/mdadm
# By default, run at 01:06 on every Sunday, but do nothing unless the day of
# the month is less than or equal to 7. Thus, only run on the first Sunday of
# each month. crontab(5) sucks, unfortunately, in this regard; therefore this
# hack (see #380425).
6 1 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +\%d) -le 7 ] && /usr/share/mdadm/checkarray --cron --all --quiet

Ah, so this is the first Sunday of the month and the check kicked off at 1:06 AM.  You trixies Ubuntu.  Apparently a bug has been filed causing performance issues on some boxes.  Good idea to verify data integrity, although slightly more obvious notice would be nice.

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.