I work with Linux systems with large numbers of drives strung together using Linux software RAID, MD devices. Multiple MD devices on a single system seems to thwart the built in throttling code. MD RAID devices are supposed to take advantage of unused disk bandwidth when doing rebuilds, checks, or scrubs. MD devices also have a configurable upper limit for rebuild speed. When there are multiple MD RAID devices on the same system you effectively multiply that upper limit by the number of devices being scrubbed simultaneously.
To serialize the weekly scrubs I want to do, I wrote a script that is meant to be run from a cron job. Each time it runs it checks to see if there is already a scrub or other RAID check or rebuild going on. If there is, the script exits. Otherwise the script starts a scrub on a single device that it finds has not been checked since a time specified on the command line, defaulting to "7 days ago." The script starts a scrub on the most out of date device first in order of increasing size.
Most administrative code I have seen that works with MD RAID uses the file /proc/mdstat. I worried that parsing that text might be fragile and opted instead for plucking values out of the files under /sys/block/md*/md/. That way the data is already all parsed up for me, just like Dan Bernstein would like. While working on this script I noticed that documentation for the latest kernels is already a little out of sync with the /sys enteries for my CentOS 5 systems. Maybe /proc/mdstat is meant to be more of a stable API than the /sys entries. For now I still prefer the /sys approach.
comments powered by Disqus