πŸ“˜ Detailed RAID Monitoring & Alerts Guide

βœ… Interpreting /proc/mdstat Output

When you run:

cat /proc/mdstat

You’ll see output like:

Personalities : [raid1] 
md0 : active raid1 sda1[0] sdb1[1]
      976630336 blocks [2/2] [UU]

unused devices: <none>

Key things to look for:

  • md0 – the RAID device name.
  • active raid1 – RAID level/type.
  • [2/2] – the first number is total disks, the second is how many are active. E.g., [2/2] = both disks healthy; [2/1] = one disk failed/missing.
  • [UU] – each U represents an up/healthy disk. If you see [U_], one disk is missing or faulty.

During a rebuild or resync, you’ll see progress lines like:

[=>...................]  recovery = 12.4% (12121212/976630336) finish=30.3min speed=54523K/sec

πŸ”” Setting Up Email Alerts for RAID Failures

If you installed RAID with mdadm, you can configure it to email you automatically on disk failures:

1️⃣ Install mdadm (if not installed)
On RHEL/AlmaLinux/CentOS:

dnf install mdadm

On Debian/Ubuntu:

apt install mdadm

2️⃣ Set the email recipient
Edit (or create) /etc/mdadm.conf and add or update the MAILADDR line:

MAILADDR [email protected]

3️⃣ Configure mdadm monitoring service
Enable and start the monitoring service:

systemctl enable --now mdmonitor

4️⃣ Test email delivery
You can test by forcing a fail (be careful in production) or by sending a manual test email.


πŸ“’ Important Best Practices

βœ… Always test your RAID rebuild/recovery plan β€” don’t wait for failure.
βœ… Keep good backups; RAID is not a replacement for backups.
βœ… Monitor RAID health proactively, not just when things break.
βœ… Use smartctl or vendor tools alongside RAID checks to monitor drive health.

Scroll to Top