β
Interpreting /proc/mdstat
Output
When you run:
cat /proc/mdstat
Youβll see output like:
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
976630336 blocks [2/2] [UU]
unused devices: <none>
Key things to look for:
- md0 β the RAID device name.
- active raid1 β RAID level/type.
- [2/2] β the first number is total disks, the second is how many are active. E.g.,
[2/2]
= both disks healthy;[2/1]
= one disk failed/missing. - [UU] β each
U
represents an up/healthy disk. If you see[U_]
, one disk is missing or faulty.
During a rebuild or resync, youβll see progress lines like:
[=>...................] recovery = 12.4% (12121212/976630336) finish=30.3min speed=54523K/sec
π Setting Up Email Alerts for RAID Failures
If you installed RAID with mdadm
, you can configure it to email you automatically on disk failures:
1οΈβ£ Install mdadm (if not installed)
On RHEL/AlmaLinux/CentOS:
dnf install mdadm
On Debian/Ubuntu:
apt install mdadm
2οΈβ£ Set the email recipient
Edit (or create) /etc/mdadm.conf
and add or update the MAILADDR line:
MAILADDR [email protected]
3οΈβ£ Configure mdadm monitoring service
Enable and start the monitoring service:
systemctl enable --now mdmonitor
4οΈβ£ Test email delivery
You can test by forcing a fail (be careful in production) or by sending a manual test email.
π’ Important Best Practices
β
Always test your RAID rebuild/recovery plan β donβt wait for failure.
β
Keep good backups; RAID is not a replacement for backups.
β
Monitor RAID health proactively, not just when things break.
β
Use smartctl
or vendor tools alongside RAID checks to monitor drive health.