Symptoms
Websites may seem sporadically slow despite no errors/failures with services and without elevated CPU or Memory usage. The only significant performance metric that seems elevated is Disk I/O.
Description
Standard methods of identifying high Disk I/O is through the use of the sar command. The "%iowait" column designates Disk I/O. Expected values are less than 1.00 or 2.00. Occasional spikes in Disk I/O are expected during planned disk heavy operations (such as nightly backups). Sustained spikes or spikes during other operating hours would be considered unusual.
Workaround
One of the only reliable methods of identifying high disk I/O source is to be present on the system when that high Disk I/O is occurring. You can use tools such as "iostat" and "iotop" commands (Available via yum through the "sysstat" RPM if not installed).
"iostat" gives you live statistics of Disk I/O to identify if high I/O is actively occurring and on what disks.
"iotop" is similar to the standard "top" command and will show you which process(es) are reading/writing to the disks in near real-time.
In one example, "iotop" was used to identify that processes named similar to "[jbd2/md4-8]" were spiking I/O. This particular process usually relates to Software RAID functionality.
From there, the status of the Software RAID was reviewed by reviewing the contents of /proc/mdstat. In the example case, it was shown that the 2-drive RAID mirror had one set of drives that failed or missing. Any failures or issues with Software RAID configurations can substantially affect Disk I/O.
root@server # cat /proc/mdstat
Personalities : [raid1]
md4 : active raid1 sda4[0]
1645260736 blocks [2/1] [U_]
bitmap: 13/13 pages [52KB], 65536KB chunk
md2 : active raid1 sda2[0]
307198912 blocks [2/1] [U_]
bitmap: 3/3 pages [12KB], 65536KB chunk
unused devices: <none>
As cPanel & WHM does not configure, deploy, or manage Software RAID configurations, the next steps would be to seek a system administrator's guidance to investigate and repair the Software RAID configuration.
If you do not have one, please see System Administration Services.
Other articles that are helpful for I/O wait troubleshooting are: