Yoshinori Matsunobu's blog: July 2009

iostat -x is very useful to check disk i/o activities. Sometimes it is said that "check %util is less than 100%" or "check svctm is less than 50ms", but please do not fully trust these numbers. For example, the following two cases (DBT-2 load on MySQL) used same disks (two HDD disks, RAID1) and reached almost 100% util, but performance numbers were very different (no.2 was about twice as fast as no.1).

# iostat -xm 10
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
       21.16    0.00    6.14   29.77    0.00   42.93

Device: rqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s
sdb    2.60 389.01  283.12 47.35     4.86     2.19
avgrq-sz avgqu-sz   await  svctm  %util
43.67     4.89   14.76   3.02  99.83

# iostat -xm 10
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
       40.03    0.00   16.51   16.52    0.00   26.94

Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s
sdb              6.39   368.53 543.06 490.41     6.71     3.90
avgrq-sz avgqu-sz   await  svctm  %util
21.02     3.29    3.20   0.90  92.66

100% util does not mean disks can not be faster anymore. For example, command queuing (TCQ/NCQ) or battery backed up write cache can often boosts performance significantly. For random i/o oriented applications(in most cases), I pay attention to r/s and w/s. r/s is the number of read requests that were issued to the device per second. w/s is the number of write requests that were issued to the device per second (copied from man). r/s + w/s is the total number of i/o requests per second (IOPS) so it is easier to check whether disks work as expected or not. For example, a few thousands of IOPS can be expected on single Intel SSD drive. For sequential i/o operations, r/s and w/s can be significantly affected by Linux parameters such as max_sectors_kb even though throughput is not different, so I check different iostat status variables such as rrqm/s, rMB/s.

What about svctm? Actually Linux's iostat calculates svctm automatically from r/s, w/s and %util. Here is an excerpt from iostat.c .

...
nr_ios = sdev.rd_ios + sdev.wr_ios;
tput   = ((double) nr_ios) * HZ / itv;
util   = ((double) sdev.tot_ticks) / itv * HZ;
svctm  = tput ? util / tput : 0.0;
...
/*       rrq/s wrq/s   r/s   w/s  rsec  wsec   rkB   wkB  rqsz  qusz await svctm %util */
printf(" %6.2f %6.2f %5.2f %5.2f %7.2f %7.2f %8.2f %8.2f %8.2f %8.2f %7.2f %6.2f %6.2f\n",
   ((double) sdev.rd_merges) / itv * HZ,
   ((double) sdev.wr_merges) / itv * HZ,
   ((double) sdev.rd_ios) / itv * HZ,
   ((double) sdev.wr_ios) / itv * HZ,
...

The latter means the following.

r/s = sdev.rd_ios / itv * HZ
w/s = sdev.wr_ios / itv * HZ

The former means the following.

svctm = util / ((sdev.rd_ios + sdev.wr_ios) * HZ / itv)

If %util is 100%, svctm is just 1 / (r/s + w/s) seconds, 1000/(r/s+w/s) milliseconds. This is an inverse number of IOPS. In other words, svctm * (r/s+w/s) is always 1000 if %util is 100%. So checking svctm is practically as same as checking r/s and w/s (as long as %util is close to 100%). The latter (IOPS) is much easier, isn't it?

Yoshinori Matsunobu's blog

Tuesday, July 28, 2009

iostat: (r/s + w/s) * svctm = %util on Linux

About Me

Followers

Blog Archive