{"id":25,"date":"2012-01-24T19:30:33","date_gmt":"2012-01-24T19:30:33","guid":{"rendered":"http:\/\/garysieling.com\/blog\/?p=25"},"modified":"2012-01-24T19:30:33","modified_gmt":"2012-01-24T19:30:33","slug":"vps-io-diagnosis","status":"publish","type":"post","link":"https:\/\/www.garysieling.com\/blog\/vps-io-diagnosis\/","title":{"rendered":"Diagnosing Disk I\/O issues in a VPS"},"content":{"rendered":"<p>Every so often, my Linode goes into a state of apparent frantic I\/O. Page loads slow down a bit, and I get regular email alerts indicating a potential problem:<\/p>\n<pre>Subject: Linode Alert - disk io rate\n\nYour Linode, linode90147, has exceeded the notification threshold (800) for disk io rate by averaging 2146.05 for the last 2 hours. The dashboard for this Linode is located at: ...<\/pre>\n<p>This is the first time this happened since I switched entirely to nginx. My first test was to install iostat\/sar, to see what is going on.<\/p>\n<pre>\napt-get install sysstat\n<\/pre>\n<p>The initial output of iostat looks like this:<\/p>\n<pre>\navg-cpu:  %user   %nice %system %iowait  %steal   %idle\n           0.75    0.28    0.19    1.21    0.01   97.56\n\nDevice:            tps   Blk_read\/s   Blk_wrtn\/s   Blk_read   Blk_wrtn\nxvda             12.53       168.70        73.94  250258986  109688664\nxvdb             23.49       127.81        86.97  189603528  129015512\n<\/pre>\n<p>This shows point in time output for the read\/write rate, which doesn&#8217;t look nearly as high as Linode is reporting. You can do a continuous reporting by doing the following:<\/p>\n<pre>\niostat -d 2\n<\/pre>\n<p>This showed the read\/write rates running anywhere from 0 to 5500 blocks\/second, about 2.8 MB\/s (512 bytes\/block). Some points to note: xvda is Xen Virtual Disk. Watching the usage for a while, both disks about simultaneously, but most of the writes are xvdb, which may indicate loading a lot of data from disk into memory (swap) space.<\/p>\n<p>To find out which process(es) are doing the disk use, I ran the following:<\/p>\n<pre>\npidstat -d 2 300\n<\/pre>\n<p>This takes 300 I\/O samples at two second intervals (i.e. for ten minutes). It prints out each sample and an average summary. Running this, I got the following output:<\/p>\n<pre>\nAverage:          PID   kB_rd\/s   kB_wr\/s kB_ccwr\/s  Command\nAverage:          996      0.01      2.31      0.00  kjournald\nAverage:         1930      2.43      0.07      0.00  rsyslogd\nAverage:         1958      0.10      0.00      0.00  atd\nAverage:         1959    109.13     14.20      0.00  cron\nAverage:         1971      0.26      0.00      0.00  memcached\nAverage:         1978     47.54      0.48      0.00  mysqld\nAverage:         2045     26.32      0.11      0.03  munin-node\nAverage:         2131     10.21      0.01      0.00  sendmail-mta\nAverage:         2234      2.62      0.01      0.00  ntpd\nAverage:         2397     10.71      0.00      0.00  fail2ban-server\nAverage:        13689      0.29      0.00      0.00  pidstat\nAverage:        14427      0.23      0.00      0.00  cron\nAverage:        14428      0.18      0.00      0.00  sh\nAverage:        14431      0.01      0.00      0.00  munin-cron\nAverage:        14432      9.95      0.01      0.00  munin-update\nAverage:        14433      6.95      5.61      0.00  munin-update\nAverage:        14434     10.14      0.04      0.01  munin-node\nAverage:        14813      0.04      0.00      0.00  vmstat\nAverage:        14814      0.15      0.00      0.00  vmstat\nAverage:        15685     12.96      0.00      0.00  php-cgi\nAverage:        15686     13.79      0.00      0.00  php-cgi\nAverage:        15687     20.47      0.59      0.15  php-cgi\nAverage:        15688     15.01      0.00      0.00  php-cgi\nAverage:        15689     33.72      0.00      0.00  php-cgi\nAverage:        15690     15.53      0.00      0.00  php-cgi\nAverage:        15691      9.60      0.00      0.00  php-cgi\nAverage:        15692     19.13      0.00      0.00  php-cgi\nAverage:        15693     12.64      0.00      0.00  php-cgi\nAverage:        15694     15.76      0.00      0.00  php-cgi\nAverage:        15695     16.30      0.01      0.00  php-cgi\nAverage:        15696     18.60      0.00      0.00  php-cgi\nAverage:        15697     12.65      0.00      0.00  php-cgi\nAverage:        16334     30.81      0.72      0.15  php-cgi\nAverage:        16338     14.99      0.00      0.00  php-cgi\nAverage:        21209      1.21      0.00      0.00  sshd\nAverage:        31579      1.03      0.86      0.79  nginx\nAverage:        31580      0.67      0.03      0.00  nginx\nAverage:        31581      1.11      0.07      0.00  nginx\nAverage:        31582      1.82      0.17      0.00  nginx\n<\/pre>\n<p>There are ares to research as an avenue for research- php, mysql, and cron. I know I added some jobs, so I tested that first. To see all available cron jobs:<\/p>\n<pre>\n for user in $(cut -f1 -d: \/etc\/passwd); do crontab -u $user -l; done\n<\/pre>\n<p>From this output, I removed two obsolete hourly tasks I had created. For good measure, I also decreased the frequency of man-db lookups from daily to monthly, removed apache2 cleanup (no longer used) and popularity-contest. Everything remaining appears to be important to system maintenance. The following is a second performance log, after this runs. Very little has changed.<\/p>\n<pre>\nAverage:            1      1.22      0.02      0.01  init\nAverage:          996      0.03      3.51      0.00  kjournald\nAverage:         1930      2.74      0.26      0.00  rsyslogd\nAverage:         1959    131.20     14.24      0.00  cron\nAverage:         1971      0.01      0.00      0.00  memcached\nAverage:         1978     78.06      0.63      0.11  mysqld\nAverage:         2045     27.10      0.09      0.03  munin-node\nAverage:         2131     23.92      0.03      0.00  sendmail-mta\nAverage:         2234      2.77      0.00      0.00  ntpd\nAverage:         2397     13.69      0.00      0.00  fail2ban-server\nAverage:        15685     24.56      0.01      0.00  php-cgi\nAverage:        15686     17.49      0.00      0.00  php-cgi\nAverage:        15687     40.37      0.00      0.00  php-cgi\nAverage:        15688     21.09      0.01      0.00  php-cgi\nAverage:        15689     29.78      0.00      0.00  php-cgi\nAverage:        15690     10.94      0.00      0.00  php-cgi\nAverage:        15691     23.94      0.00      0.00  php-cgi\nAverage:        15692     16.95      0.01      0.00  php-cgi\nAverage:        15693      8.02      0.01      0.00  php-cgi\nAverage:        15694      7.47      0.01      0.00  php-cgi\nAverage:        15695     23.33      0.00      0.00  php-cgi\nAverage:        15696      8.47      0.01      0.00  php-cgi\nAverage:        15697     13.05      0.01      0.00  php-cgi\nAverage:        16334     13.13      0.01      0.00  php-cgi\nAverage:        16338     11.27      0.00      0.00  php-cgi\nAverage:        21209      0.60      0.00      0.00  sshd\nAverage:        22998      0.41      0.00      0.00  pidstat\nAverage:        31579      0.51      0.03      0.00  nginx\nAverage:        31580      1.37      3.08      1.43  nginx\nAverage:        31581      1.32      1.55      1.44  nginx\nAverage:        31582      2.55      1.77      0.00  nginx\n<\/pre>\n<p>The php work is a little lower, but likely not enough to be significant. Next up: PHP. I had APC working when I was running Apache, but perhaps it&#8217;s not working now, with Nginx as the primary server. <\/p>\n<p>I rebuilt APC from scratch, in case there was a newer version. The lynchpin of this was discovering multiple php.ini files on the VPS. The instructions for building APC are as follows:<\/p>\n<pre>\nwget http:\/\/pecl.php.net\/package\/APC\ntar -xzf APC-3.1.9.tgz\ncd APC-3.1.9\nphpize\n.\/configure --enable-apc --enable-apc-mmap --with-apxs --with-php-config=\/etc\/php5\/cgi\/php.ini\nmake\nmake test\nmake install\n\nvi \/etc\/php5\/cgi\/php.ini\n<\/pre>\n<p>Add this line at the end:<\/p>\n<pre>\nextension=apc.so\n<\/pre>\n<p>Then restart phpd\/php-cgi. E.g. if you installed nginx\/fast_cgi as an init.d service, do something like this:<\/p>\n<pre>\n \/etc\/init.d\/php-fastcgi restart\n<\/pre>\n<p>I re-ran the performance test. PHP activity is pretty much gone. It looks like traffic is lower at the moment as well, but <a href=\"http:\/\/garysieling.com\/blog\/where-is-apc-php-on-ubuntu\">apc.php<\/a> shows about 80% cache hits. For memory sake, it would be nice to share WordPress installations, but this has some significant challenges (e.g. handling upgrades). For now, disk use has slowed, so I will leave mysql tuning for another day.<\/p>\n<pre>\nAverage:            8    0.00    0.01    0.00    0.01     -  kworker\/1:0\nAverage:          271    0.00    0.00    0.00    0.00     -  kswapd0\nAverage:          996    0.00    0.00    0.00    0.00     -  kjournald\nAverage:         1730    0.00    0.01    0.00    0.01     -  kworker\/3:1\nAverage:         1864    0.00    0.00    0.00    0.00     -  kworker\/2:1\nAverage:         1930    0.00    0.00    0.00    0.00     -  rsyslogd\nAverage:         1959    0.00    0.00    0.00    0.00     -  cron\nAverage:         1971    0.00    0.00    0.00    0.00     -  memcached\nAverage:         1978    0.19    0.09    0.00    0.28     -  mysqld\nAverage:         2045    0.00    0.00    0.00    0.01     -  munin-node\nAverage:         2131    0.00    0.00    0.00    0.00     -  sendmail-mta\nAverage:         2234    0.00    0.00    0.00    0.01     -  ntpd\nAverage:         2280    0.00    0.00    0.00    0.00     -  flush-202:0\nAverage:         2397    0.02    0.00    0.00    0.02     -  fail2ban-server\nAverage:         8895    0.25    0.02    0.00    0.27     -  php-cgi\nAverage:         8896    0.23    0.03    0.00    0.26     -  php-cgi\nAverage:         8897    0.20    0.02    0.00    0.23     -  php-cgi\nAverage:         8898    3.89    0.01    0.00    3.90     -  php-cgi\nAverage:        10837    0.16    0.41    0.00    0.57     -  pidstat\nAverage:        23460    0.00    0.01    0.00    0.02     -  sshd\nAverage:        23599    0.00    0.01    0.00    0.01     -  kworker\/0:1\nAverage:        25025    0.01    0.02    0.00    0.03     -  nginx\nAverage:        25026    0.00    0.00    0.00    0.01     -  nginx\nAverage:        25027    0.00    0.00    0.00    0.01     -  nginx\nAverage:        25028    0.00    0.00    0.00    0.01     -  nginx\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Every so often, my Linode goes into a state of apparent frantic I\/O. Page loads slow down a bit, and I get regular email alerts indicating a potential problem: Subject: Linode Alert &#8211; disk io rate Your Linode, linode90147, has exceeded the notification threshold (800) for disk io rate by averaging 2146.05 for the last &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.garysieling.com\/blog\/vps-io-diagnosis\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Diagnosing Disk I\/O issues in a VPS&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[14],"tags":[383,421,432,580],"aioseo_notices":[],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/25"}],"collection":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/comments?post=25"}],"version-history":[{"count":0,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/posts\/25\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/media?parent=25"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/categories?post=25"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.garysieling.com\/blog\/wp-json\/wp\/v2\/tags?post=25"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}