Centos5

高伺服器負載無法弄清楚原因

  • December 19, 2012

我的伺服器目前執行 CentOS 5.2,WHM 11.34。

目前,我們的平均負載為 6.43 到 12。我們託管的網站需要花費大量時間來響應和解決。 top沒有顯示任何異常,iftop也沒有顯示很多流量。

我們有很多經銷商,有些不擅長編寫程式碼,我們如何才能找到罪魁禍首?

vmstat 輸出

vmstat
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
0  2     84  78684 154916 1021080    0    0    72   274    0   14  6  3 80 12  0

最高輸出(按 %CPU 排序)

top - 21:44:43 up 5 days, 10:39,  3 users,  load average: 3.36, 4.18, 4.73
Tasks: 222 total,   3 running, 219 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.8%us,  2.3%sy,  0.2%ni, 79.6%id, 11.8%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   2074580k total,  1863044k used,   211536k free,   174828k buffers
Swap:  2040212k total,       84k used,  2040128k free,   987604k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15930 mysql     15   0  138m  46m 4380 S    4  2.3   1:45.87 mysqld
21772 igniteth  17   0 23200 7152 3932 R    4  0.3   0:00.02 php
1586 root      10  -5     0    0    0 S    2  0.0  11:45.19 kjournald
21759 root      15   0  2416 1024  732 R    2  0.0   0:00.01 top
   1 root      15   0  2156  648  560 S    0  0.0   0:26.31 init
   2 root      RT   0     0    0    0 S    0  0.0   0:00.35 migration/0
   3 root      34  19     0    0    0 S    0  0.0   0:00.32 ksoftirqd/0
   4 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/0
   5 root      RT   0     0    0    0 S    0  0.0   0:02.00 migration/1
   6 root      34  19     0    0    0 S    0  0.0   0:00.11 ksoftirqd/1
   7 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/1
   8 root      RT   0     0    0    0 S    0  0.0   0:01.29 migration/2
   9 root      34  19     0    0    0 S    0  0.0   0:00.26 ksoftirqd/2
  10 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/2
  11 root      RT   0     0    0    0 S    0  0.0   0:00.90 migration/3
  12 root      34  19     0    0    0 R    0  0.0   0:00.20 ksoftirqd/3
  13 root      RT   0     0    0    0 S    0  0.0   0:00.00 watchdog/3

最高輸出(按 CPU 時間排序)

top - 21:46:12 up 5 days, 10:41,  3 users,  load average: 2.88, 3.82, 4.55
Tasks: 217 total,   1 running, 216 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.7%us,  2.0%sy,  2.0%ni, 67.2%id, 25.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   2074580k total,  1959516k used,   115064k free,   183116k buffers
Swap:  2040212k total,       84k used,  2040128k free,  1090308k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+    TIME COMMAND
32367 root      16   0  215m 212m 1548 S    0 10.5  62:03.63  62:03 tailwatchd
1586 root      10  -5     0    0    0 S    0  0.0  11:45.27  11:45 kjournald
1576 root      10  -5     0    0    0 S    0  0.0   2:37.86   2:37 kjournald
27722 root      16   0  2556 1184  800 S    0  0.1   1:48.94   1:48 top
15930 mysql     15   0  138m  46m 4380 S    4  2.3   1:48.63   1:48 mysqld
2932 root      34  19     0    0    0 S    0  0.0   1:41.05   1:41 kipmi0
 226 root      10  -5     0    0    0 S    0  0.0   1:34.33   1:34 kswapd0
2671 named     25   0 74688 7400 2116 S    0  0.4   1:23.58   1:23 named
3229 root      15   0 10300 3348 2724 S    0  0.2   0:40.85   0:40 sshd
1580 root      10  -5     0    0    0 S    0  0.0   0:30.62   0:30 kjournald
   1 root      17   0  2156  648  560 S    0  0.0   0:26.32   0:26 init
2616 root      15   0  1816  576  480 S    0  0.0   0:23.50   0:23 syslogd
1584 root      10  -5     0    0    0 S    0  0.0   0:18.67   0:18 kjournald
4342 root      34  19 27692  11m 2116 S    0  0.5   0:18.23   0:18 yum-updatesd
8044 bollingp  15   0  3456 2036  740 S    1  0.1   0:15.56   0:15 imapd
  26 root      10  -5     0    0    0 S    0  0.0   0:14.18   0:14 kblockd/1
7989 gmailsit  16   0  3196 1748  736 S    0  0.1   0:10.43   0:10 imapd

iostat -xtk 1 10 輸出

[root@server1 tmp]# iostat -xtk 1 10
Linux 2.6.18-53.el5    12/18/2012

Time: 09:51:06 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          5.83    0.19    2.53   11.85    0.00   79.60

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               1.37   118.83 18.70 54.27   131.47   692.72    22.59     4.90   67.19   3.10  22.59
sdb               0.35    39.33 20.33 61.43   158.79   403.22    13.75     5.23   63.93   3.77  30.80

Time: 09:51:07 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          1.50    0.00    0.50   24.00    0.00   74.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    25.00  2.00  2.00   128.00   108.00   118.00     0.03    7.25   4.00   1.60
sdb               0.00    16.00 41.00 145.00   200.00   668.00     9.33   107.92  272.72   5.38 100.10

Time: 09:51:08 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          2.00    0.00    1.50   29.50    0.00   67.00

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    95.00  3.00 33.00    12.00   480.00    27.33     0.07    1.72   1.31   4.70
sdb               0.00    14.00  1.00 228.00     4.00   960.00     8.42   143.49  568.01   4.37 100.10

Time: 09:51:09 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
         13.28    0.00    2.76   21.30    0.00   62.66

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    21.00  1.00 19.00    16.00   192.00    20.80     0.06    3.55   1.30   2.60
sdb               0.00    36.00 28.00 181.00   124.00   884.00     9.65   121.16  617.31   4.79 100.10

Time: 09:51:10 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          4.74    0.00    1.50   25.19    0.00   68.58

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    20.00  3.00 15.00    12.00   136.00    16.44     0.17    7.11   3.11   5.60
sdb               0.00     0.00 103.00 60.00   544.00   248.00     9.72    52.35  545.23   6.14 100.10

Time: 09:51:11 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          1.24    0.00    1.24   25.31    0.00   72.21

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    75.00  4.00 28.00    16.00   416.00    27.00     0.08    3.72   2.03   6.50
sdb               2.00     9.00 124.00 17.00   616.00   104.00    10.21     3.73  213.73   7.10 100.10

Time: 09:51:12 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          1.00    0.00    0.75   24.31    0.00   73.93

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    24.00  1.00  9.00     4.00   132.00    27.20     0.01    1.20   1.10   1.10
sdb               4.00    40.00 103.00 48.00   528.00   212.00     9.80   105.21  104.32   6.64 100.20

Time: 09:51:13 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          2.50    0.00    1.75   23.25    0.00   72.50

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   125.74  3.96 46.53    15.84   689.11    27.92     0.20    4.06   2.41  12.18
sdb               2.97     0.00 91.09 84.16   419.80   471.29    10.17    85.85  590.78   5.66  99.11

Time: 09:51:14 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          0.75    0.00    0.50   24.94    0.00   73.82

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00    88.00  1.00  7.00     4.00   380.00    96.00     0.04    4.38   3.00   2.40
sdb               3.00     7.00 111.00 44.00   540.00   208.00     9.65    18.58  581.79   6.46 100.10

Time: 09:51:15 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
         11.03    0.00    3.26   26.57    0.00   59.15

Device:         rrqm/s   wrqm/s   r/s   w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00   145.00  7.00 53.00    28.00   792.00    27.33     0.15    2.50   1.55   9.30
sdb               1.00     0.00 155.00  0.00   800.00     0.00    10.32     2.85   18.63   6.46 100.10

[root@server1 tmp]#

MySQL 顯示完整程序列表

mysql> show full processlist;
+------+---------------+-----------+-----------------------+----------------+------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Id   | User          | Host      | db                    | Command        | Time | State                      | Info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+------+---------------+-----------+-----------------------+----------------+------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|    1 | DB_USER_ONE   | localhost | DB_ONE                | Query          |    3 | waiting for handler insert | INSERT DELAYED INTO defers (mailtime,msgid,email,transport_method,message,host,ip,router,deliveryuser,deliverydomain) VALUES(FROM_UNIXTIME('1355879748'),'1TivwL-0003y8-8l','xxxxxxxxxxxxxxxxxxxx@yahoo.com.tw','remote_smtp','SMTP error from remote mail server after initial connection: host mx1.mail.tw.yahoo.com [203.188.197.119]: 421 4.7.0 [TS01] Messages from 75.125.90.146 temporarily deferred due to user complaints - 4.16.55.1; see http://postmaster.yahoo.com/421-ts01.html','mx1.mail.tw.yahoo.com','203.188.197.119','lookuphost','','') |
|    2 | DELAYED       | localhost | DB_ONE                | Delayed insert |   52 | insert                     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|    3 | DELAYED       | localhost | DB_ONE                | Delayed insert |   68 | insert                     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|  911 | DELAYED       | localhost | DB_ONE                | Delayed insert |   99 | Waiting for INSERT         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
|  993 | DB_USER_TWO   | localhost | DB_TWO                | Sleep          |  832 |                            | NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|  994 | DB_USER_ONE   | localhost | DB_ONE                | Query          |  185 | Locked                     | delete from failures where FROM_UNIXTIME(UNIX_TIMESTAMP(NOW())-1296000) > mailtime                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| 1102 | DB_USER_THREE | localhost | DB_THREE              | Query          |   29 | NULL                       | commit                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 1249 | DB_USER_FOUR  | localhost | DB_FOUR               | Query          |   13 | NULL                       | commit                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 1263 | root          | localhost | DB_FIVE               | Query          |    0 | NULL                       | show full  processlist                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 1264 | DB_USER_SIX   | localhost | DB_SIX                | Query          |    3 | NULL                       | commit                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+------+---------------+-----------+-----------------------+----------------+------+----------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
10 rows in set (0.00 sec)

很明顯,您的磁碟已達到其限制。通常,%wa (iowait) 應該非常低(一般網站 <1%),並且您希望您的 util%(來自 iostat -x)盡可能低(0 是可能的)。

您可以使用 iotop 找出導致所有磁碟使用的程序。

如果結果是 mysql,則應在 my.cnf 中打開記錄慢查詢(並重新啟動 mysql)。然後,您將能夠找出導致它的特定查詢。

**或者。**我認為你的 sdb 壞了。嘗試檢查硬體。

編輯:iotop(可通過 EPEL 獲得)是一個很棒的工具,它可以讓您知道哪個程序導致了 iowait。

引用自:https://serverfault.com/questions/459217