Redhat
使用 0% SWAPON 查找 99.99% iowait 的根本原因
使用者和 DBA 抱怨我們的 OEL 伺服器上的“Oracle 速度很慢”。從作業系統的角度來看,我發現的唯一一件事是有一些奇怪的 IOWAIT 統計數據來自
iotop
.輸出
iotop
:Total DISK READ: 27.24 M/s | Total DISK WRITE: 2.32 M/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 10374 be/4 root 190.28 K/s 0.00 B/s 0.00 % 99.99 % clBackup -child 22862 -j ~jt 202777:7:1 -cn xxxxxx12844 be/4 xxxxxx 0.00 B/s 303.15 K/s 0.00 % 99.99 % ora_dbw0_oaprod 14460 be/4 oracleuser 251.55 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 6795 be/4 oracleuser 1012.65 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 4336 be/4 oracleuser 812.70 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 17725 be/4 oracleuser 193.50 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 14456 be/4 oracleuser 109.65 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 12831 be/4 oracleuser 51.60 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 9756 be/4 oracleuser 83.85 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 24916 be/4 oracleuser 1128.75 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 19701 be/4 oracleuser 361.20 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 27920 be/4 oracleuser 432.15 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 16132 be/4 oracleuser 90.30 K/s 0.00 B/s 0.00 % 99.99 % oracleoaprod (LOCAL=NO) 27967 be/4 oracleuser 64.50 K/s 0.00 B/s 0.00 % 97.87 % oracleoaprod (LOCAL=NO) 16615 be/4 oracleuser 64.50 K/s 0.00 B/s 0.00 % 97.17 % oracleoaprod (LOCAL=NO) 4465 be/4 oracleuser 7.46 M/s 0.00 B/s 0.00 % 97.15 % oracleoaprod (LOCAL=NO) 28044 be/4 oracleuser 14.51 M/s 0.00 B/s 0.00 % 97.02 % oracleoaprod (DESCRIPTION~(ADDRESS=(PROTOCOL=beq)))32283 be/4 oracleuser 77.40 K/s 0.00 B/s 0.00 % 95.48 % oracleoaprod (LOCAL=NO) 12851 be/4 oracleuser 19.35 K/s 590.18 K/s 0.00 % 91.77 % ora_lgwr_oaprod 12846 be/4 oracleuser 0.00 B/s 1077.15 K/s 0.00 % 91.41 % ora_dbw1_oaprod 23153 be/4 oracleuser 96.75 K/s 0.00 B/s 0.00 % 72.37 % oracleoaprod (LOCAL=NO) 27710 be/4 oracleuser 19.35 K/s 0.00 B/s 0.00 % 41.50 % oracleoaprod (LOCAL=NO) 25775 be/4 oracleuser 51.60 K/s 0.00 B/s 0.00 % 30.11 % oracleoaprod (LOCAL=NO) 13323 be/4 oracleuser 19.35 K/s 51.60 K/s 0.00 % 21.98 % oracleoaprod (LOCAL=NO) 24345 be/4 oracleuser 12.90 K/s 0.00 B/s 0.00 % 19.34 % oracleoaprod (LOCAL=NO) 12853 be/4 oracleuser 0.00 B/s 38.70 K/s 0.00 % 11.72 % ora_ckpt_oaprod 7234 be/4 oracleuser 6.45 K/s 0.00 B/s 0.00 % 7.52 % oracleoaprod (LOCAL=NO) 17820 be/4 apps 0.00 B/s 9.68 K/s 0.00 % 0.00 % rwrun P_CONC_REQUEST_ID=8~2211170.out desformat=XML20562 be/4 apps 0.00 B/s 3.23 K/s 0.00 % 0.00 % java -DCLIENT_PROCESSID=2~.GSMSvcComponentContainer 5849 be/4 apps 3.23 K/s 0.00 B/s 0.00 % 0.00 % FNDLIBR 7232 be/4 apps 0.00 B/s 3.23 K/s 0.00 % 0.00 % RVCTP 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init [5] 2 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0] 3 be/7 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0] 4 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/0] 5 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1] 6 be/7 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/1] 7 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/1] 8 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % a[migration/2] 9 be/7 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/2] 10 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/2] 11 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/3] 12 be/7 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/3] 13 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [watchdog/3] 14 rt/3 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/4]
輸出
sar
:# sar 1 7 Linux 2.6.18-371.3.1.0.1.el5 01/16/2014 10:13:41 AM CPU %user %nice %system %iowait %steal %idle 10:13:42 AM all 65.32 0.00 2.56 22.08 0.00 10.04 10:13:43 AM all 65.94 0.00 2.50 23.02 0.00 8.55 10:13:44 AM all 65.15 0.00 2.06 24.17 0.00 8.62 10:13:45 AM all 62.16 0.00 2.06 26.06 0.00 9.73 10:13:46 AM all 54.00 0.00 1.81 31.96 0.00 12.23 10:13:47 AM all 51.03 0.00 1.62 35.17 0.00 12.18 10:13:48 AM all 51.97 0.00 1.25 27.61 0.00 19.18 Average: all 59.37 0.00 1.98 27.15 0.00 11.50
所有磁碟都來自
NetApp
除了LogVol00
:Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup01-LogVol00 97G 76G 17G 83% / /dev/cciss/c0d0p1 99M 32M 63M 34% /boot tmpfs 127G 500M 126G 1% /dev/shm /dev/mapper/mpath4p1 5.4T 3.2T 2.0T 62% /oracle/x1 /dev/mapper/mpath6p1 6.3T 4.3T 1.7T 72% /oracle/x2 /dev/mapper/mpath1p1 184G 188M 174G 1% /oracle/x1/db/apps_st/redo /dev/mapper/mpath2p1 184G 188M 174G 1% /oracle/x1/db/apps_st/redo02
我想這只是缺乏可用的 iops。對於大型數據庫伺服器,我始終建議使用 SSD 儲存或更大的 SAS 陣列(最好是本地儲存)。