Cpu-Usage
多少中斷是太多了?
在 AWS 實例 x1.32xlarge(128 核)上,我們每秒會收到很多中斷。
以下是中斷/秒中最高的 CPU:
Interrupts Top CPUs CPU0: 140838.0 CPU1: 77867.0 CPU4: 66495.0 CPU6: 59941.0 CPU3: 39096.0 CPU2: 31532.0 CPU7: 30861.0 CPU5: 26042.0 CPU8: 4168.0 CPU12: 3026.0 CPU10: 2793.0
以下是最高的中斷/s/CPU:
Interrupts above 10k/s HYP [Hypervisor callback interrupts] [CPU0] = 46902.0/sec 49 [xen-percpu-ipi resched0] [CPU0] = 43437.0/sec RES [Rescheduling interrupts] [CPU0] = 41512.0/sec HYP [Hypervisor callback interrupts] [CPU2] = 26638.0/sec HYP [Hypervisor callback interrupts] [CPU8] = 22875.0/sec HYP [Hypervisor callback interrupts] [CPU12] = 20813.0/sec 55 [xen-percpu-ipi resched1] [CPU2] = 20749.0/sec RES [Rescheduling interrupts] [CPU2] = 19568.0/sec 73 [xen-percpu-ipi resched4] [CPU8] = 16400.0/sec RES [Rescheduling interrupts] [CPU8] = 15677.0/sec HYP [Hypervisor callback interrupts] [CPU6] = 14226.0/sec 85 [xen-percpu-ipi resched6] [CPU12] = 14060.0/sec RES [Rescheduling interrupts] [CPU12] = 13271.0/sec HYP [Hypervisor callback interrupts] [CPU14] = 12173.0/sec HYP [Hypervisor callback interrupts] [CPU4] = 11887.0/sec HYP [Hypervisor callback interrupts] [CPU10] = 10500.0/sec
當該機器上執行的應用程序處於顯著負載下時,就會發生這種情況。網路流量比較大,執行緒很多。
我的問題是:50K/150K 中斷/秒太多了嗎?我們如何解釋這個數字?是否有最大中斷/秒?
更新:
這裡是
top
輸出的一瞥:Tasks: 825 total, 3 running, 822 sleeping, 0 stopped, 0 zombie Cpu(s): 10.6%us, 3.4%sy, 0.0%ni, 83.6%id, 0.0%wa, 0.0%hi, 2.3%si, 0.0%st Mem: 2014742856k total, 40059184k used, 1974683672k free, 162036k buffers Swap: 0k total, 0k used, 0k free, 3159112k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32936 ec2-user 20 0 77.3g 11g 29m S 1759.7 0.6 1780:36 java 32118 ec2-user 20 0 64.2g 10g 26m S 1036.9 0.6 62:31.08 java 3 root 20 0 0 0 0 R 70.4 0.0 14:54.84 ksoftirqd/0 12 root 20 0 0 0 0 S 21.2 0.0 6:06.47 ksoftirqd/1 16 root 20 0 0 0 0 S 15.2 0.0 4:33.28 ksoftirqd/2 20 root 20 0 0 0 0 S 12.2 0.0 3:34.12 ksoftirqd/3 28 root 20 0 0 0 0 S 11.9 0.0 3:24.96 ksoftirqd/5 24 root 20 0 0 0 0 S 11.6 0.0 3:26.54 ksoftirqd/4 32 root 20 0 0 0 0 S 10.2 0.0 3:23.56 ksoftirqd/6 36 root 20 0 0 0 0 S 10.2 0.0 3:28.80 ksoftirqd/7
大多數中斷來自網路網卡隊列,這允許將負載分散到其他核心: https ://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-cpu-irq .html
如果不知道您的應用程序做了什麼以及它產生的負載,就無法判斷您的系統是否有“太多中斷”正在進行。
您可以使用
top
來檢查system
負載值。如果它很高,則意味著很大一部分 CPU 負載發生在核心上下文中。反過來,這可能是中斷風暴的跡象。