當 RAM 使用量增加時,Proxmox 會殺死 VM
我不知道這個 stackexchange 網站是否最適合我的問題,如果不是,請標記這個以便 mods 將問題移到其他地方。
背景和問題
我在物理機(家庭伺服器)上安裝了 Proxmox,在安裝過程中選擇了 ZFS 鏡像,並創建了一個佔用幾乎所有資源的大型 Ubuntu 伺服器 VM。
每當我在 VM 中進行大量磁碟 I/O(寫入或讀取大量數據)時,RAM 使用量就會增加,並且在達到 100% 後,VM 就會崩潰。在讀取 的輸出時
htop
,它是 RAM 的橙色部分增長,我知道這是核心使用的緩衝區/記憶體,應該是可恢復的。設置
我目前的設置如下:
- 24GB 記憶體
- 2 個 2TB 硬碟
- Proxmox 安裝在兩個驅動器上,安裝過程中選擇了 ZFS 鏡像。
- Proxmox 上的虛擬機上的 Ubuntu 20.04,佔用 1.5TB 的空間並限制為 20GB 的 RAM。它使用 LVM 和 ext4、VirtIO SCSI 控制器以及來自 Proxmox 的大部分預設值。
研究
我用我最喜歡的搜尋引擎並沒有真正找到這樣的東西,但也許我只是沒有使用正確的關鍵字🤔
我讀到 ZFS 上糟糕的磁碟性能可能與不適當的
ashift
值有關。對於扇區大小為 4096 字節的磁碟,我的似乎是 12,據我了解(2 ^ 12 = 4096),請參見下面的程式碼塊。我嘗試將分配給 VM 的 RAM 減少到 16GB,以防 ZFS 需要更多來處理所有這些數據流,但這只是讓它崩潰得更快。
據我所知,zpool 似乎很健康(請參見下面的程式碼塊)。
最初,Ubuntu 安裝附帶了一個 8GB 的交換文件,我禁用了它,但沒有幫助。
我現在幾乎沒有想法。隨意指出明顯的錯誤或疏忽!
我想我的問題是:它為什麼會崩潰?為什麼 RAM 使用量會增長?
gaugendre@ubuntu:~$ free -h # the "buff/cache" grows total used free shared buff/cache available Mem: 19Gi 1.4Gi 4.7Gi 119Mi 13Gi 17Gi Swap: 0B 0B 0B root@pm:~# zpool get all | grep ashift rpool ashift 12 local root@pm:~# fdisk -l Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: WDC WD20EFRX-68E Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: D4515220-E091-40B3-B6CC-2FE98C69A5F2 Device Start End Sectors Size Type /dev/sda1 34 2047 2014 1007K BIOS boot /dev/sda2 2048 1050623 1048576 512M EFI System /dev/sda3 1050624 3907029134 3905978511 1.8T Solaris /usr & Apple ZFS Partition 1 does not start on physical sector boundary. Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: WDC WD20EFRX-68E Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 1A09CBBB-F15C-47BD-A5E6-BC616AA46EB6 Device Start End Sectors Size Type /dev/sdb1 34 2047 2014 1007K BIOS boot /dev/sdb2 2048 1050623 1048576 512M EFI System /dev/sdb3 1050624 3907029134 3905978511 1.8T Solaris /usr & Apple ZFS Partition 1 does not start on physical sector boundary. Disk /dev/zd0: 1.5 TiB, 1610612736000 bytes, 3145728000 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 8192 bytes I/O size (minimum/optimal): 8192 bytes / 8192 bytes Disklabel type: gpt Disk identifier: 8F028008-A5B1-4D8A-BF51-825BCC73949F Device Start End Sectors Size Type /dev/zd0p1 2048 4095 2048 1M BIOS boot /dev/zd0p2 4096 2101247 2097152 1G Linux filesystem /dev/zd0p3 2101248 3145725951 3143624704 1.5T Linux filesystem root@pm:~# zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT rpool 1.81T 157G 1.66T - - 0% 8% 1.00x ONLINE - root@pm:~# zpool status -v pool: rpool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M7KVECCZ-part3 ONLINE 0 0 0 ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M3FASL3F-part3 ONLINE 0 0 0 errors: No known data errors
Proxmox OOM 殺死日誌
Feb 02 15:51:39 pm kernel: pve-firewall invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 Feb 02 15:51:39 pm kernel: CPU: 1 PID: 1496 Comm: pve-firewall Tainted: P O 5.4.73-1-pve #1 Feb 02 15:51:39 pm kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./A320M Pro4-F, BIOS P2.20 07/27/2020 Feb 02 15:51:39 pm kernel: Call Trace: Feb 02 15:51:39 pm kernel: dump_stack+0x6d/0x9a Feb 02 15:51:39 pm kernel: dump_header+0x4f/0x1e1 Feb 02 15:51:39 pm kernel: oom_kill_process.cold.33+0xb/0x10 Feb 02 15:51:39 pm kernel: out_of_memory+0x1ad/0x490 Feb 02 15:51:39 pm kernel: __alloc_pages_slowpath+0xd40/0xe30 Feb 02 15:51:39 pm kernel: ? __wake_up_common_lock+0x8c/0xc0 Feb 02 15:51:39 pm kernel: __alloc_pages_nodemask+0x2df/0x330 Feb 02 15:51:39 pm kernel: alloc_pages_current+0x81/0xe0 Feb 02 15:51:39 pm kernel: __page_cache_alloc+0x6a/0xa0 Feb 02 15:51:39 pm kernel: pagecache_get_page+0xbe/0x2e0 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x34/0x70 Feb 02 15:51:39 pm kernel: filemap_fault+0x887/0xa70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x34/0x70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x40/0x70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x34/0x70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x40/0x70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x34/0x70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x40/0x70 Feb 02 15:51:39 pm kernel: ? __switch_to_asm+0x34/0x70 Feb 02 15:51:39 pm kernel: ? xas_load+0xc/0x80 Feb 02 15:51:39 pm kernel: ? xas_find+0x17e/0x1b0 Feb 02 15:51:39 pm kernel: ? filemap_map_pages+0x28d/0x3b0 Feb 02 15:51:39 pm kernel: __do_fault+0x3c/0x130 Feb 02 15:51:39 pm kernel: __handle_mm_fault+0xe75/0x12a0 Feb 02 15:51:39 pm kernel: handle_mm_fault+0xc9/0x1f0 Feb 02 15:51:39 pm kernel: __do_page_fault+0x233/0x4c0 Feb 02 15:51:39 pm kernel: do_page_fault+0x2c/0xe0 Feb 02 15:51:39 pm kernel: page_fault+0x34/0x40 Feb 02 15:51:39 pm kernel: RIP: 0033:0x55e4e4e79b18 Feb 02 15:51:39 pm kernel: Code: Bad RIP value. Feb 02 15:51:39 pm kernel: RSP: 002b:00007ffdaad052a0 EFLAGS: 00010206 Feb 02 15:51:39 pm kernel: RAX: 0000000000000000 RBX: 000055e4e6577260 RCX: 00007f073b0286f4 Feb 02 15:51:39 pm kernel: RDX: 0000000000000000 RSI: 00007ffdaad05260 RDI: 00007ffdaad05260 Feb 02 15:51:39 pm kernel: RBP: 000055e4e657ca78 R08: 0000000000000e02 R09: 0000000000000602 Feb 02 15:51:39 pm kernel: R10: 000000000517e000 R11: 0000000000000246 R12: 000055e4e89ef7b0 Feb 02 15:51:39 pm kernel: R13: 000055e4e657ca70 R14: 0000000000000000 R15: 0000000000000000 Feb 02 15:51:39 pm kernel: Mem-Info: Feb 02 15:51:39 pm kernel: active_anon:5184091 inactive_anon:16558 isolated_anon:0 active_file:84 inactive_file:0 isolated_file:0 unevictable:1330 dirty:0 writeback:10 unstable:0 slab_reclaimable:13066 slab_unreclaimable:144860 mapped:14964 shmem:17028 pagetables:11872 bounce:0 free:40335 free_pcp:762 free_cma:0 Feb 02 15:51:39 pm kernel: Node 0 active_anon:20736364kB inactive_anon:66232kB active_file:628kB inactive_file:0kB unevictable:5320kB isolated(anon):0kB isolated(file):0kB mapped:59856kB dirty:0kB writeback:40kB shmem:68112kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 7663616kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no Feb 02 15:51:39 pm kernel: Node 0 DMA free:15884kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15884kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB Feb 02 15:51:39 pm kernel: lowmem_reserve[]: 0 1357 21897 21897 21897 Feb 02 15:51:39 pm kernel: Node 0 DMA32 free:86332kB min:4184kB low:5572kB high:6960kB active_anon:1333188kB inactive_anon:0kB active_file:0kB inactive_file:4kB unevictable:0kB writepending:0kB present:1497008kB managed:1431132kB mlocked:0kB kernel_stack:0kB pagetables:8kB bounce:0kB free_pcp:248kB local_pcp:0kB free_cma:0kB Feb 02 15:51:39 pm kernel: lowmem_reserve[]: 0 0 20539 20539 20539 Feb 02 15:51:39 pm kernel: Node 0 Normal free:59628kB min:63348kB low:84380kB high:105412kB active_anon:19403176kB inactive_anon:66232kB active_file:0kB inactive_file:400kB unevictable:5320kB writepending:40kB present:21482752kB managed:21040620kB mlocked:5320kB kernel_stack:6304kB pagetables:47480kB bounce:0kB free_pcp:2980kB local_pcp:28kB free_cma:0kB Feb 02 15:51:39 pm kernel: lowmem_reserve[]: 0 0 0 0 0 Feb 02 15:51:39 pm kernel: Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB Feb 02 15:51:39 pm kernel: Node 0 DMA32: 3*4kB (UM) 0*8kB 0*16kB 6*32kB (UM) 8*64kB (UM) 70*128kB (UM) 60*256kB (U) 42*512kB (UM) 39*1024kB (U) 0*2048kB 0*4096kB = 86476kB Feb 02 15:51:39 pm kernel: Node 0 Normal: 5350*4kB (UME) 460*8kB (UME) 177*16kB (UME) 718*32kB (UME) 112*64kB (UM) 9*128kB (U) 4*256kB (UM) 1*512kB (U) 0*1024kB 0*2048kB 0*4096kB = 60744kB Feb 02 15:51:39 pm kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB Feb 02 15:51:39 pm kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Feb 02 15:51:39 pm kernel: 18053 total pagecache pages Feb 02 15:51:39 pm kernel: 0 pages in swap cache Feb 02 15:51:39 pm kernel: Swap cache stats: add 0, delete 0, find 0/0 Feb 02 15:51:39 pm kernel: Free swap = 0kB Feb 02 15:51:39 pm kernel: Total swap = 0kB Feb 02 15:51:39 pm kernel: 5748939 pages RAM Feb 02 15:51:39 pm kernel: 0 pages HighMem/MovableOnly Feb 02 15:51:39 pm kernel: 127030 pages reserved Feb 02 15:51:39 pm kernel: 0 pages cma reserved Feb 02 15:51:39 pm kernel: 0 pages hwpoisoned Feb 02 15:51:39 pm kernel: Tasks state (memory values in pages): Feb 02 15:51:39 pm kernel: [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name Feb 02 15:51:39 pm kernel: [ 683] 0 683 19753 9110 172032 0 0 systemd-journal Feb 02 15:51:39 pm kernel: [ 755] 0 755 5806 650 69632 0 -1000 systemd-udevd Feb 02 15:51:39 pm kernel: [ 1005] 106 1005 1705 385 57344 0 0 rpcbind Feb 02 15:51:39 pm kernel: [ 1006] 100 1006 23270 573 86016 0 0 systemd-timesyn Feb 02 15:51:39 pm kernel: [ 1020] 0 1020 4878 611 77824 0 0 systemd-logind Feb 02 15:51:39 pm kernel: [ 1021] 0 1021 93118 287 86016 0 0 lxcfs Feb 02 15:51:39 pm kernel: [ 1022] 0 1022 41689 511 81920 0 0 zed Feb 02 15:51:39 pm kernel: [ 1023] 104 1023 2264 444 57344 0 -900 dbus-daemon Feb 02 15:51:39 pm kernel: [ 1024] 0 1024 68958 296 77824 0 0 pve-lxc-syscall Feb 02 15:51:39 pm kernel: [ 1025] 0 1025 1022 326 49152 0 0 qmeventd Feb 02 15:51:39 pm kernel: [ 1026] 0 1026 56455 662 90112 0 0 rsyslogd Feb 02 15:51:39 pm kernel: [ 1029] 0 1029 3064 656 61440 0 0 smartd Feb 02 15:51:39 pm kernel: [ 1030] 0 1030 535 240 40960 0 -1000 watchdog-mux Feb 02 15:51:39 pm kernel: [ 1071] 0 1071 1681 330 49152 0 0 ksmtuned Feb 02 15:51:39 pm kernel: [ 1137] 0 1137 954 288 45056 0 0 lxc-monitord Feb 02 15:51:39 pm kernel: [ 1182] 0 1182 1722 61 49152 0 0 iscsid Feb 02 15:51:39 pm kernel: [ 1187] 0 1187 1848 1257 49152 0 -17 iscsid Feb 02 15:51:39 pm kernel: [ 1212] 0 1212 1402 350 49152 0 0 agetty Feb 02 15:51:39 pm kernel: [ 1230] 0 1230 3962 630 69632 0 -1000 sshd Feb 02 15:51:39 pm kernel: [ 1391] 0 1391 218005 697 208896 0 0 rrdcached Feb 02 15:51:39 pm kernel: [ 1461] 0 1461 10868 651 86016 0 0 master Feb 02 15:51:39 pm kernel: [ 1463] 107 1463 10968 597 86016 0 0 qmgr Feb 02 15:51:39 pm kernel: [ 1465] 0 1465 378386 12276 450560 0 0 pmxcfs Feb 02 15:51:39 pm kernel: [ 1473] 0 1473 2125 565 53248 0 0 cron Feb 02 15:51:39 pm kernel: [ 1496] 0 1496 75361 20384 307200 0 0 pve-firewall Feb 02 15:51:39 pm kernel: [ 1500] 0 1500 568 187 40960 0 0 none Feb 02 15:51:39 pm kernel: [ 1505] 0 1505 74951 19974 307200 0 0 pvestatd Feb 02 15:51:39 pm kernel: [ 1541] 0 1541 88255 29812 405504 0 0 pvedaemon Feb 02 15:51:39 pm kernel: [ 1555] 0 1555 83335 23049 364544 0 0 pve-ha-crm Feb 02 15:51:39 pm kernel: [ 1632] 33 1632 88663 30378 434176 0 0 pveproxy Feb 02 15:51:39 pm kernel: [ 1638] 33 1638 17578 12613 184320 0 0 spiceproxy Feb 02 15:51:39 pm kernel: [ 1640] 0 1640 83246 22949 368640 0 0 pve-ha-lrm Feb 02 15:51:39 pm kernel: [ 1483] 0 1483 5418704 5111805 41844736 0 0 kvm Feb 02 15:51:39 pm kernel: [ 15884] 0 15884 90381 30381 430080 0 0 pvedaemon worke Feb 02 15:51:39 pm kernel: [ 30602] 0 30602 90429 30408 434176 0 0 pvedaemon worke Feb 02 15:51:39 pm kernel: [ 10126] 0 10126 90380 30339 434176 0 0 pvedaemon worke Feb 02 15:51:39 pm kernel: [ 21058] 33 21058 17631 12591 176128 0 0 spiceproxy work Feb 02 15:51:39 pm kernel: [ 21085] 0 21085 21543 371 69632 0 0 pvefw-logger Feb 02 15:51:39 pm kernel: [ 21097] 33 21097 88702 30316 417792 0 0 pveproxy worker Feb 02 15:51:39 pm kernel: [ 21098] 33 21098 88663 30272 417792 0 0 pveproxy worker Feb 02 15:51:39 pm kernel: [ 21099] 33 21099 88702 30278 417792 0 0 pveproxy worker Feb 02 15:51:39 pm kernel: [ 7252] 107 7252 10956 616 86016 0 0 pickup Feb 02 15:51:39 pm kernel: [ 32252] 0 32252 1314 153 53248 0 0 sleep Feb 02 15:51:39 pm kernel: [ 2329] 0 2329 5806 544 65536 0 0 systemd-udevd Feb 02 15:51:39 pm kernel: [ 2330] 0 2330 5806 542 65536 0 0 systemd-udevd Feb 02 15:51:39 pm kernel: [ 2331] 0 2331 5806 542 65536 0 0 systemd-udevd Feb 02 15:51:39 pm kernel: [ 2332] 0 2332 5806 542 65536 0 0 systemd-udevd Feb 02 15:51:39 pm kernel: [ 4195] 0 4195 5806 653 65536 0 0 systemd-udevd Feb 02 15:51:39 pm kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/qemu.slice/100.scope,task=kvm,pid=1483,uid=0 Feb 02 15:51:39 pm kernel: Out of memory: Killed process 1483 (kvm) total-vm:21674816kB, anon-rss:20445684kB, file-rss:1536kB, shmem-rss:0kB, UID:0 pgtables:40864kB oom_score_adj:0 Feb 02 15:51:39 pm kernel: oom_reaper: reaped process 1483 (kvm), now anon-rss:0kB, file-rss:68kB, shmem-rss:0kB Feb 02 15:51:40 pm kernel: fwbr100i0: port 2(tap100i0) entered disabled state Feb 02 15:51:40 pm kernel: fwbr100i0: port 2(tap100i0) entered disabled state Feb 02 15:51:40 pm systemd[1]: 100.scope: Succeeded. Feb 02 15:51:43 pm qmeventd[1019]: Starting cleanup for 100 Feb 02 15:51:43 pm kernel: fwbr100i0: port 1(fwln100i0) entered disabled state Feb 02 15:51:43 pm kernel: vmbr0: port 2(fwpr100p0) entered disabled state Feb 02 15:51:43 pm kernel: device fwln100i0 left promiscuous mode Feb 02 15:51:43 pm kernel: fwbr100i0: port 1(fwln100i0) entered disabled state Feb 02 15:51:43 pm kernel: device fwpr100p0 left promiscuous mode Feb 02 15:51:43 pm kernel: vmbr0: port 2(fwpr100p0) entered disabled state Feb 02 15:51:44 pm qmeventd[1019]: Finished cleanup for 100
我的問題可能是由多種因素共同造成的,導致主機記憶體不足,最終導致 VM 被 OOM 殺死以檢索記憶體。
影響因素:
- 主機有 24GB 的物理 RAM,但由於某種原因,它只報告了 ~22GB。我還沒有進一步調查。
- 為單個 VM 分配了 20GB 的 RAM,只為主機和 ZFS 留下了約 2GB。
- VM 內部的大量 RAM 使用和磁碟 I/O。
- VM 內部的磁碟 I/O 導致 ZFS ARC 記憶體無限增長。
最終,主機終止了虛擬機程序以檢索 RAM 並防止中斷。
我將主機上 ZFS 的 ARC 記憶體限制為 4GB,並將分配給 VM 的 RAM 減少到 12GB,以讓主機喘口氣。
限制 ZFS ARC 記憶體(在主機上):
$ cat /etc/modprobe.d/zfs.conf options zfs zfs_arc_min=2147483648 options zfs zfs_arc_max=4294967296
考慮新選項(在主機上):
# update-initramfs -u
感謝@MichaelHampton 引導我得出這個結論。