Linux

LVM 相關程序完全掛起

  • April 28, 2017

此問題屬於 VM 通過 LVM 獲取其儲存的 KVM 節點。所以每個虛擬機都有自己的邏輯卷。每天晚上都會備份一些虛擬機(快照dd [..] | ssh [..]- 沒什麼特別的)。然而,昨晚這以某種方式搞砸了 LVM 系統。第二次備份開始後的 2-3 分鐘,核心開始記錄“掛起的任務”——簡而言之,它報告了三個 qemu-kvm 程序掛起和 dd 程序。至少有一個虛擬機(這是一個受我們監控的託管伺服器)出現故障 - 更準確地說:它仍在執行,但服務不再響應。VNC 顯示 VM 中的掛起任務。在硬重置(和遷移 - 見下文)之後,VM 很好,但dd程序從未終止(kill -9什麼都不做),並且命令如下lvdisplay不再工作了——他們什麼也沒給。lvmetad也無法重新啟動,並且屬於 LVM 的每個程序都不能被殺死。它們只是永遠掛在磁碟狀態,而節點通常執行良好。出現故障的 VM 必須遷移到另一個節點,因為它virsh shutdown也不再工作 - “設備或資源繁忙”。但其他虛擬機也在繼續工作。

幾週前我們在另一個節點上遇到了這個問題,“快照”虛擬機也出現了故障,執行了從 4.4 到 4.9 的核心升級(因為無論如何我們都必須重新啟動機器)並且沒有再次看到這樣的問題。但是由於今天出現問題的節點已經有兩個月的正常執行時間,這並不是說這真的是固定的。那麼 - 任何人都可以在此日誌中看到比我們更多的資訊嗎?這將不勝感激。謝謝閱讀!

Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:32970 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff88734767f908     0 32970      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff88734767f908 ffff880166d65900 ffff887048ef0000 ffff887347680000
Apr 28 00:37:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88492b5b8a00
Apr 28 00:37:15 vnode19 kernel: ffff88734767f920 ffffffff816b2425 ffff887f7f116cc0 ffff88734767f9d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:33655 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff886a1dd23908     0 33655      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff886a1dd23908 ffff8875c6e442c0 ffff88582127ac80 ffff886a1dd24000
Apr 28 00:37:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff886d0d021e00
Apr 28 00:37:15 vnode19 kernel: ffff886a1dd23920 ffffffff816b2425 ffff887f7f496cc0 ffff886a1dd239d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task qemu-kvm:33661 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: qemu-kvm        D ffff8855341f3728     0 33661      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff8855341f3728 ffff880166d642c0 ffff886916a4c2c0 ffff8855341f4000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff8855341f3740 ffffffff816b2425 ffff886916a4c2c0 ffff8855341f37d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06dfdfe>] __origin_write+0x6e/0x210 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811918ae>] ? mempool_alloc+0x6e/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0007>] do_origin.isra.14+0x67/0x90 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0092>] origin_map+0x62/0x80 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81300deb>] ? bio_alloc_bioset+0x1ab/0x2d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124ccb7>] do_blockdev_direct_IO+0x2427/0x2d20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120cf59>] __vfs_write+0xc9/0x110
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d5b2>] vfs_write+0xa2/0x1a0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e537>] SyS_pwrite64+0x87/0xb0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task dmeventd:33781 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: dmeventd        D ffff8803493b7af8     0 33781      1 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff8803493b7af8 ffff880166da1640 ffff880b15a50000 ffff8803493b8000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff8803493b7b10 ffffffff816b2425 ffff880b15a50000 ffff8803493b7b98
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06df172>] snapshot_status+0x82/0x1a0 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b51a6>] retrieve_status+0xa6/0x1b0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6363>] table_status+0x63/0xa0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6300>] ? dm_get_live_or_inactive_table.isra.3+0x30/0x30 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b6015>] ctl_ioctl+0x255/0x4d0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81095806>] ? __dequeue_signal+0x106/0x1b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81095a1b>] ? recalc_sigpending+0x1b/0x50
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04b62a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81220872>] do_vfs_ioctl+0x2d2/0x4b0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81220ac9>] SyS_ioctl+0x79/0x90
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:37:15 vnode19 kernel: INFO: task dd:33790 blocked for more than 120 seconds.
Apr 28 00:37:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:37:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:37:15 vnode19 kernel: dd              D ffff885238e1f828     0 33790  33746 0x00000080
Apr 28 00:37:15 vnode19 kernel: ffff885238e1f828 ffff883f77ce42c0 ffff884a64088000 ffff885238e20000
Apr 28 00:37:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:37:15 vnode19 kernel: ffff885238e1f840 ffffffff816b2425 ffff884a64088000 ffff885238e1f8d0
Apr 28 00:37:15 vnode19 kernel: Call Trace:
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa06e0d32>] snapshot_map+0x62/0x390 [dm_snapshot]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124d6ba>] mpage_bio_submit+0x2a/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8124e0b0>] mpage_readpages+0x130/0x160
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff811e0428>] ? alloc_pages_current+0x88/0x120
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247add>] blkdev_readpages+0x1d/0x20
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119bfbc>] __do_page_cache_readahead+0x19c/0x220
Apr 28 00:37:15 vnode19 kernel: [<ffffffff810b4c39>] ? try_to_wake_up+0x49/0x3d0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119c175>] ondemand_readahead+0x135/0x260
Apr 28 00:37:15 vnode19 kernel: [<ffffffffa04ae0aa>] ? dm_any_congested+0x4a/0x50 [dm_mod]
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8119c30c>] page_cache_async_readahead+0x6c/0x70
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81190748>] generic_file_read_iter+0x438/0x680
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81215e79>] ? pipe_write+0x3d9/0x430
Apr 28 00:37:15 vnode19 kernel: [<ffffffff81247da7>] blkdev_read_iter+0x37/0x40
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120ce56>] __vfs_read+0xc6/0x100
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120d45f>] vfs_read+0x7f/0x130
Apr 28 00:37:15 vnode19 kernel: [<ffffffff8120e2d5>] SyS_read+0x55/0xc0
Apr 28 00:37:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:32970 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff88734767f908     0 32970      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff88734767f908 ffff880166d65900 ffff887048ef0000 ffff887347680000
Apr 28 00:39:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff88492b5b8a00
Apr 28 00:39:15 vnode19 kernel: ffff88734767f920 ffffffff816b2425 ffff887f7f116cc0 ffff88734767f9d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:33655 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff886a1dd23908     0 33655      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff886a1dd23908 ffff8875c6e442c0 ffff88582127ac80 ffff886a1dd24000
Apr 28 00:39:15 vnode19 kernel: 0000000000000000 7fffffffffffffff 0000000000000000 ffff886d0d021e00
Apr 28 00:39:15 vnode19 kernel: ffff886a1dd23920 ffffffff816b2425 ffff887f7f496cc0 ffff886a1dd239d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5137>] schedule_timeout+0x237/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] ? generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b1b96>] io_schedule_timeout+0xa6/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124b537>] do_blockdev_direct_IO+0xca7/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247c40>] ? bd_unlink_disk_holder+0xe0/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d8a1>] do_readv_writev+0x1f1/0x2b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d9e9>] vfs_writev+0x39/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e9e8>] SyS_pwritev+0xb8/0xe0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task qemu-kvm:33661 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: qemu-kvm        D ffff8855341f3728     0 33661      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff8855341f3728 ffff880166d642c0 ffff886916a4c2c0 ffff8855341f4000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff8855341f3740 ffffffff816b2425 ffff886916a4c2c0 ffff8855341f37d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06dfdfe>] __origin_write+0x6e/0x210 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811918ae>] ? mempool_alloc+0x6e/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0007>] do_origin.isra.14+0x67/0x90 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0092>] origin_map+0x62/0x80 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81300deb>] ? bio_alloc_bioset+0x1ab/0x2d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124ccb7>] do_blockdev_direct_IO+0x2427/0x2d20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d5f3>] __blockdev_direct_IO+0x43/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffff812479d8>] blkdev_direct_IO+0x58/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190a3d>] generic_file_direct_write+0xad/0x170
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190bc2>] __generic_file_write_iter+0xc2/0x1e0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247cd0>] blkdev_write_iter+0x90/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120cf59>] __vfs_write+0xc9/0x110
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d5b2>] vfs_write+0xa2/0x1a0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e537>] SyS_pwrite64+0x87/0xb0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task dmeventd:33781 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: dmeventd        D ffff8803493b7af8     0 33781      1 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff8803493b7af8 ffff880166da1640 ffff880b15a50000 ffff8803493b8000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff8803493b7b10 ffffffff816b2425 ffff880b15a50000 ffff8803493b7b98
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06df172>] snapshot_status+0x82/0x1a0 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b51a6>] retrieve_status+0xa6/0x1b0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6363>] table_status+0x63/0xa0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6300>] ? dm_get_live_or_inactive_table.isra.3+0x30/0x30 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b6015>] ctl_ioctl+0x255/0x4d0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81095806>] ? __dequeue_signal+0x106/0x1b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81095a1b>] ? recalc_sigpending+0x1b/0x50
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04b62a3>] dm_ctl_ioctl+0x13/0x20 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81220872>] do_vfs_ioctl+0x2d2/0x4b0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811308cf>] ? __audit_syscall_entry+0xaf/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81220ac9>] SyS_ioctl+0x79/0x90
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71
Apr 28 00:39:15 vnode19 kernel: INFO: task dd:33790 blocked for more than 120 seconds.
Apr 28 00:39:15 vnode19 kernel:      Not tainted 4.4.51 #1
Apr 28 00:39:15 vnode19 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 28 00:39:15 vnode19 kernel: dd              D ffff885238e1f828     0 33790  33746 0x00000080
Apr 28 00:39:15 vnode19 kernel: ffff885238e1f828 ffff883f77ce42c0 ffff884a64088000 ffff885238e20000
Apr 28 00:39:15 vnode19 kernel: ffff880d40fc8c18 ffff880d40fc8c00 ffffffff00000000 fffffffe00000001
Apr 28 00:39:15 vnode19 kernel: ffff885238e1f840 ffffffff816b2425 ffff884a64088000 ffff885238e1f8d0
Apr 28 00:39:15 vnode19 kernel: Call Trace:
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b2425>] schedule+0x35/0x80
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b4c05>] rwsem_down_write_failed+0x1f5/0x320
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81343233>] call_rwsem_down_write_failed+0x13/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b44ad>] ? down_write+0x2d/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa06e0d32>] snapshot_map+0x62/0x390 [dm_snapshot]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04acf8a>] __map_bio+0x3a/0x110 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae73f>] __split_and_process_bio+0x24f/0x3f0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae94a>] dm_make_request+0x6a/0xd0 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309826>] generic_make_request+0x106/0x1d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81309967>] submit_bio+0x77/0x150
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124d6ba>] mpage_bio_submit+0x2a/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8124e0b0>] mpage_readpages+0x130/0x160
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247290>] ? I_BDEV+0x20/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff811e0428>] ? alloc_pages_current+0x88/0x120
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247add>] blkdev_readpages+0x1d/0x20
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119bfbc>] __do_page_cache_readahead+0x19c/0x220
Apr 28 00:39:15 vnode19 kernel: [<ffffffff810b4c39>] ? try_to_wake_up+0x49/0x3d0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119c175>] ondemand_readahead+0x135/0x260
Apr 28 00:39:15 vnode19 kernel: [<ffffffffa04ae0aa>] ? dm_any_congested+0x4a/0x50 [dm_mod]
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8119c30c>] page_cache_async_readahead+0x6c/0x70
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81190748>] generic_file_read_iter+0x438/0x680
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81215e79>] ? pipe_write+0x3d9/0x430
Apr 28 00:39:15 vnode19 kernel: [<ffffffff81247da7>] blkdev_read_iter+0x37/0x40
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120ce56>] __vfs_read+0xc6/0x100
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120d45f>] vfs_read+0x7f/0x130
Apr 28 00:39:15 vnode19 kernel: [<ffffffff8120e2d5>] SyS_read+0x55/0xc0
Apr 28 00:39:15 vnode19 kernel: [<ffffffff816b5fee>] entry_SYSCALL_64_fastpath+0x12/0x71

我假設您已經排除了實際的物理磁碟問題。

我還假設您要確保主機和所有虛擬機都沒有重疊的 VG 名稱。這可能會導致您所描述的瘋狂。

你所看到的聽起來像是“不間斷睡眠”,盒子認為它正在等待 IO,沒有什麼可以改變這一點。Kill -9 甚至不會這樣做。我曾經在磁帶備份中看到過這種情況。我最近在做一些愚蠢的事情時看到了它,比如在主機上安裝 VM LVM 並在執行 VM 時忘記解除安裝它。這總是很有趣。

對於您所描述的情況,我發現的最有用的工具是dmsetup. 它允許您手動取消 LVM。我不知道這是否能讓你擺脫不間斷的睡眠狀態。

另一種可能性是您使用的是慢速磁碟,並且確實需要超過 120 秒的時間。

我使用磁碟文件 ala qemu-img 而不是 LVM。我曾經像您在 Xen 中所描述的那樣使用 LVM,但從來沒有遇到過任何不是我自己造成的問題。

-迪倫

引用自:https://serverfault.com/questions/847205