ZFS：有人知道如何及時回滾 zvol 嗎？

March 11, 2021

我有一個基於 ZFS 的 iSCSI SAN，它為一堆 VM 伺服器提供 ZVOL。今天，網路故障導致安裝在客戶端上的所有 iSCSI 卷都變為 RO。解決這個問題的唯一方法是將它們全部關閉並重新啟動，通常執行 fsck 以使 iSCSI 卷重新聯機。好吧，fsck 決定徹底銷毀其中一卷。所以，看起來我無法修復 fsck 造成的混亂。
我已經閱讀了很多關於在 ZFS 上恢復文件的內容，但是，在這種情況下，我正在處理 ZVOL，這在某種意義上要簡單得多，但我還沒有看到任何處理試圖回滾的內容塊設備。有什麼建議麼？
-TIA-
一些數據集細節：
Dataset zpool1/vm3 [ZVOL], ID 59, cr_txg 12078, 44.6G, 2 objects, rootbp DVA[0]=&lt;6:6c2c4b1e00:200&gt; DVA[1]=&lt;7:487aa4b200:200&gt; [L0 DMU objset] fletcher4 lz4 LE contiguous unique double size=800L/200P birth=7736596L/7736596P fill=2 cksum=4c78779ec:2049fb2de6c:6f2f6c4a44e9:1042484aee3ded

   Deadlist: 1K (512/512 comp)

mintxg 0 -&gt; obj 48
mintxg 1 -&gt; obj 4157

   Object  lvl   iblk   dblk  dsize  lsize   %full  type
        0    7    16K    16K  7.00K    16K    6.25  DMU dnode
       dnode flags: USED_BYTES
       dnode maxblkid: 0

   Object  lvl   iblk   dblk  dsize  lsize   %full  type
        1    5    16K     8K  44.6G   200G   36.45  zvol object
       dnode flags: USED_BYTES
       dnode maxblkid: 26214399

   Object  lvl   iblk   dblk  dsize  lsize   %full  type
        2    1    16K    512      0    512  100.00  zvol prop
       dnode flags: USED_BYTES
       dnode maxblkid: 0
       microzap: 512 bytes, 1 entries

               size = 214748364800
系統為CentOS 7.1
Linux san1srvp01 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
沒有相關的快照；我想這是不言而喻的。
我提出這個問題的原因以及我所追求的與Max Burning的這篇文章有關，該文章通過取證技術深入研究對象恢復。這當然依賴於 ZFS 內部的知識。儘管我所看到的大部分內容都涉及回溯文件對象，這與實現原始塊儲存的對象非常不同，而且我幾乎沒有看到與 ZVOL 相關的內部結構。
即使我不能在技術上“回滾” fsck 所做的更改，至少返回並找到一些關鍵的原始塊會有所幫助。考慮到 ZFS 的 COW 行為……以及我缺乏的足夠知識，這應該是可能的，但我通常不會讓它阻止我。

是的，這是可以做到的。會很亂。

https://gist.github.com/jshoward/5685757

現在連結已經失效，包括原始文件。

zfs_revert-0.1.py


#!/usr/bin/python
# -*- coding: utf-8 -*-

#Script for reverting ZFS changes by destroying uberblocks
#Author: Martin Vool
#E-mail: mardicas@gmail.com
#Version: 0.1
#Date: 16 November 2009


import time
import subprocess
import sys
import os
#Default blocksize
bs=512
#default total blocks (sorry programming in estonian :-/)
suurus=None

if len(sys.argv) &gt; 2:
   for arg in sys.argv:
       arg=arg.split('=')
       if len(arg) == 1:
           file=arg[0]
       elif arg[0] == '-bs':
           bs=int(arg[1])
       elif arg[0] == '-tb':
           suurus=int(arg[1])
else:
   print 'Usage: zfs_revert.py [-bs=n default:n=512 blocksize] \\n [-tb=n total block size in blocks] [file/device] You have to set -tb'
   exit(1)
print int(bs)
if suurus == None:
   print 'Total block size in blocks is undefined'
   exit(1)
#make solaris use gnu grep.
if os.uname()[0] == 'SunOS':
   grep_cmd='ggrep'
else:
   grep_cmd='grep'


#to format program output
def formatstd(inp):
   inp=inp.split('\n')
   ret=[]
   for line in inp:
       columns=line.split(' ')
       nc=[]
       for c in columns:
           if c != '':
               nc.append(c)
       ret.append(nc)
   return ret


#read blocks from beginning(64mb)
a_count=(256*bs)
#read blocks from end (64mb)
l_skip=suurus-(256*bs)


print 'Total of %s blocks'%suurus
print 'Reading from the beginning to %s blocks'%a_count
print 'Reading from %s blocks to the end'%l_skip

#get the uberblocks from the beginning and end
yberblocks_a=formatstd(subprocess.Popen('sync && dd bs=%s if=%s count=%s | od -A x -x | %s -A 2 "b10c 00ba" | %s -v "\-\-"'%(bs,file, a_count,grep_cmd,grep_cmd), shell=True, stdout=subprocess.PIPE).communicate()[0])
yberblocks_l=formatstd(subprocess.Popen('sync && dd bs=%s if=%s skip=%s | od -A x -x | %s -A 2 "b10c 00ba" | %s -v "\-\-"'%(bs,file, l_skip,grep_cmd,grep_cmd), shell=True, stdout=subprocess.PIPE).communicate()[0])


yberblocks=[]

for p in yberblocks_a:
   if len(p) &gt; 0:
       #format the hex address to decmal so dd would eat it.
       p[0]=(int(p[0], 16)/bs)
       yberblocks.append(p)

for p in yberblocks_l:
   if len(p) &gt; 0:
       #format the hex address to decmal so dd would eat it and add the skipped part.
       p[0]=((int(p[0], 16)/bs)+int(l_skip)) #we have to add until the place we skipped so the adresses would mach.
       yberblocks.append(p)
print '----'
#here will be kept the output that you will see later(TXG, timestamp and the adresses, should be 4, might be less)
koik={}
i=0
for p in yberblocks:
   if len(p) &gt; 0:
       if i == 0:#the first output line
           address=p[0]
       elif i == 1:#second output line
           #this is the output of od that is in hex and needs to be reversed
           txg=int(p[4]+p[3]+p[2]+p[1], 16)
       elif i == 2:#third output line
           timestamp=int(p[4]+p[3]+p[2]+p[1], 16)
           try:
               aeg=time.strftime("%d %b %Y %H:%M:%S", time.localtime(timestamp))
           except:
               aeg='none'
           if koik.has_key(txg):
               koik[txg]['addresses'].append(address)
           else:
               koik[txg]={
                   'txg':txg,
                   'timestamp':timestamp,
                   'htime': aeg,
                   'addresses':[address]
               }
       if i == 2:
           i=0
       else:
           i+=1
   keys = koik.keys()
   keys.sort()
   
while True:
   keys = koik.keys()
   keys.sort()
   print 'TXG\tTIME\tTIMESTAMP\tBLOCK ADDRESSES'
   for k in keys:
       print '%s\t%s\t%s\t%s'%(k, koik[k]['htime'],koik[k]['timestamp'],koik[k]['addresses'])
   try:
       save_txg=int(input('What is the last TXG you wish to keep?\n'))
       keys = koik.keys()
       keys.sort()
       for k in keys:
           if k &gt; save_txg:
               for adress in koik[k]['addresses']:
                   #wrtie zeroes to the unwanted uberblocks
                   format=formatstd(subprocess.Popen('dd bs=%s if=/dev/zero of=%s seek=%s count=1 conv=notrunc'%(bs, file, adress), shell=True, stdout=subprocess.PIPE).communicate()[0])
               del(koik[k])
       #sync changes to disc!
       sync=formatstd(subprocess.Popen('sync', shell=True, stdout=subprocess.PIPE).communicate()[0])
   except:
       print ''
       break

引用自：https://serverfault.com/questions/844061

ZFS：有人知道如何及時回滾 zvol 嗎？

相關問答

從缺少設備的 zpool 中恢復數據

zpool 文件系統上的錯誤

zfs 無法辨識自己的物理磁碟

ZFS 數據集在重新啟動時消失

由於磁碟/控制器速度慢而導致凍結啟動

ZFS 性能：極低的寫入速度