Skip Navigation
InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)DA
Data Hoarder @selfhosted.forum

Unrepairable data corruption on a raidz2 when all drives show zero errors -- HOW???

So just getting around to checking my logs on my backup server, and it says that I have a permanently damaged file that's un-repairable.

How is this even possible on a raidz2 volume where each member shows zero problems and no dead drives? Isn't that whole point of raidz2, so that if one (er, two) drives have a problem the data is recoverable? How can I figure out why this happened and why it was unrecoverable, and most importantly, prevent it in the future?

It's only my backup server and the original file is still A-OK, but I'm really concerned here!

 
        zpool status -v:

    3-2-1-backup@BackupServer:~$ sudo zpool status -v
    pool: data_pool3
    state: ONLINE
    status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
    action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
     see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
      scan: scrub repaired 0B in 06:59:59 with 1 errors on Sun Nov 12 07:24:00 2023
    config:

        NAME                        STATE     READ WRITE CKSUM
        data_pool3                  ONLINE       0     0     0
          raidz2-0                  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx1  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx2  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx3  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx4  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx5  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx6  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx7  ONLINE       0     0     0
            wwn-0x5000ccaxxxxxxxx8  ONLINE       0     0     0

    errors: Permanent errors have been detected in the following files:

        data_pool3/(redacted)/(redacted)@backup_script:/Documentaries/(redacted)
  
7 comments
  • Well, two steps forwards, one step back. The scrub I ran yesterday at least showed some errors, but I'm having trouble identifying exactly what is the actual problem. I think I'll sleep on it and form a new plan in the morning.

    Controller failure? RAM failure? Dmesg shows absolutely nothing, no panics no anything so I'm not thinking it's ram. Hmmmm... maybe I'll run mtest after I get some sleep.

     
            3-2-1-backup@BackupServer:~$ sudo zpool status -vx
        pool: data_pool3
        state: ONLINE
        status: One or more devices has experienced an error resulting in data
            corruption.  Applications may be affected.
        action: Restore the file in question if possible.  Otherwise restore the
            entire pool from backup.
         see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
         scan: scrub repaired 40K in 07:07:07 with 4 errors on Tue Nov 28 22:39:33 2023
        config:
    
            NAME                        STATE     READ WRITE CKSUM
            data_pool3                  ONLINE       0     0     0
              raidz2-0                  ONLINE       0     0     0
                wwn-0x5000ccax1  ONLINE       0     0     8
                wwn-0x5000ccax2 ONLINE       0     0    10
                wwn-0x5000ccax3 ONLINE       0     0     8
                wwn-0x5000ccax4 ONLINE       0     0     8
                wwn-0x5000ccax5 ONLINE       0     0     8
                wwn-0x5000ccax6 ONLINE       0     0     8
                wwn-0x5000ccax7 ONLINE       0     0     8
                wwn-0x5000ccax8 ONLINE       0     0     8
    
        errors: Permanent errors have been detected in the following files:
    
            data_pool3/(redacted)/downloads@backup_script-2023-11-28-0901:/(redacted).mkv
            data_pool3/(redacted)@backup_script-2023-11-28-2001:/ISOs/Ubuntu/23.10/ubuntu-23.10.1-desktop-amd64.iso
            data_pool3/(redacted)@backup_script-2023-11-07-0901:/(redacted).mkv
    
    
    
      

    Hey wow, even though my problem is getting worse (maybe), an actual honest-to-god ISO showed up in the problem file list!

7 comments