Repairing QFS file systems with samfsck

QFS file systems write validation data in the following records that are critical to file system operations: directories, indirect blocks, and inodes.

If the file system detects corruption while searching a directory, it issues an EDOM error, and the directory is not processed. If an indirect block is not valid, it issues an ENOCSI error, and the file is not processed. In addition, inodes are validated and cross-checked with directories.

Monitoring

As part of the general maintanance and sys admin you should monitor the following files for error conditions:

  • The log file specified in /etc/syslog.conf for any EDOM or ENOCSI errors.
  • The /var/adm/messages for any device errors.

Checking a file system

If a discrepancy is noted, you should unmount the file system and check it using the samfsck command.

To use the samfsck command to perform a file systems check:

In the above output non-fatal errors are preceded by NOTICE. non-fatal errors are lost blocks and orphans. The file system is still consistent if NOTICE errors are returned. You can repair these nonfatal errors during a convenient, scheduled maintenance outage.

Fatal errors are preceded by ALERT. These errors include duplicate blocks, invalid directories, and invalid indirect blocks. The file system is not consistent if these errors occur. Notify Sun if the ALERT errors cannot be explained by a hardware malfunction.

If the samfsck command detects file system corruption and returns ALERT messages, you should determine the reason for the corruption. If hardware is faulty, repair it before repairing the file system.

To repair a file system

  1. Use the umount command to unmount the file system

    Whilst the samfsck command can be run against a mounted file system, the results cannot be guaranteed. Because of this, you are encouraged to run the command on unmounted file systems only.

  2. Use the samfsck command to repair a file system. If you are repairing a shared file system, issue the command from the metadata server.
    samfsck -F -V <family-set-name>
samfsck -V <family-set-name>

For the family-set-name, specify the name of the file system as specified in the mcf file.

# samfsck -F samfs1
name:     samfs1        version:     2          
First pass
Second pass
ALERT:  ino 4911015.8,  Invalid base inode
ALERT:  ino 4911155.8,  Invalid base inode
ALERT:  ino 5028636.6,  Invalid base inode
ALERT:  ino 5036411.6,  Invalid base inode
ALERT:  ino 5112542.12, Invalid base inode
Third pass
ALERT:  Invalid inode:        ino 4911015 marked free
ALERT:  Invalid inode:        ino 4911155 marked free
ALERT:  Invalid inode:        ino 5028636 marked free
ALERT:  Invalid inode:        ino 5036411 marked free
ALERT:  Invalid inode:        ino 5112542 marked free
 
Inodes processed: 5140992
total data kilobytes       = 18710419456
total data kilobytes free  = 6521135104
total meta kilobytes       = 1161239104
total meta kilobytes free  = 1157836832
NOTICE: Reclaimed 6291456 bytes
# samfsck -F -V samfs2
name:     samfs2        version:     2           
First pass
Second pass
Third pass
NOTICE: ino 2.2,  Repaired link count from 8 to 14
Inodes processed: 123392
total data kilobytes       = 1965952
total data kilobytes free  = 1047680
total meta kilobytes       = 131040
total meta kilobytes free  = 65568
INFO:  FS samfs2 repaired:
        start:  Jun 13, 2008 12:10:04 AM BST
        finish: Jun 13, 2008 12:10:28 AM BST
NOTICE: Reclaimed 70057984 bytes
NOTICE: Reclaimed 9519104 meta bytes