Troubleshooting Veritas File System (VxFS)

Following on from my article Veritas File System (VxFS) Cheat Sheet, I have put together this article in providing some troubleshooting tips that I have come across whilst supporting the Veritas File System software.

Troubleshooting a VxFS is pretty simply, we have the following tools available:

fsck

The fsck utility (/usr/sbin/fsck) is simply a wrapper that when it recieves the -F vxfs option will call /usr/lib/fs/vxfs/fsck. This of course means that /usr must be mounted to check vxfs filesystems. If an fsck fails, odds are that the filesystem is not recoverable.

# fsck -F vxfs -o full,nolog /dev/vx/rdsk/vol01

fsadm

fsadm is a great tool to keep vxfs's in good shape and diagnose problems. For an indepth understanding of the fsadm command review SunSolve InfoDoc #24067

Usage:
   -D Reports on directory fragmentation.
   -d Reorganizes directories.
   -E Reports on  extent  fragmentation.
   -e Extent reorganization.

How to defrag aVxFS

For example:

# /opt/VRTSvxfs/sbin/fsadm -D -d -E -e -s /dev/vx/rdsk/vol01

When defrag wont help during shrinking a VxFS you can shrink it to force a reorganization of data and directories. But, you need to have free space. For example: vol01 is 10gig in size. Only 6 gig is being used. First try to shrink it to 7gig.

NOTE: The fs must be mounted

# /opt/VRTSvxfs/sbin/fsadm -b 7g /vol01mountpoint
vxfs fsadm: /dev/vx/rdsk/vol01 is currently 2097152 sectors - size will be reduced
vxfs fsadm: allocations found in shrink range, moving data

fsadm errno 16

An attempt to grow the file system fails as follows:

# /usr/lib/fs/vxfs/ fsadm -F vxfs -b 28041216 -r /dev/vx/rdsk/mydg/home /export/home
/usr/lib/fs/vxfs/fsadm: /dev/vx/rdsk/mydg/home is currently 19849216 sectors - size will
be increased
UX:vxfs fsadm: ERROR: attempt to resize /dev/vx/rdsk/mydg/home failed with
errno 16

There are three primary causes for this error:

  • The file system is too busy to be resized Although resizing a file system requires that the file system be mounted, it must "freeze" the file system to actually perform the resize. Freezing will temporarily prevent new accesses to the file system and it must wait for pending I/Os to complete. If it is unable to freeze the file system quickly, it gives up stating that the file system is too busy.
  • The file system has a snapshot file system mounted from it. If a snapshot file system was mounted that is a "snapof" the file system being resized, the resize will fail. File systems that have snapshots mounted from them cannot be resized
  • The file system may have corruption and needs to be fsck'd A file system that has experienced structural damage and is marked for full fsck cannot be resized.

Possible resolutions:

  • Make sure that the file system does not have any snapshot file system mounted from it.
  • Attempt the resize when the file system has less of a load on it.
  • If it continues to fail with "errno 16" then unmount the file system, perform a "fsck", remount and try again

fsadm errno 28

An attempt to grow the file system fails as follows:

# /usr/lib/fs/vxfs/fsadm -b 2093056 -r /dev/vx/rdsk/mydg/data019 /data01
vxfs fsadm: /dev/vx/rdsk/mydg/data01 is currently 20480 sectors -
size will be increased
vxfs fsadm: attempt to resize /dev/vx/rdsk/mydg/data01 failed with errno 28
# df -k /data01
Filesystem kbytes used avail capacity Mounted on
/dev/vx/dsk/mydg/data01 10240 10240 0 100% /data01

This error results because a completely full file system cannot be grown

  • Growing a file system generally requires that the current meta-data structures be expanded so that they can keep track of the new space
  • The meta-data structures must be created before the new space is allocated so that there is something to track that space
  • If there is no space left in the file system, the meta-data structures cannot be grown so the file system cannot be expanded.
  • This can happen on file systems that are 100% full, but it also can happen on file systems that are 70-90% full, but are heavily fragmented
  • Most meta-data allocations must be made with 8k extents, if the file system only has very small extents available, it might not be able to allocate an 8k extent for the meta-data

Resolution:

There are two solutions to this problem and they both can be used in conjunction:

  • Attempt to grow the file system by a smaller amount, Instead of growing the file system by 10Gb, try 100Mb or 10Mb
    • Frequently there is some unused space in the current meta-data structures that can deal with small amounts of growth
    • If the resize does not require any new meta-data allocations, it should succeed After "inching" it once or twice, try the full resize again with the new space that was just added
  • Delete or move some large files off of the file system
    • VxFS can make use of the holes that the files leave when they are removed to store meta-data information
    • A few large files work better than many small files because they are more likely to have 8k or larger extents

fsck and fsadm fail

If fsck and fsadm fail, the filesystem will need to be rebuilt and restored to minimize downtime.

The last thing to do is get a metasave. This get a skeleton snapshot of the VxFS and veritas engineers can try to rebuild the filesystem from that.

metasave does not come with the vxfs software. It is available from ftp.veritas.com:/pub/support/metasave.tar.Z

Note: If the metasave fails, the filesystem is unrecoverable.