Troubleshooting Veritas File System (VxFS)
Following on from my article Veritas File System (VxFS) Cheat Sheet, I have put together this article in providing some troubleshooting tips that I have come across whilst supporting the Veritas File System software.
Troubleshooting a VxFS is pretty simply, we have the following tools available:
fsck
The fsck
utility (/usr/sbin/fsck
) is simply a wrapper that when it recieves the -F vxfs
option will call /usr/lib/fs/vxfs/fsck
. This of course means that /usr
must be mounted to check vxfs filesystems. If an fsck fails, odds are that the filesystem is not recoverable.
# fsck -F vxfs -o full,nolog /dev/vx/rdsk/vol01
fsadm
fsadm
is a great tool to keep vxfs's in good shape and diagnose problems. For an indepth understanding of the fsadm
command review SunSolve InfoDoc #24067
Usage: -D Reports on directory fragmentation. -d Reorganizes directories. -E Reports on extent fragmentation. -e Extent reorganization.
How to defrag aVxFS
For example:
# /opt/VRTSvxfs/sbin/fsadm -D -d -E -e -s /dev/vx/rdsk/vol01
When defrag wont help during shrinking a VxFS you can shrink it to force a reorganization of data and directories. But, you need to have free space. For example: vol01 is 10gig in size. Only 6 gig is being used. First try to shrink it to 7gig.
NOTE: The fs must be mounted
# /opt/VRTSvxfs/sbin/fsadm -b 7g /vol01mountpoint vxfs fsadm: /dev/vx/rdsk/vol01 is currently 2097152 sectors - size will be reduced vxfs fsadm: allocations found in shrink range, moving data
fsadm errno 16
An attempt to grow the file system fails as follows:
# /usr/lib/fs/vxfs/ fsadm -F vxfs -b 28041216 -r /dev/vx/rdsk/mydg/home /export/home /usr/lib/fs/vxfs/fsadm: /dev/vx/rdsk/mydg/home is currently 19849216 sectors - size will be increased UX:vxfs fsadm: ERROR: attempt to resize /dev/vx/rdsk/mydg/home failed with errno 16
There are three primary causes for this error:
- The file system is too busy to be resized Although resizing a file system requires that the file system be mounted, it must "freeze" the file system to actually perform the resize. Freezing will temporarily prevent new accesses to the file system and it must wait for pending I/Os to complete. If it is unable to freeze the file system quickly, it gives up stating that the file system is too busy.
- The file system has a snapshot file system mounted from it. If a snapshot file system was mounted that is a "snapof" the file system being resized, the resize will fail. File systems that have snapshots mounted from them cannot be resized
- The file system may have corruption and needs to be fsck'd A file system that has experienced structural damage and is marked for full fsck cannot be resized.
Possible resolutions:
- Make sure that the file system does not have any snapshot file system mounted from it.
- Attempt the resize when the file system has less of a load on it.
- If it continues to fail with "errno 16" then unmount the file system, perform a "fsck", remount and try again
fsadm errno 28
An attempt to grow the file system fails as follows:
# /usr/lib/fs/vxfs/fsadm -b 2093056 -r /dev/vx/rdsk/mydg/data019 /data01 vxfs fsadm: /dev/vx/rdsk/mydg/data01 is currently 20480 sectors - size will be increased vxfs fsadm: attempt to resize /dev/vx/rdsk/mydg/data01 failed with errno 28 # df -k /data01 Filesystem kbytes used avail capacity Mounted on /dev/vx/dsk/mydg/data01 10240 10240 0 100% /data01
This error results because a completely full file system cannot be grown
- Growing a file system generally requires that the current meta-data structures be expanded so that they can keep track of the new space
- The meta-data structures must be created before the new space is allocated so that there is something to track that space
- If there is no space left in the file system, the meta-data structures cannot be grown so the file system cannot be expanded.
- This can happen on file systems that are 100% full, but it also can happen on file systems that are 70-90% full, but are heavily fragmented
- Most meta-data allocations must be made with 8k extents, if the file system only has very small extents available, it might not be able to allocate an 8k extent for the meta-data
Resolution:
There are two solutions to this problem and they both can be used in conjunction:
- Attempt to grow the file system by a smaller amount, Instead of growing the file system by 10Gb, try 100Mb or 10Mb
- Frequently there is some unused space in the current meta-data structures that can deal with small amounts of growth
- If the resize does not require any new meta-data allocations, it should succeed After "inching" it once or twice, try the full resize again with the new space that was just added
- Delete or move some large files off of the file system
- VxFS can make use of the holes that the files leave when they are removed to store meta-data information
- A few large files work better than many small files because they are more likely to have 8k or larger extents
fsck and fsadm fail
If fsck
and fsadm
fail, the filesystem will need to be rebuilt and restored to minimize downtime.
The last thing to do is get a metasave
. This get a skeleton snapshot of the VxFS and veritas engineers can try to rebuild the filesystem from that.
metasave
does not come with the vxfs software. It is available from ftp.veritas.com:/pub/support/metasave.tar.Z
Note: If the metasave
fails, the filesystem is unrecoverable.