Troubleshooting the SAM-FS Recycler
There are a limited number of problems that occur with the
sam-recycler. The problems can be characterized into the following three areas:
- Waiting for VSN to drain of active archive copies
- Archive copies are not associated with real files
- Robot high water mark set too high; too many files marked for re-archiving
The most frequently issue seen with the
sam-recycler is the message:
"Waiting for VSN mo:OPT000 to drain, it still has 89 active archive copies." each time it's invoked.
This message is caused by one of two things:
- The archiver is failing to rearchive the 89 archive copies on the volume. This in turn, is caused by several conditions:
- The files, which need to be re-archived, are marked "no archive".
- The files, which need to be re-archived, match the no_archive archive set.
- The files cannot be archived because there are no VSNs available.
- The archiver.cmd file contains a "wait" statement.
- The 123-archive copies are not really associated with files in the file system. The inodes appear to be valid and the files appear to have archive copies on the volume, but in reality the inodes are not part of the directory tree. Therefore, the files are not accessible by any file name. In this case, the archive copies are not really active.
To determine which of these conditions is the case, run the
-v option. This will cause the pathnames of the files associated with the 123-archive copies to be displayed in the recycler log file.
To completely resolve the condition where the archive copies are not associated with files in the filesystem (which should only occur as a result of a system crash which partially corrupted the .inodes file), the site will need to do a
samfsrestore to re-create the file system cleanly.
sam-recycler allows more than one VSN to be recycled on a single pass. In cases where the robot high water mark was set very low (less than 50 percent), this sometimes results in several VSNs being flagged for recycling and in too many files being marked for
- To address the situation where too many files are marked with the
rearchflag, SAM-QFS has an
unrearchcommand. This command allows you to clear the
rearchflag on individual files or with all archive copies of a specific number or on a specific VSN.
- Another possibility is that the total number of stages for a VSN has exceeded the maxactive value for the number active stages in the stager queue. Normally a message will appear in the
sam-log, warning that maxactive has been exceeded.
Feb 5 08:21:53 schlumpf sam-stagerd: [ID 992764 local7.info] info Maximum number of active stages was exceeded.
- Increase the maxactive value in the
/etc/opt/SUNWsamfs/stager.cmd. The default value is 1000. Increasing this value should allow the recycler to work normally again.