Introduction to Troubleshooting SAM-FS/QFS
This article provides a basic overview of SAM-FS/QFS Components, basic SAM-FS/QFS Troubleshooting and a list of SAM-FS/QFS resouces.
What this article is not is an end-to-end tutorial on the SAM-FS/QFS product suite.
Areas covered are:
- Introduction
- Product Overview
- SAM-FS/QFS Components
- Basic Troubleshooting
- Resources
Introduction
What is SAM-FS/QFS?
The Sun Storage Archive Manager (SAMFS/QFS) product suite enables you to archive file system data. The SAMFS/QFS environment includes a storage and archive manager along with Sun QFS file system software. The SAM-QFS software enables data to be archived to automated libraries at device-rated speeds. In addition, data can be archived to files in another file system through a process known as disk archiving. You can archive the data on an as-needed basis, or you can define policies that determine when data should be archived. You can also set specific schedules for when to archive data. You are presented with a standard file system interface and can read and write files as though they were all on primary disk storage.
Brief history of SAM-FS/QFS
- SAMFS/QFS, was originally developed by LSC Inc
- LSC was acquired by Sun Microsystems in May 2001
- The latest SAM-FS/QFS 4.0 was released in August 2002
Product Overview
SAM-FS/QFS has three base product configurations:
- SAM-FS — Storage Archive Manager integrated with a base file system.
- QFS — High performance full featured file system
- SAM-QFS — Storage Archive Manager software which is integrated with the QFS filesystem.
QFS
QFS is an enhancement over the basic filesystem. It provides:
- Separate partition for metadata
- Shared filesystem (distributed) in a SAN environment.
- Multi-reader capability
- Advanced tunables
- Available standalone or with integrated Storage Archive Manager (SAM)
The Storage Archive Manager (SAM)
The three primary components of SAM:
- Archiving — creating tape or disk tar copies Not data migration!
- Releasing — freeing “disk cache” space Not file deletion!
- Staging — retrieving file data from archive copy Uses existing inode
Components
Where are the SAM-FS/QFS components?
- Executables:
- /opt/SUNWsamfs/sbin
- /opt/SUNWsamfs/bin
- Configuration — /etc/opt/SUNWsamfs
- Log — /var/opt/SUNWsamfs & /var/adm
- License — /etc/opt/SUNWsamfs
- Example Configuration — /opt/SUNWsamfs/examples
Configuration & Installation Troubleshooting
Are configuration files properly modified?
- Configuration files:
- /kernel/drv/st.conf
- /kernel/drv/samst.conf
- /etc/opt/SUNWsamfs/inquiry.conf
- master configuration file (mcf) errors
Using sam-fsd
command to find problems
Are tape drives “matched” properly?
- Tape Drive order must be defined in the “mcf” file the same as the robot controller defines them.
- Tape Drive order seen by robot's controller may not be the same as the order of the SCSI targets.
- Tape Drives will not “come ready” if they are defined incorrectly in the “mcf” file.
- Tape Drives may “fail to unload “ if they are defined incorrectly in the “mcf” file.
License related problems?
Samfs-QFS 4.0 requires a new license key:
- A license for SAM-FS/QFS 3.3 or 3.5 will not work on 4.0.
- License key information must be stored in the
/etc/opt/SUNWsamfs/LICENSE.4.0
file - The License key information must match the correct robot and media type in the configuration.
- The License file must NOT contain any extra lines or control characters.
Troubleshooting SAM-FS Operations
- Identifying the problem(s)?
- Log Files
- /var/opt/SUNWsamfs
- /var/adm/messages
- Samu display (CLI Interface)
samcmd
- Are there any debug options in Samfs that could assist in problem determation?
- Is the license key expired?
- Are there any hardware errors?
- Log Files
- Troubleshooting Daemon Operations
- What daemons are running?
- sam-initd, sam-archiverd, sam-arfind, sam-stagerd, sam-genericd, sam-robotsd,sam-ftpd, sam-fsd,sam-scannerd, and sam-stagealld
- Are there any duplicate or defunct processes?
- Enable daemon tracing
- /etc/opt/SUNWsamfs/defaults.conf
- /var/opt/SUNWsamfs/trace
- What daemons are running?
Troubleshooting the Archiver
For archiving issues, ask the following:
- Are there errors in the archiver.cmd file?
- Is media available?
- Is the archiver running?
- Does the site have a valid license?
- Are there files in the file system that need to be archived?
- Do files have to be staged before archive copies can be made? Is the file system full? Are any of the drives/robot/media reporting problems?
- Run the command
archiver -lv
to check the /etc/opt/SUNWsamfs/archiver.cmd file for possible misconfiguration in syntax. - Check the
sam-log
for a report of errors. Default location of the file is /var/adm/sam-log - Monitor via the
samu 'a'
display. - Truss the
sam-archiverd
process. - Monitor via the
samu 'l'
display for possible license errors.
Troubleshooting the Releaser
- Are files being archived?
- Is the file system large enough to accommodate active files?
- Is associative staging set?
- Are the high/low watermarks set correctly?
- Are large files busy?
- Use
sfind
command to find files - Enable releaser logging
/etc/opt/SUNWsamfs/releaser.cmd
- Check samu displays:
- samcmd a
- samcmd m
- Check the archiver trace file
/var/opt/SUNWsamfs/sam-archiverd
Troubleshooting the Stager
- Are the stager daemons running?
- Have archived copies been marked damaged?
- Is the VSN present in the robot?
- Can file be accessed using disaster recovery techniques?
- Are any of the drives/robot/media reporting problems?
- Has the VSN been drained of all archived copies?
- Is the archived copy associated an orphan file?
- Is an archive copy marked damaged?
- Check the recycler log file for possible errors?
- Use the
unrearch
command - Is the stage queue large enough (maxactive) /etc/opt/SUNWsamfs/stager.cmd
Troubleshooting the File System
- What if the filesystem(s) become corrupt?
- What are the steps to recover your data?
- How do you restore files and/or directories?
- How do you correct damaged files?
File system tools:
- samfsck — file system checking/repair
- samncheck — find a filename for inode #
- samfsinfo — basic file system info
- samfsconfig — search for superblocks
- samtrace — file system tracing
Disaster Recovery
Recommended Disaster Recovery techniques:
- Obtain position from
sls -D
display? - Use the archiver log to obtain the position?
- To retrieve file use the needed samfs commands? (for example:
dd
,star
,recover.sh
,restore.sh
,tarback.sh
,samfsdump
,samfsrestore
,qfsdump
,qfsrestore
- Retrieve the file(s) into a Samfs Filesystem?
- Place the file in a temporary directory?
Additional Resources
- Check the offical documentation at http://www.sun.com/lsc/documentation.html
- Installation & Configuration
- Sam-fs Adminsitrator Guide
- Storage and Archive management guide
- Disaster Recovery Guide.
- Check on-line man pages
- Search Sunsolve
Summary
In this article, we gave a brief history of SAM-FS/QFS. We learned the basic product components and covered basic troubleshooting and avilable resources.