Repairing a corrupt SMF Repository

When the repository daemon (svc.configd) starts it performs an integrity check. If this check fails the system will not be able to boot until you have restored a working database.

This article provides the steps needed to replace a corrupt repository with a default copy of the repository.

The most common error seen is:

svcs: Could not bind to repository server: repository server unavailable. Exiting.

Which indicates that your SMF repo might be corrupt. The repository can become corrupted due to any of the following reasons:

  • Disk failure
  • Hardware/Software bug
  • Accidental overwrite of the file

If the integrity check fails, the svc.configd daemon writes a message to the console similar to the following:

svc.configd: smf(5) database integrity check of:

    /etc/svc/repository.db

  failed. The database might be damaged or a media error might have
  prevented it from being verified.  Additional information useful to
  your service provider is in:

    /etc/svc/volatile/db_errors

  The system will not be able to boot until you have restored a working
  database.  svc.startd(1M) will provide a sulogin(1M) prompt for recovery
  purposes.  The command:

    /lib/svc/bin/restore_repository

  can be run to restore a backup version of your repository.  See
  http://sun.com/msg/SMF-8000-MY for more information.

The svc.startd daemon then exits and starts sulogin to enable you to perform maintenance. sulogin the prompts for the root password, thus allowing the root user to enter system maintenance mode to repair the system.

Whatever the cause for the repository being corrupt or broken, we are lucky enough that backups are kept and you can restore the repository by running the following command:

/lib/svc/bin/restore_repository

A menu of options will come up asking for what restore point you would like to go back to by default its boot, and that will go to the most recent one. Just hit ENTER and it will restore it. The following screen capture will make sense and explain your options and show all of the restore points.

root@t5-8:~# /lib/svc/bin/restore_repository
See http://sun.com/msg/SMF-8000-MY for more information on the use of
this script to restore backup copies of the smf(5) repository.

If there are any problems which need human intervention, this script will
give instructions and then exit back to your shell. 


Note that upon full completion of this script, the system will be
rebooted using reboot(1M), which will interrupt any active services.

The following backups of /etc/svc/repository.db exist, from
oldest to newest:

manifest_import-20070611_114818
manifest_import-20090124_172757
manifest_import-20100804_101919
boot-20100804_131912
boot-20120501_133156
boot-20120517_121216
manifest_import-20120517_121218
boot-20120601_090313

The backups are named based on their type and the time what they were taken.
Backups beginning with "boot" are made before the first change is made to
the repository after system boot.  Backups beginning with "manifest_import"
are made after svc:/system/manifest-import:default finishes its processing.
The time of backup is given in YYYYMMDD_HHMMSS format.

Please enter either a specific backup repository from the above list to
restore it, or one of the following choices:

CHOICE            ACTION
----------------  ----------------------------------------------
boot              restore the most recent post-boot backup
manifest_import   restore the most recent manifest_import backup
-seed-            restore the initial starting repository  (All
                  customizations will be lost, including those
                  made by the install/upgrade process.)
-quit-            cancel script and quit

Enter response [boot]: boot-20120601_090313
After confirmation, the following steps will be taken:
svc.startd(1M) and svc.configd(1M) will be quiesced, if running.
/etc/svc/repository.db
     -- renamed --> /etc/svc/repository.db_old_20120818_034915
/etc/svc/volatile/db_errors
     -- copied --> /etc/svc/repository.db_old_20120818_034915_errors
/etc/svc/repository-boot-20120601_090313
     -- copied --> /etc/svc/repository.db
and the system will be rebooted with reboot(1M).

Proceed [yes/no]? yes

Quiescing svc.startd(1M) and svc.configd(1M): done.
/etc/svc/repository.db
     -- renamed --> /etc/svc/repository.db_old_20120818_034915
/etc/svc/volatile/db_errors
     -- copied --> /etc/svc/repository.db_old_20120818_034915_errors
/etc/svc/repository-boot-20120601_090313
     -- copied --> /etc/svc/repository.db

The backup repository has been successfully restored.
Rebooting in 5 seconds.
syncing file systems... done
rebooting...
Resetting ... 

Rebooting with command: boot Boot device: /pci@9,600000/SUNW,qlc@2/fp@0,0/disk@w21000004cf96cd2b,0:a File and args:
SunOS Release 5.10 Version Generic_144488-14 64-bit Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: t5-8
Loading smf(5) service descriptions: 2/2
Reading ZFS config: done.

Warning: The system automatically reboots after the restore_repository command executes all of the listed actions.

If when running the restore_repository the system complains that the root file system is in Read-Only mode. Then you'll need to remount root read-write and then rerun the command:

# mount -m -o remount,rw /

The options used for the mount:

-m — Mount the file system without making an entry in /etc/mnttab.

-o remount,rw — Remount a file system with a new set of read-write options