Sun Cluster 3.x cheat sheet

This cheat sheet contains common commands and information covering Sun Cluster versions 3.0, 3.1, 3.2 and 3.3.
SunCluster 3.0 & 3.1 SunCluster 3.2 & 3.3
File Locations
man pages /usr/cluster/man /usr/cluster/man
log files /var/cluster/logs
/var/adm/messages
/var/cluster/logs
/var/adm/messages
sccheck logs /var/cluster/sccheck/report.<date> /var/cluster/sccheck/report.<date>
cluster check logs N/A /var/cluster/logs/cluster_check/\ <date and time>/ (3.2U2)
CCR files /etc/cluster/ccr /etc/cluster/ccr/<zone name>
Cluster infrastructure file /etc/cluster/ccr/infrastructure /etc/cluster/ccr//infrastructure (3.2U2)
SCSI Reservations
Display reservation keys scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inkeys -d /dev/did/rdsk/d4s2
scsi3:
/usr/cluster/lib/sc/scsi -c inkeys -d /dev/did/rdsk/d4s2
scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inkeys -d /dev/did/rdsk/d4s2
scsi3:
/usr/cluster/lib/sc/scsi -c inkeys -d /dev/did/rdsk/d4s2
determine the device owner scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inresv -d /dev/did/rdsk/d4s2
scsi3:
/usr/cluster/lib/sc/scsi -c inresv -d /dev/did/rdsk/d4s2
scsi2:
/usr/cluster/lib/sc/pgre -c pgre_inresv -d /dev/did/rdsk/d4s2
scsi3:
/usr/cluster/lib/sc/scsi -c inresv -d /dev/did/rdsk/d4s2
Cluster Information
Quorum info scstat -qcl quorum show
Cluster components scstat -pv cluster show
Resource/Resource group status scstat -g clrg show
clrs show
IP Networking Multipathing scstat -i clnode status -m
Status of all nodes scstat -n clnode show
Disk device groups scstat -D cldg show
Transport info scstat -W clintr show
Detailed resource/resource group scrgadm -pv clrs show -v
clrg show -v
Cluster configuration info scconf -p cluster show -v
Installation info (prints packages and version) scinstall -pv scinstall -pv
Cluster Configuration
Integrity check sccheck cluster check (U2)
Configure the cluster (add nodes, add data services, etc) scinstall scinstall
Cluster configuration utility (quorum, data sevices, resource groups, etc) scsetup clsetup
Add a node scconf -a -T node=<host>
Remove a node scconf -r -T node=<host>
Prevent new nodes from entering scconf -a -T node=.
Put a node into maintenance state scconf -c -q node=,maintstate
Note: use the scstat -q command to verify that the node is in maintenance mode, the vote count should be zero for that node.
clnode evacuate <node>
Note: use the clquorum status command to verify that the node is in maintenance mode, the vote count should be zero for that node.
Get a node out of maintenance state scconf -c -q node=<node>,reset
Note: use the scstat -q command to verify that the node is in maintenance mode, the vote count should be one for that node.
clquorum reset
Note: use the clquorum status command to verify that the node is in maintenance mode, the vote count should be one for that node.
Admin Quorum Device
Quorum devices are nodes, disk devices, and quorum servers. so the total quorum will be all nodes and devices added together.
Adding a device to the quorum scconf -a -q globaldev=d1
Note: if you get the error message "unable to scrub device" use scgdevs to add device to the global device namespace.
clquorum add d1
Note: if you get the error message "unable to scrub device" use cldevice to add device to the global device namespace.
Removing a device to the quorum scconf -r -q globaldev=d1 clquorum remove d1
Remove the last quorum device
  1. Evacuate all nodes
  2. put cluster into maint mode
    # scconf -c -q installmode
  3. remove the quorum device
    # scconf -r -q globaldev=d1
  4. check the quorum devices
    # scstat -q /ol>
  1. cluster set -p installmode=enabled
  2. clquorum remove d1
  3. check the quorum devices
  4. clquorum show
Resetting quorum info scconf -c -q reset
Note: this will bring all offline quorum devices online
clquorum reset
Note: this will bring all offline quorum devices online
Bring a quorum device into maintenance mode
    1. obtain the device number
    2. # scdidadm -L
    3. # scconf -c -q globaldev=<device&gr;,maintstate
clquorum disable <device>
Bring a quorum device out of maintenance mode scconf -c -q globaldev=<device> <device>,reset clquorum enable <device>
Device Configuration
Lists all the configured devices including paths across all nodes. scdidadm -L cldevice list -v
List all the configured devices including paths on node only. scdidadm -l cldevice list -v -n <nodename>
Reconfigure the device database, creating new instances numbers if required. scdidadm -r scdidadm -r
Lists all the configured devices including paths & fencing N/A cldevice show -v
Rename a did instance N/A cldevice rename -d <destination device> <device>
Clearing no longer used did scdidadm -C cldevice clear
Perform the repair procedure for a particular path (use then when a disk gets replaced) scdidadm -R <c0t0d0s0> - device
scdidadm –R 2 - device id
cldevice repair device
Configure the global device namespace scgdevs cldevice populate
Status of all disk paths scdpm -p all:all
Note: (<host>:<disk>)
cldevice status
Monitor device path scdpm –m <node:disk path> cldevice monitor -n <node> <disk>
Unmonitor device path scdpm –u <node:disk path> cldevice unmonitor -n <node> <disk>
Device group
Adding/Registering scconf -a -D type=vxvm,name=appdg,nodelist=<host>:<host>,preferenced=true cldg create -t <devicegroup> -n <node> -d <device> <devicegroup>
Removing scconf -r -D name=<device group> cldg remove-node[-t <devicegroup-type> -n <node> <devicegroup>
cldg remove-device -d
adding single node scconf -a -D type=vxvm,name=appdg,nodelist=<host> cldg add-node -t <devicegroup-type> -n <node> <device group>
Removing single node scconf -r -D name=<device group>,nodelist=<host> cldg remove-node -t <devicegroup-type> -n <node> <devicegroup>
Switch scswitch -z -D <device group> -h <host> cldg switch -t <devicegroup-type> -n <host> <disk group>
Put into maintenance mode scswitch -m -D <device group> cldg disable -t <devicegroup-type> <device group>
take out of maintenance mode scswitch -z -D <device group> -h <host> cldg enable -t <devicegroyp-type> <device group>
onlining a device group scswitch -z -D <device group> -h <host> cldg online -t <devicegroup-type> -n <node> <device group>
offlining a device group scswitch -F -D <device group> cldg offline -t <devicegroup-type> <devicegroup>
Resync a device group scconf -c -D name=<dg> sync cldg sync -t <devicegroup-type> <device group>
Transport cable
Enable scconf –c –m endpoint=<host>:qfe1,state=enanode clintr enable <host>:<interface>
Disable scconf –c –m endpoint=<host>:qfe1,state=disabled
Note: it gets deleted
clintr disable <host>:<interface>
Resource Groups
Adding scrgadm -a -g <res_group> -h <host>,<host> clrg create -n <host>,<host> <res_group>
Removing scrgadm -r -g <res_group> clrg delete <res_group>
changing properties scrgadm -c -g <res_group> -y <property=value> clrg set -p <name=value> <res_group>
Listing scstat -g clrg show
Detailed List scrgadm -pv -g <res_group> clrg show -v <res_group>
Display mode type (failover or scalable) scrgadm -pv -g <res_group> | grep 'Res Group mode' clrg show -v <res_group>
Offlining scswitch -F -g <res_group> clrg offline <res_group>
Onlining scswitch -Z -g <res_group> clrg online <res_group>
Unmanaging scswitch -u -g <res_group>
Note: (all resources in group must be disabled)
clrg unmanage <res_group>
Note: (all resources in group must be disabled)
Managing scswitch -o -g <res_group> clrg manage <res_group>
Suspending N/A clrg suspend <res_group>
Resuming N/A clrg resume <res_group>
Switching scswitch -z -g <res_group> -h clrg switch -n -h <res_group>
Resources
Adding failover network resource scrgadm –a –L –g <res_group> -l clreslogicalhostname create -g <res_group>
Adding shared network resource scrgadm –a –S –g <res_group> -l clressharedaddress create -g <res_group>
adding a failover apache application and attaching the network resource scrgadm –a –j apache_res -g <res_group> \\ -t SUNW.apache -y Network_resources_used = <logicalhost> -y Scalable=False –y Port_list = 80/tcp \\ -x Bin_dir = /usr/apache/bin
adding a shared apache application and attaching the network resource scrgadm –a –j apache_res -g <res_group> \\ -t SUNW.apache -y Network_resources_used = <logicalhot> -y Scalable=True –y Port_list = 80/tcp \\ -x Bin_dir = /usr/apache/bin
Create a HAStoragePlus failover resource scrgadm -a -g <res_group> -j -t SUNW.HAStoragePlus \\ -x FileSystemMountPoints=/oracle/data01 -x Affinityon=true clresource create -g <res_group> -t SUNW.HAStoragePlus \\ -p FilesystemMountPoints=/test2 -x Affinityon=true
Removing scrgadm –r –j <resource>
Note: must disable the resource first
clresource delete <resource>
Note: must disable the resource first
changing properties scrgadm -c -j -y <property=value> clresource set -p <property=value> <resource>
List scstat -g clresource list
Detailed List scrgadm –pv –j <resource>
scrgadm –pvv –j <resource>
clresource list -v
Disable resoure monitor scrgadm –n –M –j <resource> clresource unmonitor <resource>
Enable resource monitor scrgadm –e –M –j <resource> clresource monitor <resource>
Disabling scswitch –n –j <resource> clresource disable <resource>
Enabling scswitch –e –j <resource> clresource enable <resource>
Clearing a failed resource scswitch -c -h<host>,<host> -j -f STOP_FAILED <resource> clrs clear -f STOP_FAILED <resource>
Find the network of a resource scrgadm –pvv –j <resource> | grep –I network scrgadm –pvv –j <resource> | grep –I network
Removing a resource and resource group
  1. offline the group:
    ' scswitch –F –g <res_group>
  2. remove the resource:
    # scrgadm –r –j <resource>
  3. remove the resource group:
    # scrgadm –r –g <res_group>
  1. clrg offline <res_group>
  2. clrs delete <resource>
  3. clrg delete <res_group>
Resource Types
Adding scrgadm –a –t <resource type>
i.e SUNW.HAStoragePlus
clrt register <resource type>
Deleting scrgadm –r –t <resource type>
set the RT_SYSTEM property on the RT to false
clrt unregister <resource_type>
set the RT_SYSTEM property on the RT to false
Listing scrgadm –pv | grep ‘Res Type name’ clrt list