Solaris FMA Demo Kit

What is the Solaris FMA Demo Kit?

The Solaris FMA (Fault Management Architecture) Demo Kit consists of a set of PERL and Korn shell scripts which implement an automated harness for executing FMA demos. The Demo Kit also provides example demos which demonstrate Solaris' ability to handle and diagnose CPU, Memory and PCI I/O errors. The Solaris FMA Demo kit is designed to run on stock Solaris systems, out-of-the-box - no custom error injection hardware or drivers are required.

Installing the Solaris FMA Demo Kit

The Solaris FMA Demo Kit is currently available as a downloadable tarball here. The tarball can be extracted anywhere on the system on which you wish to run the demo. Once extracted, the demo kit consumes a trivial amount of disk space (less than 1 MB).

Running the Solaris FMA Demo Kit

While the Solaris Fault Manager Demo Kit is designed to run out-of-the-box on a wide range of SPARC and X86 platforms, not all of the diagnosis capabilities are not supported on all platforms. In those cases, the demo will exit with an appropriate error message. The following table shows which demos are expected to work with which platforms:

Demo TypeSupported Platforms
cpu
  • All Opteron/Athlon64/Turion64 based platforms
  • All UltraSPARC-III/IIIi/IIIplus/IV/IVplus based platforms
  • All UltraSPARC-T1/T2/T2plus based platforms
mem
  • All Opteron/Athlon64/Turion64 based platforms
  • All UltraSPARC-III/IIIi/IIIplus/IV/IVplus based platforms
  • All UltraSPARC-T1/T2/T2plus based platforms
pci
  • All PCI-based x86 platforms
  • All PCI-based UltraSPARC platforms

This demo is designed to be run under Xwindows. It must also be run as root. To run the demo, do the following:

  1. ssh to the system on which to run the demo.
    NOTE: You must configure ssh to forward X11 packets. You can temporarily allow this by either passing the -X option to ssh or you can permanently allow this by adding the following entry to /.ssh/config:
ForwardX11 yes

2. su to root
NOTE: Do not pass the - option to su(1M). Passing the - option causes environment variables that are needed by ssh to forward X11 to be lost.

3. Run the demo

# ./run_fmdemo -d demo_dir

where demo_dir can be cpu, mem, or pci

Upon startup, the demo harness will pop up the following four xterm windows:

  • Error Log Monitor: This window monitors the Solaris Fault Manager's error log. As errors are detected by hardened drivers, the telemetry is captured into special events, called ereports, which are then dispatched to automated diagnosis engines for analysis.
  • Fault Log Monitor: This window monitors the Solaris Fault Manager's fault log. When a diagnosis engine determines that a fault has occurred, based in incoming error telemetry, it produces a special event called a suspect list, which lists one or components which are believed to be faulty.
  • Resource cache Monitor: The Resource cache, caches the state of hardware resources that have been the subject of a fault diagnosis. The contents of the cache are persistent and, among other things, allows the fault manager to re-offline ASRU's after a system restart.
  • Console log Monitor: This window monitors messages as they are logged to the console. When the Solaris Fault Manager diagnoses a fault, a localized message is printed to the console which summarizes the fault condition and also provides a URL to an appropriate knowledge article on the Predictive Self-Healing Knowledge Web which provides detailed instructions on how to resolve the fault.

Advanced Options

The Solaris FMA Demo Kit is designed to be safe and completely non-disruptive, when run with the default options. The following section details command line options which can be passed to run_fmdemo to allow an even more authentic demonstration. However, enabling these options can cause the demo to be disruptive and hence users should choose to use them at their own risk.

Allowing Destructive Actions

Under normal operating conditions, retire agents may respond to fault events by offlining faulty components, if possible. By default, these actions are disabled in the demo. To allow the retire agents to operate normally, pass the -w option to run_fmdemo. When this option is set, the demo will popup a fifth window which will monitor the processor states.

Running on the Live System

By default, the demos are run in an isolated chroot'd environment using a seperate instance of the Solaris Fault Manager. To run the demo on the live system, pass the -L option to run_fmdemo.

Known Issues

The Solaris FMA Demo Kit is susceptible to the following bug:

6451677 lock usage error detected in libtopo rtld_init()

As a result, the demos must be run on Solaris Nevada build 51 or later. The demo will also run on Solaris 10 08/07 (Update 4).

UltraSPARC-T1/T2 demos are only supported on Solaris Nevada build 58 or later. For Solaris 10 systems, patch 125369-05 or better is required for cpu demos. Memory demos require Solaris 10 Update 4.

Questions and Comments

Questions and comments regarding the Solaris FMA Demo Kit can be sent to the fm-discuss discussion group.


Copyright 2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved.
U.S. Government Rights - Commercial software. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements.
Use is subject to license terms.
Sun, Sun Microsystems, the Sun logo, Solaris and Solaris FMA Demo Kit are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries.
All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon architecture developed by Sun Microsystems, Inc.

last modified by alta on 2009/10/29 00:28
Collectives
Project


© Sun Microsystems Inc. 2009
XWiki Enterprise 1.8.2.19075 - Documentation
Terms Of Use | Privacy | Trademarks | Copyright Policy | Site Guidelines | Site map | Help
Your use of this web site or any of its content or software indicates your agreement to be bound by these Terms of Use.