en

Release Notes

HA Clusters Quick Links

Download
Download the latest Open High Availability Cluster source
Participate
Join the OpenSolaris HA Clusters discussions
Contribute
Contribute to Open High Availability Cluster

Translation for this page
English日本語简体中文

Open HA Cluster 2009.06 Release Notes

These release notes contain additions, changes, and corrections to information in the Open HA Cluster Installation Guide and Sun Cluster 3.2 documentation as they apply to OpenSolaris and Open HA Cluster software. Information in the Sun Cluster 3.2 documentation set is valid with code that is provided in Open HA Cluster software, unless otherwise stated here.

Support Changes

Open Solaris - Open HA Cluster 2009.06 software runs only on OpenSolaris 2009.06 (build 111) software.

Platforms - Open HA Cluster 2009.06 software runs on either SPARC based platforms or x86 based platforms. All nodes in a cluster must run on the same platform.

Data Services - The following Sun Cluster data services are fully supported in an Open HA Cluster 2009.06 configuration:

  • HA-Apache
  • HA-Apache Tomcat
  • HA-Containers
  • HA-DHCP
  • HA-DNS
  • HA-Glassfish (but not HA-Sun Java Application Server)
  • HA-Kerberos
  • HA-MySQL
  • HA-NFS
  • HA-Samba

back to top

New Features

 Open HA Cluster 2009.06 software introduces the following new features.

Virtual Private Interconnect

 As an alternative to using physical cables between cluster nodes to handle private-network communication, you can now instead configure virtual network interfaces (VNICs) for this purpose. You create VNICs over public-network adapters to transmit the private-network traffic. This configuration also creates an abstraction of having physical adapters handling the public-network traffic to the rest of the cluster. You can optionally configure IPsec for the private-interconnect interface to provide secure TCP/IP communication.

COMSTAR based iSCSI Storage

 As an alternative to sharing storage between cluster nodes, you can configure a failover ZFS file system on local storage using COMSTAR.

back to top

Restrictions and Requirements

Feature restrictions - The following features that are normally available in the Sun Cluster software product are not available in an Open HA Cluster 2009.06 configuration:

  • Campus clusters and replication
  • Closed-source binaries, such as HA-Oracle, Oracle RAC, and HA-Sybase
  • Cluster Control Panel including cconsole (use the pconsole IPS package instead)
  • Geographic Edition
  • NAS quorum devices
  • scsnapshot
  • Solaris Containers non-global zones
  • Sun Cluster Manager GUI, including the data-service wizards (use the clsetup utility or the command line instead)
  • Sun Cluster module to Sun Management Center
  • Sun Storage Archive Manager (SAM-QFS)
  • Telemetry attributes, or system resource monitoring
  • Upgrade to future releases
  • Veritas Volume Manager and Veritas File System

Supported hardware - The following servers and storage arrays are supported in an Open HA Cluster 2009.06 configuration:

  • SPARC based platforms
    • Sun SPARC Enterprise M3000 Server
    • Sun SPARC Enterprise T5120 Server
  • x86 based platforms
    • Sun Fire x4140 Server
    • Sun Fire x4170 Server
  • Storage arrays
    • StorageTek 2540 Array
    • Sun Storage J4400 Array (SAS)

Memory Requirements - Open HA Cluster 2009.06 software requires the following memory requirements for every cluster node:

  • Minimum of 1 Gbyte of physical RAM (2 Gbytes typical)
  • Minimum of 6 Gbytes of available hard drive space

Actual physical memory and hard drive requirements are determined by the applications that are installed. Consult the application's documentation or contact the application vendor to calculate additional memory and hard drive requirements.

Upgrade of OpenSolaris - A system that is installed with Open HA Cluster software cannot be upgraded to another version of OpenSolaris software while the Open HA Cluster software is still installed. Specifically, you cannot switch the opensolaris.org publisher to pkg.opensolaris.org/dev/ and run the pkg image-update command. This is because the cluster packages are tied to a specific OpenSolaris version. To switch to pkg.opensolaris.org/dev, you must uninstall Open HA Cluster software first.

 This restriction does not apply to SRUs. You can successfully apply OpenSolaris SRUs without first uninstalling Open HA Cluster software.

back to top

Compatibility Issues

 This section describes information about Open HA Cluster 2009.06 compatibility issues.

iSCSI Initiator Is Not Generating Events When Target Device Is Reconnected (6822867)

Problem Summary: The iSCSI initiator code generates events on the first connection to the target device. When the target device node is dead, the access from the initiator fails with ENOENT. When the target device node is available, the access is successful, but the iSCSI initiator code does not generate events on the target device availability. The lack of proper events by iSCSI initiator prevents ZFS autodetect and requires manual intervention to inform ZFS that the device is online.

Workaround: Issue an explicit online request for the device.

phys-schost# **zpool online //iscsi_pool_name// //rejoiner_iscsi_device//**

 For example, the following shows the status where ZFS has not detected the device c7t31d0s0, which is not accessible:

phys-schost# **zpool status**
  pool: ha_iscsi_pool
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
        the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

        NAME                     STATE     READ WRITE CKSUM
        ha_iscsi_pool            DEGRADED     0     0     0
          mirror                 DEGRADED     0     0     0
            1930536541678814621  UNAVAIL      0     0     0  was /dev/dsk/c7t31d0s0
            c7t32d0              ONLINE       0     0     0

 The following command would explicitly request that the device c7t31d0s0 be made accessible by ZFS:

phys-schost# **zpool online ha_iscsi_pool c7t31d0s0**

back to top

cacaocsc Core Dump Under IPS Packaging (6754753)

Problem Summary: Problems with common agent container (cacaocsc) installation might cause a core dump if a command that relies on cacaocsc is issued.

Workaround: Check whether a fix to OpenSolaris 2009.06 for this bug is available. If not, do not run the following commands:

  • cluster check or cluster list-checks (However, if the node is in noncluster mode, it is safe to use these commands.)
  • sceventmib (Note that this feature is not officially supported in this release.)
  • sctelemetry (Note that this feature is not officially supported in this release.)

back to top

Known Issues and Workarounds

The following entries describe workarounds to unresolved Open HA Cluster 2009.06 bugs.

Cannot Boot a Node After Running clnode remove on x86 (6844029)

Problem Summary: After removing an x86 based node from a cluster by using the clnode remove -F command, the node cannot be rebooted. The reboot operation fails with the following errors:

Error: /etc/cluster/ccr/global/infrastructure does not exist
UNRECOVERABLE ERROR: /etc/cluster/ccr/infrastructure file is corrupted
Please reboot in noncluster mode(boot -x) and Repair
Press any key to reboot.

Workaround: Reboot the node in noncluster mode by using the GRUB menu to add -x to the kernel boot parameter command. See Step 4 in "How to Uninstall Open HA Cluster Software" in the Open HA Cluster Installation Guide for instructions to boot an x86 machine into noncluster mode.

 This change does not persist over the system boot. The next time you need to reboot the node, it will attempt to boot into cluster mode unless you again add the -x option to the kernel boot parameter command.

back to top

Node Panicked at Install Time Due to Wrong netmask When Non-Default Network Is Selected (6842823)

Problem Summary: If during cluster establishment a private IP address other than the default address is specified, an invalid netmask is configured. This causes the node to panic.

Workaround: Use the default IP address for the private interconnect. If you cannot use the default IP address, choose an IP address whose third byte is a number from 0 to 191. For example, the IP address 192.168.191.0 would be acceptable, but 192.168.208.0 would not be acceptable and would trigger this bug.

back to top

Part of Localization Messages Are Not Translated (6842548)

Problem Summary: The Open HA Cluster 2009.06 release uses localized user-visible text that was translated for the Sun Cluster 3.2 1/09 release. Some new or modified texts for the OHAC release are not localized and is therefore only available in English. In some cases, displayed text might have a mixture of translated text and English text.

Workaround: To view texts in all English, set the LC_MESSAGES environment variable to C.

 In sh, ksh, bash:

# **LC_MESSAGES=C; export LC_MESSAGES**

 In csh:

# **setenv LC_MESSAGES C**

back to top

scinstall Fails on rsh Following "clnode remove" of All Nodes of the Cluster (6841982)

Problem Summary: After the clnode remove command is run on all installed nodes of a cluster and the software is reinstalled, the /etc/cluster/remoteconfiguration file does not get reinstalled. The absence of this file prevents the scinstall utility from being able to remotely access other nodes to configure them as cluster nodes.

Workaround: Before you start the scinstall utility, run the following command on each node where you previously ran the clnode remove command:

phys-schost# **touch /etc/cluster/remoteconfiguration**

back to top

HAStoragePlus Unable to Import zpool After Pool Is Re-Created (6836718)

Problem Summary: When a ZFS storage pool is destroyed and re-created with the same name but with different devices, HAStoragePlus might fail to bring online a resource that contains that ZFS pool. The following failure message is displayed:


The pool //pool_name// failed to import using cachefile //pool_cachefile_location//

Workaround: Use the following command to delete the CCR cache file, then retry the operation:

# **/usr/cluster/lib/sc/hasp_util delete_pool_ccr_cachefile -f ** //pool_name//

back to top

cfgchk Core Dumps With cluster check (6835179 )

Problem Summary: Attempts to run the cluster check or cluster list-checks command fails and dumps core. This problem is attributable to the OpenSolaris CR 6754753, cacaoscs core dump under IPS packaging.

Workaround: There is no workaround. When CR 6754753 is fixed, apply the SRU that contains the code fix.

back to top

ha-cluster-agent-builder Pulls In Too Many depend Packages (6835118)

Problem Summary: When the Agent Builder ha-cluster-agent-builder package group is installed on a non-cluster machine, the many SMF services that are started on the machine expect the machine to be a cluster node. Error messages such as the following are displayed:

NOTICE: Can't open /etc/cluster/nodeid
NOTICE: BOOTING IN NON CLUSTER MODE

Workaround: The error messages are safe to ignore. Alternatively, install the Agent Builder group package on a cluster node to use as a development environment.

back to top

colorado: Prevent Cluster Package Contents Installation in Non-Global Zones (6829537)

Problem Summary: Open HA Cluster software does not work in ipkg brand non-global zones on OpenSolaris.

Workaround: There is no workaround. Do not install Open HA Cluster software in ipkg brand non-global zones. Install Open HA Cluster software only in the global zone.

back to top

NOTICE: softmac: received DL_ERROR_ACK to DL_BIND_ACK; DLPI errno 0x7fffec94, UNIX errno 1 (6822706)

Problem Summary: When clprivnet receives probes from softmac, error messages are generated.

Workaround: The error messages are safe to ignore. However, clprivnet does not currently work with softmac.

back to top

Slow HAStoragePlus Failover Time Following Hard Node Death in HA-ZFS Over iSCSI Configuration (6817564)

Problem Summary: An HAStoragePlus resource that contains a ZFS storage pool and is created on iSCSI devices takes longer than usual to come online during failover. The delay is due to the iSCSI device retry-access timeout, which is around 3 minutes, that occurs if an iSCSI target is not accessible during a zpool import operation.

Workaround: There is no workaround to avoid the problem. OpenSolaris CR 6497777 is filed to request code changes that would allow the user to tune the iSCSI device access-retry timeout.

back to top

Problem Summary: During a cluster node reboot, error messages similar to the following might appear:


Mar  9 11:11:35 phys-schost-1 rcm_daemon[1040]: IP: get_link_resource(clprivnet0) failed
Mar  9 11:11:35 phys-schot-1 rcm_daemon[1040]: IP: get_link_resource for clprivnet0 error(object not found)

Workaround: The error messages are safe to ignore.

back to top

Tags:
Created by admin on 2009/10/26 12:08
Last modified by admin on 2009/10/26 12:08

Collectives


XWiki Enterprise 2.7.1.34853 - Documentation