Failover on Solaris might take a long time

On Solaris when a path fails it can take up to two minutes before HDLM fails over to one of the remaining path's.

HDLM relies on the fibrechannel driver for path status notification. If the fibrechannel driver does not propagate the path failure to HDLM, HDLM will not initiate a pathfailover and therefore the host will see read and/or write errors in the volume layer as well as the filesystem layer. When the applications are not IO tolerant this can lead to application failure and/or data corruption.

To reduce the time the fibrechannel driver propagates it's status notification modify the following.

In the /kernel/drv//fp.conf add:
fp_offline_ticker=15;

The fp is the Sun FibreChannel port driver and the fp.conf is the configuration file. When a path goes offline the fp driver will wait 15 seconds before it marks the path offline.

In the /kernel/drv/fcp.conf add:
fcp_offline_delay=10;

This setting will propagate the offline status to HDLM for the configured amount of seconds.
The fcp driver is the upper layer protocol that supports mechanisms for transporting SCSI-3 commands over Fibre Channel. The fcp driver, which interfaces with the Sun Fibre Channel transport library, supports the standard functions provided by the SCSA interface.

When the above mentioned values are configured HDLM will trigger a failover after 25 seconds of the actual pathfailure.

Be careful when setting these values too low since it might trigger numerous failover and failback actions. test these settings in your environment on a test system and determine what might work for you.

After modifying these settings restart the system to activate the configured settings.