Home Forums HAast (High Availability for Asterisk) General What happens when management connection lost between nodes Reply To: What happens when management connection lost between nodes

Avatar photoTelium Support Group
Post count: 262

I’ll start by answering this question in the context of ANY cluster (eg: gateway cluster, router cluster, file server cluster, etc.). If the nodes which make up the cluster cannot talk to one another then they have no way of knowing if the other node is dead or alive. As such, the correct action for any isolated cluster node is to promote to active and assume the other node is dead. Once the nodes contact each other again they discover that multiple nodes are active (a situation called “dual-active contention”). Then the nodes should negotiate who should remain active, and who should demote itself.

This is exactly what happens with HAAst. If the management connection between nodes is lost, then there is no way for either node to know that the other is alive. And so both nodes try to take over telephony service. Once the nodes reconnect then one node will automatically demote itself.

You will find this scenario plays out identically with any commercial HA product (eg: CISCO routers with HSRP). Dual-active contention is the worst case scenario for any cluster as the two nodes are competing, and they will both contend for the resources / traffic / data / etc.

There is a workaround called STONITH – available using event handlers in the Commercial Unlimited edition of HAAst. STONITH is an acronym for “Short The Other Node In The Head”, which basically tells one node to power off the other node. Although HAAst supports STONITH this functionality is disabled by default as the concept of STONITH is hotly debated as risky (a failing node may mistaking shoot the healthy node). And there are many scenarios where STONITH does not work (eg: two isolated nodes) without another out of band connection (eg: serial, 3rd network connection, etc)