Home Forums HAast (High Availability for Asterisk) Installation & Upgrade AMI connection issue at startup

Viewing 5 posts - 1 through 5 (of 5 total)
  • Author
    Posts
  • teliumcustomer6
    Participant
    Post count: 3

    I installed FreePBX 14 a few weeks ago and so far it is working great. I am now looking into the HA part of the setup and I picked HAAst.

    I am having a problem at the system startup, where HAAst takes some time to switch on Asterisk and in the HAAst log I see the following:


    Fri Aug 10 06:46:55 2018, 00001258, I, Asterisk Controller, Starting
    Fri Aug 10 06:46:55 2018, 00010012, I, Cluster Controller, Setting initial state of local peer to standby due to local peer haast service start
    Fri Aug 10 06:46:55 2018, 00000107, I, Cluster Controller, Local peer HAAst state changing to standby
    Fri Aug 10 06:46:56 2018, 00010013, I, Cluster Controller, Initial state of local peer sucessfully set to standby
    Fri Aug 10 06:46:56 2018, 00001901, I, Peer Link Server, PeerLinkServer listening on 12.34.56.78:3002
    Fri Aug 10 06:46:56 2018, 00001800, E, Peer Link, Connection attempt to remote node failed with error 0. Attempting reconnection in 5 seconds
    Fri Aug 10 06:46:56 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:00 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:05 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:10 2018, 00001805, W, Peer Link, Connection to remote node not established within 15 seconds of first attempt. Requesting local node promotion
    Fri Aug 10 06:47:10 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:10 2018, 00010018, I, Cluster Controller, Promotion of local peer is beginning due to local peer PeerLink connection failure
    Fri Aug 10 06:47:10 2018, 00000108, I, Cluster Controller, Local peer HAAst state changing to active
    Fri Aug 10 06:47:10 2018, 00003327, I, Sensors, Resetting local peer sensor scores and state
    Fri Aug 10 06:47:10 2018, 00010106, I, NIC, IP address ‘12.34.56.78’ added to physical NIC device ‘eth0’ as new virtual NIC device ‘eth0:haast’
    Fri Aug 10 06:47:15 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:22 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:27 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:32 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:37 2018, 00001836, W, Peer Link, Connection to remote node failed. Not requesting local node promotion due to local node snoozing
    Fri Aug 10 06:47:42 2018, 00000721, E, System Command, Failed to start Asterisk (start) wrapper. Run result 3; exitcode 0
    Fri Aug 10 06:47:42 2018, 00010019, I, Cluster Controller, Promotion of local peer is complete
    Fri Aug 10 06:47:42 2018, 00001256, W, Asterisk Controller, Connection attempt to AMI failed with error 0. Requesting local peer demotion
    Fri Aug 10 06:47:42 2018, 00010012, I, Cluster Controller, Demotion of local peer is beginning due to local peer lost/failed AMI connection

    So I am sure that the password for AMI is correct, I cannot understand why there is such error and I cannot find any reference in the guide about the “error 0” at the same time I don;t get why the system is “snoozing” for so long.
    Another thing I did is to add some delay to firewalld at the startup as sometimes I saw a “firewall rules corrupted” but in the log the only thing wrong I saw was that asterisk was not running.
    About the peer link error, I didn’t setup the backup server yet, I am going to do it after I figure this AMI error

    Avatar photoTelium Support Group
    Participant
    Post count: 262

    You have a number of issues:

    1. The following error in your event log indicates that HAAst is unable to successfully start Asterisk (and/or FreePBX services).


    Fri Aug 10 06:47:42 2018, 00000721, E, System Command, Failed to start Asterisk (start) wrapper. Run result 3; exitcode 0

    What have you set your ‘distribution’ key to in the [asterisk] stanza of your haast.conf file? Are you running the fully open source version of FreePBX or proprietary (FreePBX ‘distro’)? If open source, on what Linux distro and architecture?
    This error indicates Asterisk is not being started (properly), and also explains why you aren’t getting an AMI connection error. (Error 0 means HAAst could not establish a connection to the AMI)

    2. The follow errors in your event log suggest your PeerLinkServer is listening on the same IP that you are using for the dynamic HAAst created interface:


    Fri Aug 10 06:46:56 2018, 00001901, I, Peer Link Server, PeerLinkServer listening on 12.34.56.78:3002
    Fri Aug 10 06:47:10 2018, 00010106, I, NIC, IP address ‘12.34.56.78’ added to physical NIC device ‘eth0’ as new virtual NIC device ‘eth0:haast’

    Have you obfuscated the IP addresses in your posting? If not you will need to fix your IP addresses (which may also prevent Asterisk from Starting depending on the FreePBX settings).

    3. The reason you see snoozing messages relates to the promotesnoozetime and demotesnoozetime keys in the [cluster] stanza of haast.conf. This has nothing to do with your problem – in fact I suggest you leave those settings at default until you understand HAAst a little better. As background, these settings prevent the cluster from flipping back and forth in the event that it takes Asterisk some time to stabilize. (See https://telium.io/topic/manual-failover-leads-to-brief-dual-active-contention/ for more info).

    4. HAAst does not interact with firewalld or iptables, so that symptom isn’t directly related to HAAst. However, beware that Songoma calls fail2ban a ‘security system’ – when in fact it’s not. Actually Digium warns companies NOT to use fail2ban as a security system (see https://telium.io/topic/i-use-fail2ban-why-do-i-need-secast/ for more info). I suspect FreePBX is trying to start fail2ban and that is giving you the error messages. If you are anything but a small/SOHO business have a look at the SecAst pages on this site to understand the different between SecAst and fail2ban.

    With more info we can provide more suggestions but hopefully that clarifies the meaning of the messages. I suspect all of the above relate to starting FreePBX.

    • This reply was modified 4 years, 4 months ago by WebMaster.
    teliumcustomer6
    Participant
    Post count: 3

    1. I am using the proprietary “FreePBX Distro” version 14. In HAAst I am using option 11. I tried option 8 with the same result, option 2 doesn’t generate any error but obviously I have other kind of problem with my distro choice.
    2. I have obfuscated the ip address , I am using two different ip for Management and voip.
    3. This is clear now. I didn’t realize that “snooze” was related to the [cluster] stanza.
    4.I will look into this as soon as I have HAAst running smoothly.

    At this point I am not sure why there is this error

    Fri Aug 10 06:47:42 2018, 00000721, E, System Command, Failed to start Asterisk (start) wrapper. Run result 3; exitcode 0

    As per the install guide the correct option is 11.

    Avatar photoTelium Support Group
    Participant
    Post count: 262

    Distribution type 11 would be correct for your setup, so that’s not the problem. I would focus solely on ensuring HAAst can start your FreePBX instance properly. The problem could even be that FreePBX is starting but shutting down with an error code, or is getting severely delayed (see Sangoma notes on very slow FreePBX startup) causing HAAst to treat the startup as failed.

    I would suggest you look at:

    1. Try to start/stop the FreePBX service using systemctl from the commandline. Does Asterisk start/stop properly with it?
    2. Confirm permissions on FreePBX and Asterisk resources, relative to who you are logged in as (and relative to root)
    3. Start FreePBX using the commands within the systemd service script directly to look for errors
    4. Look for FreePBX startup errors
    5. Look for severe FreePBX startup delays (and follow Sangoma wiki instructions on correcting very slow startup if necessary)
    teliumcustomer6
    Participant
    Post count: 3

    The problem was a custom rule that I had in the firewall. It was creating a long delay at the startup of FreePBX, and HAAst gave up waiting for FreePBX.

    I fixed my firewall and now all starts perfectly.

    Time to look into SecAst for security

    Thank you

Viewing 5 posts - 1 through 5 (of 5 total)
  • You must be logged in to reply to this topic.