Forum Replies Created

Viewing 15 posts - 151 through 165 (of 262 total)
  • Author
    Posts
  • Avatar photoTelium Support Group
    Participant
    Post count: 268

    There are several possible causes, all network configuration related. However, the most common cause has to do with the default route on your PBX causing an asymmetric network path.

    In other words, packets come in ethernet1 and (try to) go out ethernet2. For security reasons some Linux versions prohibit this, and some routers/gateways can’t handle this. The first thing to do is ensure that your Linux configuration allows this type of asymmetrical path. Type the following from a Bash prompt:



    echo “0” > /proc/sys/net/ipv4/conf/eth0/rp_filter
    echo “0” > /proc/sys/net/ipv4/conf/eth1/rp_filter

    and replace eth0 and eth1 with the names of your network interfaces. This should take effect immediately (no restart required).

    For more details of what the above commands do (and what problem they address) please visit https://access.redhat.com/solutions/53031 . Note that allowing asymmetric routes is sometimes the best solution, but other times it’s best to adjust your routing tables to cause symmetry in packet flow.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    If FreePBX is not too badly damaged (i.e. it has not messed up its own settings) you may be able to recover by just copying the full database with schema from the working node to the defective node. This assume you have followed all of the other advice in the installation guide regarding FreePBX modules/versions/etc.

    If not, the Telium support team has tools to attempt to realign the two FreePBX installations. These tools move files/directories/databases/links etc to attempt to make FreePBX identical on both peers. Because of the potential to really make of mess of the peers, Telium does not offer these tools to the public. Instead you would need to purchase 2 hours of service (from the Buy tab) and grant direct SSH access to each peer. That’s usually the quickest route to recovery, but doesn’t always guarantee success (depending how badly the second peer is damaged).

    If you want to recover the systems on your own, the next quickest way to recover your cluster is to mirror the primary PBX disk to the secondary PBX disk, and then adjust settings on the secondary to turn it into a unique peer. (Network settings, host name, and HAAst settings). Using ‘dd’ (or Ghost4Linux) is the easiest way to mirror the disk. Keep the secondary PBX unplugged from the network throughout this recovery, and resume at step 2 of the link you posted above.

    After that your cluster will be up and running again!

    In the future, I suggest you follow the HAAst Maintenance Guide before you apply any updates, or enable any FreePBX modules. Fortunately this problem is appears to be unique to FreePBX, as all other Asterisk based PBX’s we have encountered which use MySQL databases seem to detect any code-database mismatches and allow the user to simply UPDATE the configuration generator to recover.

    Treat the FreePBX program as very fragile – so follow the upgrade instructions. As well, be sure to disable Automatic Updates in FreePBX as this too can cause problems.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    Regardless of how you update FreePBX(TM) (command line or GUI), you must follow the procedure listed in the HAAst maintenance guide – or the shortcut above.

    By ignoring the instructions you have synced new database contents to old FreePBX code. FreePBX on the standby will be confused and refuse to start. The only solution is to bring the FreePBX code and database back into alignment.

    I really hope you heeded our warning to BACKUP YOUR SYSTEM before applying any FreePBX’s updates, enabling modules, etc. The quickest solution is to unplug the standby and restore from the backup. After that resume at step 2 in the link you posted, and apply the same updates/changes you made to your active peer.

    We see this problem a couple of times per year when a user doesn’t follow the upgrade instructions. The solution is simple: just restore your system level backup and resume at step 2 in the link you posted above.

    Some users have described the FreePBX PHP code as ‘a tangled and fragile mess’. And we have seen FreePBX systems implode because of a (FreePBX) module changing the schema and other modules (or core FreePBX) didn’t like it. You must be very careful with FreePBX changes/updates. (This does not apply to other configuration generators or Asterisk itself).

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    The call handling limit of the Free Edition means that exceeding 3 simultaneous calls will shut down HAAst (and implicitly Asterisk with it).

    The Free Edition allows you to test compatibility, features, usability, etc. which is usually sufficient for a trial, or even for a small office wishing to create a cluster. The Free Edition is not designed to handle the full call load of a mid to large size business, nor offer all features (of the Commercial Unlimited Edition).

    If you wish to conduct a pilot with all features enabled and capacity limits removed we would be happy to work with you on the pilot (see https://telium.io/faq1002) and provide a temporary Commercial Unlimited Edition license.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    To understand what is going wrong you need to understand a bit more about the HAast CC module. The CC module acts like an SBC or proxy, in that all external UA’s (or services/devices) connect to the CC module, and the CC module connects to Asterisk. Each “session” is actually split into two sessions.

    In the event of cluster failover the UA sessions move to the other node, while the Asterisk sessions are rebuilt. The CC module rebuilds conferences, queues, calls in progress, etc. on Asterisk and then bridges it back to the matching outside UA session.

    With the Unlimited edition the call UniqueID found in asterisk matches what was seen at your gateway. With the OEM edition the call UniqueID found in Asterisk will be different that was is seen at your gateway (the CC module is bridging two separate sessions). In order to for your app to get the UniqueID (or any Asterisk variable) you must use the CC module API and ask it to provide a mapping between the external and internal session variables.

    HAast OEM uniqueid session

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    As customers become more sophisticated they often outgrow configuration generators and move to pure Asterisk(TM) from Digium. This has no impact on HAAst.

    HAAst operates at a layer beneath Asterisk, so it doesn’t care if you change configuration generators, Asterisk versions, etc. Your license will continue to operate just fine after your switch.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    To help customers trying to extract status information, here is a sample python script that retrieves and prints the local status:


    # Example python script to retrieve local HAAst status
    import socket
    import sys
    # End of packet marker
    READYPROMPT=’ready>’
    # Create a UDS socket
    sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
    # Connect the socket to the port where the server is listening
    server_address = ‘/run/haast.sock’
    try:
    sock.connect(server_address)
    except socket.error, msg:
    print >>sys.stderr, msg
    sys.exit(1)
    # Wait for a packet
    def receivepacket():
    global sock
    total_data=[];data=”
    while True:
    data=sock.recv(8192)
    if READYPROMPT in data:
    total_data.append(data[:data.find(READYPROMPT)])
    break
    total_data.append(data)
    if len(total_data)>2:
    #check if end_of_data was split
    last_pair=total_data[-2]+total_data[-1]
    if READYPROMPT in last_pair:
    total_data[-2]=last_pair[:last_pair.find(READYPROMPT)]
    total_data.pop()
    break
    return ”.join(total_data).replace(‘r’,”).replace(‘nn’,’n’)
    # Send a packet
    def sendpacket(message):
    global sock
    success = 1
    try:
    # Send data
    message += ‘nn’
    # print >>sys.stderr, ‘sending “%s”‘%message
    sock.sendall(message)
    amount_received = 0
    amount_expected = len(message)
    while amount_received < amount_expected: data = sock.recv(16) amount_received += len(data) finally: # print >>sys.stderr, ‘closing socket’
    # sock.close()
    success = 0
    return success
    got = receivepacket()
    sendpacket(“id:123ncommand:getstatus”)
    got = receivepacket()
    for item in got.split(“n”):
    if “local haast state formatted:” in item:
    print item.strip()
    sock.close()

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    We realize that some of our users are new to Linux, and that Asterisk may be the only reason they are working with Linux at all. Many long-time Windows admins have never used (or wanted to use) a command line and find the provided instructions frustrating and/or confusing.

    The problem is that we offer a very technical product (with deep integration into Asterisk and the operating system, etc), requiring a level of Linux expertise that some admins don’t have. If we offered detailed explanations for each Linux command, networking concept, operating system parameter, etc. our installation guide would be huge. Sadly this is not an option for us.

    If you are a commercial user we recommend purchasing installation assistance allowing our professional services group to perform the installation for/with you. We can offer a turnkey solution (from design through implementation) or just step in where you need help.

    If you are a home user you may be better off using a ‘built-in’ option provided by one of the many configuration generators out there. You will be losing a substantial number of features and capabilities, but at least you can install a ‘built-in’ option by clicking BUY/INSTALL on a GUI. Please note that HAAst, SecAst, and SecData are targeted at large and critical commercial call centers – not home office / small office environments (Telium products are on a different scale than ‘built-in’ products/modules). You are welcome to use Telium products in home office / small office environments but realize that it’s like using an 18-wheel truck instead of a bicycle. As a home user the bicycle may be a better fit.

    • This reply was modified 5 years ago by WebMaster.
    Avatar photoTelium Support Group
    Participant
    Post count: 268

    The most likely causes of your failures are as follows:

    • FTP Pull: Your firewall is blocking incoming FTP data connections (TCP port 22). You can either enable incoming data connections on your firewall (not ideal), or preferably, set your FTP client to use ‘passive mode’.
    • FTP Push: Your firewall is blocking incoming FTP control/data connection (TCP ports 21 and 22), or your FTP server is not running.
    • wget: You forgot to add the ‘–content-disposition’ parameter to wget when using the HTTP URL. Either rename the downloaded file to match the package name, or add the –content-disposition parameter when using wget.
    • browser: It sounds like your browser connection is being interrupted/corrupted, and the file you downloaded is corrupt. Verify the md5 checksum of the file and try again, or switch to one of the above (more reliable) transfer methods. Browser (HTTP) download is known to be unreliable.
    Avatar photoTelium Support Group
    Participant
    Post count: 268

    Glad you are up and running. If you need SecAst to recreate its iptables rules just restart the SecAst service (it will restore all banned IP since it keeps those in a recovery file). We’ll have to think about how/if SecAst should monitor the iptables. It’s unusual for the iptables rules to be lost (so SecAst shouldn’t have to check that) – but it’s on our discussion list.

    In regards to downloading, what error exactly are you experiencing? (Corrupt download, or download won’t start, etc). Downloading by browser is often unreliable for large files, but FTP normally works perfectly. We just tried FTP (pull) and the file downloaded perfectly (no corruption, etc). We also tried downloading with Firefox version 53 (32 bit) and browser download worked fine 2 of 3 times (one time download was corrupt so it would not untar). Similarly downloading by Chrome worked 3 of 4 times. You can see why we offer FTP…browsers aren’t great for this kind of thing. (Since this is a different topic feel free to email support@telium.io if you have more details on file transfer issue)

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    Problem 1: iptables rules not being created

    When SecAst starts it creates a SECAST chain linked into your iptables’ INPUT chain like this:


    Chain INPUT (policy ACCEPT)
    target prot opt source destination
    SECAST all — anywhere anywhere

    And the SECAST chain is where dropping of attackers’ IP’s occurs. I see from your iptables list that the above rule is missing – and that’s why you are not able to block attacker IP’s. So the question is why is the SECAST chain rule being refused/lost. Are you updating/flushing your iptables rules (eg: regenerating using FireHOL) after SecAst starts? Is there an error in the SecAst log upon service start indicating any iptables related errors?

    Problem 2: Attackers not detected

    You did not include the [asterisk] stanza of your secast.conf, so ensure the securityevents key is blank (use the AMI), or points to a valid /var/log/asterisk/messages file. That’s usually the cause.

    I suggest you stop SecAst, delete the secast log file, and restart Secast, then manually ban 1 IP. Either post the secast log (or send to support@telium.io if you are concerned about making content public) and we can look there for further clues.

    If this is a commercial environment keep in mind that we recommend blocking attackers at the network edge (firewall) – letting SecAst add rules to your firewall.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    By design HAAst does not use disk mirroring (e.g. DRBD) or disk sharing (e.g. NFS, SMB, iSCSI). The reason is that with these protocols file corruption on a failing peer would immediately corrupt files on the other peer, and thereby destroy the entire cluster.

    Instead, HAAst brings the peers into sync at regular intervals. These intervals should leave enough time for HAAst to detect if a peer is failing, and then prevent synchronization if a peer is unhealthy. Do not use short sync intervals to simulate a mirrored disk (that defeats the benefit of this design).

    So your sensor intervals and sync intervals should work hand-in-hand. Keep in mind that HAAst’s internal sensors run at 0.5 second intervals, but external sensors (which you define) can run anywhere from seconds to hours apart. If you are only using internal sensors then synchronization no less than 15 seconds apart is usually sufficient (but 15 seconds is unusually low/short). If you are using external sensors then set your synchronization intervals to no less than 1/2 the sensor interval time. As well, set your sync intervals to non-multiples of one another; for example, 1,2,4 minute intervals are sub-optimal (as syncs will overlap), while 2,5,7 minute intervals are better (less chance of sync overlap).

    As well, set your synchronization intervals to match the value of the file/data being synchronized. For example, it doesn’t make sense to synchronize a MySQL database every 10 seconds if it only holds configuration data that might be changed once per day. Or, if you have a very large AstDB file (eg: 10,000 FreePBX users and devices on the host) then ensure your sync duty cycle is less than 30%; for example, if it takes 30 seconds to sync your AstDB then set your sync interval to no less than 90 seconds.

    Your interval settings need to balance the benefit of keeping both peers in sync quickly, with avoiding cluster failure in the event of file corruption, and adding too much load to a host. There are of course exceptions to every rule, but the above should serve as a guideline.

    The bottom line is NO – don’t make the interval as short as possible. When Telium is engaged to setup a cluster we usually set intervals in minutes (not seconds), other than for unusual circumstances.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    Exit code 156 is defined as “Failure to setup NIC control” (see the Detailed Installation Guide for the meaning of all exit code). This means that the NIC you have told HAAst to use (in the [voipnic] stanza of haast.conf) is not responding as expected or not available.

    If you set the log level to DEBUG and restart HAAst, you will see more details of what is going wrong in the haast log file. Most often this relates to configuring HAAst to use an IP address that is already in use somewhere on the network, or use an interface that is not present in the system (possibly due to a typo in the interface name). If you post the [voipnic] stanza of your haast.conf, and the output of ifconfig we can offer more specific advice.

    Avatar photoTelium Support Group
    Participant
    Post count: 268

    The automatic upgrade feature of FreePBX poses a number of risks to your system, including changing the cluster members to incompatible states. Telium recommends that you disable the automatic upgrades and only perform upgrades following the steps outlined in the HAAst operations guide (see https://telium.io/topic/upgrade-to-configuration-generator-freepbx/).

    Aside from risks to the cluster, configuration generators sometimes release updates that break the PBX (only to be fixed by updates a week later). Many commercial installations using configuration generators avoid updates altogether, unless absolutely necessary (for a feature or security update). Even then, updates should be carefully tested before being promoted to production servers.

    In our experience most mid to large Asterisk(TM) installations do not use configuration generators, so this problem will primarily affect home office and small office users.

    • This reply was modified 5 years ago by WebMaster.
    Avatar photoTelium Support Group
    Participant
    Post count: 268

    If you are willing to accept the risks of placing new single points of failure in your call path, and you are not using the OEM edition of HAAst (which includes call survival features), then yes you still have options. The key to this solution is to ensure directmedia (RTP flowing directly between endpoints). It’s also quite likely that your endpoints will expect to see the SIP channel responsive as well (or they may drop the call).

    Establishing directmedia involves:

    • Ensuring the media anchor points are accessible to one another without NAT.
    • Ensuring Asterisk is configured to use re-invites/directmedia
    • Ensuring your Asterisk dialplan does not force Asterisk to remain in the RTP stream
    • Ensuring your endpoints do not require transcoding (performed by Asterisk)

    Optional: ensuring the SIP endpoints continue to see active SIP connections involves:

    • Placing a B2BUA (or gateway/proxy/SBC) between endpoints and the cluster – this device must place itself into the SIP stream and optionally allow NAT traversal
    • Configuring the B2BUA to allow the interior leg of the SIP call to drop, but keep the outer leg of the SIP call to remain active
    • Configuring the B2BUA to use UDP for SIP (at least for cluster facing leg). This is not always required

    For example (this shows two B2BUA’s for clarity, but you can adjust to fit your need):
    Keeping calls up

    There are open source B2BUA products which might be modifiable to do what you want (eg: the SIPpy project available at: https://github.com/sippy/b2bua). Keep in mind that you are creating a free version of the commercial solution we do not recommend. If this is a critical call center you may be better off developing a proper B2BUA from scratch to do what you want, including moving calls through the new active HAAst node, etc but that is a large undertaking.

    HAAst OEM edition creates a call anchor on the PBX, so that even if Asterisk fails the calls don’t drop. HAAst will move the calls to the other node in an orderly fashion (move by IP or SIP redirect), or HAAst will grab the calls by force should the entire PBX server fail.

    • This reply was modified 5 years ago by WebMaster.
    • This reply was modified 5 years ago by WebMaster.
    • This reply was modified 3 weeks, 3 days ago by WebMaster.
Viewing 15 posts - 151 through 165 (of 262 total)