Forum Replies Created
-
AuthorPosts
-
in reply to: Complete list of features #6767
The “FEATURES” page for HAAst on the Telium website is far from complete. I would compare our “FEATURES” page to the window sticker of a new car at a dealership. It shows the highlights of a car but is not an exhaustive list of all features.
in reply to: Asterisk beta compatibility #6766We can definitely help.
Although we don’t list Asterisk beta’s as officially certified, our engineering group is regularly testing and certifying newer Asterisk versions. (Probably not Beta’s however)
If you can provide SSH access to your system our engineering group will ensure HAAst is working with your specific code. Since you are already running commercial editions of HAAst, we will ensure HAAst works with your Beta at no charge.
The root cause of your problem is a regression (bug) in your Linux distro. There have been similar Linux regressions noted by Digium in the past (eg: https://issues.asterisk.org/jira/browse/ASTERISK-15603)
The solution is to use the HAAst event handler to create the missing directory for Asterisk, before Asterisk even starts.
Create the file /usr/local/haast/events/asterisk.start.pre with the following contents:
#!/bin/bash
If [ ! -d /var/run/asterisk ] ; then
mkdir /var/run/asterisk
chown asterisk:asterisk /var/run/asterisk
fi
and set the permissions on /usr/local/haast/events/asterisk.start.pre to 550:
chmod 550 /usr/local/haast/events/asterisk.start.preNote: The event handler system (including use of the asterisk.start.pre file) is restricted to the Commercial Unlimited edition of HAAst only.
in reply to: License file rejected #6758Please note that in mid-2018 Telium is launching an updated license option, which no longer requires activation or ties your installation to your hardware. You will have several license options available at the time of generating your license request.
See http://telium.io/activation for activation options.
in reply to: Lost AMI connection caused failover #6759Based on the Asterisk full message log received, it appears that your Asterisk process was hung for almost 30 seconds. As proof, you have a number of plug-ins/dialplan add-ons that trigger log messages at least once per second. Notice that at 2:38am all messages stopped for almost 30 seconds? Something was blocking IO/CPU to the Asterisk process.
Five seconds after the Asterisk process hung HAAst deemed the peer to be non-responsive and initiated a failover. (This is correct behavior on the part of HAAst – something was going wrong on your PBX).
You need to trace down the root cause of Asterisk hanging for almost 30 seconds. Look for badly written backup scripts, IO or CPU intensive jobs scheduled for this time, etc. Do a grep search through all of your system logs around that time for clues as to what else was happening on your system.
(Hint: Looking at your Asterisk log file you appear to have added a new plug-in in the last 2 days)
in reply to: All HA for PBX based on open source? #6764I should also point out that during the past 10 years three other companies have released HA products for Asterisk. The products didn’t work well, and once their customers realized all of the conditions / use cases which their HA software could not handle (and the resulting outages) they switched to HAAst. All three of those companies are now out of business / disappeared, or their HA products have been discontinued.
You’ll see new HA products pop up on occasion, because you can add ‘HA’ type open source packages in just a matter of hours. But there are no open source packages for application level HA (that’s up to the application developer). Adding HA level code at the application level (or trying to add it to file level HA packages) is a massive undertaking (measured in person-years of development, not hours).
HAAst has been on the market since 2005 and has been growing in capabilities ever since!
in reply to: Lose contact with PBX when node is active #6763I think you are missing some basics concepts of networking. Your description above shows two NIC’s on the same subnet. That will confuse Linux as it is not sure which NIC to use for incoming/outgoing traffic.
Carefully review section 2.4 of the HAAst installation guide before going any further. If you are new to networking/multihoming/routing then you will find all HA product difficult to install or get working properly.
If this is for a commercial environment I would recommend you engage Telium professional services to perform the installation for you (or at least the network/VoIP NIC portion). If you wish, our engineers can also walk you through the key concepts of networking and multihoming, to help you understand what they configured and why.
in reply to: No option to configure gateway for VoIP NIC #6762Some Linux distributions allow you to configure routing information along with the NIC (in the same NIC configuration file); however, this is misleading – in fact, newer Linux distros no longer allow this practice. Routing information has nothing to do with the NIC. Routing information (eg: default gateway / subnet gateway) is added to the routing table not the NIC.
If you would like to add a route to the routing table from a terminal window then enter a command like:
route add -net 10.10.0.0 netmask 255.255.255.0 gw 10.10.0.1
If you would like this route to persist between reboots you would have to enter this information into a config file/script. If you are using the Commercial Unlimited edition of HAAst then you can add the above line to the pre-start event handler /usr/local/haast/events/asterisk.start.pre This event handler will run before the vNIC is brought up so the route will be present when needed. If you like you can also delete this route by adding a line to the the post-stop event handler /usr/local/haast/events/asterisk.stop.post
If you are not using the Commercial Unlimited Edition of HAAst, then you will need to find the most suitable place for these routing rules (which varies considerably between Linux distributions). In your case I suspect you are using a RedHat flavour of Linux, so you could create a file called /etc/sysconfig/network-scripts/route-eth0 with:
default 192.168.0.1 dev eth0
10.10.0.0/24 via 10.10.0.1 dev eth0
172.16.1.0/24 via 192.168.0.1 dev eth0
but you should check the guides for your Linux distro to confirm the best place to place ‘persistent routes’. If you are looking for routes to be added/deleted based on HAAst events then you should use the HAAst event handler instead.
in reply to: OEM edition option during license request #6761Telium has partnered with various PBX manufacturers and integrators to allow for purchasing Telium products with a predefined set of features/capacity limits at discounted prices. This is known as an OEM edition. The OEM edition is usually preinstalled by an integrator, but may be purchased by reseller/end-users for aftermarket installation as well. (Be aware that the exact features which are made available in OEM editions depend on pre-existing agreements, so you will likely not get every feature you want if you are an end-user or low-volume reseller).
If you are an integrator and wish to setup an OEM edition volume purchase please contact support@telium.io.
If you are an end-user and wish to see if you qualify for a price discounted OEM edition based on the hardware you already have, please contact support@telium.io. We offer over 30 different OEM editions (vendor specific) as of March 2018, and the list keeps growing.
in reply to: Software activation options #6760Telium software uses “activation” to turn on the features you have paid for, and to prevent software theft (or pirating). In order to offer the greatest possible flexibility to customers, and minimize inconvenience, Telium offers four different activation types:
- Hardware Fingerprint: This type of activation is used by Microsoft Windows and is now the industry stadard. Telium’s software computes a fingerprint of your computer at the time of licensing, and that fingerprint is used to uniquely identify your computer. If someone copies the software and tries to run it on another computer then it will fail to start (since the fingerprints won’t match). This type of activation does not require any special hardware or access to the internet, and is suitable to emergency / public safety phone service operators, PBX’s on secure networks, stand alone physical servers, etc. Note that if your computer is not fingerprintable (eg: if you run the software inside a container or virtual machine) then this type of activation is not available to you. However, if you are running the Telium software in a commercial hosting environment (eg: AWS, Azure) then this option works perfectly (even though the cloud provider uses virtualization), and this option is ideal as Telium has worked with major hosting providers to ensure fingerprint consistency as your instance moves from host to host.
- USB Dongle: This type of activation uses a USB device (known as a “dongle”) which we courier to you after completing your purchase. Telium’s software will recognize the dongle plugged into a USB port and then start normally. If someone copies the software and tries to run it on another computer without the USB dongle present then it will fail to start. This type of activation is suitable to computing environments where hardware is not fingerprintable (eg: if your run the software inside a container or virtual machine), or where hardware changes frequently. You can still move your VM guest across different hosts by sharing your USB dongle over the network (eg: using usbip) – you do not need to plug the dongle into a particular host, but you can.
A USB Dongle looks similar to this:
- Cloud:This type of activation does not have any dependence on computing hardware or USB devices. Upon start the software will contact Telium’s servers in the cloud to request permission to run. So long as you are running only the number of licensed copies you have purchased then Telium’s servers will give permission to the software to start. This type of activation is suitable to computing environments where hardware is not fingerprintable (eg: if you run the software inside a container or virtual machine), or where hardware changes frequently. This type of activation is not suitable to emergency / public safety phone service operators since such environments are not permitted to depend on the internet/cloud in order to operate.
- Volume License Server: This type of activation does not have any dependence on computing hardware, USB devices, or the cloud. Upon start the software it will contact a Volume License Server (VLS) running in your data center to request permission to run. So long as you are running only the number of licensed copies you have purchased then the VLS will give permission to the software to start. This type of activation is best suited to customers running a large number of Telium products, such as Internet Telephone Service Providers (ITSP). Note that a Telium VLS must be purchased and installed separately.
Telium licenses are perpetual; in other words, they never expire. As a result you cannot change activation types after we have issued your license. (If we were to issue a second license with a different activation type then the second license could be used to run a second copy of the software.) Please be sure to select the right activation type when you initially request your license.
If you lose or damage the USB Dongle then you would have to purchase an entirely new license at full price. Be sure that software loss is included in your office fire/damage/theft insurance policy, so that you can replace your software in case of loss. The USB key contains a CPU with hacking detection (EAL 5+) and will lock itself after repeated hack attempts. A USB dongle which has been locked due to hacking is considered damaged and can only be replaced through purchase of an entirely new license at full price.
There are products which allow a Linux OS to connect to a USB dongle over a network/internet. Although such products (e.g. http://usbip.sourceforge.net/, https://www.eltima.com/share-usb-dongle-over-network.html) are not officially support by Telium, we have been told these work with our USB dongles.
- This reply was modified 5 years ago by WebMaster.
in reply to: Node fail over if unrelated devices go down #6757It sounds like you want the local HAAst node to monitor the health of an external device. The way to accomplish this is with a HAAst health sensor.
At the simplest level I would suggest you create a sensor to monitor whether or not the device is responding to pings. For example, pasted the following code to /etc/xdg/telium/haast.conf.d/mydevice.sensors.conf
; Test if mydevice is reachable and responsive
network-connection/description=Ensure mydevice is reachable and responsive
network-connection/type=ping
network-connection/input=received
network-connection/parameters=count:3 | interface:ens1 | host:1.2.3.4
network-connection/scoring= =3:0 | =2:30 | =1:50 | :70
network-connection/warningscore=30
network-connection/errorscore=70
network-connection/resetcumulativescore=0
network-connection/interval=10
Restart HAAst and the sensor will become live. You should now see the local peer health score change based on the result of the above sensor.
I’ll explain the example more:
- The sensors pings the device 3 times through interface ens1.
- If 3 responses are received, then the health score for this sensor is 0; if 2 are received the health score is 30; if 1 is received the health score is 50; if none are received the health score is 70.
- If there is a warning from the ping (eg: no route) then a score of 30 is used. If there is a error (eg: ping cant run) then a score of 70 is used.
- The sensor runs every 10 seconds. (Since the ping command runs for 3-4 seconds that would be a 40% duty cycle).
- If the score ever returns 0, then the cumulative score for this sensor is reset to 0. Remember that each sensor’s score is cumulative, so it grows over time if the sensor keeps failing.
To make the above work, you also have to adjust your sensors settings in haast.conf If this is the only sensor in use then it is quite simple, but you may wish to create many such sensors to monitor other critical devices. In this case you could set your haast.conf settings as follows:
- criticallevel=70
- criticalresetlevel=40
- failurelevel=100
Note that the level settings in haast.conf check totals cores (across all sensors), and that individual sensor scores are cumulative. So in the example above missing 1 ping out of 3 each time the sensor runs will cause the sensor’s cumulative score to go 0 -> 30 -> 60 -> 90 -> 120. When it reaches 70 a critical alert can be sent (an event handler can run), and when it reaches 100 the local peer will declare failure and initiate a fail over.
As scoring tends to be more difficult to understand, here another example. The following graph shows the health score of the sensor over time (seconds). You can see from the blue boxes when a sensor starts (every 10 seconds), and when it ends (approx 4 seconds later). Based on the result of the ping responses (pongs), the number of packets received will cause a score to be calculated. The cumulative health score (in yellow) grows over time as the sensors detects missed pongs.
You can see at time 34s that all pongs where received and the cumulative score returns to 0. At time 64s the cumulative score has reached the critical level (70) and the critical event handler is trigger. At time 74s the cumulative score has reached the failure level and the node triggers the fail over process to the peer. Once the other node is active the local health score returns to 0 around time 84s.
This example is a simple ping test, but if your device offers health information through a REST API, telnet interface, etc. then you could create an even more sophisticated sensor.
I would suggest you experiment with disabling the ens1 NIC on your host, and watch the health score increase and then failover in HAAst. (Through the HAAst GUI or through the HAAst telnet interface). Once it works perfectly you could create similar sensors for HAAst on the other peer.
in reply to: All HA for PBX based on open source? #6755Telium wrote HAAst from top to bottom in C++, and we agree with your Linux admins that the open source packages are not well suited to application level HA. The open source packages are great if you want to create a HA print server, file server, etc. However, they have no understanding of, or visibility into, the PBX, the environment, trunks, etc. The open source packages work at the OS level, not the application level and as such don’t create an HA PBX which can survive real world failures. For example, a route becoming unavailable at the ITSP, the PBX running out of file handles, a data center router going down, etc. would all trigger a failover with HAAst, but open source packages would likely leave you with a non-functional PBX.
Aside from HAAst, every other HA solution for PBX’s is based on essentially the same open source packages. HAAst does not use open source packages for any detection, heartbeat, failover control, etc. But, you are welcome to use open sources packages with HAAst if you like. (There are some things open source does well and we don’t want to reinvent the wheel). Although the HAAst product tabs do a good job explaining the differences between HAAst and the open source / commercial products but here are a few highlights:
Heartbeat & Health: The open source heartbeat/health package does not take into account the health of Asterisk, status of trunk & route availability, available file handles, available memory, etc. latency between devices, calls successfully bridging, etc. Open source packages do dead/alive detection of the box/process (and are not Asterisk operations aware). HAAst has its own proprietary heartbeat and health detection, written exclusively to monitor and detect Asterisk PBX’s and their environments. By default HAAst detects 18 different factors which degrade node health (including talking directly to Asterisk through the AMI to assess health), and can also use an unlimited number of customizable sensors.
Synchronization: Open source packages like DRBD (or other shared disk solutions like NFS/Samba/iSCSI/Corosync/Rsync/etc.) put your data at risk since file corruption by one failing peer immediately corrupts files of the other peer. With those open source solutions a failing process on one peer may destroy your entire cluster. As well, network loss during data sharing may leave your files corrupt and SQL databases in invalid states (the database might not start with corrupt tables, and Asterisk might not start with corrupt config files). HAAst synchronizes data between peers, but only if the peers are healthy. HAAst synchronizes databases (MySQL/PostgreSQL/SQLite) at the SQL transaction level, so you are never left with corrupt tables (and a PBX that won’t start). And to top it off, HAAst can maintain snapshots of healthy critical files, stepping backwards through previous snapshots to find a system state that allows the PBX to start after failure.
Security: HAAst has an encrypted link between peers so there is no risk of Man In The Middle attack, and no risk of hackers gaining control of the PBX. Open source products have well published and unencrypted protocols that are easy to tap into, and a novice/script kiddie can bring down the cluster using this information.
Other: There are lots of other functional differences as well – have a look at the features tab of the HAAst web pages for an overview of what HAAst does, and then look at the comparison tab to see what generic tools can’t do.
Performance: The bottom line is: how do these HA solutions perform in real life. There’s a reason that emergency service gateways, hospitals, airline call centers, etc. choose HAAst. From complete detection, to avoiding false positive failovers, to speed of failover, to moving resources, etc. nothing comes close to HAAst.
All of the open source packages are wonderful products in their own right, each with a specific purpose. If you spend enough time adding your own code on top of these then you can start to add application level HA functionality – but its up to you to code it. After 10 years of continuous development we have created a very sophisticated product which can detect and recover gracefully from an enormous number of failure scenarios, building our own heartbeat/synchronization/health detection/etc tailored to telephony environments.
So unlike some other HA ‘solutions’, HAAst is not a collection of open source packages relabeled as an application level HA product. Check out the HAAst web page tabs to see why HAAst is the only solution for hospitals, police/fire/911 call centers, mission critical call centers,etc. So your Linux admins are right – open source packages are NOT suitable for application level HA (as would be the case with a PBX), and that’s why HAAst is the choice of large call centers/fire departments/emergency service gateways/etc.
in reply to: Send SNMP trap on failover #6754You can have HAAst issue SNMP traps using HAAst’s built-in event handler system. The event handlers are simply executable files (binary / BASH / etc) or symlinks placed in the directory:
/usr/local/haast/eventsThe event handlers are named to reflect when they are triggered. The files you most likely want to create are:
- asterisk.start.pre
- asterisk.stop.post
The event handler (files) are launched automatically by HAAst – there is nothing you need to do to execute them. For example, the “asterisk.start.pre” is launched by HAAst before Asterisk is started upon node promotion. If you wanted the event handler to run immediately after Asterisk is started, create a file called “asterisk.start.post”.
In your case create these two files with 550 permissions in the above folder. The content of the file would be similar to:
#!/bin/bash
# Issue SNMP v3 trap
snmptrap -v -e -u -a -A -x -X
logger “SNMP trap issued for demotion of local peer”There are also numerous environment variables set by HAAst before executing the event handlers; so if you want information related to the peer/failover/etc you can use these environment variables in your BASH script. (See the installation guide for more information on the environment variables, or use the set command in an event handler to redirect all environment variables to a temporary file for further examination).
The snmptrap commands may require a change to system configuration files, but that would be specific to your setup. If you have configured other traps then this is likely already done.
Please note that use of SNMP is an advanced topic, and its configuration requires advanced Linux admin skills.
in reply to: Where can I find an installation manual #6599Look in the /docs folder of the package you downloaded. In there you will find a PDF document called Detailed_Installation_Guide.pdf which will take you through all of the steps involved in installation (getting the program installed), and key steps of configuration (making the program work the way you want).
For example, if you copy the text below and place it into the file /etc/xdg/telium/haast.conf.d/network_cable.sensors.conf then you can test the sensors graph on the GUI. This sensors checks if the network connection on ‘eth1’ is down. If the connection is down the health score increases by 5 points. If the connection is up the health score remains unchanged for that cycle. (i.e. 0 health points).
If HAAst can’t read the NIC, then an error score of 20 is used. If HAAst can talk to the NIC but not extract a state then the warning score of 10 is used. If the sensors reports 0, then the cumulative score for this sensors is reset to 0. This sensor runs every 5 seconds.
So restart HAAst with this sensor in place and open up the graph on the HAAst GUI. Unlpug the network cable and watch your score, then replug the cable and see it reset to 0. If you leave the cable unplugged long enough and the health score reaches the critical level, then your cluster will fail-over to the peer.
; Test to ensure a network cable is plugged into the NIC, and that the cable is live (i.e. other end is plugged in)
network-cable/description=Cable plugged in
network-cable/type=nic
network-cable/input=state
network-cable/parameters=state:ethernet | interface:eth1
network-cable/scoring= =up:0 | :5
network-cable/warningscore=10
network-cable/errorscore=20
network-cable/resetcumulativescore=0
network-cable/interval=5
network-cable/debug=false
- This reply was modified 5 years ago by WebMaster.
-
AuthorPosts