Forum Replies Created
- 
		AuthorPosts
- 
		
			
				
in reply to: Optimal configuration and synchronization interval #6700By design HAAst does not use disk mirroring (e.g. DRBD) or disk sharing (e.g. NFS, SMB, iSCSI). The reason is that with these protocols file corruption on a failing peer would immediately corrupt files on the other peer, and thereby destroy the entire cluster. Instead, HAAst brings the peers into sync at regular intervals. These intervals should leave enough time for HAAst to detect if a peer is failing, and then prevent synchronization if a peer is unhealthy. Do not use short sync intervals to simulate a mirrored disk (that defeats the benefit of this design). So your sensor intervals and sync intervals should work hand-in-hand. Keep in mind that HAAst’s internal sensors run at 0.5 second intervals, but external sensors (which you define) can run anywhere from seconds to hours apart. If you are only using internal sensors then synchronization no less than 15 seconds apart is usually sufficient (but 15 seconds is unusually low/short). If you are using external sensors then set your synchronization intervals to no less than 1/2 the sensor interval time. As well, set your sync intervals to non-multiples of one another; for example, 1,2,4 minute intervals are sub-optimal (as syncs will overlap), while 2,5,7 minute intervals are better (less chance of sync overlap). As well, set your synchronization intervals to match the value of the file/data being synchronized. For example, it doesn’t make sense to synchronize a MySQL database every 10 seconds if it only holds configuration data that might be changed once per day. Or, if you have a very large AstDB file (eg: 10,000 FreePBX users and devices on the host) then ensure your sync duty cycle is less than 30%; for example, if it takes 30 seconds to sync your AstDB then set your sync interval to no less than 90 seconds. Your interval settings need to balance the benefit of keeping both peers in sync quickly, with avoiding cluster failure in the event of file corruption, and adding too much load to a host. There are of course exceptions to every rule, but the above should serve as a guideline. The bottom line is NO – don’t make the interval as short as possible. When Telium is engaged to setup a cluster we usually set intervals in minutes (not seconds), other than for unusual circumstances. in reply to: Terminate with exit code 156 #6699Exit code 156 is defined as “Failure to setup NIC control” (see the Detailed Installation Guide for the meaning of all exit code). This means that the NIC you have told HAAst to use (in the [voipnic] stanza of haast.conf) is not responding as expected or not available. If you set the log level to DEBUG and restart HAAst, you will see more details of what is going wrong in the haast log file. Most often this relates to configuring HAAst to use an IP address that is already in use somewhere on the network, or use an interface that is not present in the system (possibly due to a typo in the interface name). If you post the [voipnic] stanza of your haast.conf, and the output of ifconfig we can offer more specific advice. in reply to: Automatic upgrade of FreePBX #6701The automatic upgrade feature of FreePBX poses a number of risks to your system, including changing the cluster members to incompatible states. Telium recommends that you disable the automatic upgrades and only perform upgrades following the steps outlined in the HAAst operations guide (see https://telium.io/topic/upgrade-to-configuration-generator-freepbx/). Aside from risks to the cluster, configuration generators sometimes release updates that break the PBX (only to be fixed by updates a week later). Many commercial installations using configuration generators avoid updates altogether, unless absolutely necessary (for a feature or security update). Even then, updates should be carefully tested before being promoted to production servers. In our experience most mid to large Asterisk(TM) installations do not use configuration generators, so this problem will primarily affect home office and small office users. - 
		This reply was modified 5 years, 9 months ago by WebMaster. 
 in reply to: Patented call survival add-on #6698You’re welcome. I forgot to answer the second part of your question: Yes you can use such a call survival device in combination with HAAst. But we do not recommend it. Such a product does nothing for real life failure scenarios, but adds new single points of failure. (Nothing to gain, lots to lose). You are better off using one of the techniques described above to keep calls up. If you want to build your own open source version of the commercial product above have a look at https://telium.io/topic/keeping-calls-up-when-cluster-switches-to-backup-node/. If you qualify for the OEM edition of HAAst then let HAAst perform the full call continuity function for you. - 
		This reply was modified 5 years, 9 months ago by WebMaster. 
 in reply to: License for preproduction / testing server #6694Although it’s not listed on the website, we do offer a special Commercial Unlimited edition license to be used only for preproduction / testing environments. This license includes all functionality of the CU edition, but is limited to 5 simultaneous calls. This license is 1/2 the price of the regular CU edition, and optional maintenance agreement (for updates) is also 1/2 price. Please contact support@telium.io with your current customer number to purchase this license. (This version is only available to customers with at least 1 CU license). If your customer wants to run a high volume of calls over the testing system then they would have to buy a full CU license. We don’t distinguish between testing and production systems if both need full features and full call volume. (We also have to prevent license fraud – so a full license has to be full price). in reply to: Upgrade to configuration generator (FreePBX) #6693When you perform an upgrade/update to any module in FreePBX (even a minor one) there is the possibility that FreePBX will change the structure of the tables in MySQL. Since HAAst will (intentionally) not sync metadata (SQL structures), you must ensure that the peers do not attempt to synchronize data during such an upgrade/update. The Maintenance and Operations Guide shows the complete upgrade procedure (see section 6). But if you are very experienced with Linux & FreePBX, you can follow this short-cut: - Upgrade A
- Unplug the network connection from A
- Upgrade FreePBX on A
 
- Upgrade B
- Unplug the network connection from B
- Replug the network connection to A
- Upgrade FreePBX on B
 
- Re-establish cluster
- Replug the network connection from B
- Wait for the cluster to HAAst restablish automatically
- Use the telnet/web interface to make the preferred peer active. (Or wait for automatic fallback during the maintenance window if enabled in the haast.conf file)
 
 The key concept here is that a standby peer must NOT be able to see an active peer which is running a different version (or different modules installed/enabled) of the configuration generator. Note that this applies only to FreePBX. Other configuration generators do a much better job managing settings and keeping settings-code aligned. in reply to: HAAst upgrade procedure (major version upgrade) #6692Since the Peerlink protocol verion has changed, the peers will not be able to talk to each other (over Peerlink) until both peers are upgraded. So if both peers are online at the same time they will not be able to communicate – and both peers will try to take over as active. Consequently, your upgrade procedure must ensure both peers are NOT online at the same time. Since the license version has changed, you will need to request new licenses from Telium. To avoid bringing down the entire cluster you should upgrade and re-license one peer at a time. The overall steps to performing such an upgrade are: - Upgrade A
- Stop HAAst on peer A, wait for stop
- Run the install_files/updatefiles.sh script from the newly downloaded package
- Unplug the network connection from A
- Restart HAAst on peer A
- Request and apply new license to A if required, then restart A
 
- Switchover
- Stop HAAst on peer B, wait for stop
- Replug the network connection to A, ensure A takes over as active
 
- Upgrade B
- Run the install_files/updatefiles.sh script from the newly downloaded package
- Restart HAAst on peer B, ensure cluster forms
- Request and apply new license to B if required, then restart B
 
- Fallback to preferred peer (optional)
- Use the telnet/web interface to make the preferred peer active. (Or wait for automatic fallback during the maintenance window if enabled in the haast.conf file)
 
 The answer depends on the location of your two PBX’s. If the two PBX’s are located on the same subnet, then - Move IP: Use the VoIPNIC option of HAAst to move a single IP between peers. This will allow for rapid reconnection of downstream (user agents) and upstream (trunks)
 If the two PBX’s are located on different subnets (from each other): 
 [list=2]
 [*]SRV records: Assuming your user agents (phone sets) support SRV records (which most do), then you should create SRV records for your two PBX’s. Most user agents will perform a DNS lookup for SRV records to find available PBX’s, and try them in order of priority until they successfully register with a PBX. For example, if you have PBX’s located in data centers dc1 and dc2, create two DNS entries (in your internal DNS server) as follows:
 
 type=srv
 name=_sip._udp.mydomain.com
 priority= 10
 weight=0
 port=5060
 hostname=pbx1.local
 and
 
 type=srv
 name=_sip._udp.mydomain.com
 priority= 20
 weight=0
 port=5060
 hostname=pbx2.local
 [*]Route Change: Use the pre/post Asterisk start/stop event handlers of HAAst to update routes in your router(s). Set the updated routes to point to the new PBX address.
 [*]DNS update: Use the pre/post asterisk start/stop event handlers to update a public DNS entry. Be sure to set the TTL value low enough that phones will lookup the new IP in a reasonable timeframe.[/list]Note: Using SRV records or DNS entries makes it easy for users with softphones to move on and off LAN and resume a PBX connection without manual intervention. The answer depends on the location of your two PBX’s. If the PBX’s are located in the same data center (i.e. using the same external IP address), then no change is necessary as they will connect to the same IP address. If you need to modify your firewall/router internally to direct traffic to the active peer then see the answer to the question on locating the PBX for internal phones. On the other hand, if the PBX’s are located in different data centers (i.e. accessible using different public IP addresses) then your options are: - SRV records: Assuming your user agents (phone sets) support SRV records (which most do), then you should create SRV records for your two PBX’s.  Most user agents will perform a DNS lookup for SRV records to find available PBX’s, and try them in order of priority until they successfully register with a PBX.  For example, if you have PBX’s located in data centers dc1 and dc2, then create to DNS entries (in your public DNS server) as follows:
 type=srv
 name=_sip._udp.mydomain.com
 priority= 10
 weight=0
 port=5060
 hostname=dc1.mydomain.com
 and
 
 type=srv
 name=_sip._udp.mydomain.com
 priority= 20
 weight=0
 port=5060
 hostname=dc2.mydomain.com
- DNS update: Use the pre/post asterisk start/stop event handlers to update a public DNS entry. Be sure to set the TTL value low enough that phones will lookup the new IP in a reasonable timeframe.
- MPLS: If you use MPLS then you can simply move the label (to move IP between routers of your two DC’s). We don’t provide any further detail on this option (i.e. if you don’t understand how to do this with MPLS, then there’s too much to explain in one post)
 Note: Using SRV records or DNS entries makes it easy for users with softphones to move on and off LAN and resume a PBX connection without manual intervention. in reply to: Asterisk 14 compatability #6689Yes, HAAst was certified Asterisk 14 compatible in November 2016. Not a lot of companies are running Asterisk 14 yet (in production) as of Jan 2017, so you will be on the leading edge. But I assume you need some Asterisk 14 features. You didn’t mention your customer name/number but based on the username (and 4500 phone sets) I think I found you in our CRM system. It looks like your maintenance agreement expired last year so you will need to contact admin@telium.io to upgrade your license (otherwise upgrading HAAst will cause it to run as the ‘free edition’). in reply to: OK to use rsync or NFS share for cluster data #6688You are welcome to use rsync,NFS,samba, etc. in your cluster. However, we generally recommend keeping data on each peer and allowing HAAst to control all synchronization, and here’s why: - HAAst only synchronizes data between peers if peers have passed a health check. That means if one node is failing and starts to accidentally corrupt data, it will not be copied to the other peer! Tools like rsync, NFS, DRBD, etc. will immediately share/mirror all data including, corrupt data.
- By allowing HAAst to control synchronization, the HAAst event handler system will allow you to customize inbound data following a synchronization (e.g. update trunk information, modify the dialplan, customize TFTP files for the local network, etc)
 You should not place databases on any block level sharing device (NFS/SAMBA), or do block level mirroring (DRBD,iSCSI), as corruption by one peer will destroy the database for the other peer! Even worse, a failure midway through a write will corrupt both peers! Note that HAAst performs SQL transactions (not block level access) for database synchronization, so even if a peer fails midway through a database write neither peer will be left with an invalid database state. The one exception to this rule is if you need to archive a high volume of files, or very large files, that are written once and thereafter only read. A perfect example of this is call center call recordings or logs. A call center can easily generate gigabytes of recordings every minute, to be referenced in the future in case of dispute or for quality assurance. Since these are large files written once and then archived, they are the perfect example of data that should be written to a server share, common iSCSI device, etc. It would not make sense to generate the high network load and disk load required to continually create a second copy of this data. in reply to: License violation but no calls in progress #6687The fact that the license violation occurs close to the time of a log rotation is a red herring (no relationship). SecAst does not track calls in progress; it asks Asterisk to report the number of calls in progress. You can perform the same query from the command line: 
 
 asterisk -vx ‘core show calls’
 So the question is why is your Asterisk installation reporting 8 calls in progress. This can be due to:- Valid users making calls in or out
- Valid user starting the conference feature
- Incoming callers leaving a voicemail
- Automated calls
- Hackers calling in to probe your dialplan
- Asterisk incorrectly not releasing channels
- Dialplan errors
 If the number of calls reports higher than you expect, you can delve deeper into the calls in progress using a command like: 
 
 asterisk -vx ‘core show channels’If you are using FreePBX then Sangoma recently started making automatic calls in the background to set ‘time condition’ variables. In essence FreepBX is making invisible calls, and Asterisk will report these as calls in progress; nothing we can do about it, and that won’t explain 8 calls in progress. So…in a nutshell SecAst does not count calls – it gets that number from Asterisk. Something else is going on with your Asterisk setup. Repeat the first command above once every 30 seconds and watch if your ‘calls processed’ count is increasing even when users aren’t making calls. That should help you figure out why Asterisk is reporting a count you don’t expect! And now the bad news…it sounds like you’re struggling with some basic Linux admin and Asterisk admin tasks. If this is a commercial installation I would recommend purchasing 2 hours of support so we can help you through setup. If this is a home installation you probably have a big learning curve ahead of you in terms of Ubuntu and Asterisk – I’m not sure if it’s worthwhile for you to continue but we can’t really offer free support for Asterisk (or Ubuntu). I’m not sure if you are using a configuration generator either (you don’t offer any details of your system), but if this is a commercial installation you may want to move up to a package like xCALLY which provides a very professional turnkey solution without many of the headaches involved with many smaller packages (you don’t need to know anything about Linux or Asterisk). in reply to: Thank you! #6686You’re very welcome. It was a great project, and I’m glad we were your partner for this important project. in reply to: Which peers takes over when cluster reassembles #6685When the peers are in a state of dual-active contention one of the peers is considered improperly active (invalid state). In other words, it should not be processing calls at all. Usually trunks (E1/T1/SIP/IAX/etc) are forced to one peer or the other which means one peer will not be getting (more) calls. If your configuration allows both peers to handle calls simultaneous then you are in the minority (this is not typical). For this reason the number of active calls is not a criteria in determining which peer to demote. in reply to: Which peers takes over when cluster reassembles #6683When the cluster reassembles HAAst will discover 2 peers active (called ‘dual-active contention’). HAAst will then try to pick the peer with the lowest likelihood of long-term success (probability of staying active) and demote it. The determination of which peer is least likely to succeed considers: - Which peer caused the previous failover
- How many failures has each peer had
- How long has each peer been running
- What was the last health score of each peer
- And more…
 This works well when the peers are configured as equals (primary/primary), which implies that the cluster would be happy with either peer running. However, when the peers are not configured as equals (primary/backup), the determination of which peer should demote may result in the backup server remaining active. This describes the situation you encountered, and this is normal behavior (by design). As of version 2.3.2.14 the administrator can override the demotion decision, ignoring the criteria listed above. If the ‘autodemote’ key is set to true in the [backuprole] stanza of either peer, then HAAst will always demote that peer. 
- 
		This reply was modified 5 years, 9 months ago by 
- 
		AuthorPosts
 
											
				 
			
					 
			