cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3387
Views
2
Helpful
2
Replies

High Availability

rayoelkins
Level 4
Level 4

Looking for a document that goes into the specifics of high availability clustering in more detail.  For example, something that provides expected behavior of collaboration applications/IP phones during a hardware failover event and any caveats to the transparency of the failover process.  I've checked the PA's, CVD's and SRND's but none go into great detail.  I can already see a Business Edition client asking "what happens if there is no hardware failure but just a single VM failure?"

1 Accepted Solution

Accepted Solutions

dakeller
Cisco Employee
Cisco Employee

Ray,

High availability was a huge topic in the early days of Cisco UCM.  This was a key topic what PBX manufactures used as FUD when defending their estate at their customers.   But it's no longer a topic because customers are satisfied with the solution offered by the Cisco UC solution to provide highly available services.

Multiple levels of high availability are available in not only the UCM core product, but also the clients.  First off, a UCM cluster is designed to be a highly resilient and highly available platform.  Devices (phones and MGCP GW) maintain 2 TCP connection into a cluster.  One on a primary call processing node (a subscriber) and one to their backup node (another subscriber).  If one of the call processing nodes fails, the device will 'activate' the backup link.  Calls will stay active on the devices, but the only operation that can be done on a failover call situation is to 'end the call'.  Same thing holds true for SIP and H323 trunks, but redundancy is achieved through multiple references to different nodes in the cluster. 

Cisco recommends at least 2 call processing servers that will provide highly available call processing services.  If the 2 servers (or VM's) are on different compute hardware, then the loss of a single VM or a single ESXi host will leave the other VM/ESXi running to take on the load.  Devices active their backup link and they will rarely notice the failure occurred.  Failback occurs in a similar fashion once the primary call control is stable and available for 3-5 min. 

This failover mechanism is the same for CTI applications, SIP/H323 trunks, SIP/SCCP phone devices, VM ports, CSF clients, etc. 

You can read more about redundancy at Cisco Unified Communications Manager System Guide, Release 10.0(1) - Redundancy [Cisco Unified Communications Manager (C…

It's in the 10.0 System Guide, but applies to every version back to 6.x.

Thanks,

Dan Keller

Technical Marketing Engineer

View solution in original post

2 Replies 2

dakeller
Cisco Employee
Cisco Employee

Ray,

High availability was a huge topic in the early days of Cisco UCM.  This was a key topic what PBX manufactures used as FUD when defending their estate at their customers.   But it's no longer a topic because customers are satisfied with the solution offered by the Cisco UC solution to provide highly available services.

Multiple levels of high availability are available in not only the UCM core product, but also the clients.  First off, a UCM cluster is designed to be a highly resilient and highly available platform.  Devices (phones and MGCP GW) maintain 2 TCP connection into a cluster.  One on a primary call processing node (a subscriber) and one to their backup node (another subscriber).  If one of the call processing nodes fails, the device will 'activate' the backup link.  Calls will stay active on the devices, but the only operation that can be done on a failover call situation is to 'end the call'.  Same thing holds true for SIP and H323 trunks, but redundancy is achieved through multiple references to different nodes in the cluster. 

Cisco recommends at least 2 call processing servers that will provide highly available call processing services.  If the 2 servers (or VM's) are on different compute hardware, then the loss of a single VM or a single ESXi host will leave the other VM/ESXi running to take on the load.  Devices active their backup link and they will rarely notice the failure occurred.  Failback occurs in a similar fashion once the primary call control is stable and available for 3-5 min. 

This failover mechanism is the same for CTI applications, SIP/H323 trunks, SIP/SCCP phone devices, VM ports, CSF clients, etc. 

You can read more about redundancy at Cisco Unified Communications Manager System Guide, Release 10.0(1) - Redundancy [Cisco Unified Communications Manager (C…

It's in the 10.0 System Guide, but applies to every version back to 6.x.

Thanks,

Dan Keller

Technical Marketing Engineer

Thanks for the great info Daniel.  Exactly what I needed.