CUCM Call-Processing Redundancy
A cluster is a set of networked servers that can be configured to provide specific services per server. Some cluster servers can be configured to provide CUCM services while other servers can provide Computer Telephony Integration (CTI), Trivial File Transfer Protocol (TFTP), and other media services such as conferencing or music on hold (MOH) These services can be provided by the subscribers and the publisher and can be shared by all servers.
Clustering provides several benefits. It allows the network to scale to up to 40,000 endpoints, provides redundancy in case of network or server failures, and provides a central point of administration. CUCM also supports clusters for load sharing. Database redundancy is provided by sharing a common database, whereas call-processing redundancy is provided by CUCM groups.
A cluster consists of one publisher and a total maximum of 20 servers (nodes) running various services, including TFTP, media resources, conferencing, and call processing. You can have a maximum of eight nodes for call processing (running the Cisco CallManager service).
For a quick recap, a CUCM cluster has a CUCM publisher server that is responsible for replicating the database to the other subscriber nodes in the cluster. The publisher stores the call detail records, and is typically used to make most of configuration change, except starting with CUCM 8.0 where database modifications for user facing call processing features are made on the subscriber servers. The subscriber servers replicate the publisher’s database to maintain configuration consistency across the members of the cluster and facilitate spatial redundancy of the database.
To process calls correctly, CUCM needs to retrieve configuration settings for all devices. These settings are stored in a database using an IBM Informix Dynamic Server (IDS). The database is the repository for information such as service parameters, features, device configurations, and the dial plan.
The database replicates nearly all information in a star topology (one publisher, many subscribers). However, CUCM nodes also use a second communication method to replicate run-time data in a mesh topology as shown in Figure 2-9 (every node updates every other node). This type of communication is used for dynamic information that changes more frequently than database changes. The primary use of this replication is to communicate newly registered phones, gateways, and DSP resources, so that optimum routing of calls between members of the cluster and the associated gateways occurs.
Figure 2-9 Cisco Unified Communications Manager Database Replication Overview
Database replication is fully meshed between all servers within a cluster. Static configuration data, because it is created through moves, adds, and changes, is always stored on the publisher and replicated one way from the publisher to each subscriber in the cluster. However, user-facing feature data, for example, Cisco Extension Mobility features, is writeable on a subscriber and are replicated from an updated subscriber to all other servers. All nonuser-facing feature data can be written only to the publisher database and is replicated from the publisher to all subscribers.
User-facing features are typically characterized by the fact that a user can enable or disable the feature directly on their phone by pressing one or more buttons, as opposed to changing a feature through a web-based GUI.
As illustrated in Figure 2-10, user-facing features that are listed below do not rely on the availability of the publisher. The dynamic user-facing feature data can be written to the subscribers to which the device is registered. The data is then replicated to all other servers within the cluster. By allowing the data to be written to the subscriber, the user-facing features can continue to function in the event of a publisher failure.
Figure 2-10 User-Facing Feature Processing
User-facing features are any features that can be enabled or disabled by pressing buttons on the phone and include the following:
Call Forward All (CFA)
Message Waiting Indicator (MWI)
Privacy Enable/Disable
Do Not Disturb (DND) Enable/Disable
Cisco Extension Mobility Login
Hunt-Group Logout
Device Mobility
CTI CAPF status for end users and application users
Therefore, most data (all nonuser-facing feature data) is still replicated in hub-and-spoke style (publisher to subscribers), while user-facing feature data is replicated bidirectionally between all servers.
Cisco Unified Communications Manager Groups: 1:1 Design
A 1:1 CUCM redundancy deployment design, as illustrated in Figure 2-11, guarantees that Cisco IP phone registrations never overwhelm the backup servers, even if multiple primary servers fail concurrently. This design provides high availability and simplifies the configuration. However, the 1:1 redundancy design has an increased server count compared with other redundancy designs and may not be cost-effective.
Figure 2-11 1:1 Redundancy Design
The other services (dedicated database publisher, dedicated TFTP server, or MOH servers) and media-streaming applications (conference bridge or MTP) may also be enabled on a separate server that registers with the cluster.
Each cluster must also provide the TFTP service, which is responsible for delivering IP phone configuration files to telephones, along with streamed media files, such as MOH and ring files. Therefore, the server that is running the TFTP service can experience a considerable network and processor load.
Depending on the number of devices that a server supports, you can run the TFTP service on a dedicated server, on the database publisher server, or on any other server in the cluster.
In Figure 2-11, an Open Virtualization Archive (OVA) template with the maximum number of users functions as the dedicated database publisher and TFTP server. In addition, there are two call-processing servers supporting a maximum of 10,000 Cisco IP phones. One of these two servers is the primary server; the other server is a dedicated backup server. The function of the database publisher and the TFTP server can be provided by the primary or secondary call-processing server in a smaller IP telephony deployment (fewer than 1000 IP phones). In this case, only two servers are needed in total.
When you increase the number of IP phones, you must increase the number of CUCM servers to support the IP phones. Some network engineers may consider the 1:1 redundancy design excessive because a well-designed network is unlikely to lose more than one primary server at a time. With the low possibility of server loss and the increased server cost, many network engineers choose a 2:1 redundancy design that is explained in the following section.
Cisco Unified Communications Manager Groups: 2:1 Design
Figure 2-12 shows a basic 2:1 redundancy design. While the 2:1 redundancy design offers some redundancy, there is the risk of overwhelming the backup server if multiple primary servers fail. In addition, upgrading the CUCM servers can cause a temporary loss of some services, such as TFTP or DHCP, because a reboot of the CUCM servers is needed after the upgrade is complete.
Figure 2-12 2:1 Redundancy Design
Network engineers use this 2:1 redundancy model in most IP telephony deployments because of the reduced server costs. If a virtual machine with the largest OVA template is used (shown in Figure 2-11), the server is equipped with redundant, hot-swappable power supplies and hard drives, and it is properly connected and configured, it is unlikely that multiple primary servers will fail at the same time, which makes the 2:1 redundancy model a viable option for most businesses.
As shown in the first scenario in Figure 2-12, when no more than 10,000 IP phones are used, there are no savings in the 2:1 redundancy design compared with the 1:1 redundancy design, simply because there is only a single primary server.
In the scenario with up to 20,000 IP phones, there are two primary servers (each serving 10,000 IP phones) and one secondary server. As long as only one primary server fails, the backup server can provide complete support. If both primary servers failed, the backup server would be able to serve only half of the IP phones.
The third scenario shows a deployment with 40,000 IP phones. Four primary servers are required to facilitate this number of IP phones. For each pair of primary servers, there is one backup server. As long as no more than two servers fail, the backup servers can provide complete support, and all IP phones will operate normally.