Top 10 Features in Windows Server 2019 Failover Clustering: Full Overview

As the demand for uninterrupted performance grows, modern businesses have started looking for new ways of ensuring 99.999% availability of their services. The reality is that most organizations cannot tolerate even minimal downtime as the stakes of losing productivity are just too high. The potential consequences of unexpected system failure can include loss of revenue, business opportunities, productivity, and customer’s trust. Even if you manage to recover from such serious repercussions, downtime can still affect your business growth and negatively shape the future of your organization.

To mitigate the risks of downtime, you need to ensure that your business can still provide its services once the system or any of its components go down. The ultimate approach is to build a highly available environment, which can be done through ensuring redundancy of all the system components. There are various ways to provide High Availability for your environment—with the help of Windows server backup, for example. Another popular option is Failover Clustering.

In this blog post, we will describe how Failover Clustering works in Windows Servers. Additionally, we will discuss how the Failover Clustering functionality has transformed with the release of Windows Server 2019. Particularly, we will provide an overview of the top 10 features in Windows Server 2019 Failover Clustering.

NAKIVO for Hyper-V Replication

NAKIVO for Hyper-V Replication

Business continuity in any failure scenario with robust replication for Hyper-V VMs. Verify replicas in seconds and automate VM failover to minimize downtime.

The Basics of Windows Server Failover Clustering

A failover cluster is a group of 2 or more servers (nodes) which work together to ensure that clustered roles and services remain highly available and scalable under any circumstances. These clustered nodes share network and storage resources as they can be connected by physical cables, software, or at the application level. In case of cluster node failure, the services of failed nodes are assumed by secondary nodes. This process is known as failover, which can help minimize service disruption, reduce downtime, and respond to host failure quickly and efficiently.

Moreover, it is critical that you are able to control the state of nodes in each failover cluster. Using the internal monitoring tool, you can verify that all nodes in the failover cluster are functional and can perform all the required functions. This way, you can identify any unhealthy nodes in the cluster and reduce the risk of cluster node failure.

How Windows Server 2019 Failover Clustering Works

With Windows Server Failover Clustering, you can build multiple failover clusters to ensure a high availability for your applications and services. For this functionality to work, you need to have two servers (active and passive) which share the same storage as well as networks, plus meet specific hardware requirements.

The two machines can communicate using the heartbeat functionality, whereby they send ‘heartbeat’ signals to one another via a dedicated network. Two types of signals are differentiated: push and pull heartbeats. A push heartbeat is sent from an active server to a passive one, whereas a pull heartbeat is sent from a passive server to an active one. These communication signals are sent/received at regular intervals. Thus, if a ‘heartbeat’ fails to reach a server at the expected time, server failure is identified and the workloads of the failed machine are taken over by the standby server.

To learn how to make your environment highly available and resilient to system failures, you can read our blog post on how to enable Hyper-V High Availability using failover clustering. For a more detailed overview of this technology, you can download our eBook which describes how to deploy a failover cluster, which requirements should be met to create a failover cluster, and how NAKIVO Backup & Replication can ensure continuous protection of Hyper-V clusters.

Top New Features of Failover Clustering for Windows Server 2019

Microsoft developers have worked relentlessly on each of its releases by adding new features and improving existing functionality. The release of Windows Server 2019 is no exception. Among many other enhancements like hybrid cloud integration, advanced layers of security, or hyperconvergence, this operating system also takes the Failover Clustering functionality to another level. Below, you can see a full list of new Windows Server 2019 features and how they have transformed the Failover Clustering functionality.

Cross-domain cluster migration

The process of cross-domain cluster migration used to be a complex and time-consuming endeavor. Nodes and clusters couldn’t be simply moved between different domains. This process required complete reconfiguration of failover clusters, which resulted in unwanted service disruptions and considerable downtime. With Windows Server 2019, you can finally migrate failover clusters from one Active Directory domain to another. By ensuring fast and simple domain consolidation, you can save time, effort, and resources.

Cluster-Shared Volumes enhancements

The Cluster Shared Volume (CSV) cache enables the allocation of system memory as a write-through cache, which allows for caching of read-only unbuffered I/O. With this functionality, you can enhance the performance of Hyper-V virtual machines, which leverage unbuffered I/O when accessing virtual hard disks. The CSV cache is available in Windows Server 2019 by default, ensuring better productivity and faster performance of VMs running on top of the Cluster Shared Volumes. Other CSV enhancements include improved logic for detecting any issues in the cluster as well as its prompt repairing. This functionality works thanks to cluster network route detection and partitioned nodes.

Azure-aware clusters

Windows Server 2019 was designed for seamless integration of hybrid capabilities in your datacenter. What’s more, Windows failover clusters are Azure-aware, meaning that they can automatically identify when they are running inside Azure. As a result, Windows failover clusters can automatically optimize themselves, ensuring proactive failover and logging of Azure planned maintenance events. To boot, you no longer have to go through the mundane process of reconfiguring the load balancer with dynamic network name.

USB file share witness for quorum

The heartbeat functionality mentioned above allows you to check the status of each node in the cluster. However, in case of unexpected network failure, the cluster nodes will not be able to communicate with each other. This results in the split-brain scenario whereby each of the nodes assumes that they are the only functioning instance in the cluster and they start running at the same time. Unfortunately, this may cause data corruption or various types of data conflicts.

The quorum technology has been designed to solve this issue. The cluster will force one of the nodes to stop running based on the majority of votes. However, if there is an even number of nodes in the cluster (e.g. two-node cluster), cluster members might fail to achieve quorum and determine which of the nodes should continue running. As a result, the cluster stops working altogether.

With Windows Server 2019, you can use a USB drive attached to a commodity network device as a witness for the failover cluster quorum. In this case, the USB witness also has a vote and it can provide a casting vote to avoid the split-brain scenario.

Upgraded file share witness for quorum scenarios

With the Windows Server 2019 release, the quorum voting mechanism has become even more failure-tolerant. The upgraded File Share Witness can benefit you in the following cases:

  • When you cannot access a cloud witness due to slow or absent internet connection.
  • When there are no available shared drives for a disk witness.
  • When the failover cluster is running in a demilitarized zone (DMZ), where the domain controller connection is unavailable.
  • When you have a workgroup or mixed-domain cluster without an Active Directory cluster name object (CNO).

In all of these scenarios, the quorum voting procedure can fail, resulting in a shutdown of the failover cluster. With Windows Server 2019, these potential risks have been addressed, allowing you to use the file share witness under almost any scenario.

Cluster sets

Another newly-added functionality of Windows Server 2019 is cluster sets. A cluster set entails grouping multiple Windows Server failover cluster hosts (compute, storage and hyper-converged) into a logical set of clusters. Cluster sets can significantly simplify the failover cluster management in your infrastructure in a number of ways. Thus, you can easily migrate VMs between failover clusters running in a single cluster set. Additionally, this feature can make your clusters more failure-resilient as you can now fail over across clusters, ensuring minimum service disruption.

Cluster-Aware Updating for Storage Spaces Direct

The Cluster-Aware Updating feature was first introduced with Windows Server 2012. What, indeed, can this functionality do? With Cluster-Aware Updating, you can automatically update clustered servers with the minimum loss of availability. With the release of Windows Server 2019, this functionality can be integrated with Storage Spaces Direct (S2D), allowing for automated data resynchronization on each node during the updating process. What’s more, Cluster-Aware Updating can detect after which updates the system restart is required. Thus, restarts will be performed only when necessary, significantly reducing business downtime.

How Windows Server 2019 Failover Clustering Benefits You

Windows Server 2019 failover cluster authentication

Failover clusters are also exposed to various security threats. In previous Windows Server releases, NTLM authentication asked you to tackle this issue. With the release of Windows Server 2019, the Microsoft team has once again improved its approach to security. Instead of NTLM authentication, cluster nodes can communicate with each other via certificate-based authentication and Kerberos. This way, you can prevent network traffic spooning and make failover clusters more resilient to sudden security attacks.

Self-healing failover clusters

Windows Server 2019 strengthens resilience and availability of the cluster network by adding the self-healing functionality. A self-healing cluster can regularly check the state of its nodes and promptly repair (heal) them if any issues have been detected. For example, if a node fails and is unable to communicate with the rest of the cluster, the cluster will automatically detect the issue, attempt to repair the failed node, and reconnect it with a cluster. This functionality can significantly reduce the management overhead that system administrators experience in addition to increasing high availability capabilities.

Cluster hardening

Another security feature available in Windows Server 2019 is сluster Hardening. The nodes within the cluster can communicate over Server Message Block (SMB) for Cluster Shared Volumes and Storage Spaces Direct using certificate-based authentication. This allows for higher security levels of intra-cluster communication.

Data Protection with NAKIVO Backup & Replication

The main focus of failover clustering is to ensure the highest level of infrastructure availability. Windows Failover Clustering can be rightfully considered an essential technology for modern datacenters which are expected to provide continuous service delivery. With this functionality, you can avoid unplanned downtime, and maintain the same level of business productivity under almost any circumstances.

However, you still need to build a comprehensive data protection strategy capable of responding to security risks and preventing potential disasters from occurring. NAKIVO Backup & Replication is a reliable and affordable solution which can ensure robust data protection in a number of ways.

  • With NAKIVO backup solution, you can perform native, image-based, application-aware backup of VMware, Hyper-V, Nutanix AHV VMs, AWS EC2 instances, and Windows and Linux physical servers.
  • The Backup Copy functionality can add an extra level of protection against unexpected data corruption, system failure, or disaster. You can create copies of existing backups and send them offsite or to public clouds. Additionally, you can build a mirrored copy of your backup repository or streamline the entire backup copying process.
  • Put your data protection activities on autopilot using Policy-Based Data Protection. You can create multiple data protection rules based on the VM name, size, location, configuration, power state, tag, or a combination of these parameters. These policy rules can regularly scan your infrastructure, identify the VMs matching the set rules, and automatically add them to corresponding data protection jobs.
  • Automate and orchestrate the disaster recovery process from start to finish with site recovery workflows. By combining various actions and conditions into an automated algorithm, you can create multiple site recovery jobs to address various disaster scenarios. What’s more, you can test and update your site recovery jobs when needed without disrupting your production environment.
  • NAKIVO Backup & Replication offers multiple recovery options, allowing you to instantly restore VMs, files, and application objects directly from compressed and deduplicated backups. You can also recover VMware VMs to a Hyper-V environment and vice versa using Cross-Platform Recovery. What’s more, NAKIVO Backup & Replication lets you recover physical machines to VMware or Hyper-V VMs, allowing you to recover under almost any circumstances.

1 Year of Free Data Protection: NAKIVO Backup & Replication

1 Year of Free Data Protection: NAKIVO Backup & Replication

Deploy in 2 minutes and protect virtual, cloud, physical and SaaS data. Backup, replication, instant recovery options.

People also read