Hyper-V Replication and Failover Types: In-Depth Overview

Hyper-V Replication is a feature that comes at no additional cost with Microsoft Hyper-V. This feature lets users implement a business continuity (BC) and disaster recovery (DR) plan built on replication to a remote host. Introduced with Microsoft Server 2012, Hyper-V Replication is popular with both SMBs and large enterprises.

When it comes to using Hyper-V Replica, there may be some misunderstandings about how best to use this feature and the purposes it serves. Replication can also be a source of confusion when trying to decide on other DR features like checkpoints and clustering.

NAKIVO for Hyper-V Replication

NAKIVO for Hyper-V Replication

Business continuity in any failure scenario with robust replication for Hyper-V VMs. Verify replicas in seconds and automate VM failover to minimize downtime.

What Is Hyper-V Replica?

A Hyper-V replica is an identical standby copy of a Hyper-V virtual machine (VM) that is saved in a powered-off state on another Hyper-V host. The host that keeps the replicas is called the secondary host (sometimes target host or replica host). This secondary host with the standby VMs takes over when a disaster strikes the production site or when the primary host is down.
Hyper-V replication and Hyper-V failover
Hyper-V replication is a disaster recovery feature available as part of Microsoft Hyper-V. The main role of Hyper-V replication is creating replicas of primary virtual machines to be stored on remote hosts for VM recovery when needed.

The primary host and the hosts with the replica VMs can reside on the same site or be located on different sites. An organization can set up and maintain its own replica site. Smaller organizations with limited budgets can choose to subscribe to disaster recovery as a service (DRaaS) from a managed service provider (MSP). In this case, disaster recovery using Hyper-V replication can also be another affordable option given the low requirements and ease of configuration.

How Does Hyper-V Replication Work?

Hyper-V replication is asynchronous data replication based on the intervals set by the administrator, and, as a result, zero data loss cannot be guaranteed. These intervals are set based on the recovery point objectives (RPO) for the VMs and the available options with this replication feature.

Note: To learn more about recovery metrics, download our white papers about RPO and RTO. You can also read the blog post about the difference between RPO and RTO.

Recovery points

By default, Hyper-V replication creates only one recovery point for a VM replica and updates the data of this recovery point at set time intervals. You can set multiple recovery points for a Hyper-V replica if needed. The minimum replication interval is not reduced in this case, but you can recover data for the needed recovery point. For example, you can set up to 24 recovery points for VM replicas with a 1-hour interval. The available Hyper-V replication time intervals are: every 30 seconds, 5 minutes, 10 minutes, 15 minutes or 1 hour.

Network

Replication data is transferred via the network from a host running the source VM to the host on which the VM replica is stored. For this reason, you must have high network bandwidth, which can be a challenge when using an internet connection between two geographically distributed sites. To avoid conflicts and split-brain situations, you should not run a source Hyper-V VM and a VM replica simultaneously.

VM replicas are usually connected to other networks and have IP addresses different from those used by the original VM.

Hyper-V replication process

  • You can configure Hyper-V replication in Hyper-V Manager or System Center Virtual Machine Manager (SCVMM).
  • When you enable Hyper-V replication for a VM, a VM replica is created on the secondary host, and a Hyper-V Replica Log (.HRL) file to track changes is created.
  • When you replicate a VM for the first time, all VM data is copied from the source host to the target host.
  • The next time the VM is replicated, only changed VHDX (or VHD) virtual disk data (increments) is copied to save replication time and amount of transferred data.
  • A Hyper-V checkpoint (.AVDX) is created when replication starts (for subsequent replicas after the initial replication).
  • When you create a new recovery point, and the oldest recovery point for a VM replica has expired, the oldest one is combined with the main virtual disk.

Recovery from a VM replica is performed manually when using native Hyper-V functionality.

When Is Hyper-V Replication Used?

VM replication is used to prepare for situations in which you need to recover a VM in a very short time. This time is shorter than the time it would take to recover a VM from a backup. When you have a backup, you can perform VM recovery and restore operations (operational recovery), but not VM failover for disaster recovery.

Unlike Hyper-V clustering, where one running VM is located on shared storage and accessed by two Hyper-V hosts, Hyper-V replication uses two VM instances (a primary running VM and a VM replica that is in powered-off state during normal operation) located on the own storage of hosts (local storage, SAN, or NAS).

Note: Download the ebook about Hyper-V clustering to learn how clustering works.

When not to use Hyper-V replication

You may not need to use a Hyper-V replica for VM failover if you run the following services on VMs:

  • Active Directory Domain Controller. Choose native Active Directory replication options rather than using Hyper-V replication.
  • MS SQL Server. You can use Hyper-V replicas for SQL Server protection. However, there is a native alternative solution to replicate SQL databases. Read the blog post about MS SQL Server Replication to learn more about the native replication feature. Selecting the right SQL replication method depends on your tasks and requirements.
  • Microsoft Exchange. You may have problems if you use Hyper-V replication for VMs running Exchange. Choose the native Exchange replication technology.

Hyper-V replication flexibility

Hyper-V replication is flexible in terms of the multiple deployment variations it supports. It can be deployed between:

  • Two standalone hosts
  • A standalone host and a Hyper-V failover cluster
  • Two Hyper-V failover clusters

Hyper-V replication is also flexible in terms of hardware requirements. The primary and secondary hosts do not require matching hardware components. Besides, extended replication is supported. This means that a secondary host can be the source of another replication to a third host, thus forming a daisy chain.

Hyper-V replication provides flexible granular protection. You can select specific VMs to be replicated and even select specific VM’s VHDX virtual disks.

What Is Hyper-V Replica Failover?

Hyper-V replica failover is an operation involving switching from the original VM on a source Hyper-V host to the VM replica on a remote host (replication or target Hyper-V host) to restore VM workloads and data. The failover operation allows you to ensure the operational availability of systems with minimal downtime. You can start VM failover manually in Hyper-V Manager or in SCVMM.

Hyper-V Replica Failover Types

There are three Hyper-V failover types that you can use depending on the scenario for initiating this operation:

  • Test failovers
  • Planned failovers
  • Unplanned failovers

Each failover type is intended to meet specific needs.

Type 1: Test failovers

A test failover is used to validate replica VMs and test a disaster recovery plan. It should be carried out regularly. With test failovers, neither the running primary VM operation nor the replication process for the replica VM is impacted. Test failovers don’t interrupt production workloads and ongoing replication. A test VM is created to be examined in an isolated environment, including an isolated network. Once the IT admin stops the test failover for a replica VM, the created test VM is cleaned up.

Test failovers use the internal VM Export/Import feature of Hyper-V to create a new VM copy and then rename this VM. The test Hyper-V failover includes the following operations:

  1. A VM replica including the VHDX, XML and other files is exported to a temporary location.
  2. The XML file of the exported VM is modified to use a unique GUID.
  3. The host registers the newly created VM with Hyper-V with the VMSS.exe process.
  4. The VM is renamed.
  5. The VM is imported to the same Hyper-V host.

The test VM remains in the powered-off state after the test failover, and you need to start the test VM manually.

Type 2: Planned failovers

Planned failovers are used to prepare for service availability during a disaster such as a hurricane or planned power outage, or to smoothly fail over from primary VMs to replicas during maintenance or datacenter migrations. Another possible reason to use a planned failover is related to compliance requirements.

During a planned failover, the primary VM is shut down, and the replica VM is forced to boot on the secondary host. The traffic is directed towards the secondary host, and VM workloads are moved to that host. There is no data loss when you use planned failover. Planned failover has zero RPO and RTO and requires only the time to replicate data and boot the VM after that.

A planned Hyper-V failover consists of the following actions:

  1. A system administrator or user initiates failover.
  2. The VMSS.exe Hyper-V process is notified about this action.
  3. VMSS.exe requests Hyper-V VSS Writer to create a snapshot of the primary VM.
  4. The VSS Writer creates a standard Hyper-V replica VM.
  5. The Hyper-V Replica server is notified about this event.
  6. The standard replica VM is copied to the Hyper-V replica server via the network.
  7. The replica server registers the received VM replica and starts this VM replica.

Type 3: Unplanned failovers

An unplanned failover is started on the secondary server or site when an unexpected disaster brings down VMs on your primary server or site (power loss, hardware failure, ransomware attack, etc.). This Hyper-V replica failover type is also used to fail over a single failed VM to a secondary host. As in the case of a planned failover, the RTO is the time it takes to boot the VMs. However, when it comes to the RPO, the data since the last replication is lost. The maximum RPO is the configured replication interval, ranging from 30 seconds to 15 minutes.

After switching to a Hyper-V replica using failover, you have the option of running a failback operation when the primary server is back working. Failback starts reverse replication to copy the latest data from the replica server to the original server and move workloads back to the original server.

Alternative Hyper-V Replication Solutions

An alternative to Microsoft Hyper-V native replication and disaster recovery features are third-party data protection solutions, which can offer comprehensive backup and DR for Hyper-V and other infrastructures.

NAKIVO Backup & Replication is a universal data protection solution that supports Hyper-V, VMware vSphere, Nutanix AHV virtual environments, as well as Amazon EC2 and Linux/Windows physical machines. You can use the NAKIVO solution to manage replication of Hyper-V VMs, automatic failover, and disaster recovery orchestration using Site Recovery. Advanced features allow you to improve replication speed, reduce replication time and automate data protection operations.

1 Year of Free Data Protection: NAKIVO Backup & Replication

1 Year of Free Data Protection: NAKIVO Backup & Replication

Deploy in 2 minutes and protect virtual, cloud, physical and SaaS data. Backup, replication, instant recovery options.

People also read