An auto-failover group is an Azure SQL database feature that replicates one, or a group of databases to the secondary Azure SQL server in a different region. This feature is used to failover all the databases in the failover group in case of disaster and the failover is automatic.
High availability is a crucial component for data professionals, even when operating in a cloud environment such as Azure. The architecture of the Microsoft Azure Platform as a Service (PaaS) is built in a way that offers high availability for service right out of the gate. This helps to ensure that your databases, such as Azure SQL Database and Azure SQL Managed Instances, are always available for use without having to lift a finger. Even more, Microsoft now offers the ability to manually control a failover over for these resources, which gives data professionals more granular control.
Why You Should Choose an Auto-Failover Group
The Azure SQL Database by default offers a 99.99% availability SLA across all its service tiers, meaning any database’s downtime will not exceed 52 minutes per year. Using “zone redundancy” increases availability to 99.995%, which is about 26 minutes per year.
These impressive numbers are achievable through in-region redundancy of the compute and storage resources and automatic failover within the region. Some disruptive events may impact the region’s availability like a datacentre outage, possibly caused by a natural disaster, network failure, hardware or software malfunction, and others. To protect against a region disaster, auto-failover groups can be configured to automatically failover one or multiple databases to another region.
The Difference Between Geo-Replication and Auto-Failover Group
With geo-replication, the former service provided by Azure, you can designate up to four readable secondary databases (each residing on a separate logical server in any region supported by your Azure subscription, including the region hosting the primary database), which are automatically configured as replication partners. Once the replication is established, you can initiate a planned or unplanned failover between the current primary and one of the secondaries.
While geo-replication is very straightforward to configure (it can be set up directly from the Azure portal by simply designating the desired location of secondary replicas), there are a few drawbacks that limit its usefulness, such as:
- The lack of support for simultaneous replication and failover of multiple databases
- The need for manual failover
These limitations have been addressed by auto-failover groups. One of the primary strengths of this technology is the support for automatic failover of multiple databases based on custom groupings that you define. All databases in the same group must be part of the same logical server with the source and target servers residing in two different Azure regions.
Another distinction between geo-replication and auto-failover groups is the auto-failover groups support “read-write” and “read-only” listener endpoints that remain the same following a failover. Meaning, client applications will continue to function without any additional configuration regardless of the location of the primary or secondary replica.
Auto-failover group uses the same underlying technology as geo-replication. The following are some of the differences between auto-failover groups and active geo-replication.
How a Failover Group Works
Failover groups, just like geo-replication, use the SQL Server’s “Always-on” feature with asynchronous data movement to provide business continuity. Failover groups are built on top of Geo-replication and provide automatic failovers for your applications. This is achievable by having listeners created for the failover groups.
The Auto-Failover provides two listeners created by Azure. The first one for “read-write” OLTP type workload, and the second one for “read-only” connection. The Domain Name System (DNS) records update automatically to redirect to the correct server after a role change.
The auto-failover group must be configured on the primary server, which will then connect it to the secondary server in a different Azure region. The groups can include all or some databases in these servers. The following diagram illustrates a typical configuration of a geo-redundant cloud application using multiple databases and auto-failover group:
As you can see in the image above, two listeners were created for each failover group. The “read-only” listener can be used for applications or connections, which only want to read the data. In this case, connections will be directed to the secondary logical SQL server. The other listener is the “read-write” listener, which can be used to direct the connections to the primary logical SQL server where “read” and “writes” can both happen.
Another option besides an auto-failover is the option to create groups of databases. Similar to “availability groups” on your “on-premise Always-On,” the failover-groups in the Azure SQL database behave the same way, meaning all the databases inside the failover group will failover together to the other secondary node.
Configuring Auto-Failover Group on the Azure SQL server:
- Log in to the Azure portal and go to the SQL Server page. Next, search for “SQL Server” in the search box. Please refer to the below image:
- Click on the server where you will configure the failover groups. On the server details page, click on “failover groups.”
- Click on the “add group.” Enter the failover group name that you’ll use to connect the current primary server. Select the secondary server if you already have one in a different region. If you don’t, you’ll need to create a new SQL server in a different region. The secondary server in the same region isn’t supported in failover groups in Azure SQL Server.
- Select the “Read/Write” failover policy. Select “automatic” if you need automatic failover when the primary goes down. Select the “read/write” grace period.
- Add the databases to the group that you want to failover to the secondary and click on “create.”
Note: As the data replicates asynchronously, the immediate failover may result in data loss. We must set the grace period accordingly to reduce data loss. The grace period is the time the SQL service waits before triggering automatic failover when an outage occurs.
Once the failover group is created and the replication of the databases is complete, two endpoints are then created. The provided failover group name, which is the “read-write” listener endpoint, will always point to the current primary server in the Azure SQL Server failover group. The other one is a “read-only” listener endpoint, which always points to the secondary server in the failover group.
To know the endpoints, navigate to failover groups and click on the name of the failover group. Under the map, you can see the primary, secondary server details, the “read-write,” and the “read-only” listener endpoints.
When a planned failover is imitated, the secondary databases are fully synchronized with Azure SQL primary databases before switching the roles. In this case, there is no data loss.
Once the failover is successful, the existing primary becomes secondary, and the existing secondary becomes primary. The pointing of “read-write” and “read-only” listener endpoints are changed automatically. “Forced Failover” is used to do a manual forced failover, which immediately switches the roles without synchronizing with primary. This may result in data loss.
Azure Knowledge at Your Fingertips
We explored how to create auto-failover groups in Azure SQL Server and the differences between auto-failover groups and geo-replication. Auto-failover groups provide you the option for automatic failover for your Azure SQL databases group without the need of changing the application connection string. For more information on auto-failover groups or Azure, contact our commerce experts today.