Tuesday, August 6, 2019

Azure SQL Managed Instance High Availability and Disaster Recovery options




The below blog is to summarize every High Availability and Disaster Recovery options available in Azure SQL Database Managed Instance and a few methods to make use of them.

Before we proceed, I would like to refresh your knowledge about the below two key parameters which are often discussed in every DR Strategy meeting between system administrators and other stake holders.


RPO (Recovery Point Objective) : It decides the maximum amount of data loss the IT system or the business can afford in an event of disaster. For an example, if your log backup is set to run every 30 min, you are practically having an option to recover your data to any point in time, at least 30 min before the time a disaster bring down your system.


RTO (Recovery Time Objective): Time in hours/Days which indicates the maximum downtime, you can have in your system as agreed with your client or business .

The better your Disaster Recovery setups, the lower will be your systems' RPO and RTO.


Below are the Disaster Recovery and High availability options we have in Azure SQL database managed instance.


1. Read Scale-out


By Default each Azure SQL database in managed instance of business Critical service tier is given with three replicas , one of which is provisioned as Read-Only. None of these extra replicas are sharing the compute resources you have allocated to your Managed instance . The read only replica can be used for offloading some of your read only work load from your Primary instance and can be connected by mentioning 'Application intent' as 'Readonly' in an OLeDB connection String. You can also create a linked server which is again built on an Oledb provider by mentioning the same property value.

Other two replicas are maintained for High Availability during events such as service patching .



T-SQL query to fetch the synchronization status and health of your replicas.


Use master
go
SELECT
db.name as [db_name],
rpl.replica_id,
rpl.group_database_id,
rpl.is_local,
rpl.is_primary_replica,
rpl.synchronization_state_desc AS syncro_state,
rpl.synchronization_health_desc AS synchro_health,
rpl.database_state_desc AS db_state,
rpl.log_send_queue_size,
rpl.redo_queue_size
FROM sys.dm_hadr_database_replica_states AS rpl
join sys.databases AS db ON db.database_id = rpl.database_id



Sample Output








T-SQL Script to create a linked server for your read only replica






EXEC sp_addlinkedserver @server=N'ReadOnlyReplica', --Change the name as you need
@srvproduct=N'',
@provider=N'SQLNCLI',
@provstr = N'ApplicationIntent=ReadOnly',
@datasrc= @@SERVERNAME;












Another important point to note is that there will only be Tables and views present here in this replica and not procedures and functions.


2. Failover Groups

It is a Disaster Recovery setup built based on always-on Feature of SQL servers and made available in azure SQL space, in addition to the availability replicas described above .The reason why we might need it, is that the replicas that come free with the instance can not guarantee safeguarding against a region/data center wise outage. Azure Managed instances support single fail-over group consisting of all the databases in the instances and it is created with automatic fail-over plan by default (Can be turned off). To begin with the configuration, one must create a secondary empty instance in another region entirely and work for enabling two directional traffic between the VNets hosting both the primary and secondary instance.


If you are using automatic failover option, make sure you understand and configure the 'GracePeriodWithDataLossHours' Property correctly as it decides how many hours the system will wait for a graceful failover with no data loss. Its minimum allowed value is one hour at present. You can also perform manual fail-over with the failover groups.


Azure SQL DBaaS offerings other than managed instance supports a similar feature called Geo-Replication and the below table can be used to differentiate between the two.






3. Geo-Restore


The automated backups that are taken and retained based on the retention period you define (Between 7-35 days) are made geo-redundant by default. Hence these backups can be used to restore the databases even when they are dropped by a mistake or on purpose. Managed instances does not have long term retention plan as they support Copy-only backup already which could be taken and maintained as per users' choice. Other two DBaaS offerings support Long term retention policies yet do not have copy only backups.


Below is a table depicting the RPO RTOs for the various DR methods discussed till now.








4. Database Replication


Database replication is supported in managed instances , but with certain restrictions. On-premise databases can be made as both source and target for the Transactional replication in managed instances. Managed instances do not support merge replication , peer to peer replication and transactional replication with update-able subscribers. And another limitation for Managed instance to Managed instance replication is,it supports only push subscription and  the distributor and the publisher has to be in the same managed instance.

But I am hopeful that there are improvements coming in future days for the replication in managed instance as its only in public preview for managed instances.


I will be writing a different blog on configuring steps with a demo for a thorough analysis.


































No comments:

Post a Comment