Sakthi's Blogs

Tag: azure

  • SQL Server Always On Availability Groups in Azure VMs – Configuration Guide

    Overview

    SQL Server Always On Availability Groups offer HA for multiple databases using Windows Server Failover Clustering (WSFC). When deployed in Azure VMs, the environment mimics on-premises architecture with additional Azure-specific components.


    🧱 1. Prerequisites

    RequirementDetails
    Azure VMsAt least 2 (recommended: DS-series or higher) with SQL Server installed
    Windows Server2016 or later
    SQL Server EditionEnterprise (Standard supports Basic AG)
    Domain ControllerRequired for WSFC (can be Azure VM or AD DS)
    Virtual NetworkAll VMs in same region/VNet; subnet peering enabled if required
    Static IPsAssign private static IPs for cluster nodes
    Load BalancerNeeded for listener IP configuration in Azure
    Quorum WitnessOptional but recommended (can use Azure File Share Witness)

    πŸ–₯ 2. Environment Setup

    πŸ”Ή Virtual Machines

    • Deploy 2+ Azure VMs with SQL Server (Enterprise)
    • Join all VMs to your Active Directory domain

    πŸ”Ή Virtual Network

    • Ensure VMs are in the same VNet and subnet
    • Enable network connectivity via internal DNS or custom DNS

    βš™οΈ 3. Configure Windows Failover Cluster (WSFC)

    Step-by-Step:

    1. Install Failover Clustering Feature powershellCopyEditInstall-WindowsFeature -Name Failover-Clustering -IncludeManagementTools
    2. Validate Cluster Configuration
      • Open Failover Cluster Manager
      • Validate with both nodes and all required tests
    3. Create the Cluster
      • Use a static IP address (do not register with Azure DNS)
      • E.g., New-Cluster -Name SQLCluster -Node SQL1,SQL2 -StaticAddress 10.0.0.100
    4. Configure Cluster Quorum
      • Use File Share Witness (hosted in a 3rd VM or Azure File Share)

    πŸ›  4. Enable Always On in SQL Server

    On each SQL Server VM:

    1. Open SQL Server Configuration Manager
    2. Enable Always On Availability Groups in the SQL Server instance properties
    3. Restart the SQL Server service

    πŸ“¦ 5. Create and Configure Availability Group

    1. Open SQL Server Management Studio (SSMS)
    2. Create a database and ensure full recovery model
    3. Take a full and log backup
    4. Launch New Availability Group Wizard
      • Add replicas (SQL instances)
      • Enable automatic failover (if synchronous)
      • Add databases
      • Choose backup preferences
      • Create listener (initially leave blank)

    🌐 6. Configure Azure Load Balancer for Listener

    Azure does not support automatic ARP updates for failover cluster IPs, so a Load Balancer is required.

    🧩 Create Load Balancer

    • Type: Internal
    • Frontend IP: Same subnet as SQL nodes
    • Backend Pool: Add SQL VMs
    • Health Probe: Port 59999, custom TCP listener
    • Load Balancing Rule:
      • Port: 1433 (SQL)
      • Backend port: 1433
      • Floating IP: Enabled
      • Session persistence: None

    πŸ”§ Add Listener in SQL

    • Run in SQL: sqlCopyEditALTER AVAILABILITY GROUP [YourAG] ADD LISTENER 'AGListener' (WITH IP ((N'10.0.0.200', N'255.255.255.0')), PORT=1433);

    πŸ”„ 7. Test Failover

    • Open Failover Cluster Manager or use SSMS
    • Perform a manual failover to test behavior
    • Verify listener redirection and database accessibility

    πŸ§ͺ 8. Monitoring and Maintenance

    • Use SQL Agent Alerts, Azure Monitor, and Log Analytics
    • Enable email notifications for failover events
    • Regularly check:
      • Health of replicas
      • Cluster events
      • Quorum state
      • Azure LB health probes

    βœ… Best Practices

    • Use Accelerated Networking on all SQL VMs
    • Use Premium SSD for data/log disks
    • Configure automatic backups to Azure Blob
    • Use NSG rules to control access to SQL ports
    • Document your failover/failback procedure

    πŸ“Ž Optional Enhancements

    FeaturePurpose
    Azure File Share WitnessAvoid 3rd VM just for quorum
    Azure BastionSecure RDP access without public IPs
    Azure Recovery Services VaultProtect databases with point-in-time restore
    SQL Managed InstanceConsider for PaaS-like HA features
  • Azure Site Recovery (ASR) and Availability Zones – Configuration & Operations Guide

    Part 1: Azure Site Recovery (ASR)

    πŸ“˜ What is Azure Site Recovery?

    Azure Site Recovery (ASR) is Microsoft’s disaster recovery-as-a-service (DRaaS) solution. It helps you replicate, fail over, and recover workloadsβ€”including VMs, physical servers, and Azure VMsβ€”to a secondary location during outages.


    πŸ› οΈ ASR Key Components

    ComponentPurpose
    Source EnvironmentWhere the protected workloads reside
    Recovery Services VaultCentral hub for managing backup and replication
    Replication PolicyDefines RPO, recovery points, and retention
    Process ServerFor physical/VMs in on-prem replication
    Configuration ServerCoordinates replication (on-prem to Azure)

    βœ… ASR Supported Scenarios

    • On-premises β†’ Azure
    • Azure region β†’ Azure region
    • VMware/Hyper-V β†’ Azure
    • Physical servers β†’ Azure

    πŸ“‹ ASR Configuration Steps

    πŸ”Ή Scenario: On-premises to Azure (VMware/Physical)

    1. Create a Recovery Services Vault
      • Azure Portal β†’ Search “Recovery Services Vault” β†’ Create
    2. Set up Site Recovery
      • In the vault β†’ Site Recovery β†’ Choose source (On-prem) and target (Azure)
    3. Download & Install Configuration Server
      • Install on a dedicated Windows server (must be domain-joined)
    4. Register Configuration Server
      • Use vault credentials to register it to Azure
    5. Install Mobility Agent
      • Install on each source machine to replicate
    6. Create Replication Policy
      • Define RPO (Recovery Point Objective), app-consistent snapshots, and retention
    7. Enable Replication
      • Map source to target resource group, subnet, and VM size
    8. Test Failover
      • Perform a test failover to validate replication (no production impact)
    9. Planned / Unplanned Failover
      • Switch to Azure in case of disaster, choose direction of failback

    πŸ” Azure-to-Azure Replication

    1. Select the source Azure VM
    2. Choose target region
    3. Configure network mapping, disks, and VM sizes
    4. Enable replication, monitor health and perform test failovers

    πŸ“Š ASR Monitoring and Operations

    • Recovery Services Vault Dashboard β†’ See replication health, events, jobs
    • Azure Monitor + Log Analytics β†’ Alerts and automation
    • Cost optimization β†’ Use Reserved Instances for secondary region

    πŸ”Ή Part 2: Azure Availability Zones

    πŸ“˜ What are Availability Zones?

    Availability Zones are physically separate datacenters within an Azure region. Each zone has independent power, cooling, and networking to ensure high availability.


    πŸ›‘οΈ Benefits of Using Availability Zones

    • Protect against datacenter-level failures
    • Provide 99.99% uptime SLA for zone-redundant services
    • Ensure resiliency and fault isolation

    πŸ—οΈ Availability Zones Architecture

    • Zone 1, Zone 2, Zone 3 β†’ Each with isolated infra
    • Services like VMs, managed disks, load balancers, databases can be spread across zones

    πŸ”§ Configuring Availability Zones

    For Azure VMs:

    1. During VM creation β†’ Choose the region with support for Zones
    2. Select a specific zone (1, 2, or 3) or use zone balancing
    3. Use Availability Sets if deploying across fault and update domains (within a zone)

    For Load Balancing:

    • Use Standard Load Balancer to direct traffic across multiple zones
    • Zone-aware frontend and backend pool

    For Data:

    • Use Zone-Redundant Storage (ZRS) for blob storage to replicate data across zones
    • Use Azure SQL zone-redundant deployments for HA

    πŸ”„ ASR + Availability Zones – Combined Resilience

    • ASR replicates across regions (Geo-resilience)
    • AZs provide intra-region redundancy
    • For full DR: Deploy zone-redundant VMs + enable ASR to replicate to another region

    🧠 Best Practices

    • Use Proximity Placement Groups (PPG) for low latency when needed
    • Schedule test failovers quarterly
    • Monitor RTO/RPO compliance
    • Tag all resources for DR drill tracking