0% found this document useful (0 votes)

12 views

SQL Always on issues and resolutions __ SQL Server

The document outlines common issues and resolutions related to SQL Server AlwaysOn Availability Groups, a high-availability and disaster recovery solution. Key issues include failures to come online, synchronization lag, prolonged failover times, listener failures, node drops from the cluster, data loss after forced failovers, and replicas in a 'NOT SYNCHRONIZED' state. Each issue is accompanied by potential causes and recommended resolutions to ensure optimal performance and reliability of the Availability Groups.

Uploaded by

vkyvishal1721

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

SQL Always on issues and resolutions __ SQL Server

Uploaded by

vkyvishal1721

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

https://www.sqldbachamps.

com Praveen Madupu +91 98661 30093

Sr SQL Server DBA, Dubai
praveensqldba12@gmail.com

SQL Server AlwaysOn Availability Groups is a high-availability and disaster recovery solution introduced
in SQL Server 2012. Despite its robustness, it comes with its own set of common issues that can occur during
deployment, operation, or failover processes.

Below are the most frequent issues encountered in AlwaysOn Availability Groups and their resolutions.

1. Availability Group Fails to Come Online

Issue: The Availability Group or one of its replicas fails to come online after a failover or restart.

Causes:

● Insufficient permissions for the SQL Server service accounts.

● Cluster service or the Windows Server Failover Clustering (WSFC) is not running.
● Incorrect network configuration.
● Replicas are in a non-synchronized state, preventing failover.

Resolution:

● Check SQL Server Service Account Permissions: Ensure that the SQL Server service accounts have
the necessary permissions on the failover cluster nodes and storage devices. The service accounts must
have read/write permissions on the shared resources.

https://www.sqldbachamps.com
● Start the Cluster Service: Verify that the WSFC service is running on all participating nodes. Use the
following PowerShell command to check the status:

Get-Service clussvc

If it’s not running, start it manually or investigate the root cause.

Network Configuration: Ensure the network configuration, especially the AlwaysOn listener, is correctly set up
and the DNS is resolving the listener name properly. Test network connectivity between the nodes.

Check Replica Synchronization: Make sure all replicas are in a synchronized state. If not, check for network
issues or transaction log shipping failures. Use the following query to check synchronization state:

SELECT

ag.name AS AvailabilityGroupName,

ar.replica_server_name AS ReplicaName,

ags.synchronization_state_desc

FROM sys.dm_hadr_availability_replica_states ags

JOIN sys.availability_replicas ar
https://www.sqldbachamps.com Praveen Madupu +91 98661 30093
Sr SQL Server DBA, Dubai
praveensqldba12@gmail.com
ON ags.replica_id = ar.replica_id

JOIN sys.availability_groups ag

ON ags.group_id = ag.group_id;

2. Synchronization Lag Between Primary and Secondary Replicas

Issue: The secondary replicas are experiencing significant synchronization lag, which may lead to
increased recovery time during a failover.

Causes:

● High transaction log generation rate on the primary replica.

● Insufficient network bandwidth between the replicas.
● Disk I/O bottlenecks on the secondary replica.

Resolution:

● Monitor Log Generation: Use SQL Server's sys.dm_db_log_stats DMV to monitor the transaction
log generation rate and ensure that it is not excessively high. If necessary, review and optimize the queries

https://www.sqldbachamps.com
●
or batch processes causing the high log generation.
Increase Network Bandwidth: Ensure that there is enough bandwidth between the primary and
secondary replicas. If you are running replicas across different geographical locations, consider upgrading
your network links or using technologies like WAN accelerators to reduce network latency.
● Improve Disk Performance: Ensure that the secondary replicas have sufficient I/O throughput. If disk
performance is the bottleneck, consider upgrading to faster disk types (e.g., SSDs) or optimizing storage
configuration.

Verification:

● Use the following query to monitor synchronization lag:

SELECT

ag.name AS AvailabilityGroupName,

ar.replica_server_name AS ReplicaName,

ags.log_send_queue_size,

ags.redo_queue_size,

ags.last_commit_time
https://www.sqldbachamps.com Praveen Madupu +91 98661 30093
Sr SQL Server DBA, Dubai
praveensqldba12@gmail.com
FROM sys.dm_hadr_availability_replica_states ags

JOIN sys.availability_replicas ar

ON ags.replica_id = ar.replica_id

JOIN sys.availability_groups ag

ON ags.group_id = ag.group_id;

3. Failover Takes Too Long or Fails

Issue: During a planned or unplanned failover, it takes too long to switch to a secondary replica, or the
failover fails altogether.

Causes:

● The secondary replica is not fully synchronized with the primary replica.
● The transaction log is too large, causing delays in recovery.
● The SQL Server service on the secondary replica is not running or has crashed.
● Incomplete or incorrect failover cluster settings.

https://www.sqldbachamps.com
Resolution:

● Ensure Synchronization: Before initiating a failover, ensure that the secondary replica is in a
SYNCHRONIZED state. You can check this using the same query as above. If the secondary is not
synchronized, perform a manual log backup and restore, or allow time for the secondary to catch up.
● Monitor Transaction Log Size: If a large transaction log is causing delays, you may need to back up the
transaction log more frequently to keep it from growing too large.
● Check SQL Server Service: Ensure that the SQL Server services are running on the secondary replicas.
If not, investigate the Windows Event Logs and SQL Server logs for errors.
● Review Cluster Settings: Check the AlwaysOn configuration settings, especially the failover threshold,
and ensure that they are appropriate for your workload. Consider increasing the timeout settings if failover
frequently fails.
https://www.sqldbachamps.com Praveen Madupu +91 98661 30093
Sr SQL Server DBA, Dubai
praveensqldba12@gmail.com
4. Availability Group Listener Fails to Come Online

Issue: The Availability Group listener fails to come online, preventing client connections to the
Availability Group.

Causes:

● DNS registration issues for the listener name.

● Incorrect permissions for the SQL Server service account to register the listener in DNS.
● Network configuration issues, such as port conflicts or firewall settings blocking the listener.

Resolution:

● DNS Registration: Ensure that the listener name is correctly registered in DNS. If necessary, manually
register the listener by running the following command:

ipconfig /registerdns

Check Permissions: Ensure the SQL Server service account has the required permissions to create DNS entries
for the listener. You can review the DNS registration status using the following PowerShell command:

https://www.sqldbachamps.com
Get-DnsServerResourceRecord -Name "<ListenerName>" -ZoneName "<DNSZone>"

● Firewall and Port Configuration: Verify that the listener's port (default 1433) is open on all participating
nodes and allowed by firewalls. Also, ensure that no other services are using the same port as the listener.

5. Cluster Node Drops from the WSFC Cluster

https://www.sqldbachamps.com Praveen Madupu +91 98661 30093
Sr SQL Server DBA, Dubai
praveensqldba12@gmail.com
Issue: One or more nodes unexpectedly drop from the Windows Server Failover Clustering (WSFC)
cluster, causing the Availability Group to lose quorum.

Causes:

● Network connectivity issues between nodes.

● Misconfigured quorum settings or insufficient quorum votes.
● Hardware or operating system failure on the node.

Resolution:

● Check Network Connectivity: Ensure that all cluster nodes have consistent network connectivity. You
can use the Failover Cluster Manager or the following PowerShell command to check the status of the
cluster nodes:

Get-ClusterNode

● Review Quorum Settings: Ensure that your quorum configuration is appropriate for the number of nodes
in your AlwaysOn cluster. For a multi-site setup, use Node and File Share Majority or Node and Disk
Majority.
● Investigate Hardware Issues: Check the Windows Event Logs for any hardware or OS failures that may

https://www.sqldbachamps.com
have caused the node to drop from the cluster. Resolve hardware issues as necessary.

Verification:

● Ensure that the cluster nodes are back online by running:

Get-ClusterNode | Where-Object { $_.State -eq "Up" }

6. Data Loss After a Forced Failover

https://www.sqldbachamps.com Praveen Madupu +91 98661 30093
Sr SQL Server DBA, Dubai
praveensqldba12@gmail.com
Issue: After a forced failover, data loss occurs due to unsent transaction logs between the primary and
secondary replicas.

Causes:

● Forced failover to a SECONDARY replica that was not fully synchronized.

● Uncommitted transactions on the primary that were not yet sent to the secondary.

Resolution:

● Avoid Forced Failovers: As a general rule, avoid forced failovers (WITH DATA LOSS) unless absolutely
necessary. This option should only be used when the primary replica is permanently offline.
● Monitor Synchronization: Regularly monitor the synchronization state of your replicas to ensure that
secondary replicas are always synchronized and capable of performing a failover without data loss. Use
this query to check for unsent logs:

SELECT * FROM sys.dm_hadr_database_replica_states

WHERE synchronization_state_desc <> 'SYNCHRONIZED';

● Restore from Backup: If data loss occurs, you may need to restore from a recent backup to recover the
lost data.

https://www.sqldbachamps.com
7. Availability Group Replica in "NOT SYNCHRONIZED" State

Issue: One or more replicas in the Availability Group remain in a "NOT SYNCHRONIZED" state,
preventing automatic failover and data protection.

Causes:

● Network issues between the primary and secondary replicas.

● The transaction log on the primary has grown too large and is not being transmitted quickly enough.
● Disk I/O bottlenecks on the secondary replica prevent it from catching up.

Resolution:

● Check Network Latency: Monitor the network traffic between the replicas and ensure that there are no
high latencies or dropped packets. Use tools like Ping and Tracert to diagnose connectivity issues.
● Transaction Log Backups: Ensure that regular transaction log backups are occurring on the primary
replica to keep the log size manageable.
● Disk Performance on Secondary: If the secondary replica is experiencing I/O bottlenecks, consider
upgrading the disks or optimizing the I/O path.

Interpreters April 01 2011
100% (4)
Interpreters April 01 2011
254 pages
SQL Server DBA Interview Questions and Answers
100% (3)
SQL Server DBA Interview Questions and Answers
10 pages
Outdoor Bible Scavenger Hunt Clues
No ratings yet
Outdoor Bible Scavenger Hunt Clues
3 pages
Adaptiva Installation Guide
No ratings yet
Adaptiva Installation Guide
24 pages
SQL Server Always On Compute Engine
No ratings yet
SQL Server Always On Compute Engine
58 pages
Sta4setup$ PDF
No ratings yet
Sta4setup$ PDF
8 pages
African Pentecotalism
67% (3)
African Pentecotalism
376 pages
Multiple Subnet AG Groups in SQL Server-Overview
No ratings yet
Multiple Subnet AG Groups in SQL Server-Overview
4 pages
SQL Server Alwayson With 3 AG Replicas DR Drill in SQL Server
No ratings yet
SQL Server Alwayson With 3 AG Replicas DR Drill in SQL Server
7 pages
SQL Server Failover Cluster (Active-Passive) DR Drill
No ratings yet
SQL Server Failover Cluster (Active-Passive) DR Drill
6 pages
Options Available Under Primary Alwayson AG Replica
No ratings yet
Options Available Under Primary Alwayson AG Replica
6 pages
Windows Failover Cluster-Overview
No ratings yet
Windows Failover Cluster-Overview
6 pages
DB Replication SQL Server
No ratings yet
DB Replication SQL Server
8 pages
Pre_Requisites_to_setup_HA_DR_in_SQL_Server__1727584922
No ratings yet
Pre_Requisites_to_setup_HA_DR_in_SQL_Server__1727584922
7 pages
DAC (Dedicated Admin Connection) SQL Server
No ratings yet
DAC (Dedicated Admin Connection) SQL Server
3 pages
Clustering SQL Server Active-Active-Passive
No ratings yet
Clustering SQL Server Active-Active-Passive
29 pages
Trace Flags in SQL Server-Overview
No ratings yet
Trace Flags in SQL Server-Overview
4 pages
Automatic Seeding in Always On Availability Groups 1598681906
No ratings yet
Automatic Seeding in Always On Availability Groups 1598681906
9 pages
SQL Server Always On - Overview
No ratings yet
SQL Server Always On - Overview
4 pages
SQL Server DBA Interview Questions and Answers
No ratings yet
SQL Server DBA Interview Questions and Answers
10 pages
Alwayson WRT HA and DR in SQL Server
No ratings yet
Alwayson WRT HA and DR in SQL Server
5 pages
SQL-Server Always On Availability Group Lab Step by Step3
No ratings yet
SQL-Server Always On Availability Group Lab Step by Step3
12 pages
Error Collation Database SQL - CCURE 9K - 03. SWH-TAB-000025729 - Latam
No ratings yet
Error Collation Database SQL - CCURE 9K - 03. SWH-TAB-000025729 - Latam
5 pages
MDC 9694
No ratings yet
MDC 9694
55 pages
Criteria SQL Server Oracle
No ratings yet
Criteria SQL Server Oracle
6 pages
Best Practices For Running SQL Server in Windows Azure Virtual Machine - TechNet
No ratings yet
Best Practices For Running SQL Server in Windows Azure Virtual Machine - TechNet
2 pages
SQL Dba Interview Questions
No ratings yet
SQL Dba Interview Questions
67 pages
AD Guide
No ratings yet
AD Guide
12 pages
Ola Hallengren SQL Server Maintenance Solution
No ratings yet
Ola Hallengren SQL Server Maintenance Solution
5 pages
SQ L Fail Over Options
No ratings yet
SQ L Fail Over Options
4 pages
2 Database Installation
No ratings yet
2 Database Installation
45 pages
SQL Server On Vmware: Jonathan Kehayias (MCTS, Mcitp) SQL Database Administrator Tampa, FL
No ratings yet
SQL Server On Vmware: Jonathan Kehayias (MCTS, Mcitp) SQL Database Administrator Tampa, FL
28 pages
AlwaysonAvailabilityGroupsFAQS_AOAG_1740791285
No ratings yet
AlwaysonAvailabilityGroupsFAQS_AOAG_1740791285
48 pages
Top Five SQL Server Cluster Setup Mistakes
No ratings yet
Top Five SQL Server Cluster Setup Mistakes
16 pages
Smitha Sqladmin-1
No ratings yet
Smitha Sqladmin-1
3 pages
How To Upgrade RMAN Catalog SCHEMA From 11g To 12c Without Upgrading The Catalog Database
No ratings yet
How To Upgrade RMAN Catalog SCHEMA From 11g To 12c Without Upgrading The Catalog Database
2 pages
Connecting Linux Using Ad Authentication
No ratings yet
Connecting Linux Using Ad Authentication
6 pages
SQL Server Database Mirroring
No ratings yet
SQL Server Database Mirroring
74 pages
How To Troubleshoot Failed Login Attempts To DB Control (ID 404820.1)
No ratings yet
How To Troubleshoot Failed Login Attempts To DB Control (ID 404820.1)
10 pages
Sqlserver Hardening
No ratings yet
Sqlserver Hardening
6 pages
SQL Server Startup Parameters
No ratings yet
SQL Server Startup Parameters
3 pages
SQL Server Hacking On Scale UsingPowerShell S.sutherland
No ratings yet
SQL Server Hacking On Scale UsingPowerShell S.sutherland
110 pages
CTX 111311
No ratings yet
CTX 111311
21 pages
Disaster Recovery Options For Microsoft SQL Server
No ratings yet
Disaster Recovery Options For Microsoft SQL Server
7 pages
Diagnostic Information For Database Replay Issues
No ratings yet
Diagnostic Information For Database Replay Issues
10 pages
Veeam Backup 5 0 2 Release Notes
No ratings yet
Veeam Backup 5 0 2 Release Notes
16 pages
az-800_7
No ratings yet
az-800_7
21 pages
Physical and Virtual Machines SQL Server - Overview
No ratings yet
Physical and Virtual Machines SQL Server - Overview
5 pages
Monitoring and Tuning Oracle RAC Database: Practice 7
No ratings yet
Monitoring and Tuning Oracle RAC Database: Practice 7
14 pages
Week 4 - Automated SQL Injection
No ratings yet
Week 4 - Automated SQL Injection
6 pages
SQL DBA PDF (20 files merged)
No ratings yet
SQL DBA PDF (20 files merged)
114 pages
Failover Clustering
No ratings yet
Failover Clustering
192 pages
7nov14.30 - Upgrading Windows Server 2003-2008 Active Directory To Windows Server 2012 R2
No ratings yet
7nov14.30 - Upgrading Windows Server 2003-2008 Active Directory To Windows Server 2012 R2
58 pages
GGS - Oracle To SQL Server
No ratings yet
GGS - Oracle To SQL Server
25 pages
Goldengate Oracle To SQL Server
No ratings yet
Goldengate Oracle To SQL Server
25 pages
AOAG Read OnlySecondaryReplica
No ratings yet
AOAG Read OnlySecondaryReplica
34 pages
SQL Server Security Best Practices
No ratings yet
SQL Server Security Best Practices
11 pages
Aveva Pdms
No ratings yet
Aveva Pdms
35 pages
Oracle Prep4sure 1z0-134 v2018-10-11 by Karen 45q
No ratings yet
Oracle Prep4sure 1z0-134 v2018-10-11 by Karen 45q
39 pages
Scheduling a daily backup and maintenance for SQL Server Express
No ratings yet
Scheduling a daily backup and maintenance for SQL Server Express
15 pages
Windows Failover and SQL Failover Cluster Requirements
No ratings yet
Windows Failover and SQL Failover Cluster Requirements
5 pages
Set Up Access To Network Shares
No ratings yet
Set Up Access To Network Shares
6 pages
EDU 311 80a MOD 10 Performance Troubleshooting - 10
No ratings yet
EDU 311 80a MOD 10 Performance Troubleshooting - 10
33 pages
Confluent Certified Developer for Apache Kafka® Exam kit
From Everand
Confluent Certified Developer for Apache Kafka® Exam kit
PRIYANKA
No ratings yet
SY Electronics Slips
No ratings yet
SY Electronics Slips
8 pages
Hirschler, ArabicandPersianTraditions
No ratings yet
Hirschler, ArabicandPersianTraditions
10 pages
Jacobi and Gauss-Seidel
No ratings yet
Jacobi and Gauss-Seidel
10 pages
Church's Year of Grace - Volume 1 Advent To Candlemas
No ratings yet
Church's Year of Grace - Volume 1 Advent To Candlemas
510 pages
Discuss The Following Operating Systems Structures of Stating Advantages and Disadvantages If Any
No ratings yet
Discuss The Following Operating Systems Structures of Stating Advantages and Disadvantages If Any
4 pages
Anchoring Script For The Event of 25TH January
100% (1)
Anchoring Script For The Event of 25TH January
3 pages
Science 5-Dlp-June 13, 2023
No ratings yet
Science 5-Dlp-June 13, 2023
4 pages
n5 Page 4 +4 Until Deshou N4 From Shi Until Page 9 n3 11-17 Tsumori n2 To 23
No ratings yet
n5 Page 4 +4 Until Deshou N4 From Shi Until Page 9 n3 11-17 Tsumori n2 To 23
40 pages
Unit 2 Data Structures, File Organisation and Physical Database Design
No ratings yet
Unit 2 Data Structures, File Organisation and Physical Database Design
13 pages
Bengali 128 Updated Team 24x7offshoring)
No ratings yet
Bengali 128 Updated Team 24x7offshoring)
4 pages
ENGLISH 2 TG Let - S Begin Reading Unit II
No ratings yet
ENGLISH 2 TG Let - S Begin Reading Unit II
81 pages
Indian Translators
No ratings yet
Indian Translators
18 pages
19b LCD Display 1602
No ratings yet
19b LCD Display 1602
4 pages
Glossary of GurSikh - 11th Vaar of Bhai Gurdaas-1 PDF
100% (1)
Glossary of GurSikh - 11th Vaar of Bhai Gurdaas-1 PDF
45 pages
Empowerment Technologies Q2 Module 6
No ratings yet
Empowerment Technologies Q2 Module 6
40 pages
Ben Okri
No ratings yet
Ben Okri
1 page
Java p13
No ratings yet
Java p13
5 pages
The 1st Page Sage - Unlocking The SEO
No ratings yet
The 1st Page Sage - Unlocking The SEO
171 pages
On His Blindness
No ratings yet
On His Blindness
5 pages
ALL7950 Manual English V1.01
No ratings yet
ALL7950 Manual English V1.01
49 pages
Quoting Paraphrasing and Summarising
No ratings yet
Quoting Paraphrasing and Summarising
11 pages
Session 02: IELTS P Reparation Cour Se Elsaid Rashad (MR Ha Ppy) IELTS 4 Arabs G Roup/ FB
No ratings yet
Session 02: IELTS P Reparation Cour Se Elsaid Rashad (MR Ha Ppy) IELTS 4 Arabs G Roup/ FB
15 pages
Example of Nonfiction Story in The Philippines
No ratings yet
Example of Nonfiction Story in The Philippines
14 pages
Poetry Analysis.docx_20241127_104057_0000
No ratings yet
Poetry Analysis.docx_20241127_104057_0000
3 pages
Lexical and Grammatical Taransformations
No ratings yet
Lexical and Grammatical Taransformations
10 pages
My Language Bio Data Form
No ratings yet
My Language Bio Data Form
2 pages