SQL Server Backup Restore 2012
SQL Server Backup Restore 2012
ISBN: 978-1-906434-74-8
By Shawn McGehee
The right of Shawn McGehee to be identified as the author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored or introduced into a retrieval system, or transmitted, in any form, or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written consent of the publisher. Any person who does any unauthorized act in relation to this publication may be liable to criminal prosecution and civil claims for damages. This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form other than that in which it is published and without a similar condition including this condition being imposed on the subsequent publisher. Technical Review by Eric Wisdahl Cover Image by Andy Martin Edited by Tony Davis Typeset & Designed by Peter Woodhouse & Gower Associates
Table of Contents
Introduction___________________________________________12 Chapter 1: Basics of Backup and Restore___________________19
Components of a SQL Server Database ___________________________________ 20 Data files ________________________________________________________ 20 Filegroups_________________________________________________________22 Transaction log ____________________________________________________24 SQL Server Backup Categories and Types __________________________________28 SQL Server database backups _________________________________________29 SQL Server transaction log backups ____________________________________ 32 File backups _______________________________________________________36 Software Requirements and Code Examples ________________________________ 18
Restoring Databases___________________________________________________ 46 Restoring system databases ___________________________________________47 Restoring single pages from backup ____________________________________48
Summary ___________________________________________________________ 49 Backup Storage________________________________________________________50 Local disk (DAS or SAN) _____________________________________________ 52 Network device ____________________________________________________58 Tape _____________________________________________________________59 Backup Tools ________________________________________________________ 60
Backup and Restore Planning ___________________________________________ 64 Backup requirements _______________________________________________65 Restore requirements ______________________________________________ 68 An SLA template __________________________________________________ 69 Example restore requirements and backup schemes _______________________ 71 Backup scheduling __________________________________________________ 73 Backup Verification and Test Restores _____________________________________ 75 Back up WITH CHECKSUM _________________________________________76 Verifying restores ___________________________________________________77 DBCC CHECKDB __________________________________________________77 Documenting Critical Backup Information _________________________________78 Summary ____________________________________________________________83
Maintenance plan backups ___________________________________________ 61 Custom backup scripts ______________________________________________62 Third-party tools ___________________________________________________63
Taking Full Backups ___________________________________________________ 96 Native SSMS GUI method ____________________________________________97 Native T-SQL method ______________________________________________106
Verifying Backups ____________________________________________________ 113 Summary ___________________________________________________________ 115 Full Restores in the Backup and Restore SLA ______________________________ 116 Possible Issues with Full Database Restores ________________________________ 117 Large data volumes ________________________________________________ 118 Restoring databases containing sensitive data ___________________________ 118 Too much permission ______________________________________________ 120 Performing Full Restores _______________________________________________ 122 Native SSMS GUI full backup restore __________________________________ 122 Native T-SQL full restore ___________________________________________ 129
Forcing Restore Failures for Fun _________________________________________ 133 Considerations When Restoring to a Different Location _____________________ 136 Restoring System Databases ____________________________________________ 137 Restoring the msdb database ________________________________________ 138 Restoring the master database _______________________________________140
Summary ___________________________________________________________ 143 A Brief Peek Inside a Transaction Log ____________________________________ 145 Three uses for transaction log backups ___________________________________ 148 Performing database restores ________________________________________ 149 Large database migrations___________________________________________ 150 Log shipping _____________________________________________________ 151
Preparing for Log Backups _____________________________________________ 153 Choosing the recovery model ________________________________________ 154 Creating the database ______________________________________________ 155 Creating and populating tables _______________________________________ 157 Taking a base full database backup ____________________________________ 159 Taking Log Backups ___________________________________________________ 161 The GUI way: native SSMS log backups ________________________________ 161 T-SQL log backups_________________________________________________ 166 Forcing Log Backup Failures for Fun _____________________________________170 Troubleshooting Log Issues ____________________________________________ 172 Failure to take log backups __________________________________________ 173 Other factors preventing log truncation _______________________________ 174 Excessive logging activity ___________________________________________ 175 Handling the 9002 Transaction Log Full error __________________________ 176 Log fragmentation _________________________________________________ 177
Forcing Restore Failures for Fun ________________________________________ 200 Summary __________________________________________________________ 204
Taking Differential Backups ____________________________________________ 218 Native GUI differential backup _______________________________________ 219 Native T-SQL differential backup _____________________________________ 221 Compressed differential backups _____________________________________ 223 Performing Differential Backup Restores __________________________________ 225 Native GUI differential restore _______________________________________ 225 Native T-SQL differential restore _____________________________________ 227 Restoring compressed differential backups _____________________________230 Forcing Failures for Fun________________________________________________ 231 Missing the base __________________________________________________ 231 Running to the wrong base __________________________________________ 232 Recovered, already _________________________________________________ 235
Chapter 8: Database Backup and Restore with SQL Backup Pro 238
Preparing for Backups _________________________________________________ 238 Full Backups _________________________________________________________ 241 SQL Backup Pro full backup GUI method ______________________________ 241 SQL Backup Pro full backup using T-SQL ______________________________ 253
Log Backups _________________________________________________________ 256 Preparing for log backups ___________________________________________ 256 SQL Backup Pro log backups ________________________________________ 258 Differential Backups __________________________________________________ 261 Building a reusable and schedulable backup script __________________________ 263 Restoring Database Backups with SQL Backup Pro _________________________267 Preparing for restore _______________________________________________267 SQL Backup Pro GUI restore to the end of a log backup ___________________269 SQL Backup Pro T-SQL complete restore ______________________________ 277 SQL Backup Pro point-in-time restore to standby________________________279 Restore metrics: native vs. SQL Backup Pro _____________________________289 Backup Optimization _________________________________________________292
Verifying Backups ____________________________________________________ 291 Summary ___________________________________________________________294 Advantages of File Backup and Restore __________________________________ 296 Common Filegroup Architectures _______________________________________298
Common Issues with File Backup and Restore _____________________________340 File Backup and Restore SLA____________________________________________ 341 Forcing Failures for Fun________________________________________________ 343 Summary ___________________________________________________________346 Why Partial Backups? _________________________________________________349
Possible Issues with Partial Backup and Restore ____________________________364 Partial Backups and Restores in the SLA __________________________________ 365 Summary ___________________________________________________________ 367 SQL Backup Pro GUI Installation ________________________________________368 Forcing Failures for Fun________________________________________________366
Acknowledgements
I would like to thank everyone who helped and supported me through the writing of this book. I would especially like to thank my editor, Tony Davis, for sticking with me during what was a long and occasionally daunting process, and helping me make my first single-author book a reality. I also need to give a special thank you to all of my close friends and family for always being there for me during all of life's adventures, good and bad. Shawn McGehee
xi
Introduction
My first encounter with SQL Server, at least from an administrative perspective, came while I was still at college, working in a small web development shop. We ran a single SQL Server 6.5 instance, on Windows NT, and it hosted every database for every client that the company serviced. There was no dedicated administration team; just a few developers and the owner. One day, I was watching and learning from a fellow developer while he made code changes to one of our backend administrative functions. Suddenly, the boss stormed into the room and demanded everyone's immediate attention. Whatever vital news he had to impart is lost in the sands of time, but what I do remember is that when the boss departed, my friend returned his attention to the modified query and hit Execute, an action that was followed almost immediately by a string of expletives so loud they could surely have been heard several blocks away. Before being distracted by the boss, he'd written the DELETE portion of a SQL statement, but not the necessary WHERE clause and, upon hitting Execute, he had wiped out all the data in a table. Fortunately, at least, he was working on a test setup, with test data. An hour later we'd replaced all the lost test data, no real harm was done and we were able to laugh about it. As the laughter subsided, I asked him how we would have gotten that data back if it had been a live production database for one of the clients or, come to think of it, what we would do if the whole server went down, with all our client databases on board. He had no real answer, beyond "Luckily it's never happened." There was no disaster recovery plan; probably because there were no database backups that could be restored! It occurred to me that if disaster ever did strike, we would be in a heap of trouble, to the point where I wondered if the company as a whole could even survive such an event. It was a sobering thought. That evening I did some online research on database backups, and the very next day performed a full database backup of every database on our server. A few days later, I had 12
Introduction jobs scheduled to back up the databases on a regular basis, to one of the local hard drives on that machine, which I then manually copied to another location. I told the boss what I'd done, and so began my stint as the company's "accidental DBA." Over the coming weeks and months, I researched various database restore strategies, and documented a basic "crash recovery" plan for our databases. Even though I moved on before we needed to use even one of those backup files, I felt a lot better knowing that, with the plan that I'd put in place, I left the company in a situation where they could recover from a server-related disaster, and continue to thrive as a business. This, in essence, is the critical importance of database backup and restore: it can mean the difference between life or death for a business, and for the career of a DBA.
Introduction Such a plan needs to be developed for each and every user database in your care, as well as supporting system databases, and it should be tailored to the specific requirements of each database, based on the type of data being stored (financial, departmental, personal, and so on), the maximum acceptable risk of potential data loss (day? hour? minute?), and the maximum acceptable down-time in the event of a disaster. Each of these factors will help decide the types of backup required, how often they need to be taken, how many days' worth of backup files need to be stored locally, and so on. All of this should be clearly documented so that all parties, both the DBAs and application/ database owners, understand the level of service that is expected for each database, and what's required in the plan to achieve it. At one end of the scale, for a non-frontline, infrequently-modified database, the backup and recovery scheme may be simplicity itself, involving a nightly full database backup, containing a complete copy of all data files, which can be restored if and when necessary. At the opposite end of the scale, a financial database with more or less zero tolerance to data loss will require a complex scheme consisting of regular (daily) full database backups, probably interspersed with differential database backups, capturing all changes since the last full database backup, as well as very regular transaction log backups, capturing the contents added in the database log file, since the last log backup. For very large databases (VLDBs), where it may not be possible to back up the entire database in one go, the backup and restore scheme may become more complex still, involving backup of individual data files, for filegroups, as well as transaction logs. All of these backups will need to be carefully planned and scheduled, the files stored securely, and then restored in the correct sequence, to allow the database to be restored to the exact state in which it existed at any point in time in its history, such as the point just before a disaster occurred. It sounds like a daunting task, and if you are not well prepared and well practiced, it will be. However, with the tools, scripts, and techniques provided in this book, and with the requisite planning and practice, you will be prepared to respond quickly and efficiently to a disaster, whether it's caused by disk failure, malicious damage, database corruption or the accidental deletion of data. This book will walk you step by step through the process of capturing all types of backup, from basic full database backups, to transaction log 14
Introduction backups, to file and even partial backups. It will demonstrate how to perform all of the most common types of restore operation, from single backup file restores, to complex point-in-time restores, to recovering a database by restoring just a subset of the files that make up the database. As well as allowing you to recover a database smoothly and efficiently in the face of one of the various the "doomsday scenarios," your well-rounded backup and recovery plan, developed with the help of this book, will also save you time and trouble in a lot of other situations including, but not limited to those below. Refreshing development environments periodically, developers will request that their development environments be refreshed with current production data and objects. Recovering from partial data loss occasionally, a database has data "mysteriously disappear from it." Migrating databases to different servers you will eventually need to move databases permanently to other servers, for a variety of reasons. The techniques in this book can be used for this purpose, and we go over some ways that different backup types can cut down on the down-time which this process may cause. Offloading reporting needs reporting on data is becoming more and more of a high priority in most IT shops. With techniques like log shipping, you can create cheap and quick reporting solutions that can provide only slightly older reporting data than High Availability solutions. I learned a lot of what I know about backup and restore the hard way, digging through innumerable articles on Books Online, and various community sites. I hope my book will serve as a place for the newly-minted and accidental DBA to get a jump on backups and restores. It can be a daunting task to start planning a Backup and Restore SLA from scratch, even in a moderately-sized environment, and I hope this book helps you get a good start.
15
Introduction
16
Introduction Chapters 5 and 6 cover how to take transaction log backups, and then use them in conjunction with a full database backup to restore a database to a particular point in time. They also cover common transaction log problems and how to resolve them. Chapter 7 covers standard and compressed differential database backup and restore. Basic backup and restore with SQL Backup how to capture and restore all basic backup types using Red Gate SQL Backup. Chapter 8 third-party tools such as Red Gate SQL backup aren't free, but they do offer numerous advantages in terms of the ease with which all the basic backups can be captured, automated, and then restored. Many organizations, including my own, rely on such tools for their overall backup and restore strategy. Advanced backup and restore how to capture and restore file and filegroup backups, and partial database backups. Chapter 9 arguably the most advanced chapter in the book, explaining the filegroup architectures that enable file-based backup and restore, and the complex process of capturing the necessary file backups and transaction log backups, and using them in various restore operations. Chapter 10 a brief chapter on partial database backups, suitable for large databases with a sizeable portion of read-only data. Finally, Appendix A provides a quick reference on how to download, install, and configure the SQL Backup tool from Red Gate Software, so that you can work through any examples in the book that use this tool.
17
Introduction
18
19
Data files
Data files in a SQL Server database refer to the individual data containers that are used to store the system and user-defined data and objects. In other words, they contain the data, tables, views, stored procedures, triggers and everything else that is accessed by you, and your end-users and applications. These files also include most of the system information about your database, including permission information, although not including anything that is stored in the master system database. Each database must have one, and only one, primary data file, typically denoted by the .MDF extension, which will be stored in the PRIMARY filegroup. It may also have some secondary data files, typically denoted by the .NDF extension. Note that use of the .MDF and .NDF extensions are convention rather than necessity; if you enjoy confusing your fellow DBAs, you can apply any extensions you wish to these files. 20
Chapter 1: Basics of Backup and Restore The primary data file will contain: all system objects and data by default, all user-defined objects and data (assuming that only the MDF file exists in the PRIMARY filegroup) the location of any secondary data files. Many of the databases we'll create in this book will contain just a primary data file, in the PRIMARY filegroup, although in later chapters we'll also create some secondary data files to store user-defined objects and data. Writes to data files occur in a random fashion, as data changes affect random pages stored in the database. As such, there is a potential performance advantage to be had from being able to write simultaneously to multiple data files. Any secondary data files are typically denoted with the NDF extension, and can be created in the PRIMARY filegroup or in separate user-defined filegroups (discussed in more detail in the next section). When multiple data files exist within a single filegroup, SQL Server writes to these files using a proportional fill algorithm, where the amount of data written to a file is proportionate to the amount of free space in that file, compared to other files in the filegroup. Collectively, the data files for a given database are the cornerstone of your backup and recovery plan. If you have not backed up your live data files, and the database becomes corrupted or inaccessible in the event of a disaster, you will almost certainly have lost some or all of your data. As a final point, it's important to remember that data files will need to grow, as more data is added to the database. The manner in which this growth is managed is often a point of contention among DBAs. You can either manage the growth of the files manually, adding space as the data grows, or allow SQL Server to auto-grow the files, by a certain value or percentage each time the data file needs more space. Personally, I advocate leaving auto-growth enabled, but on the understanding that files are sized initially to cope with current data and predicted data growth (over a year, say) without undergoing an excessive 21
Chapter 1: Basics of Backup and Restore number of auto-growth events. We'll cover this topic more thoroughly in the Database creation section of Chapter 3, but for the rest of the discussion here, we are going to assume that the data and log files are using auto-growth.
Filegroups
A filegroup is simply a logical collection of one or more data files. Every filegroup can contain one or more data files. When data is inserted into an object that is stored in a given filegroup, SQL Server will distribute that data evenly across all data files in that filegroup. For example, let's consider the PRIMARY filegroup, which in many respects is a "special case." The PRIMARY filegroup will always be created when you create a new database, and it must always hold your primary data file, which will always contain the pages allocated for your system objects, plus "pointers" to any secondary data files. By default, the PRIMARY filegroup is the DEFAULT filegroup for the database and so will also store all user objects and data, distributed evenly between the data files in that filegroup. However, it is possible to store some or all of the user objects and data in a separate filegroup. For example, one commonly cited best practice with regard to filegroup architecture is to store system data separately from user data. In order to follow this practice, we might create a database with both a PRIMARY and a secondary, or user-defined, filegroup, holding one or more secondary data files. All system objects would automatically be stored in the PRIMARY data file. We would then ALTER the database to set the secondary filegroup as the DEFAULT filegroup for that database. Thereafter, any user objects will, by default, be stored in that secondary filegroup, separately from the system objects. There may also be occasions when we want to store just certain, specific user objects separately, outside the PRIMARY filegroup. To store an object in a secondary, rather than the PRIMARY, filegroup, we simply specify this during object creation, via the ON clause, as in the example below. 22
Any data files in the secondary filegroup can, and typically will, be stored on separate physical storage from those in the PRIMARY filegroup. When a BACKUP DATABASE command is issued it will, by default, back up all objects and data in all data files in all filegroups. However, it's possible to specify that only certain filegroups, or specific files within a filegroup are backed up, using file or filegroup backups (covered in more detail later in this chapter, and in full detail in Chapter 9, File and Filegroup Backup and Restore). It's also possible to perform a partial backup (Chapter 10, Partial Backup and Restore), excluding any read-only filegroups. Given these facts, there's a potential for both performance and administrative benefits, from separating your data across filegroups. For example, if we have certain tables that are exclusively read-only then we can, by storing this data in a separate filegroup, exclude this data from the normal backup schedule. After all, performing repeated backups of data that is never going to change is simply a waste of disk space. If we have tables that store data that is very different in nature from the rest of the tables, or that is subject to very different access patterns (e.g. heavily modified), then there can be performance advantages to storing that data on separate physical disks, configured optimally for storing and accessing that particular data. Nevertheless, it's my experience that, in general, RAID (Redundant Array of Inexpensive Disks) technology and SAN (Storage Area Network) devices (covered in Chapter 2) automatically do a much better job of optimizing disk access performance than the DBA can achieve by manual placement of data files. Also, while carefully designed filegroup architecture can add considerable flexibility to your backup and recovery scheme, it will also add administrative burden. There are certainly valid reasons for using secondary files and filegroups, such as separating system and user data, and there are certainly cases where they might be a necessity, for example, 23
Chapter 1: Basics of Backup and Restore for databases that are simply too large to back up in a single operation. However, they are not required on every database you manage. Unless you have experience with them, or know definitively that you will gain significant performance with their use, then sticking to a single data file database will work for you most of the time (with the data being automatically striped across physical storage, via RAID). Finally, before we move on, it's important to note that SQL Server transaction log files are never members of a filegroup. Log files are always managed separately from the SQL Server data files.
Transaction log
A transaction log file contains a historical account of all the actions that have been performed on your database. All databases have a transaction log file, which is created automatically, along with the data files, on creation of the database and is conventionally denoted with the LDF extension. It is possible to have multiple log files per database but only one is required. Unlike data files, where writes occur in a random fashion, SQL Server always writes to the transaction log file sequentially, never in parallel. This means that it will only ever write to one log file at a time, and having more than one file will not boost write-throughput or speed. In fact, having more multiple files could result in performance degradation, if each file is not correctly sized or differs in size and growth settings from the others. Some inexperienced DBAs don't fully appreciate the importance of the transaction log file, both to their backup and recovery plan and to the general day-to-day operation of SQL Server, so it's worth taking a little time out to understand how SQL Server uses the transaction log (and it's a topic we'll revisit in more detail in Chapter 5, Log Backups). Whenever a modification is made to a database object (via Data Definition Language, DDL), or the data it contains (Data Manipulation Language, DML), the details of the change are recorded as a log record in the transaction log. Each log record contains 24
Chapter 1: Basics of Backup and Restore details of a specific action within the database (for example, starting a transaction, or inserting a row, or modifying a row, and so on). Every log record will record the identity of the transaction that performed the change, which pages were changed, and the data changes that were made. Certain log records will record additional information. For example, the log record recording the start of a new transaction (the LOP_BEGIN_XACT log record) will contain the time the transaction started, and the LOP_COMMIT_XACT (or LOP_ABORT_XACT) log records will record the time the transaction was committed (or aborted). From the point of view of SQL Server and the DBA looking after it, the transaction log performs the following critical functions: ensures transactional durability and consistency enables, via log backups, point-in-time restore of databases.
Chapter 1: Basics of Backup and Restore modified in non-volatile storage (i.e. on disk), the description of the change must first be "hardened" to stable storage. SQL Server or, more specifically, the buffer manager, makes sure that the change descriptions (log records) are written to the physical transaction log file before the data pages are written to the physical data files.
The Lazy Writer
Another process that scans the data cache, the Lazy Writer, may also write dirty data pages to disk, outside of a checkpoint, if forced to do so by memory pressures.
By always writing changes to the log file first, SQL Server can guarantee that the effects of all committed transactions will ultimately be reflected in the data files, and that any data modifications on disk that originate from incomplete transactions, i.e. those for which neither a COMMIT nor a ROLLBACK have been issued are ultimately not reflected in the data files. This process of reconciling the contents of the data and log files occurs during the database recovery process (sometimes called Crash Recovery), which is initiated automatically whenever SQL Server restarts, or as part of the RESTORE command. Say, for example, a database crashes after a certain transaction (T1) is "hardened" to the transaction log file, but before the actual data is written from memory to disk. When the database restarts, a recovery process is initiated, which reconciles the data file and log file. All of the operations that comprise transaction T1, recorded in the log file, will be "rolled forward" (redone) so that they are reflected in the data files. During this same recovery process, any data modifications on disk that originate from incomplete transactions, i.e. those for which neither a COMMIT nor a ROLLBACK have been issued, are "rolled back" (undone), by reading the relevant operations from the log file, and performing the reverse physical operation on the data. More generally, this rollback process occurs if a ROLLBACK command is issued for an explicit transaction, or if an error occurs and XACT_ABORT is turned on, or if the database detects that communication has been broken between the database and the client that instigated the 26
Chapter 1: Basics of Backup and Restore transactions. In such circumstances, the log records pertaining to an interrupted transaction, or one for which the ROLLBACK command is explicitly issued, are read and the changes rolled back. In these ways, SQL Server ensures that, either all the actions associated with a transaction succeed as a unit, or that they all fail, and so guarantees data consistency and integrity during normal day-to-day operation.
27
Chapter 1: Basics of Backup and Restore (Chapters 3 to 8) will focus on database backups (full and differential) and transaction log backups. However, we do cover file backups in Chapters 9 and 10. Note that the exact types of backup that can be performed, and to some extent the restore options that are available, depend on the recovery model in which the database is operating (SIMPLE, FULL or BULK_LOGGED). We'll be discussing this topic in more detail shortly, in the Recovery Models section, but for the time being perhaps the most notable point to remember is that it is not possible to perform transaction log backups for a database operating in SIMPLE recovery model, and so log backups play no part of a database RESTORE operation for these databases. Now we'll take a look at each of these types of backup in a little more detail.
29
30
Chapter 1: Basics of Backup and Restore If that size of potential loss is unacceptable, then you'll need to either take more frequent full backups (often not logistically viable, especially for large databases) or take transaction log backups and, optionally, some differential database backups, in order to minimize the risk of data loss. A full database backup serves as the base for any subsequent differential database backup.
Copy-only full backups
There is a special type of full backup, known as a copy-only full backup, which exists independently of the sequence of backup files required to restore a database, and cannot act as the base for differential database backups. This topic is discussed in more detail in Chapter 3, Full Database Backups.
31
Chapter 1: Basics of Backup and Restore Some DBAs avoid taking differential backups where possible, due to the perceived administrative complexity they add to the backup and restore strategy; they prefer instead to rely solely on a mix of full and regular transaction log backups. Personally, however, I find them to be an invaluable component of the backup strategy for many of my databases. Furthermore, for VLDBs, with a very large full backup footprint, differential backups may become a necessity. Even so, it is still important, when using differential backups, to update the base backup file at regular intervals. Otherwise, if the database is large and the data changes frequently, our differential backup files will end up growing to a point in size where they don't give us much value. We will discuss differential backups further in Chapter 7, where we will dive much deeper into best practices for their use as part of a backup and recovery strategy.
32
Chapter 1: Basics of Backup and Restore Therefore, the full transaction "history" can be captured into a backup file by backing up the transaction log. These log backups can then be used as part of a database RESTORE operation, in order to roll the database forward to a point in time at, or very close to, when some "disaster" occurred.
Chapter 1: Basics of Backup and Restore as a result of the disaster. A tail log backup using NO_TRUNCATE may "succeed" (although with reported errors) in these circumstances but a subsequent attempt to restore that tail log backup will fail. This is discussed in more detail in the Minimally logged operations section of Chapter 6.
35
Chapter 1: Basics of Backup and Restore In the FULL (or BULK LOGGED) recovery model, once a full backup of the database has been taken, the inactive portion of the log is no longer marked as reusable on CHECKPOINT, so records in the inactive VLFs are retained alongside those in the active VLFs. Thus we maintain a complete, unbroken series of log records, which can be captured in log backups, for use in point-in-time restore operations. Each time a BACKUP LOG operation occurs, it marks any VLFs that are no longer necessary as inactive and hence reusable. This explains why it's vital to back up the log of any database running in the FULL (or BULK LOGGED) recovery model; it's the only operation that makes space in the log available for reuse. In the absence of log backups, the log file will simply continue to grow (and grow) in size, unchecked.
File backups
In addition to the database backups discussed previously, it's also possible to take file backups. Whereas database backups back up all data files for a given database, with file backups we can back up just a single, specific data file, or a specific group of data files (for example, all the data files in a particular filegroup). For a VLDB that has been "broken down" into multiple filegroups, file backups (see Chapter 9) can decrease the time and disk space needed for the backup strategy and also, in certain circumstances, make disaster recovery much quicker. For example, let's assume that a database's architecture consists of three filegroups: a primary filegroup holding only system data, a secondary filegroup holding recent business data and a third filegroup holding archive data, which has been specifically designated as a READONLY filegroup. If we were to perform database backups, then each full backup file would contain a lot of data that we know will never be updated, which would simply be wasting disk space. Instead, we can take frequent, scheduled file backups of just the system and business data.
36
Chapter 1: Basics of Backup and Restore Furthermore, if a database suffers corruption that is limited to a single filegroup, we may be able to restore just the filegroup that was damaged, rather than the entire database. For instance, let's say we placed our read-only filegroup on a separate drive and that drive died. Not only would we save time by only having to restore the read-only filegroup, but also the database could remain online and just that read-only data would be unavailable until after the restore. This latter advantage only holds true for user-defined filegroups; if the primary filegroup goes down, the whole ship goes down as well. Likewise, if the disk holding the file storing recent business data goes down then, again, we may be able to restore just that filegroup; in this case, we would also have to restore any transaction log files taken after the file backup to ensure that the database as a whole could be restored to a consistent state. Finally, if a catastrophe occurs that takes the database completely offline, and we're using SQL Server Enterprise Edition, then we may be able to perform an online restore, restoring the primary data file and bringing the database back online before we've restored the other data files. We'll cover all this in a lot more detail, with examples, in Chapter 9. The downside of file backups is the significant complexity and administrative burden that they can add to the backup strategy. Firstly, it means that a "full backup" will consist of capturing several backup files, rather than just a single one. Secondly, in addition, we will have to take transaction log backups to cover the time between file backups of different file groups. We'll discuss this in fuller detail in Chapter 9 but, briefly, the reason for this is that while the data is stored in separate physical files it will still be relationally connected; changes made to data stored in one file will affect related data in other files, and since the individual file backups are taken at different times, SQL Server needs any subsequent log backup files to ensure that it can restore a self-consistent version of the database. Keeping track of all of the different backup jobs and files can become a daunting task. This is the primary reason why, despite the potential benefits, most people prefer to deal with the longer backup times and larger file sizes that accompany full database backups.
37
38
Chapter 1: Basics of Backup and Restore So, a partial full backup is akin to a full database backup, but omits all READONLY filegroups. Likewise, a partial differential backup is akin to a differential database backup, in that it only backs up data that has been modified since the base partial backup and, again, does not explicitly back up the READONLY filegroups within the database. Differential partial backups use the last partial backup as the base for any restore operations, so be sure to keep the base partial on hand. It is recommended to take frequent base partial backups to keep the differential partial backup file size small and manageable. Again, a good rule of thumb is to take a new base partial backup at least once per week, although possibly more frequently than that if the read/write filegroups are frequently modified. Finally, note that we can only perform partial backups via T-SQL. Neither SQL Server Management Studio nor the Maintenance Plan Wizard supports either type of partial backup.
Recovery Models
A recovery model is a database configuration option, chosen when creating a new database, which determines whether or not you need to (or even can) back up the transaction log, how transaction activity is logged, and whether or not you can perform more granular restore types that are available, such as file and page restores. All SQL Server database backup and restore operations occur within the context of one of three available recovery models for that database. SIMPLE recovery model certain operations can be minimally logged. Log backups are not supported. Point-in-time restore and page restore are not supported. File restore support is limited to secondary data files that are designated as READONLY.
39
Chapter 1: Basics of Backup and Restore FULL recovery model all operations are fully logged. Log backups are supported. All restore operations are supported, including point-in-time restore, page restore and file restore. BULK_LOGGED recovery model similar to FULL except that certain bulk operations can be minimally logged. Support for restore operations is as for FULL, except that it's not possible to restore to a specific point in time within a log backup that contains log records relating to minimally logged operations. Each model has its own set of requirements and caveats, so we need to choose the appropriate one for our needs, as it will dramatically affect the log file growth and level of recoverability. In general operation, a database will be using either the SIMPLE or FULL recovery model.
Can we restore just a single table?
Since we mentioned the granularity of page and file restores, the next logical question is whether we can restore individual tables. This is not possible with native SQL Server tools; you would have to restore an entire database in order to extract the required table or other object. However, certain third-party tools, including Red Gate's SQL Compare, do support object-level restores of many different object types, from native backups or from Red Gate SQL Backup files.
By default, any new database will inherit the recovery model of the model system database. In the majority of SQL Server editions, the model database will operate with the FULL recovery model, and so all new databases will also adopt use of this recovery model. This may be appropriate for the database in question, for example if it must support point-in-time restore. However, if this sort of support is not required, then it may be more appropriate to switch the database to SIMPLE recovery model after creation. This will remove the need to perform log maintenance in order to control the size of the log. Let's take a look at each of the three recovery models in a little more detail.
40
Simple
Of the three recovery models for a SQL Server database, SIMPLE recovery model databases are the easiest to manage. In the SIMPLE recovery model, we can take full database backups, differential backups and file backups. The one backup we cannot take, however, is the transaction log backup. As discussed earlier, in the Log space reuse section, whenever a CHECKPOINT operation occurs, the space in any inactive portions of the log file belonging to any database operating SIMPLE recovery model, becomes available for reuse. This space can be overwritten by new log records. The log file does not and cannot maintain a complete, unbroken series of log records since the last full (or differential) backup, which would be a requirement for any log backup to be used in a point-in-time restore operation, so a log backup would be essentially worthless and is a disallowed operation.
Truncation and the size of the transaction log
There is a misconception that truncating the log file means that log records are deleted and the file reduces in size. It does not; truncation of a log file is merely the act of making space available for reuse.
This process of making log space available for reuse is known as truncation, and databases using the SIMPLE recovery model are referred to as being in auto-truncate mode. In many respects, use of the SIMPLE recovery model greatly simplifies log file management. The log file is truncated automatically, so we don't have to worry about log file growth unless caused, for example, by some large and/or long-running batch operation. If a huge number of operations are run as a single batch, then the log file can grow in size rapidly, even for databases running in SIMPLE recovery (it's better to run a series of smaller batches).
41
Chapter 1: Basics of Backup and Restore We also avoid the administrative burden of scheduling and testing the log backups, and the storage overhead required for all of the log backup files, as well as the CPU and disk I/O burden placed on the server while performing the log backups. The most obvious and significant limitation of working in SIMPLE model, however, is that we lose the ability to perform point-in-time restores. As discussed earlier, if the exposure to potential data loss in a given database needs to be measured in minutes rather than hours, then transaction log backups are essential and the SIMPLE model should be avoided for that database. However, not every database in your environment needs this level of recoverability, and in such cases the SIMPLE model can be a perfectly sensible choice. For example, a Quality Assurance (QA) server is generally subject to a very strict change policy and if any changes are lost for some reason, they can easily be recovered by redeploying the relevant data and objects from a development server to the QA machine. As such, most QA servers can afford to operate in SIMPLE model. Likewise, if a database that gets queried for information millions of time per day, but only receives new information, in a batch, once per night, then it probably makes sense to simply run in SIMPLE model and take a full backup immediately after each batch update. Ultimately, the choice of recovery model is a business decision, based on tolerable levels of data loss, and one that needs to be made on a database-by-database basis. If the business requires full point-in-time recovery for a given database, SIMPLE model is not appropriate. However, neither is it appropriate to use FULL model for every database, and so take transaction log backups, "just in case," as it represents a considerable resource and administrative burden. If, for example, a database is read-heavy and a potential 12-hours' loss of data is considered bearable, then it may make sense to run in SIMPLE model and use midday differential backups to supplement nightly full backups.
42
Full
In FULL recovery model, all operations are fully logged in the transaction log file. This means all INSERT, UPDATE and DELETE operations, as well as the full details for all rows inserted during a bulk data load or index creation operations. Furthermore, unlike in SIMPLE model, the transaction log file is not auto-truncated during CHECKPOINT operations and so an unbroken series of log records can be captured in log backup files. As such, FULL recovery model supports restoring a database to any point in time within an available log backup and, assuming a tail log backup can be made, right up to the time of the last committed transaction before the failure occurred. If someone accidentally deletes some data at 2:30 p.m., and we have a full backup, plus valid log backups spanning the entire time from the full backup completion until 3:00 p.m., then we can restore the database to the point in time directly before that data was removed. We will be looking at performing point-in-time restores in Chapter 6, Log Restores, where we will focus on transaction log restoration. The other important point to reiterate here is that inactive VLFs are not truncated during a CHECKPOINT. The only action that can cause the log file to be truncated is to perform a backup of that log file; it is only once a log backup is completed that the inactive log records captured in that backup become eligible for truncation. This means that log backups play a vital dual role: firstly in allowing point-in-time recovery, and secondly in controlling the size of the transaction log file. In the FULL model, the log file will hold a full and complete record of the transactions performed against the database since the last time the transaction log was backed up. The more transactions your database is logging, the faster it will fill up. If your log file is not set to auto-grow (see Chapter 3 for further details), then this database will cease to function correctly at the point when no further space is available in the log. If auto-grow is enabled, the log file will grow and grow until either you take a transaction log backup or the disk runs out of space; I would recommend the first of these two options.
43
Chapter 1: Basics of Backup and Restore In short, when operating in FULL recovery model, you must be taking transaction log backups to manage the growth of data in the transaction log; a full database backup does not cause the log file to be truncated. Once you take a transaction log backup, space in inactive VLFs will be made available for new transactions (except in rare cases where you specify a copy-only log backup, or use the NO_TRUNCATE option, which will not truncate the log).
Bulk Logged
The third, and least-frequently used, recovery model is BULK_LOGGED. It operates in a very similar manner to FULL model, except in the extent to which bulk operations are logged, and the implications this can have for point-in-time restores. All standard operations (INSERT, UPDATE, DELETE, and so on) are fully logged, just as they would be in the FULL recovery model, but many bulk operations, such as the following, will be minimally logged: bulk imports using the BCP utility BULK INSERT INSERT SELECT * FROM OPENROWSET(bulk) SELECT INTO inserting or appending data using WRITETEXT or UPDATETEXT index rebuilds (ALTER INDEX REBUILD). In FULL recovery model, every change is fully logged. For example, if we were to use the BULK INSERT command to load several million records into a database operating in FULL recovery model, each of the INSERTs would be individually and fully logged. This puts a tremendous overhead onto the log file, using CPU and disk I/O to write each of the transaction records into the log file, and would also cause the log file to grow at a tremendous rate, slowing down the bulk load operation and possibly causing disk usage issues that could halt your entire operation. 44
Chapter 1: Basics of Backup and Restore In BULK_LOGGED model, SQL Server uses a bitmap image to capture only the extents that have been modified by the minimally logged operations. This keeps the space required to record these operations in the log to a minimum, while still (unlike in SIMPLE model) allowing backup of the log file, and use of those logs to restore the database in case of failure. Note, however, that the size of the log backup files will not be reduced, since SQL Server must copy into the log backup file all the actual extents (i.e. the data) that were modified by the bulk operation, as well as the transaction log records.
Tail log backups and minimally logged operations
If the data files are unavailable as a result of a database failure, and the tail of the log contains minimally logged operations recorded while the database was operating in BULK_LOGGED recovery model, then it will not be possible to do a tail log backup, as this would require access to the changed data extents in the data file.
The main drawback of switching to BULK_LOGGED model to perform bulk operations, and so ease the burden on the transaction log, is that it can affect your ability to perform point-in-time restores. The series of log records is always maintained but, if a log file contains details of minimally logged operations, it is not possible to restore to a specific point in time represented within that log file. It is only possible to restore the database to the point in time represented by the final transaction in that log file, or to a specific point in time in a previous, or subsequent, log file that does not contain any minimally logged transactions. We'll discuss this in a little more detail in Chapter 6, Log Restores. There is a time and place for use of the BULK_LOGGED recovery model. It is not recommended that this model be used for the day-to-day operation of any of your databases. What is recommended is that you switch from FULL recovery model to BULK_LOGGED recovery model only when you are using bulk operations. After you have completed these operations, you can switch back to FULL recovery. You should make the switch in a way that minimizes your exposure to data loss; this means taking an extra log backup immediately before you switch to BULK_LOGGED, and then another one immediately after you switch the database back to FULL recovery. 45
Restoring Databases
Of course, the ultimate goal of our entire SQL Server backup strategy is to prepare ourselves for the, hopefully rare, cases where we need to respond quickly to an emergency situation, for example restoring a database over one that has been damaged, or creating a second copy of a database (see http://msdn.microsoft.com/en-us/library/ms190436. aspx) in order to retrieve some data that was accidentally lost from that database. In non-emergency scenarios, we may simply want to restore a copy of a database to a development or test server. For a user database operating in FULL recovery model, we have the widest range of restore options available to us. As noted throughout the chapter, we can take transaction log backups and use them, in conjunction with full and differential backups, to restore a database to a specific point within a log file. In fact, the RESTORE LOG command supports several different ways to do this. We can: recover to a specific point in time we can stop the recovery at a specific point in time within a log backup file, recovering the database to the point it was in when the last transaction committed, before the specified STOPAT time recover to a marked transaction if a log backup file contains a marked transaction (defined using BEGIN TRAN TransactionName WITH MARK 'Description') then we can recover the database to the point that this transaction starts (STOPBEFOREMARK) or completes (STOPATMARK) recover to a Log Sequence Number stop the recovery at a specific log record, identified by its LSN (see Chapter 6, Log Restores). We'll cover several examples of the first option (which is by far the most common) in this book. In addition, we can perform more "granular" restores. For example, in certain cases, we can recover a database by restoring only a single data file (plus transaction logs), rather than the whole database. We'll cover these options in Chapters 9 and 10.
46
Chapter 1: Basics of Backup and Restore For databases in BULK_LOGGED model, we have similar restore options, except that none of the point-in-time restore options listed previously can be applied to a log file that contains minimally logged transactions. For SIMPLE recovery model databases, our restore options are more limited. In the main, we'll be performing straightforward restores of the full and differential database backup files. In many cases, certainly for a development database, for example, and possibly for other "non-frontline" systems, this will be perfectly adequate, and will greatly simplify, and reduce the time required for, the backup and restore strategies for these databases. Finally, there are a couple of important "special restore" scenarios that we may run into from time to time. Firstly, we may need to restore one of the system databases. Secondly, if only a single data page is damaged, it may be possible to perform a page restore, rather than restoring the whole database.
Chapter 1: Basics of Backup and Restore master after a significant RDBMS update (such as a major Service Pack). If you find yourself in a situation where your master database has to be rebuilt, as in the case where you do not have a backup, you would also be rebuilding the msdb and model databases, unless you had good backups of msdb and model, in which case you could simply restore them. The msdb database contains SQL Agent jobs, schedules and operators as well as historical data regarding the backup and restore operations for databases on that instance. A full backup of this database should be taken whenever the database is updated. That way, if a SQL Server Agent Job is deleted by accident, and no other changes have been made, you can simply restore the msdb database and regain that job information. Finally, the model database is a "template" database for an instance; all user databases created on that instance will inherit configuration settings, such as recovery model, initial file sizes, file growth settings, collation settings and so on, from those stipulated for the model database. By default, this database operates in the FULL recovery model. It should rarely be modified, but will need a full backup whenever it is updated. Personally, I like to back it up on a similar rotation to the other system databases, so that it doesn't get overlooked. We'll walk through examples of how to restore the master and the msdb databases in Chapter 4, Restoring From Full Backup.
Summary
In this chapter, we've covered a lot of necessary ground, discussing the files that comprise a SQL Server database, the critical role played by each, why it's essential that they are backed up, the types of backup that can be performed on each, and how this is impacted by the recovery model chosen for the database. We're now ready to start planning, verifying and documenting our whole backup strategy, answering questions such as: Where will the backups be stored? What tools will be used to take the backups? How do I plan and implement an appropriate backup strategy for each database? How do I verify that the backups are "good?" What documentation do I need? To find out the answers to these questions, and more, move on to Chapter 2.
49
Backup Storage
Hopefully, the previous chapter impressed on you the need to take database backups for all user and system databases, and transaction log backups for any user databases that are not operating in SIMPLE recovery mode. One of our basic goals, as a DBA, is to create an environment where these backups are stored safely and securely, and where the required backup operations are going to run as smoothly and quickly as possible.
50
Chapter 2: Planning, Storage and Documentation The single biggest factor in ensuring that this can be achieved (alongside such issues as careful backup scheduling, which we'll discuss later in the chapter) is your backup storage architecture. In the examples in this book, we back up our databases to the same disk drive that stores the live data and log files; of course, this is purely a convenience, designed to make the examples easy to practice on your laptop. In reality, we'd never back up to the same local disk; after all if you simply store them on the same drive as the live files, and that drive becomes corrupt, then not only have you lost the live files, but the backup files too! There are three basic options, which we'll discuss in turn: local disk storage, network storage, and tape storage. Each of these media types has its pros and cons so, ultimately, it is a matter of preference which you use and how. In many cases, a mixture of all three may be the best solution for your environment. For example, you might adopt the scheme below. 1. Back up the data and log files to local disk storage either Direct Attached Storage (DAS) or a Storage Area Network (SAN). In either case, the disks should be in a RAID configuration. This will be quicker for the backup to complete, but you want to make sure your backups are being moved immediately to a separate location, so that a server crash won't affect your ability to restore those files.
2. Copy the backup files to a redundant network storage location again, this space should be driven by some sort of robust storage solution such as SAN, or a RAID of local physical drives. This will take a bit longer than the local drive because of network overhead, but you are certain that the backups are in a separate/secure location in case of emergency. 3. Copy the files from the final location to a tape backup library for storage in an offsite facility. I recommend keeping the files on disk for at least three days for daily backups and a full seven days for weekly (or until the next weekly backup has been taken). If you need files older than that, you can retrieve them from the tape library.
51
Chapter 2: Planning, Storage and Documentation The reason I offer the option to write the backups to local storage initially, instead of straight to network storage, is that it avoids the bottleneck of pushing data through the network. Generally speaking, it's possible to get faster write speeds, and so faster backups, to a local drive than to a drive mapped from another network device, or through a drive space shared out through a distributed file system (DFS). However, with storage networks becoming ever faster, it is becoming increasingly viable to skip Step 1, and back up the data and log files directly to network storage. Whether you write first to locally attached storage, or straight to a network share, you'll want that disk storage to be as fast and efficient as possible, and this means that we want to write, not to a single disk, but to a RAID unit, provided either as DAS, or by a SAN. We also want, wherever possible, to use dedicated backup storage. For example, if a particular drive on a file server, attached from the SAN, is designated as the destination for our SQL Server backup files, we don't want any other process storing their data in that location, competing with our backups for space and disk I/O.
52
Chapter 2: Planning, Storage and Documentation These RAID-configured disks are made available to SQL Server either as part of Directly Attached Storage, where the disks (which could be SATA, SCSI or SAS) are built into the server or housed in external expansion bays that are attached to the server using a RAID controller, or as Storage Area Network in layman's terms, a SAN is a big box of hard drives, available via a dedicated, high performance network, with a controller that instructs individual "volumes of data" known as Logical Unit Numbers (LUNs) to interact with certain computers. These LUNs appear as local drives to the operating system and SQL Server. Generally, the files for many databases will be stored on a single SAN.
RAID configuration
The RAID technology allows a collection of disks to perform as one. For our data and log files RAID, depending on the exact RAID configuration, can offer some or all of the advantages below. Redundancy if one of the drives happens to go bad, we know that, depending on the RAID configuration, either the data on that drive will have been mirrored to a second drive, or it will be able to be reconstructed, and so will still be accessible, while the damaged drive is replaced. Improved read and write I/O performance reading from and writing to multiple disk spindles in a RAID array can dramatically increase I/O performance, compared to reading and writing from a single (larger) disk. Higher storage capacity by combining multiple smaller disks in a RAID array, we overcome the single-disk capacity limitations (while also improving I/O performance). For our data files we would, broadly speaking, want a configuration optimized for maximum read performance and, for our log file, maximum write performance. For backup files, the simplest backup storage, if you're using DAS, may just be a separate, single physical drive.
53
Chapter 2: Planning, Storage and Documentation However, of course, if that drive were to become corrupted, we would lose the backup files on that drive, and there isn't much to be done, beyond sending the drive to a recovery company, which can be both time consuming and expensive, with no guarantee of success. Therefore, for backup files it's just as important to take advantage of the redundancy advantages offered by RAID storage. Let's take just a brief look at the more popular of the available RAID configurations, as each one provides different levels of protection and performance.
RAID 0 (striping)
This level of RAID is the simplest and provides only performance benefits. A RAID 0 configuration uses multiple disk drives and stripes the data across these disks. Striping is simply a method of distributing data across multiple disks, whereby each block of data is written to the next disk in the stripe set. This also means that I/O requests to read and write data will be distributed across multiple disks, so improving performance. There is, however, a major drawback in a RAID 0 configuration. There is no fault tolerance in a RAID 0 setup. If one of the disks in the array is lost, for some reason, the entire array will break and the data will become unusable.
RAID 1 (mirroring)
In a RAID 1 configuration we use multiple disks and write the same data to each disk in the array. This is called mirroring. This configuration offers read performance benefits (since the data could be read from multiple disks) but no write benefits, since the write speed is still limited to the speed of a single disk. However, since each disk in the array has a mirror image (containing the exact same data) RAID 1 does provide redundancy and fault tolerance. One drive in the mirror set can be lost without losing data, or that data becoming inaccessible. As long as one of the disks in the mirror stays online, a RAID 1 system will remain in working order but will take a hit in read performance while one of the disks is offline. 54
55
Chapter 2: Planning, Storage and Documentation RAID 10 gives us the performance benefits of data striping, allowing us to read and write data faster than single drive applications. RAID 10 also gives us the added security that losing a single drive will not bring our entire disk array down. In fact, with RAID 10, as long as at least one of the mirrored drives from any set is still online, it's possible that more than one disk can be lost while the array remains online with all data accessible. However, loss of both drives from any one mirrored set will result in a hard failure. With RAID 10 we get excellent write performance, since we have redundancy with the need to deal with parity data. However, read performance will generally be lower that a RAID 5 configuration with the same number of disks, since data can be read simultaneously from only half the disks in the array.
Chapter 2: Planning, Storage and Documentation For the added cost and complexity of SAN storage, you have access to storage space far in excess of what a traditional DAS system could offer. This space is easily expandable (up to the SAN limit) simply by adding more disk array enclosures (DAEs), and doesn't take up any room in the physical server. Multiple database servers can share a single SAN, and most SANs offer many additional features (multiple RAID configurations, dynamic snapshots, and so on). SAN storage is typically provided over a fiber optic network that is separated from your other network traffic in order to minimize any network performance or latency issues; you don't have to worry about any other type of network activity interfering with your disks.
57
Chapter 2: Planning, Storage and Documentation My experience suggests that advances in controller architecture along with increases in disk speed and cache storage have "leveled the playing field." In other words, don't worry overly if your read-heavy database is on RAID 10, or a reasonably write-heavy database is on RAID 5; chances are it will still perform reliably, and well.
Network device
The last option for storing our backup files is the network device. Having each server backing up to a separate folder on a network drive is a great way to organize all of the backups in one convenient location, which also happens to be easily accessible when dumping those files to tape media for offsite storage. We don't really care what form this network storage takes, as long as it is as stable and fault tolerant as possible, which basically means RAID storage. We can achieve this via specialized Network Attached Storage (NAS), or simply a file server, backed by physical disks or SAN-attached space. However, as discussed earlier, backing up directly to a network storage device, across a highly utilized network, can lead to latency and network outage problems. That's why I generally still recommend to backup to direct storage (DAS or SAN) and then copy the completed backup files to the network storage device. A good solution is to use a scheduled job, schedulable utility or, in some cases, a third-party backup tool to back up the databases to a local drive and then copy the results to a network share. This way, we only have to worry about latency issues when copying the backup file, but at least at this stage we don't put any additional load on the SQL Server service; if a file copy fails, we just restart it. Plus, with utilities such as robocopy, we have the additional safety net of the knowing the copy will automatically restart if any outage occurs.
58
Tape
Firstly, I will state that tape backups should be part of any SQL Server backup and recovery plan and, secondly, that I have never in my career backed up a database or log file directly to tape. The scheme to go for is to back up to disk, and then archive to tape. There are several reasons to avoid backing up directly to tape, the primary one being that writing to tape is slow, with the added complication that the tape is likely to be attached via some sort of relatively high-latency network device. This is a big issue when dealing with backup processes, especially for large databases. If we have a network issue after our backup is 90% completed, we have wasted a lot of time and resources that are going to have to be used again. Writing to a modern, single, physical disk, or to a RAID device, will be much faster than writing to tape. Some years ago, tape storage might still have had a cost advantage over disk but, since disk space has become relatively cheap, the cost of LTO-4 tape media storage is about the same as comparable disk storage. Finally, when backing up directly to tape, we'll need to support a very large tape library in order to handle the daily, weekly and monthly backups. Someone is going to have to swap out, label and manage those tapes and that is a duty that most DBAs either do not have time for, or are just not experienced enough to do. Losing, damaging, or overwriting a tape by mistake could cost you your job. Hopefully, I've convinced you not to take SQL Server backups directly to tape, and instead to use some sort of physical disk for initial storage. However, and despite their other shortcomings, tape backups certainly do have a role to play in most SQL Server recovery plans. The major benefit to tape media is portability. Tapes are small, take up relatively little space and so are ideal for offsite storage. Tape backup is the last and best line of defense for you and your data. There will come a time when a restore operation relies on a backup file that is many months old, and you will be glad you have a copy stored on tape somewhere. 59
Chapter 2: Planning, Storage and Documentation With tape backups stored offsite, we also have the security of knowing that we can recover that data even in the event of a total loss of your onsite server infrastructure. In locations where the threat of natural disasters is very real and very dangerous, offsite storage is essential (I have direct experience of this, living in Florida). Without it, one hurricane, flood, or tornado can wipe out all the hard work everyone put into backing up your database files. Storing backup file archive on tape, in a secure and structurally reinforced location, can mean the difference between being back online in a matter of hours and not getting back online at all. Most DBA teams let their server administration teams handle the task of copying backups to tape; I know mine does. The server admins will probably already be copying other important system backups to tape, so there is no reason they cannot also point a backup-to-tape process at the disk location of your database and log backup files. Finally, there is the prosaic matter of who handles all the arrangements for the physical offsite storage of the tapes. Some smaller companies can handle this in-house, but I recommend that you let a third-party company that specializes in data archiving handle the long-term secure storage for you.
Backup Tools
Having discussed the storage options for our data, log, and backup files, the next step is to configure and schedule the SQL Server backup jobs. There are several tools available to do this, and we'll consider the following: maintenance plans the simplest, but also the most limited solution, offering ease of use but limited options, and lacking flexibility custom backup scripts offers full control over how your backup jobs execute, but requires considerable time to build, implement, and maintain
60
Chapter 2: Planning, Storage and Documentation third-party backup tools many third-party vendors offer powerful, highly configurable backup tools that offer backup compression and encryption, as well as well-designed interfaces for ease of scheduling and monitoring. All environments are different and the choice you make must be dictated by your specific needs. The goal is to get your databases backed up, so whichever one you decide on and use consistently is going to be the right choice.
61
Chapter 2: Planning, Storage and Documentation Under the covers, maintenance plans are simply SSIS packages that define a number of maintenance tasks, and are scheduled for execution via SQL Server Agent jobs. The Maintenance Plan Wizard and Designer makes it easy to build these packages, while removing a lot of the power and flexibility available when writing such packages directly in SSIS. For maintenance tasks of any reasonable complexity, it is better to use Business Intelligence Design Studio to design the maintenance packages that suit your specific environment and schedule them through SQL Server Agent. It may not be a traditional maintenance plan in the same sense as one that the wizard would have built, but it is a maintenance package none the less.
62
Chapter 2: Planning, Storage and Documentation If a script runs on three servers, this is no big deal; just update the code on each server and carry on. What if it must run on 40 servers? Now, every minor improvement to the backup script, or bug fix, will entail a major effort to ensure that this is reflected consistently on all servers. In such cases, we need a way to centralize and distribute the code so that we have consistency throughout the enterprise, and a quick and repeatable way to make updates to each machine, as needed. Many DBA teams maintain on each server a "DBA database" that holds the stored procedures for all sorts of maintenance tasks, such as backups, index maintenance and more. For example, my team maintains a "master" maintenance script, which will create this database on a server, or update the objects within the database if the version on the server is older than what exists in our code repository. Whenever the script is modified, we have a custom .NET tool that will run the script on every machine, and automatically upgrade all of the maintenance code.
Third-party tools
The final option available to the DBA is to create and schedule the backup jobs using a third-party tool. Several major vendors supply backup tools but the one used in my team, and in this book, is Red Gate SQL Backup (www.red-gate.com/products/dba/ sql-backup/). Details of how to install this tool can be found in Appendix A, and backup examples can be found in Chapter 8. With SQL Backup, we can create a backup job just as easily as we could with the maintenance plan wizard, and with a lot more flexibility. We can create SQL Server Agent jobs that take full, differential, transaction log or file and filegroup backups from a GUI wizard. We can set up a custom schedule for the backups. We can configure numerous options for the backup files, including location, retention, dynamic naming convention, compression, encryption, network resilience, and more. 63
Chapter 2: Planning, Storage and Documentation Be aware, though, that a tool that offers flexibility and ease of use can lead down the road of complex backup jobs. Modifying individual steps within such jobs requires T-SQL proficiency or, alternatively, you'll need to simply drop the job and build it again from scratch (of course, a similar argument applies to custom scripts and maintenance plan jobs).
64
Chapter 2: Planning, Storage and Documentation So how do we get started, when devising an appropriate backup and restore plan for a new database? As DBAs, we'd ideally be intimately familiar with the inner and outer workings of every database that is in our care. However, this is not always feasible. Some DBAs administer too many servers to know exactly what is on each one, or even what sort of data they contain. In such cases, a quick 15-minute discussion with the application owner can provide a great deal of insight into the sort of database that we are dealing with. Over the coming sections, we'll take a look at factors that affect each side of the backuprestore coin, and the sorts of questions we need to ask of our database owners and users.
Backup requirements
The overriding criterion in determining the types of backup we need to take, and how frequently, is the maximum toleration to possible data loss, for a particular database. However, there are a few other factors to consider as well. On what level of server does this database reside? For example, is it a development box, a production server or a QA machine? We may be able to handle losing a week's worth of development changes, but losing a week's work of production data changes could cost someone their job, especially if the database supports a business-critical, front-line application Do we need to back up this database at all? Not all data loss is career-ending and, as a DBA, you will run into plenty of situations where a database doesn't need to be backed up at all. For example, you may have a development system that gets refreshed with production data on a set schedule. If you and your development team are comfortable not taking backups of data that is refreshed every few days anyway, then go for it. Unless there is a good reason to do so (perhaps the 65
Chapter 2: Planning, Storage and Documentation data is heavily modified after each refresh) then you don't need to waste resources on taking backups of databases that are just copies of data from another server, which does have backups being taken. How much data loss is acceptable? Assuming this is a system that has limits on its toleration of data loss, then this question will determine the need to take supplemental backups (transaction log, differential) in addition to full database (or file) backups, and the frequency at which they need to be taken. Now, the application owner needs to be reasonable here. If they state that they cannot tolerate any down-time, and cannot lose any data at all, then this implies the need for a very high availability solution for that database, and a very rigorous backup regime, both of which are going to cost a lot of design, implementation and administrative effort, as well as a lot of money. If they offer more reasonable numbers, such as one hour's potential data loss, then this is something that can be supported as part of a normal backup regime, taking hourly transaction log backups. Do we need to take these hourly log backups all day, every day, though? Perhaps, yes, but it really depends on the answer to next question. At what times of the day is this database heavily used? What we're trying to find out here is when, and how often, backups need to be taken. Full backups of large databases should be carried out at times when the database is least used, and supplemental backups need to be fitted in around our full backup schedule. We'll need to start our log backup schedules well in advance of the normal database use schedules, in order to capture the details of all data changes, and end them after the traffic subsides. Alternatively, we may need to run these log file backups all day, which is not a bad idea, since then we will never have large gaps in time between the transaction log backup files.
66
Chapter 2: Planning, Storage and Documentation What are the basic characteristics of the database? Here, we're interested in other details that may impact backup logistics. We'll want to find out, for example: How much data is stored the size of the database will impact backup times, backup scheduling (to avoid normal database operations being unduly affected by backups), amount of storage space required, and so on. How quickly data is likely to grow this may impact backup frequency and scheduling, since we'll want to control log file size, as well as support data loss requirements. We will also want to plan for the future data growth to make sure our SQL Server backup space doesn't get eaten up. The nature of the workload. Is it OLTP, with a high number of both reads and writes? Or mainly read-only? Or mixed and, if so, are certain tables exclusively read-only? Planning for backup and restore starts, ideally, right at the very beginning of a database's life, when we are planning to create it, and define the data and log file properties and architecture for the new database. Answers to questions such as these will not only help define our backup requirements, but also the appropriate file architecture (number of filegroups and data files) for the database, initial file sizes and growth characteristics, as well as the required hardware capacity and configuration (e.g. RAID level).
Data and log file sizing and growth
We'll discuss this in more detail in Chapter 3, but it's worth noting that the initial size, and subsequent auto-growth characteristics, of the data and log files will be inherited from the properties of the model database for that instance, and there is a strong chance that these will not be appropriate for your database.
67
Restore requirements
As well as ensuring we have the appropriate backups, we need a plan in place that will allow us to perform "crash recovery." In other words, we need to be sure that we can restore our backups in such a way that we meet the data loss requirements, and complete the operation within an acceptable period of down-time. An acceptable recovery time will vary from database to database depending on a number of factors, including: the size of the database much as we would like to, we cannot magically restore a full backup of a 500 GB database, and have it back online with all data recovered in 15 minutes where the backup files are stored if they are on site, we only need to account for the time needed for the restore and recovery process; if the files are in offsite storage, we'll need to plan extra time to retrieve them first the complexity of the restore process if we just need to restore a full database backup, this is fairly straightforward process; but if we need to perform a complex point-in-time restore involving full, differential, and log files, this is more complex and may require more time or at least we'll need to practice this type of restore more often, to ensure that all members of our team can complete it successfully and within the required time. With regard to backup file locations, an important, related question to ask your database owners is something along the lines of: How quickly, in general, will problems be reported? That may sound like a strange question, but its intent is to find out how long it's necessary to retain database backup files on network storage before deleting them to make room for more backup files (after, of course, transferring these files to tape for offsite storage).
68
Chapter 2: Planning, Storage and Documentation The location of the backup file will often affect how quickly a problem is solved. For example, let's say the company policy is to keep backup files on site for three days, then archive them to tape, in offsite storage. If a data loss occurs and the error is caught quickly, the necessary files will be at hand. If, however, it's only spotted five days later, the process of getting files back from the offsite tape backups will push the recovery time out, and this extra time should be clearly accounted for in the SLA. This will save the headache of having to politely explain to an angry manager why a database or missing data is not yet back online.
An SLA template
Having asked all of these question, and more, it's time to draft the SLA for that database. This document is a formal agreement regarding the backup regime that is appropriate for that database, and also offers a form of insurance to both the owners and the DBA. You do not, as a DBA, want to be in a position of having a database owner demanding to know why you can't perform a log restore to get a database back how it was an hour before it went down, when you know that they told you that only weekly full backups were required for that database, but you have no documented proof. Figure 2-1 offers a SLA template, which will hopefully provide a good starting point for your Backup SLA contract. It might not have everything you need for your environment, but you can download the template from the supplemental material and modify it, or just create your own from scratch.
69
Database Name : salesDB Data Loss: 2 Hours Recovery Time: 4 Hours Full Backups: Daily / Weekly / Monthly Diff Backups: Daily / Weekly Log Backups: Daily @ _______ Hour Intervals File Backups: Daily / Weekly / Monthly File Differentials: Daily / Weekly Database Name : salesArchives Data Loss: 4 Hours Recovery Time: 6 Hours
Full Backups: Daily / Weekly / Monthly Diff Backups: Daily / Weekly Log Backups: Daily @ _______ Hour Intervals File Backups: Daily / Weekly / Monthly File Differentials: Daily / Weekly Database Name : resourceDB Data Loss: 2 Hours Recovery Time: 3 Hours
Full Backups: Daily / Weekly / Monthly Diff Backups: Daily / Weekly Log Backups: Daily @ _______ Hour Intervals File Backups: Daily / Weekly / Monthly File Differentials: Daily / Weekly Database Administrator: Application Owner: Date of Agreement:
Figure 2-1:
70
2. Perform a full weekly database backup for the VLDB, for example on Sunday night. 3. Perform a differential database backup for the VLDB on the nights where you do not take the full database backups. In this example, we would perform these backups on Monday through Saturday night.
71
Scenario 2: Production server, 3 databases, simple file architecture, 2 hours' data loss
In the second scenario, we have a production server containing three actively-used databases. The application owner informs us that no more than two hours of data loss can be tolerated, in the event of corruption or any other disaster. None of the databases are complex structurally, each containing just one data file and one log file. With each database operating in FULL recovery model, an appropriate backup scheme might be as below. 1. Perform full nightly database backups for every database (plus the system databases).
2. Perform log backups on the user databases every 2 hours, on a schedule starting after the full backups are complete and ending before the full backup jobs starts.
Scenario 3: Production server, 3 databases, complex file architecture, 1 hour's data loss
In this final scenario, we have a production database system that contains three databases with complex data structures. Each database comprises multiple data files split into two filegroups, one read-only and one writable. The read-only file group is updated once per week with newly archived records. The writable file groups have an acceptable data loss of 1 hour. Most database activity on this server will take place during the day. With the database operating in FULL recovery model, the backup scheme below might work well. 1. Perform nightly full database backups for all system databases.
2. Perform a weekly full file backup of the read-only filegroups on each user database, after the archived data has been loaded.
72
Chapter 2: Planning, Storage and Documentation 3. Perform nightly full file backups of the writable file groups on each user database.
4. Perform hourly log backups for each user database; the log backup schedule should start after the nightly full file backups are complete, and finish one hour before the full file backup processes start again.
Backup scheduling
It can be a tricky process to organize the backup schedule such that all the backups that are required to support the Backup and Restore SLA fit into the available maintenance windows, don't overlap, and don't cause undue stress on the server. Full database and file backups, especially of large databases, can be CPU- and Disk I/0-intensive processes, and so have the propensity to cause disruption, if they are run at times when the database is operating under its normal business workload. Ideally, we need to schedule these backups to run at times when the database is not being accessed, or at least is operating under greatly reduced load, since we don't want our backups to suffer because they are fighting with other database processes or other loads on the system (and vice versa). This is especially true when using compressed backups, since a lot of the load that would be done on disk is moved to the CPU in the compression phase. Midnight is usually a popular time to run large backup jobs, and if your shop consists of just a few machines, by all means schedule all your full nightly backups to run at this time. However, if you administer 20, 30, or more servers, then you may want to consider staggering your backups throughout the night, to avoid any possible disk or network contention issues. This is especially true when backing up directly to a network storage device. These devices are very robust and can perform a lot of operations per second, but there is a limit to how much traffic any device can handle. By staggering the backup jobs, you can help alleviate any network congestion.
73
Chapter 2: Planning, Storage and Documentation The scheduling of differential backups will vary widely, depending on their role in the backup strategy. For a VLDB, we may be taking differential backups every night, except for on the night of the weekly full backup. At other times, we may run differential backups at random times during the day, for example as a way to safeguard data before performing a large modification or update. Transaction log backups are, in general, much less CPU- and I/O-intensive operations and can be safely run during the day, alongside the normal database workload. In fact, there isn't much point having a transactional backup of your database if no one is actually performing any transactions! The scheduling of log backups may be entirely dictated by the agreed SLA; if no more than five minutes of data can be lost, then take log backups every five minutes! If there is some flexibility, then try to schedule consecutive backups close enough so that the log file does not grow too much between backups, but far enough apart that it does not put undue stress on the server and hardware. As a general rule, don't take log backup much more frequently than is necessary to satisfy the SLA. Remember, the more log backups you take, the more chance there is that one will fail and possibly break your log chain. However, what happens when you have two databases on a server that both require log backups to be taken, but at different intervals? For example, Database A requires a 30-minute schedule, and Database B, a 60-minute schedule. You have two choices: 1. create two separate log backup jobs, one for DB_A running every 30 minutes and one for DB_B, every 60 minutes; this means multiple SQL Agent / scheduled jobs and each job brings with it a little more maintenance and management workload
2. take log backups of both databases using a single job that runs every 30 minutes; you'll have fewer jobs to schedule and run, but more log backup files to manage, heightening the risk of a file being lost or corrupted. My advice in this case would be to create one log backup job, taking log backups every 30 minutes; it satisfies the SLA for both databases and is simpler to manage. The slight downside is that the time between log backups for databases other than the first one in 74
Chapter 2: Planning, Storage and Documentation the list might be slightly longer than 30 minutes, since the log backup for a given database in the queue can't start till the previous one finishes. However, since the backups are frequent and so the backup times short, any discrepancy is likely to be very small.
75
If it finds a page that fails this test, the backup will fail. If the backup succeeds then the backup is validor maybe not. In fact, this type of validation has gotten many DBAs into trouble. It does not guarantee that a database backup is corruption free. The CHECKSUM only verifies that we are not backing up a database that was already corrupt in some way; if the corruption occurs in memory or somehow else during the backup operation, then it will not be detected. As a final note, my experience and that of many others, suggests that, depending on the size of the database that is being used, enabling checksums (and other checks such as torn page detection) will bring with it a small CPU overhead and may slow down your backups (often minimally). However, use of the WITH CHECKSUM option during backups is a valuable safeguard and, if you can spare the few extra CPU cycles and the extra time the backups will take, go ahead. These checks are especially valuable when used in conjunction with restore verification.
76
Verifying restores
Since we cannot rely entirely on page checksums during backups, we should also be performing some restore verifications to make sure our backups are valid and restorable. As noted earlier, the surest way to do this is by performing test restores. However, a good additional safety net is to use the RESTORE VERIFYONLY command. This command will verify that the structure of a backup file is complete and readable. It attempts to mimic an actual restore operation as closely as possible without actually restoring the data. As such, this operation only verifies the backup header; it does not verify that the data contained in the file is valid and not corrupt. However, for databases where we've performed BACKUPWITH CHECKSUM, we can then re-verify these checksums as part of the restore verification process.
RESTORE VERIFYONLY FROM DISK= '<Backup_location>' WITH CHECKSUM
This will recalculate the checksum on the data pages contained in the backup file and compare it against the checksum values generated during the backup. If they match, it's a good indication that the data wasn't corrupted during the backup process.
DBCC CHECKDB
One of the best ways to ensure that databases remain free of corruption, so that this corruption does not creep into backup files, making mincemeat of our backup and restore planning, is to run DBCC CHECKDB on a regular basis, to check the logical and physical integrity of all the objects in the specified database, and so catch corruption as early as possible. 77
Chapter 2: Planning, Storage and Documentation We will not discuss this topic in detail in this book, but check out the information in Books Online (http://msdn.microsoft.com/en-us/library/ms176064.aspx) and if you are not already performing these checks regularly, you should research and start a DBCC CHECKDB regimen immediately.
78
Chapter 2: Planning, Storage and Documentation Log File Information Physical File Name: LDF Location: Log Size: Number of Virtual Log Files: Backup Information Types of Backups Performed (Full, Differential, Log): Last Full Database Backup: Last Differential Database Backup: Last Transaction Log Backup: How Often are Transaction Logs Backed Up: Average Database Full Backup Time: Database Full Backup Size: Average Transaction Log Backup Size: Number of Full Database Backup Copies Retained: Backups Encrypted: Backups Compressed: Backup To Location: Offsite Backup Location: Backup Software/Agent Used:
79
Chapter 2: Planning, Storage and Documentation This information can be harvested in a number of different ways, but ideally will be scripted and automated. Listing 2-3 shows two scripts that will capture just some of this information; please feel free to adapt and amend as is suitable for your environment.
SELECT d.name , MAX(d.recovery_model) , is_Password_Protected , --Backups Encrypted: --Last Full Database Backup: MAX(CASE WHEN type = 'D' THEN backup_start_date ELSE NULL END) AS [Last Full Database Backup] , --Last Transaction Log Backup: MAX(CASE WHEN type = 'L' THEN backup_start_date ELSE NULL END) AS [Last Transaction Log Backup] , --Last Differential Log Backup: MAX(CASE WHEN type = 'I' THEN backup_start_date ELSE NULL END) AS [Last Differential Backup] , --How Often are Transaction Logs Backed Up: DATEDIFF(Day, MIN(CASE WHEN type = 'L' THEN backup_start_date ELSE 0 END), MAX(CASE WHEN type = 'L' THEN backup_start_date ELSE 0 END)) / NULLIF(SUM(CASE WHEN type = 'I' THEN 1 ELSE 0 END), 0) [Logs BackUp count] , --Average backup times: SUM(CASE WHEN type = 'D' THEN DATEDIFF(second, backup_start_date, Backup_finish_date) ELSE 0 END) / NULLIF(SUM(CASE WHEN type = 'D' THEN 1 ELSE 0 END), 0) AS [Average Database Full Backup Time] , SUM(CASE WHEN type = 'I' THEN DATEDIFF(second, backup_start_date, Backup_finish_date) ELSE 0 END) / NULLIF(SUM(CASE WHEN type = 'I' THEN 1
80
SUM(CASE
END)
SUM(CASE
END)
SUM(CASE
END)
SUM(CASE
END)
SUM(CASE
END)
MAX(CASE END)
81
82
Summary
With the first two chapters complete, the foundation is laid; we've covered database file architecture, the types of backup that are available, and how this, and the overall backup strategy, is affected by the recovery model of the database. We've also considered hardware storage requirements for our database and backup files, the tools available to capture the backups, and how to develop an appropriate Service Level Agreement for a given database, depending on factors such as toleration for data loss, database size, workload, and so on. We are now ready to move on to the real thing; actually taking and restoring different types of backup. Over the coming chapters, we'll build some sample databases, and demonstrate all the different types of backup and subsequent restore, using both native scripting and a third-party tool (Red Gate SQL Backup).
83
84
Chapter 3: Full Database Backups It's useful to know exactly what is contained in this "archive" and a full backup contains: a copy of the database at the time of backup creation all user objects and data system information pertinent to the database user information permissions information system tables and views enough of the transaction log to be able to bring the database back online in a consistent state, in the event of a failure.
Chapter 3: Full Database Backups point in time, where the data existed, and then transfer that data back into the live database, using a tool such as SSIS or T-SQL (we cover this in detail in Chapter 6, Log Restores). It is important to stress that these methods, i.e. recovering from potential data loss by restoring backup files, represent the only sure way to recover all, or very nearly all, of the lost data. The alternatives, such as use of specialized log recovery tools, or attempting to recover data from secondary or replica databases, offer relatively slim chances of success. Aside from disaster recovery scenarios, there are also a few day-to-day operations where full backups will be used. Any time that we want to replace an entire database or create a new database containing the entire contents of the backup, we will perform a full backup restore. For example: moving a development project database into production for the first time; we can restore the full backup to the production server to create a brand new database, complete with any data that is required. refreshing a development or quality assurance system with production data for use in testing new processes or process changes on a different server; this is a common occurrence in development infrastructures and regular full backup restores are often performed on an automated schedule.
86
Chapter 3: Full Database Backups subject to a moderate level of data modification, but the flexibility of full point-in-time restore, via log backups, is not required, then the Backup SLA can stipulate simply that nightly full database backups should be taken. The majority of the development and testing databases that I look after receive only a nightly full database backup. In the event of corruption or data loss, I can get the developers and testers back to a good working state by restoring the previous night's full backup. In theory, these databases are exposed to a maximum risk of losing just less than 24 hours of data changes. However, in reality, the risk is much lower since most development happens during a much narrower daytime window. This risk is usually acceptable in development environments, but don't just assume this to be the case; make sure you get sign-off from the project owners. For a database subject only to very infrequent changes, it may be acceptable to take only a weekly full backup. Here the risk of loss is just under seven days, but if the database really is only rarely modified then the overall risk is still quite low. Remember that the whole point of a Backup SLA is to get everyone with a vested interest to "sign off" on acceptable levels of data loss for a given database. Work with the database owners to determine the backup strategy that works best for their databases and for you as the DBA. You don't ever want to be caught in a situation where you assumed a certain level of data loss was acceptable and it turned out you were wrong.
Chapter 3: Full Database Backups Before we get started taking full backups, however, we need to do a bit of preparatory work, namely choosing an appropriate recovery model for our example database, and then creating that database along with some populated sample tables.
Database creation
The sample database for this chapter will be about as simple as it's possible to get. It will consist of a single data (mdf) file contained in a single filegroup; there will be no secondary data files or filegroups. This one data file will contain just a handful of tables where we will store a million rows of data. Listing 3-1 shows our fairly simple database creation script. Note that in a production database the data and log files would be placed on separate drives. 88
Listing 3-1:
This is a relatively simple CREATE DATABASE statement, though even it is not quite as minimal as it could be; CREATE DATABASE [DatabaseForFullBackups] would work, since all the arguments are optional in the sense that, if we don't provide explicit values for them, they will take their default values from whatever is specified in the model database. Nevertheless, it's instructive, and usually advisable, to explicitly supply values for at least those parameters shown here. We have named the database DatabaseForFullBackups, which is a clear statement of the purpose of this database. Secondly, via the NAME argument, we assign logical names to the physical files. We are adopting the default naming convention for SQL Server 2008, which is to use the database name for logical name of the data file, and append _log to the database name for the logical name of the log file.
Chapter 3: Full Database Backups The optional SIZE and FILEGROWTH arguments are the only cases where we use some non-default settings. The default initial SIZE settings for the data and log files, inherited from the model database properties, are too small (typically, 3 MB and 1 MB respectively) for most databases. Likewise the default FILEGROWTH settings (typically 1 MB increments for the data files and 10% increments for the log file) are also inappropriate. In busy databases, they can lead to fragmentation issues, as the data and log files grow in many small increments. The first problem is physical file fragmentation, which occurs when a file's data is written to non-contiguous sectors of the physical hard disk (SQL Server has no knowledge of this). This physical fragmentation is greatly exacerbated if the data and log files are allowed to grow in lots of small auto-growth increments, and it can have a big impact on the performance of the database, especially for sequential write operations. As a best practice it's wise, when creating a new database, to defragment the disk drive (if necessary) and then create the data and log files pre-sized so that they can accommodate, without further growth in file size, the current data plus estimated data growth over a reasonable period. In a production database, we may want to size the files to accommodate, say, a year's worth of data growth. There are other reasons to avoid allowing your database files to grow in multiple small increments. Each growth event will incur a CPU penalty. This penalty can be mitigated for data files by instant file initialization (enabled by granting the perform volume maintenance tasks right to the SQL Server service account). However, the same optimization does not apply to log files. Furthermore, growing the log file in many small increments can cause log fragmentation, which is essentially the creation of a very large number of small VLFs, which can deteriorate the performance of crash recovery, restores, and log backups (in other words, operations that read the log file). We'll discuss this in more detail in Chapter 5.
90
Chapter 3: Full Database Backups In any event, in our case, we're just setting the SIZE and FILEGROWTH settings such that SQL Server doesn't have to grow the files while we pump in our test data. We've used an initial data files size of 500 MB, growing in 100 MB increments, and an initial size for the log file of 100 MB, growing in 10 MB increments. When you're ready, execute the script in Listing 3-1, and the database will be created. Alternatively, if you prefer to create the database via the SSMS GUI, rather than using a script, simply right-click on the Databases node in SSMS, select New Database, and fill out the General tab so it looks like that shown in Figure 3-1.
Figure 3-1:
91
The meaning of each of these options is as follows: COMPATIBILITY_LEVEL This lets SQL Server know with which version of SQL Server to make the database compatible. In all of our examples, we will be using 100, which signifies SQL Server 2008. AUTO_SHRINK This option either turns on or off the feature that will automatically shrink your database files when free space is available. In almost all cases, this should be set to OFF. AUTO_UPDATE_STATISTICS When turned ON, as it should be in most cases, the optimizer will automatically keep statistics updated, in response to data modications. READ_WRITE This is the default option and the one to use if you want users to be able to update the database. We could also set the database to READ_ONLY to prevent any users making updates to the database. RECOVERY SIMPLE This tells SQL Server to set the recovery model of the database to SIMPLE. Other options are FULL (the usual default) and BULK_LOGGED. 92
Chapter 3: Full Database Backups MULTI_USER For the database to allow connections for multiple users, we need to set this option. Our other choice is SINGLE_USER, which allows only one connection to the database at a time. The only case where we are changing the usual default value is the command to set the recovery model to SIMPLE; in most cases, the model database will, by default, be operating in the FULL recovery model and so this is the recovery model that will be conferred on all user databases. If the default recovery model for the model database is already set to SIMPLE, for your instance, then you won't need to execute this portion of the ALTER script. If you do need to change the recovery model, just make sure you are in the correct database before running this command, to avoid changing another database's recovery model. Alternatively, simpy pull up the Properties for our newly created database and change the recovery model manually, on the Options page, as shown in Figure 3-2.
Figure 3-2:
93
MessageTable1 and MessageTable2 are both very simple tables, comprised of only two columns each. The MessageData column will contain a static character string, mainly to fill up data space, and the MessageDate will hold the date and time that the message was inserted.
94
Chapter 3: Full Database Backups Now that we have our tables set up, we need to populate them with data. We want to pack in a few hundred thousand rows so that the database will have a substantial size. However, we won't make it so large that it will risk filling up your desktop/laptop drive. Normally, a DBA tasked with pumping several hundred thousand rows of data into a table would reach for the BCP tool and a flat file, which would be the fastest way to achieve this goal. However, since coverage of BCP is out of scope for this chapter, we'll settle for a simpler, but much slower, T-SQL method, as shown in Listing 3-4.
USE [DatabaseForFullBackups] GO DECLARE @messageData NVARCHAR(200) use to fill to get this to fill up size as we can!!'
SET @messageData = 'This is the message we are going to up the first table for now. We want as close to 200 characters as we can the database as close to our initial
The code uses a neat GO trick that allows us to INSERT the same data into the MessageTable1 table multiple times, without using a looping mechanism. The GO statement is normally used as a batch separator, but in this case we pass it a parameter defining the number of times to run the code in the batch. So, our database is now at a decent size, somewhere around 500 MB. This is not a large database by any stretch of the imagination, but it is large enough that it will take more than a few seconds to back up.
95
96
Chapter 3: Full Database Backups Alternatively, using the SSMS GUI, simply right-click on the database, and select Delete; by default, the option to Delete backup and restore history information for databases will be checked and this will clear out the msdb historical information.
97
Figure 3-3:
This will start the backup wizard and bring up a dialog box titled Back Up Database DatabaseForFullBackups, shown in Figure 3-4, with several configuration options that are available to the T-SQL BACKUP DATABASE command. Don't forget that all we're really doing here is using a graphical interface to build and run a T-SQL command.
98
Figure 3-4:
The General page comprises three major sections: Source, Backup set and Destination. In the Source section, we specify the database to be backed up and what type of backup to perform. The Backup type drop-down list shows the types of backup that are available to your database. In our example we are only presented with two options, Full and Differential, since our database is in SIMPLE recovery model.
99
Chapter 3: Full Database Backups You will also notice a check box with the label Copy-only backup. A copy-only full backup is one that does not affect the normal backup operations of a database and is used when a full backup is needed outside of a normal scheduled backup plan. When a normal full database backup is taken, SQL Server modifies some internal archive points in the database, to indicate that a new base file has been created, for use when restoring subsequent differential backups. Copy-only full backups preserve these internal archive points and so cannot be used as a differential base. We're not concerned with copy-only backups at this point. The Backup component section is where we specify either a database or file/filegroup backup. The latter option is only available for databases with more than one filegroup, so it is deactivated in this case. We will, however, talk more about this option when we get to Chapter 9, on file and filegroup backups. In the Backup set section, there are name and description fields used to identify the backup set, which is simply the set of data that was chosen to be backed up. The information provided here will be used to tag the backup set created, and record its creation in the MSDB backup history tables. There is also an option to set an expiration date on our backup set. When taking SQL Server backups, it is entirely possible to store multiple copies of a database backup in the same file or media. SQL Server will just append the next backup to the end of the backup file. This expiration date lets SQL Server know how long it should keep this backup set in that file before overwriting it. Most DBAs do not use this "multiple backups per file" feature. There are only a few benefits, primarily a smaller number of files to manage, and many more drawbacks: larger backup files and single points of failure, to name only two. For simplicity and manageability, throughout this book, we will only deal with backups that house a single backup per file. The Destination section is where we specify the backup media and, in the case of disk, the location of the file on this disk. The Tape option button will be disabled unless a tape drive device is attached to the server.
100
Chapter 3: Full Database Backups As discussed in Chapter 1, even if you still use tape media, as many do, you will almost never back up directly to tape; instead you'll back up to disk and then transfer older backups to tape. When using disk media to store backups, we are offered three buttons to the right of the file listing window, for adding and removing disk destinations, as well as looking at the different backup sets that are already stored in a particular file. The box will be pre-populated with a default file name and destination for the default SQL Server backup folder. We are not going to be using that folder (simply because we'll be storing our backups in separate folders, according to chapter) so go ahead and use the Remove button on that file to take it out of the list. Now, use the only available button, the Add button to bring up the Select Backup Destination window. Make sure you have the File name option selected, click the browse () button to bring up the Locate Database Files window. Locate the SQLBackups\Chapter3 directory that you created earlier, on the machine and then enter a name for the file, DatabaseForFullBackups_Full_Native_1.bak as shown in Figure 3-5.
Figure 3-5:
101
Chapter 3: Full Database Backups Once this has been configured, click OK to finalize the new file configuration and click OK again on the Select Backup Destination dialog box to bring you back to the Back Up Database page. Now that we are done with the General page of this wizard, let's take a look at the Options Page, shown in Figure 3-6.
102
Chapter 3: Full Database Backups The Overwrite media section is used in cases where a single file stores multiple backups and backup sets. We can set the new backup to append to the existing backup set or to overwrite the specifically named set that already exists in the file. We can also use this section to overwrite an existing backup set and start afresh. We'll use the option of Overwrite all existing backup sets since we are only storing one backup per file. This will make sure that, if we were to run the same command again in the event of an issue, we would wind up with just one backup set in the file. The Reliability section provides various options that can be used to validate the backup, as follows: Verify backup when finished Validates that the backup set is complete, after the backup operation has completed. It will make sure that each backup in the set is readable and ready for use. Perform checksum before writing to media SQL Server performs a checksum operation on the backup data before writing it to the storage media. As discussed in Chapter 2, a checksum is a special function used to make sure that the data being written to the disk/tape matches what was pulled from the database or log file. This option makes sure your backup data is being written correctly, but might also slow down your backup operation. Continue on error Instructs SQL Server to continue with all backup operations even after an error has been raised during the backup operation. The Transaction log section offers two important configuration options for transaction log backups, and will be covered in Chapter 5. The Tape Drive section of the configuration is only applicable if you are writing your backups directly to tape media. We have previously discussed why this is not the best way for backups to be taken in most circumstances, so we will not be using these options (and they aren't available here anyway, since we've already elected to back up to disk).
103
Chapter 3: Full Database Backups The final Compression configuration section deals with SQL Server native backup compression, which is an option that we'll ignore for now, but come back to later in the chapter. Having reviewed all of the configuration options, go ahead and click OK at the bottom of the page to begin taking a full database backup of the DatabaseForFullBackups database. You will notice the progress section begin counting up in percentage. Once this reaches 100%, you should receive a dialog box notifying you that your backup has completed. Click OK on this notification and that should close both the dialog box and the Back Up Database wizard.
Chapter 3: Full Database Backups at Listing 3-6 for an example of how to pull this information from your system. I'm only returning a very small subset of the available columns, so examine the table more closely, to find more information that you might find useful.
USE msdb GO SELECT database_name , DATEDIFF(SS, backup_start_date, backup_finish_date) AS [RunTImeSec] , database_creation_date FROM dbo.backupset ORDER BY database_creation_date DESC
Figure 3-7:
On my machine the backup takes 49 seconds. That seems fairly good, but we have to consider the size of the test database. It is only 500 MB, which is not a typical size for most production databases. In the next section, we'll pump more data into the database and take another full backup (this time using T-SQL directly), and we'll get to see how the backup execution time varies with database size. 105
106
Chapter 3: Full Database Backups This script should take about a few minutes to fill the secondary table. Once it is populated with another million rows of the same size and structure as the first, our database file should be hovering somewhere around 1 GB in size, most likely just slightly under. Now that we have some more data to work with, let's move on to taking native SQL Server backups using T-SQL only.
This script may look like it has some extra parameters, compared to the native GUI backup that we did earlier, but this is the scripted output of that same backup, with only the output file name modified, and a few other very minor tweaks. So what do each of these parameters mean? The meaning of the first line should be fairly obvious; it is instructing SQL Server to perform a full backup of the DatabaseForFullBackups database. This leads into the second line where we have chosen to back up to disk, and given a complete file path to the resulting backup file. The remainder of the parameters are new, so let's go through each one.
107
Chapter 3: Full Database Backups FORMAT This option tells SQL Server whether or not to overwrite the media header information. The FORMAT option will erase any information in a backup set that already exists when the backup is initialized (NOFORMAT will preserve it). INIT By default, when scripting a backup generated by the Backup wizard, this parameter will be set to NOINIT, which lets SQL Server know not to initialize a media set when taking the backup and instead append any new backup data to the existing backup set. However, since we adopt the rule of one backup per backup set, it's useful to use INIT instead, to make sure that, if a command gets run twice, we overwrite the existing set and still end up with only one backup in the set. NAME The NAME parameter is simply used to identify the backup set. If it is not supplied, the set will not record a name. SKIP Using the SKIP parameter will cause SQL Server to skip the expiration check that it normally does on the backup set. It doesn't care if any backups existing in the backup set have been marked for availability to be overwritten. NOREWIND This parameter will cause SQL Server to keep a tape device open and ready for use when the backup operation is complete. This is a performance boost to users of tape drives since the tape is already at the next writing point instead of having to search for the correct position. This is obviously a tape-only option. NOUNLOAD When backing up to a tape drive, this parameter instructs SQL Server not to unload the tape from the drive when the backup operation is completed.
108
Chapter 3: Full Database Backups STATS This option may prove useful to you when performing query-based backups. The STATS parameter defines the time intervals on which SQL Server should update the "backup progress" messages. For example, using stats=10 will cause SQL Server to send a status message to the query output for each 10 percent of the backup completion. As noted, if we wished to overwrite an existing backup set, we'd want to specify the INIT parameter but, beyond that, none of these secondary parameters, including the backup set NAME descriptor, are required. As such, we can actually use a much simplified BACKUP command, as shown in Listing 3-9.
BACKUP DATABASE [DatabaseForFullBackups] TO DISK = N'C:\SQLBackups\Chapter3\DatabaseForFullBackups_Full_Native_2.bak' GO
Go ahead and start Management Studio and connect to your test server. Once you have connected, open a new query window and use either Listing 3-8 or 3-9 to perform this backup in SSMS. Once it is done executing, do not close the query, as the query output contains some metrics that we want to record.
109
Figure 3-8:
This status output shows how many database pages the backup processed as well as how quickly the backup was completed. On my machine, the backup operation completed in just under 80 seconds. Notice here that that the backup processes all the pages in the data file, plus two pages in the log file; the latter is required because a full backup needs to include enough of the log that the backup can produce a consistent database, upon restore. When we ran our first full backup, we had 500 MB in our database and the backup process took 49 seconds to complete. Why didn't it take twice as long this time, now that we just about doubled the amount of data? The fact is that the central process of writing data to the backup file probably did take roughly twice as long, but there are other "overhead processes" associated with the backup task that take roughly the same amount of time regardless of how much data is being backed up. As such, the time to take backups will not increase linearly with increasing database size. But does the size of the resulting backup file increase linearly? Navigating to our SQL Server backup files directory, we can see clearly that the size is nearly double that of the first backup file (see Figure 3-9). The file size of your native SQL Server backups will grow at nearly the same rate as the database data files grow. 110
Figure 3-9:
We will compare these metrics against the file sizes and speeds we get from backing up the same files using Red Gate's SQL Backup in Chapter 8.
111
Chapter 3: Full Database Backups The only difference between this and our backup script in Listing 3-9, is the use here of the COMPRESSION keyword, which instructs SQL Server to make sure this database is compressed when written to disk. If you prefer to run the compressed backup using the GUI method, simply locate the Compression section, on the Options page of the Backup Wizard, and change the setting from Use the default server setting to Compress backup. Note that, if desired, we can use the sp_configure stored procedure to make backup compression the default behavior for a SQL Server instance. On completion of the backup operation, the query output window will display output similar to that shown in Figure 3-10.
If you recall, a non-compressed backup of the same database took close to 80 seconds and resulted in a backup file size of just over 1 GB. Here, we can see that use of compression has reduced the backup time to about 32 seconds, and it results in a backup file size, shown in Figure 3-11, of only 13 KB!
112
Chapter 3: Full Database Backups These results represent a considerable saving, in both storage space and processing time, over non-compressed backups. If you're wondering whether or not the compression rates should be roughly consistent across all your databases, then the short answer is no. Character data, such as that stored in our DatabaseForFullBackups database compresses very well. However, some databases may contain data that doesn't compress as readily such as FILESTREAM and image data, and so space savings will be less.
Verifying Backups
Having discussed the basic concepts of backup verification in Chapter 2, Listing 3-11 shows a simple script to perform a checksum during a backup of our DatabaseForFullBackups database, followed by a RESTORE VERIFYONLY, recalculating the checksum.
BACKUP DATABASE [DatabaseForFullBackups] TO DISK = N'C:\SQLBackups\Chapter3\DatabaseForFullBackups_Full_Native_Checksum.bak' WITH CHECKSUM GO RESTORE VERIFYONLY FROM DISK = N'C:\SQLBackups\Chapter3\DatabaseForFullBackups_Full_Native_Checksum.bak' WITH CHECKSUM
Hopefully you'll get output to the effect that the backup is valid!
113
114
Chapter 3: Full Database Backups In my role as a DBA, I use a third-party backup tool, namely SQL Backup, to manage and schedule all of my backups. Chapter 8 will show how to use this tool to build a script that can be used in a SQL Agent job to take scheduled backups of databases.
Summary
This chapter explained in detail how to capture full database backups using either SSMS Backup Wizard or T-SQL scripts. We are now ready to move on to the restoration piece of the backup and restore jigsaw. Do not to remove any of the backup files we have captured; we are going to use each of these in the next chapter to restore our DatabaseForFullBackups database.
115
Chapter 4: Restoring From Full Backup needs to stipulate file retention for as long as is reasonably necessary for such data loss or data integrity issues to be discovered. Just don't go overboard; there is no need to keep backup files for 2 weeks on local disk, when 35 days will do the trick 99.9% of the time.
117
118
Chapter 4: Restoring From Full Backup It would only take one rogue employee to steal a list of all clients and their sensitive information to sell to a competitor or, worse, a black market party. If you work at a financial institution, you may be dealing on a daily basis with account numbers, passwords and financial transaction, as well as sensitive user information such as social security numbers and addresses. Not only will this data be subject to strict security measures in order to keep customers' information safe, it will also be the target of government agencies and their compliance audits. More generally, while the production servers receive the full focus of attempts to deter and foil hackers, security can be a little lacking in non-production environments. This is why development and QA servers are a favorite target of malicious users, and why having complete customer records on such servers can cause big problems, if a compromise occurs. So, what's the solution? Obviously, for development purposes, we need the database schemas in our development and test servers to be initially identical to the schema that exists in production, so it's common practice, in such situations, to copy the schema but not the data. There are several ways to do this. Restore the full database backup, but immediately truncate all tables, purging all sensitive data. You may then need to shrink the development copy of your database; you don't want to have a 100 GB database shell if that space is never going to be needed. Note that, after a database shrink, you should always rebuild your indexes, as they will get fragmented as a result of such an operation. Use a schema comparison tool, to synch only the objects of the production and development databases. Wipe the database of all user tables and use SSIS to perform a database object transfer of all required user objects. This can be set up to transfer objects only and to ignore any data included in the production system.
119
Chapter 4: Restoring From Full Backup Of course, in each case, we will still need a complete, or at least partial, set of data in the development database, so we'll need to write some scripts, or use a data generation tool, such as SQL Data Generator, to establish a set of test data that is realistic but doesn't flout regulations for the protection of sensitive data.
Chapter 4: Restoring From Full Backup production. Even if a user doesn't have a login on the production database server, the restored database still holds the permissions. If that user is eventually given access to the production machine, he or she will automatically have that level of access, even if it wasn't explicitly given by the DBA team. The only case when this may not happen is when the user is using SQL Server authentication and the internal SID, a unique identifying value, doesn't match on the original and target server. If two SQL logins with the same name are created on different machines, the underlying SIDs will be different. So, when we move a database from Server A to Server B, a SQL login that has permission to access Server A will also be moved to Server B, but the underlying SID will be invalid and the database user will be "orphaned." This database user will need to be "de-orphaned" (see below) before the permissions will be valid. This will never happen for matching Active Directory accounts since the SID is always the same across a domain. In order to prevent this from happening in our environments, every time we restore a database from one environment to another we should: audit each and every login never assume that if a user has certain permissions in one environment they need the same in another; fix any internal user mappings for logins that exist on both servers, to ensure no one gets elevated permissions perform orphaned user maintenance remove permissions for any users that do not have a login on the server to which we are moving the database; the sp_change_ users_login stored procedure can help with this process, reporting all orphans, linking a user to its correct login, or creating a new login to which to link: EXEC sp_change_users_login 'Report' EXEC sp_change_users_login 'Auto_Fix' 'user' , EXEC sp_change_users_login 'Auto_Fix' 'user' 'login' , , , 'password'
121
Chapter 4: Restoring From Full Backup Don't let these issues dissuade you from performing full restores as and when necessary. Diligence is a great trait in a DBA, especially in regard to security. If you apply this diligence, keeping a keen eye out when restoring databases between mismatched environments, or when dealing with highly sensitive data of any kind, then you'll be fine.
Figure 4-1:
To start the restore process, right-click on the database in question, DatabaseForFullBackups, and navigate Tasks | Restore | Database..., as shown in Figure 4-2. This will initiate the Restore wizard.
Figure 4-2:
123
Chapter 4: Restoring From Full Backup The Restore Database window appears, with some options auto-populated. For example, the name of the database we're restoring to is auto-filled to be the same as the source database that was backed up. Perhaps more surprisingly, the backup set to restore is also auto-populated, as shown in Figure 4-3. What's happened is that SQL Server has inspected some of the system tables in the msdb database and located the backups that have already been taken for this database. Depending on how long ago you completed the backups in Chapter 3, the window will be populated with the backup sets taken in that chapter, letting us choose which set to restore.
Figure 4-3:
124
Chapter 4: Restoring From Full Backup We are not going to be using this pre-populated form, but will instead configure the restore process by hand, so that we restore our first full backup file. In the Source for restore section, choose the From device option and then click the ellipsis button (). In the Specify Backup window, make sure that the media type shows File, and click the Add button. In the Locate Backup File window, navigate to the C:\SQLBackups\Chapter3 folder and click on the DatabaseForFullBackups_Full_Native_1.bak backup file. Click OK twice to get back to the Restore Database window. We will now be able to see which backups are contained in the selected backup set. Since we only ever stored one backup per file, we only see one backup. Tick the box under the Restore column to select that backup file as the basis for the restore process, as shown in Figure 4-4.
Figure 4-4:
Next, click to the Options page on the left side of the restore configuration window. This will bring us to a whole new section of options to modify and validate (see Figure 4-5).
125
Figure 4-5:
The top of the screen shows four Restore options as shown below. Overwrite the existing database the generated T-SQL command will include the REPLACE option, instructing SQL Server to overwrite the currently existing database information. Since we are overwriting an existing database, in this example we want to check this box. Note that it is advised to use the REPLACE option with care, due to the potential for overwriting a database with a backup of a different database. See http://msdn.microsoft.com/en-us/library/ms191315.aspx.
126
Chapter 4: Restoring From Full Backup Preserve the replication settings only for use in a replication-enabled environment. Basically allows you to re-initialize replication, after a restore, without having to reconfigure all the replication settings. Prompt before restoring each backup receive a prompt before each of the backup files is processed. We only have one backup file here, so leave this unchecked. Restrict access to the restored database restricts access to the restored database to only members of the database role db_owner and the two server roles sysadmin and dbcreator. Again, don't select this option here. The next portion of the Options page is the Restore the database files as: section, where we specify the location for the data (mdf) and log (ldf) files for the restored database. This will be auto-populated with the location of the original files, on which the backup was based. We have the option to move them to a new location on our drives but, for now, let's leave them in their original location, although it's wise to double-check that this location is correct for your system. Finally, we have the Recovery state section, where we specify the state in which the database should be left once the current backup file has been restored. If there are no further files to restore, and we wish to return the database to a useable state, we pick the first option, RESTORE WITH RECOVERY. When the restore process is run, the backup file will be restored and then the final step of the restore process, database recovery (see Chapter 1), will be carried out. This is the option to choose here, since we're restoring just a single full database backup. We'll cover the other two options later in the book, so we won't consider them further here. Our restore is configured and ready to go, so click OK and wait for the progress section of the restore window to notify us that the operation has successfully completed. If the restore operating doesn't show any progress, the probable reason is that there is another active connection to the database, which will prevent the restore operation from starting. Stop the restore, close any other connections and try again. A convenience of scripting, as we'll see a little later, is that we can check for, and close, any other connections before we attempt the restore operation. 127
The first query should return a million rows, each containing the same message. The second query, if everything worked in the way we intended, should return no rows.
Chapter 4: Restoring From Full Backup re-create the Restore Database pages as we had them configured in Figures 4.34.5, and then click the Script drop-down button, and select Script Action to New Query Window. This will generate the T-SQL RESTORE command that will be the exact equivalent of what would be generated under the covers when running the process through the GUI. When I ran this T-SQL command on my test system, the restore took just under 29 seconds and processed 62,689 pages of data, as shown in Figure 4-6.
Chapter 4: Restoring From Full Backup For this reason, the second and most common way is to place the database into OFFLINE mode for a short period. This will drop all connections and terminate any queries currently processing against the database, which can then immediately be switched back to ONLINE mode, for the restore process to begin. Just be sure not to kill any connections that are processing important data. Even in development, we need to let users know before we just go wiping out currently running queries. Here, we'll be employing the second technique and so, in Listing 4-2, you'll see that we set the database to OFFLINE mode and use option, WITH ROLLBACK IMMEDIATE, which instructs SQL Server to roll those processes back immediately, without waiting for them to COMMIT. Alternatively, we could have specified WITH ROLLBACK AFTER XX SECONDS, where XX is the number of seconds SQL Server will wait before it will automatically start rollback procedures. We can then return the database to ONLINE mode, free of connections and ready to start the restore process.
USE [master] GO ALTER DATABASE [DatabaseForFullBackups] SET OFFLINE WITH ROLLBACK IMMEDIATE GO ALTER DATABASE [DatabaseForFullBackups] SET ONLINE GO
Go ahead and give this a try. Open two query windows in SSMS; in one of them, start the long-running query shown in Listing 4-3 then, in the second window, run Listing 4-2.
USE [DatabaseForFullBackups] GO WAITFOR DELAY '00:10:00' GO
130
Chapter 4: Restoring From Full Backup You'll see that the session with the long-running query was terminated, reporting a severe error and advising that any results returned be discarded. In the absence of a third-party tool, which will automatically take care of existing sessions before performing a restore, this is a handy script. You can include it in any backup scripts you use or, perhaps, convert it into a stored procedure which is always a good idea for reusable code. Now that we no longer have to worry about pesky user connections interfering with our restore process, we can go ahead and run the T-SQL RESTORE command in Listing 4-4.
USE [master] GO RESTORE DATABASE [DatabaseForFullBackups] FROM DISK = N'C:\SQLBackups\Chapter3\DatabaseForFullBackups_Full_Native_2.bak' WITH FILE = 1, STATS = 25 GO
The RESTORE DATABASE command denotes that we wish to restore a full backup file for the DatabaseForFullBackups database. The next portion of the script configures the name and location of the backup file to be restored. If you chose a different name or location for this file, you'll need to amend this line accordingly. Finally, we specify a number of WITH options. The FILE argument identifies the backup set to be restored, within our backup file. As discussed in Chapter 2, backup files can hold more than one backup set, in which case we need to explicitly identify the number of the backup set within the file. Our policy in this book is "one backup set per file," so we'll always set FILE to a value of 1. The STATS argument is also one we've seen before, and specifies the time intervals at which SQL Server should update the "backup progress" messages. Here, we specify a message at 25% completion intervals. Notice that even though we are overwriting an existing database without starting with a tail log backup, we do not specify the REPLACE option here, since DatabaseForFullBackups is a SIMPLE recovery model database, so the tail log backup is not possible. SQL 131
Chapter 4: Restoring From Full Backup Server will still overwrite any existing database on the server called DatabaseForFullBackups, using the same logical file names for the data and log files that are recorded within the backup file. In such cases, we don't need to specify any of the file names or paths for the data or log files. Note, though, that this only works if the file structure is the same! The backup file contains the data and log file information, including the location to which to restore the data and log files so, if we are restoring a database to a different machine from the original, and the drive letters, for instance, don't match up, we will need to use the WITH MOVE argument to point SQL Server to a new location for the data and log files. This will also be a necessity if we need to restore the database on the same server with a different name. Of course, SQL Server won't be able to overwrite any data or log files if they are still in use by the original database. We'll cover this topic in more detail later in this chapter, and again in Chapter 6. Go ahead and run the RESTORE command. Once it is done executing, do not close the query session, as the query output contains some metrics that we want to record. We can verify that the restore process worked, at this point, by simply opening a new query window and executing the code from Listing 4-1. This time the first query should return a million rows containing the same message, and the second query should also return a million rows containing the same, slightly different, message.
Figure 4-7:
Before we move on, you may be wondering whether any special options or commands are necessary if restoring a native SQL Server backup file that is compressed. The answer is "No;" it is exactly the same process as restoring a normal backup file.
Listing 4-5
Assuming you have not deleted the DatabaseForFullBackups database, attempting to run Listing 4-5 will result in the following error message (truncated for brevity; basically the same messages are repeated for the log file):
Msg 1834, Level 16, State 1, Line 2 The file 'C:\SQLData\DatabaseForFullBackups.mdf' cannot be overwritten. database 'DatabaseForFullBackups'.
It is being used by
Msg 3156, Level 16, State 4, Line 2 File 'DatabaseForFullBackups' cannot be restored to ' C:\SQLData\DatabaseForFullBackups.mdf'. Use WITH MOVE to identify a valid location for the file.
The problem we have here, and even the solution, is clearly stated by the error messages. In Listing 4-5, SQL Server attempts to use, for the DatabaseForFullBackups2 database being restored, the same file names and paths for the data and log files as are being used for the existing DatabaseForFullBackups database, which was the source of the backup file. In other words, it's trying to create data and log files for the DatabaseForFullBackups2 database, by overwriting data and log files that are being used by the DatabaseForFullBackups database. We obviously can't do that without causing the DatabaseForFullBackups database to fail. We will have to either drop the first database to free those file names or, more likely, and as the second part of the error massage suggests, identify a valid location for the log and data files for the new, using WITH MOVE, as shown in Listing 4-6.
134
Listing 4-6: A RESTORE command that renames the data and log files for the new database.
We had two choices to fix the script; we could either rename the files and keep them in the same directory or keep the file names the same but put them in a different directory. It can get very confusing if we have a database with the same physical file name as another database, so renaming the files to match the database name seems like the best solution. Let's take a look at a somewhat subtler error. For this example, imagine that we wish to replace an existing copy of the DatabaseForFullBackups2 test database with a production backup of DatabaseForFullBackups. At the same time, we wish to move the data and log files for the DatabaseForFullBackups2 test database over to a new drive, with more space.
USE master go RESTORE DATABASE [DatabaseForFullBackups2] FROM DISK = 'C:\SQLBackups\DatabaseForFileBackups_Full_Native_1.bak' WITH RECOVERY, REPLACE, MOVE 'DatabaseForFileBackups' TO 'D:\SQLData\DatabaseForFileBackups2.mdf', MOVE 'DatabaseForFileBackups_log' TO 'D:\SQLData\DatabaseForFileBackups2_log.ldf' GO
135
Chapter 4: Restoring From Full Backup In fact, no error message at all will result from running this code; it will succeed. Nevertheless, a serious mistake has occurred here: we have inadvertently chosen a backup file for the wrong database, DatabaseForFileBackups instead of DatabaseForFullBackups, and used it to overwrite our existing DatabaseForFullBackups2 database! This highlights the potential issue with misuse of the REPLACE option. We can presume a DBA has used it here because the existing database is being replaced, without performing a tail log backup (see Chapter 6 for more details). However, there are two problems with this, in this case. Firstly, DatabaseForFullBackups2 is a SIMPLE recovery model database and so REPLACE is not required from the point of view of bypassing a tail log backup, since log backups are not possible. Secondly, use of REPLACE has bypassed the normal safety check that SQL Server would perform to ensure the database in the backup matches the database over which we are restoring. If we had run the exact same code as shown in Listing 4-7, but without the REPLACE option, we'd have received the following, very useful error message:
Msg 3154, Level 16, State 4, Line 1 The backup set holds a backup of a database other than the existing 'DatabaseForFullBackups2' database. Msg 3013, Level 16, State 1, Line 1 RESTORE DATABASE is terminating abnormally.
Note that we don't have any further use for the DatabaseForFullBackups2 database, so once you've completed the example, you can go ahead and delete it.
136
Chapter 4: Restoring From Full Backup Version/edition of SQL Server used in the source and destination You may receive a request to restore a SQL Server 2008 R2 database backup to a SQL Server 2005 server, which is not a possibility. Likewise, it is not possible to restore a backup of a database that is using enterprise-only options (CDC, transparent data encryption, data compression, partitioning) to a SQL Server Standard Edition instance. What SQL Server agent jobs or DTS/DTSX packages might be affected? If you are moving the database permanently to a new server, you need to find which jobs and packages that use this database will be affected and adjust them accordingly. Also, depending on how you configure your database maintenance jobs, you may need to add the new database to the list of databases to be maintained. What orphaned users will need to be fixed? What permissions should be removed? There may be SQL Logins with differing SIDs that we need to fix. There may be SQL logins and Active Directory users that don't need access to the new server. You need to be sure to comb the permissions and security of the new location before signing off the restore as complete.
137
Chapter 4: Restoring From Full Backup In this section, we'll look at how to perform a restore of both the master and the msdb system databases, so the first thing we need to do is make sure we have valid backups of these databases, as shown in Listing 4-8.
USE [master] GO BACKUP DATABASE [master] TO DISK = N'C:\SQLBackups\Chapter4\master_full.bak' WITH INIT GO BACKUP DATABASE [msdb] TO DISK = N'C:\SQLBackups\Chapter4\msdb_full.bak' WITH INIT GO BACKUP DATABASE [model] TO DISK = N'C:\SQLBackups\Chapter4\model_full.bak' WITH INIT GO
138
Chapter 4: Restoring From Full Backup We'll choose the latter option, since we can use the services snap-in tool to view and control all of the services running on our test machine. To start up this tool, simply pull up the Run prompt and type in services.msc while connected locally or through RDP to the test SQL Server machine. This will bring up the services snap-in within the Microsoft Management Console (MMC). Scroll down until you locate any services labeled SQL Server Agent (instance); the instance portion will either contain the unique instance name, or contain MSSQLSERVER, if it is the default instance. Highlight the agent service, right-click and select Stop from the control menu to bring the SQL Server Agent to a halt, as shown in Figure 4-8.
Figure 4-8:
With the service stopped (the status column should now be blank), the agent is offline and we can proceed with the full database backup restore, as shown in Listing 4-9.
139
With the backup complete, restart the SQL Server Agent service from the services MMC snap-in tool and you'll find that all jobs, schedules, operators, and everything else stored in the msdb database, are all back and ready for use. This is a very simple task, with only the small change being that we need to shut down a service before performing the restore. Don't close the services tool yet, though, as we will need it to restore the master database.
140
Chapter 4: Restoring From Full Backup To start SQL Server in single-user mode, open a command prompt and browse to the SQL Server installation folder, which contains the sqlservr.exe file. Here are the default locations for both SQL Server 2008 and 2008 R2: <Installation Path>\MSSQL10.MSSQLSERVER\MSSQL\Binn <Installation Path>\MSSQL10_50.MSSQLSERVER\MSSQL\Binn From that location, issue the command sqlservr.exe m. SQL Server will begin the startup process, and you'll see a number of messages to this effect, culminating (hopefully) in a Recovery is complete message, as shown in Figure 4-9.
Figure 4-9: Recovery is complete and SQL Server is ready for admin connection.
Once SQL Server is ready for a connection, open a second command prompt and connect to your test SQL Server with sqlcmd. Two examples of how to do this are given below, the first when using a trusted connection and the second for a SQL Login authenticated connection.
141
Chapter 4: Restoring From Full Backup sqlcmd -SYOURSERVER E sqlcmd SYOURSERVER UloginName Ppassword At the sqlcmd prompt, we'll perform a standard restore to the default location for the master database, as shown in Listing 4-10 (if required, we could have used the MOVE option to change the master database location or physical file).
RESTORE DATABASE [master] FROM DISK = 'C:\SQLBackups\Chapter4\master_full.bak' GO
In the first sqlcmd prompt, you should see a standard restore output message noting the number of pages processed, notification of the success of the operation, and a message stating that SQL Server is being shut down, as shown in Figure 4-10.
142
Chapter 4: Restoring From Full Backup Since we just restored the master database, we need the server to start normally to pick up and process all of the internal changes, so we can now start the SQL Server in normal mode to verify that everything is back online and working fine. You have now successfully restored the master database!
Summary
Full database backups are the cornerstone of a DBA's backup and recovery strategy. However, these backups are only useful if they can be used successfully to restore a database to the required state in the event of data loss, hardware failure, or some other disaster. Hopefully, as a DBA, the need to restore a database to recover from disaster will be a rare event, but when it happens, you need to be 100% sure that it's going to work; your organization, and your career as a DBA, my depend on it. Practice test restores for your critical databases on a regular schedule! Of course, many restore processes won't be as simple as restoring the latest full backup. Log backups will likely be involved, for restoring a database to a specific point in time, and this is where things get more interesting.
143
144
145
Figure 5-1:
The concept of the active log is an important one. A VLF can either be "active," if it contains any part of what is termed the active log, or "inactive," if it doesn't. Any log record relating to an open transaction is required for possible rollback and so must be part of the active log. In addition, there are various other activities in the database, including replication, mirroring and CDC (Change Data Capture) that use the transaction log and need transaction log records to remain in the log until they have been processed. These records will also be part of the active log. The log record with the MinLSN, shown in Figure 5-1, is defined as the "oldest log record that is required for a successful database-wide rollback or by another activity or operation in the database." This record marks the start of the active log and is sometimes referred to as the "head" of the log. Any more recent log record, regardless of whether it is still open or required, is also part of the active log; this is an important point as it explains why it's a misconception to think of the active portion of the log as containing only records relating to uncommitted transactions. The log record with the highest LSN (i.e. the most recent record added) marks the end of the active log.
146
Chapter 5: Log Backups Therefore, we can see that a log record is no longer part of the active log only when each of the following three conditions below is met. 1. It relates to a transaction that is committed and so is no longer required for rollback.
2. It is no longer required by any other database process, including a transaction log backup when using FULL or BULK LOGGED recovery models. 3. It is older (i.e. has a lower LSN) than the MinLSN record.
Any VLF that contains any part of the active log is considered active and can never be truncated. For example, VLF3, in Figure 5-1, is an active VLF, even though most of the log records it contains are not part of the active log; it cannot be truncated until the head of the logs moves forward into VLF4. The operations that will cause the head of the log to move forward vary depending on the recovery model of the database. For databases in the SIMPLE recovery model, the head of the log can move forward upon CHECKPOINT, when pages are flushed from cache to disk, after first being written to the transaction log. As a result of this operation, many log records would now satisfy the first requirement listed above, for no longer being part of the active log. We can imagine that if, as a result, the MinLSN record in Figure 5-1, and all subsequent records in VLF3, satisfied both the first and second criteria, then the head would move forward and VLF3 could now be truncated. Therefore, generally, space inside the log is made available for reuse at regular intervals.
Truncation does not reduce the size of the log file
It's worth reiterating that truncation does not affect the physical size of the log; it will still take up the same physical space on the drive. Truncation is merely the act of marking VLFs in the log file as available for reuse, in the recording of subsequent transactions.
147
Chapter 5: Log Backups For databases using FULL or BULK LOGGED recovery, the head can only move forward as a result of a log backup. Any log record that has not been previously backed up is considered to be still "required" by a log backup operation, and so will never satisfy the second requirement above, and will remain part of the active log. If we imagine that the MinLSN record in Figure 5-1 is the first record added to the log after the previous log backup, then the head will remain in that position till the next log backup, at which point it can move forward (assuming the first requirement is also satisfied). I've stressed this many times, but I'll say it once more for good measure: this is the other reason, in addition to enabling point-in-time restore, why it's so important to back up the log for any database operating in FULL (or BULK_LOGGED) recovery; if you don't, the head of the log is essentially "pinned," space will not be reused, and the log will simply grow and grow in size. The final question to consider is what happens when the active log reaches the end of VLF8. Simplistically, it is easiest to think of space in the log file as being reused in a circular fashion. Once the logical end of the log reaches the end of a VLF, SQL Server will start to reuse the next sequential VLF that is inactive, or the next, so far unused, VLF. In Figure 5-1, this could be VLF8, followed by VLFs 1 and 2, and so on. If no further VLFs were available at all, the log would need to auto-grow and add more VLFs. If this is not possible, due to auto-growth being disabled or the disk housing the log file being full, then the logical end of the active log will meet the physical end of the log file, the transaction log is full, and the 9002 error will be issued.
Chapter 5: Log Backups However, the log backups, and subsequent restores, can also be very useful in reducing the time required for database migrations, and for offloading reporting from the Production environment, via log shipping.
Chapter 5: Log Backups The frequency with which log backups are taken will depend on the tolerable exposure to data loss, as expressed in your Backup and Restore SLA (discussed shortly).
150
Chapter 5: Log Backups At the point the migration window opens, we can disconnect all users from the original database, take a final log backup, transfer that final file to the target server, and apply it to the restoring database, specifying WITH RECOVERY so that the new database is recovered, and comes online in the same state it was in when you disconnected users from the original. We still need to bear in mind potential complicating factors related to moving databases to different locations, as discussed in Chapter 4. Orphaned users, elevated permissions and connectivity issues would still need to be addressed after the final log was applied to the new database location.
Log shipping
Almost every DBA has to make provision for business reporting. Often, the reports produced have to be as close to real time as possible, i.e. they must reflect as closely as possible the data as it currently exists in the production databases. However, running reports on a production machine is never a best practice, and the use of High Availability solutions (real-time replication, CDC solutions, log reading solutions, and so on) to get that data to a reporting instance can be expensive and time consuming. Log shipping is an easy and cheap way to get near real-time data to a reporting server. The essence of log shipping is to restore a full database to the reporting server using the WITH STANDBY option, then regularly ship log backup files from the production to the reporting server and apply them to the standby database to update its contents. The STANDBY option will keep the database in a state where more log files can be applied, but will put the database in a read-only state, so that it always reflects the data in the source database at the point when the last applied log backup was taken.
151
Chapter 5: Log Backups This means that the reporting database will generally lag behind the production database by 1530 minutes or more. This sort of lag is usually not a big problem and, in many cases, log shipping is an easy way to satisfy, not only the production users, but the reporting users as well.
Practical log shipping
It is out of scope to get into the full details of log shipping here, but the following article offers a practical guide to the process: www.simple-talk.com/sql/backup-and-recovery/ pop-rivetts-sql-server-faq-no.4-pop-does-log-shipping/.
Chapter 5: Log Backups If it's a database that's rarely, if ever, modified by end-users, but is subject to daily, welldefined data loads, then it's also unlikely that we'll need to perform log backups, so we can operate the database in SIMPLE recovery model. We can take a full database backup after each data load, or simply take a nightly full backup and then, if necessary, restore it, then replay any data load processes that occurred subsequently. If a database is modified frequently by ad hoc end-user processes, and toleration of data loss is low, then it's very likely that transaction log backups will be required. Again, talk with the project team and find out the acceptable level of data loss. You will find that, in most cases, taking log backups once per hour will be sufficient, meaning that the database could lose up to 60 minutes of transactional data. For some databases, an exposure to the risk of data loss of more than 30 or 15 minutes might be unacceptable. The only difference here is that we will have to take, store, and manage many more log backups, and more backup files means more chance of something going wrong; losing a file or having a backup file become corrupted. Refer back to the Backup scheduling section of Chapter 2 for considerations when attempting to schedule all the required backup jobs for a given SQL Server instance. Whichever route is the best for you, the most important thing is that you are taking transaction log backups for databases that require them, and only for those that required them.
153
Chapter 5: Log Backups Before we get started taking log backups, however, we need to do a bit of prep work, namely choosing an appropriate recovery model for our example database, and then creating that database along with some populated sample tables.
154
Chapter 5: Log Backups However, use of BULK LOGGED has implications that make it unsuitable for long-term use in a database where point-in-time restore is required, since it is not possible to restore a database to a point in time within a log file that contains minimally logged operations. We'll discuss this in more detail in the next chapter, along with the best approach to minimizing risk when a database does need to be temporarily switched to BULK LOGGED model. For now, however, we're going to choose the FULL recovery model for our database.
Listing 5-1:
155
Chapter 5: Log Backups This script will create for us a new DatabaseForLogBackups database, with the data and log files for this database stored in the C:\SQLData directory. Note that, if we didn't specify the FILENAME option, then the files would be auto-named and placed in the default directory for that version of SQL Server (for example, in SQL Server 2008, this is \Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA). We have assigned some appropriate values for the initial sizes of these files and their file growth characteristics. As discussed in Chapter 3, it's generally not appropriate to accept the default file size and growth settings for a database, and we'll take a deeper look at the specific problem that can arise with a log file that grows frequently in small increments, later in this chapter. For each database, we should size the data and log files as appropriate for their current data requirements, plus predicted growth over a set period. In the case of our simple example database, I know how exactly much data I plan to load into our tables (you'll find out shortly!), so I have chosen an initial data file size that makes sense for that purpose, 500 MB. We don't want the file to grow too much but if it does, we want it to grow in reasonably sized chunks, so I've chosen a growth step of 50 MB. Each time the data file needs to grow, it will grow by 50 MB, which provides enough space for extra data, but not so much that we will have a crazy amount of free space after each growth. For the log file, I've chosen an initial size of 50 MB, and I am allowing it to grow by an additional 50 MB whenever it needs more room to store transactions. Immediately after creating the database, we run an ALTER DATABASE command to ensure that our database is set up to use our chosen recovery mode, namely FULL. This is very important, especially if the model database on the SQL Server instance is set to a different recovery model, since all users' databases will inherit that setting. Now that we have a new database, set to use the FULL recovery model, we can go ahead and start creating and populating the tables we need for our log backup tests.
156
157
Chapter 5: Log Backups Listing 5-2 creates three simple message tables, each of which will store simple text messages and a time stamp so that we can see exactly when each message was inserted into the table. Having created our three tables, we can now pump a bit of data into them. We'll use the same technique as in Chapter 3, i.e. a series of INSERT commands, each with the GO X batch separator, to insert ten rows into each of the three tables, as shown in Listing 5-3.
USE [DatabaseForLogBackups] INSERT INTO dbo.MessageTable1 VALUES ('This is the initial data for MessageTable1', GETDATE()) GO 10 INSERT INTO dbo.MessageTable2 VALUES ('This is the initial data for MessageTable2', GETDATE()) GO 10 INSERT INTO dbo.MessageTable3 VALUES ('This is the initial data for MessageTable3', GETDATE()) GO 10
Listing 5-3:
We'll be performing a subsequent data load shortly, but for now we have a good base of data from which to work, and we have a very important step to perform before we can even think about taking log backups. Even though we set the recovery model to FULL, we won't be able to take log backups (in other words, the database is still effectively in autotruncate mode) until we've first performed a full backup. Before we do that, however, let's take a quick look at the current size of our log file, and its space utilization, using the DBCC SQLPERF (LOGSPACE); command. This will return results for all databases on the instance, but for the DatabaseForLogBackups we should see results similar to those shown in Figure 5-2.
158
Log Size
50 MB
Space Used
0.65 %
This shows us that we have a 50 MB log file and that it is only using a little more than one half of one percent of that space.
159
Chapter 5: Log Backups So, at this stage, we've captured a full backup of our new database, containing three tables, each with ten rows of data. We're ready to start taking log backups now, but let's run the DBCC SQLPERF(LOGSPACE) command again, and see what happened to our log space. Backup Stage Log Size Space Used
0.65 % 0.73%
50 MB
What's actually happened here isn't immediately apparent from these figures, so it needs a little explanation. We've discussed earlier how, for a FULL recovery model database, only a log backup can free up log space for reuse. This is true, but the point to remember is that such a database is actually operating in auto-truncate mode until the first full backup of that database completes. The log is truncated as a result of this first-ever full backup and, from that point on, the database is truly in FULL recovery, and a full backup will never cause log truncation. So, hidden in our figures is the fact that the log was truncated as a result of our first full backup, and the any space taken up by the rows we added was made available for reuse. Some space in the log would have been required to record the fact that a full backup took place, but overall the space used shows very little change. Later in the chapter, when taking log backups with T-SQL, we'll track what happens to these log space statistics as we load a large amount of data into our tables, and then take a subsequent log backup.
160
161
Figure 5-4:
Notice that we've selected the DatabaseForLogBackups database and, this time, we've changed the Backup type option to Transaction Log, since we want to take log backups. Once again, we leave the Copy-only backup option unchecked.
COPY_ONLY backups of the log file
When this option is used, when taking a transaction log backup, the log archiving point is not affected and does not affect the rest of our sequential log backup files. The transactions contained within the log file are backed up, but the log file is not truncated. This means that the special COPY_ONLY log backup can be used independently of conventional, scheduled log backups, and would not be needed when performing a restore that included the time span where this log backup was taken.
The Backup component portion of the configuration is used to specify full database backup versus file or filegroup backup; we're backing up the transaction log here, not the data files, so these options are disabled. The second set of configurable options for our log backup, shown in Figure 5-5, is titled Backup set and deals with descriptions and expiration dates. The name field is auto-filled and the description will be left blank. The backup set expiration date can also be left with 162
Chapter 5: Log Backups the default values. As discussed in Chapter 3, since we are only storing one log backup set per file, we do not need to worry about making sure backup sets expire at a given time. Storing multiple log backups would reduce the number of backup files that we need to manage, but it would cause that single file to grow considerably larger with each backup stored in it. We would also run the risk of losing more than just one backup if the file were to become corrupted or was lost.
Figure 5-5:
The final section of the configuration options is titled Destination, where we specify where to store the log backup file and what it will be called. If there is a file already selected for use, click the Remove button because we want to choose a fresh file and location. Now, click the Add button to bring up the backup destination selection window. Click the browse () button and navigate to our chosen backup file location (C:\SQLBackups\ Chapter5) and enter the file name DatabaseForLogBackups_Native_Log_1.trn at the bottom, as shown in Figure 5-6.
163
Figure 5-6:
Selecting the path and filename for your log backup file.
Note that, while a full database backup file is identified conventionally by the .BAK extension, most DBAs identify log backup files with the .TRN extension. You can use whatever extension you like, but this is the standard extension for native log backups and it makes things much less confusing if everyone sticks to a familiar convention. When you're done, click the OK buttons on both the Locate Database Files and Select Backup Destination windows to bring you back to the main backup configuration window. Once here, select the Options menu on the upper left-hand side of the window to bring up the second page of backup options. We are going to focus here on just the Transaction log section of this Options page, shown in Figure 5-7, as all the other options on this page were covered in detail in Chapter 3.
164
Figure 5-7:
The Transaction log section offers two important configuration option configurations. For all routine, daily log backups, the default option of Truncate the transaction log is the one you want; on completion of the log backup the log file will be truncated, if possible (i.e. space inside the file will be made available for reuse). The second option, Back up the tail of the log, is used exclusively in response to a disaster scenario; you've lost data, or the database has become corrupt in some way, and you need to restore it to a previous point in time. Your last action before attempting a RESTORE operation should be to back up the tail of the log, i.e. capture a backup of the remaining contents (those records added since the last log backup) of the live transaction log file, assuming it is still available. This will put the database into a restoring state, and assumes that the next action you wish to perform is a RESTORE. This is a vital option in database recovery, and we'll demonstrate it in Chapter 6, but it won't be a regular maintenance task that is performed. Having reviewed all of the configuration options, go ahead and click OK, and the log backup will begin. A progress meter will appear in the lower left-hand side of the window but since we don't have many transactions to back up, the operation will probably complete very quickly and you'll see a notification that your backup has completed and was successful. If, instead, you receive an error message, you'll need to check your configuration settings and to attempt to find what went wrong. We have successfully taken a transaction log backup! As discussed in Chapter 3, various metrics are available regarding these log backups, such as the time it took to complete, and the size of the log backup file. In my test, the backup time was negligible in this case (measured in milliseconds). However, for busy databases, handling hundreds or thousands of transactions per second, the speed of these log backups can be a very 165
Chapter 5: Log Backups important consideration and, as was demonstrated for full backups in Chapter 3, use of backup compression can be beneficial. In Chapter 8, we'll compare the backup speeds and compression ratios obtained for native log backups, versus log backups with native compression, versus log backups with SQL Backup compression. Right now, though, we'll just look at the log backup size. Browse to the C:\SQLBackups\ Chapter5\ folder, or wherever it is that you stored the log backup file and simply note the size of the file. In my case, it was 85 KB.
166
Listing 5-6: Third data insert for DatabaseForLogBackups (with 2-minute delay).
Go ahead and run this script in SSMS (it will take several minutes). Before we perform a T-SQL log backup, let's check once again on the size of the log file and space usage for our DatabaseForLogBackups database. Backup Stage Initial stats After third data data load
Figure 5-8:
Log Size
50 MB 100 MB
Space Used
0.8 % 55.4%
167
Chapter 5: Log Backups We now see a more significant portion of our log being used. The log actually needed to grow above its initial size of 50 MB in order to accommodate this data load, so it jumped in size by 50 MB, which is the growth rate we set when we created the database. 55% of that space is currently in use. When we used the SSMS GUI to build our first transaction log backup, a T-SQL BACKUP LOG command was constructed and executed under the covers. Listing 5-7 simply creates directly in T-SQL a simplified version of the backup command that the GUI would have generated.
USE [master] GO BACKUP LOG [DatabaseForLogBackups] TO DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Log_2.trn' WITH NAME = N'DatabaseForLogBackups-Transaction Log Backup', STATS = 10 GO
The major difference between this script, and the one we saw for a full database backup in Chapter 3, is that the BACKUP DATABASE command is replaced by the BACKUP LOG command, since we want SQL Server to back up the transaction log instead of the data files. We use the TO DISK parameter to specify the backup location and name of the file. We specify our central SQLBackups folder and, once again, apply the convention of using .TRN to identify our log backup. The WITH NAME option allows us to give the backup set an appropriate set name. This is informational only, especially since we are only taking one backup per file/set. Finally, we see the familiar STATS option, which dictates how often the console messages will be updated during the operation. With the value set to 10, a console message should display when the backup is (roughly) 10%, 20%, 30% complete, and so on, as shown in Figure 5-9, along with the rest of the statistics from the successful backup. 168
Figure 5-9:
Notice that this log backup took about 3.5 seconds to complete; still not long, but much longer than our first backup took. Also, Figure 5-10 shows that the size of our second log backup file weighs in at 56 MB. This is even larger than our full backup, which is to be expected since we just pumped in 3,700 times more records into the database than were there during the full backup. The log backup file size is also broadly consistent with what we would have predicted, given the stats we got from DBCC SQLPERF (LOGSPACE), which indicated a 100 MB log which was about 55% full.
169
Chapter 5: Log Backups Let's check the log utilization one last time with the DBCC SQLPERF (LOGSPACE) command. Backup Stage Initial stats After third data data load After T-SQL log backup Log Size
50 MB 100 MB 100 MB
Space Used
0.8 % 55.4% 5.55%
Figure 5-11: DBCC SQLPERF (LOGSPACE) output after big data load.
This is exactly the behavior we expect; as a result of the backup operation, the log has been truncated, and space in inactive VLFs made available for reuse, so only about 5% of the space is now in use. As discussed earlier, truncation does not affect the physical size of the log, so it remains at 100 MB.
170
We can see that the database was successfully created, the BACKUP DATABASE operation was fine, but then we hit a snag with the log backup operation:
Msg 4208, Level 16, State 1, Line 1 The statement BACKUP LOG is not allowed while the recovery model is SIMPLE. Use BACKUP DATABASE or change the recovery model using ALTER DATABASE.
171
Chapter 5: Log Backups The problem is very clear: we are trying to perform a log backup on a database that is operating in the SIMPLE recovery model, which is not allowed. The exact course of action, on seeing an error like this, depends on the reason why you were trying to perform the log backup. If you simply did not realize that log backups were not possible in this case, then lesson learned. If log backups are required in the SLA for this database, then the fact that the database is in SIMPLE recovery is a serious problem. First, you should switch it to FULL recovery model immediately, and take another full database backup, to restart the log chain. Second, you should find out when and why the database got switched to SIMPLE, and report what the implications are for point-in-time recovery over that period. An interesting case where a DBA might see this error is upon spotting that a log file for a certain database is growing very large, and assuming that the cause is the lack of a log backup. Upon running the BACKUP LOG command, the DBA is surprised to see the database is in fact in SIMPLE recovery. So, why would the log file be growing so large? Isn't log space supposed to be reused after each CHECKPOINT, in this case? We'll discuss possible reasons why you still might see log growth problems, even for databases in SIMPLE recovery, in the next section.
172
Chapter 5: Log Backups As a log file grows and grows, a related problem is that the log can become "internally fragmented," if the growth occurs in lots of small increments. This can affect the performance of any operation that needs to read the transaction log (such as database restores).
If the value returned for the log_reuse_wait_desc column is LOG BACKUP, then the reason for the log growth is the lack of a log backup. If this database requires log backups, start taking them, at a frequency that will satisfy the terms of the SLA, and control the growth of the log from here in. If the database doesn't require point-in-time recovery, switch the database to SIMPLE recovery.
173
Chapter 5: Log Backups In either case, if the log has grown unacceptably large (or even full) in the meantime, refer to the forthcoming section on Handling the 9002 Transaction Log Full error.
174
Chapter 5: Log Backups There are several other issues that can be revealed through the log_reuse_wait_ desc column, mainly relating to various processes, such as replication, which require log records to remain in the log until they have been processed. We haven't got room to cover them here, but Gail Shaw offers a detailed description of these issues in her article, Why is my transaction log full? at www.sqlservercentral.com/articles/ Transaction+Log/72488/.
175
176
Chapter 5: Log Backups Finally, before moving on, it's worth noting that there is quite a bit of bad advice out there regarding how to respond to this transaction log issue. The most frequent offenders are suggestions to force log truncation using BACKUP LOG WITH TRUNCATE_ONLY (deprecated in SQL Server 2005) or its even more disturbing counterpart, BACKUP LOG TO DISK='NUL which takes a log backup and discards the contents, without SQL Server ', having any knowledge that the log chain is broken. Don't use these techniques. The only correct way to force log truncation is to temporarily switch the database to SIMPLE recovery. Likewise, you should never schedule regular DBCC SHRINKFILE tasks as a means of controlling the size of the log as it can cause terrible log fragmentation, as discussed in the next section.
Log fragmentation
A fragmented log file can dramatically slow down any operation that needs to read the log file. For example, it can cause slow startup times (since SQL Server reads the log during the database recovery process), slow RESTORE operations, and more. Log size and growth should be planned and managed to avoid excessive numbers of growth events, which can lead to this fragmentation. A log is fragmented if it contains a very high number of VLFs. In general, SQL Server decides the optimum size and number of VLFs to allocate. However, a transaction log that auto-grows frequently in small increments, will suffer from log fragmentation. To see this in action, let's simply re-create our previous ForceFailure database, withal its configuration settings set at whatever the model database dictates, and then run the DBCC LogInfo command, which is an undocumented and unsupported command (at least there is very little written about it by Microsoft) but which will allow us to interrogate the VLF architecture.
177
The results are shown in Figure 5-13. The DBCC LogInfo command returns one row per VLF and, among other things, indicates the Status of that VLF. A Status value of 2 indicates a VLF is active and cannot be truncated; a Status value of 0 indicates an inactive VLF.
Five rows are returned, indicating five VLFs (two of which are currently active). We are not going to delve any deeper here into the meaning of any of the other columns returned. Now let's insert a large number of rows (one million) into a VLFTest table, in the ForceFailure database, using a script reproduced by kind permission of Jeff Moden (www.sqlservercentral.com/Authors/Articles/Jeff_Moden/80567/), and then rerun the DBCC LogInfo command, as shown in Listing 5-11.
178
Listing 5-11: Inserting one million rows and interrogating DBCC Loginfo.
This time, the DBCC LogInfo command returns 131 rows, indicating 131 VLFs, as shown in Figure 5-14.
179
Figure 5-14: 131 VLFs for our ForceFailure database, with one million rows.
The growth properties inherited from the model database dictate a small initial size for the log files, then growth in relatively small increments. These properties are inappropriate for a database subject to this sort of activity and lead to the creation of a large number of VLFs. By comparison, try re-creating ForceFailure, but this time with some sensible initial size and growth settings (such as those shown in Listing 5-1). In my test, this resulted in an initial 4 VLFs, expanding to 8 VLFs after inserting a million rows. The "right" number of VLFs for a given database depends on a number of factors, including, of course, the size of the database. Clearly, it is not appropriate to start with a very small log and grow in small increments, as this leads to fragmentation. However, it might also cause problems to go to the opposite end of the scale and start with a huge (tens of GB) log, as then SQL Server would create very few VLFs, and this could affect log space reuse. Further advice on how to achieve a reasonable number of VLFs can be found in Paul Randal's TechNet article Top Tips for Effective Database Maintenance, at http://technet.microsoft.com/en-us/magazine/2008.08.database.aspx. If you do diagnose a very fragmented log, you can remedy the situation using DBCC SHRINKFILE, as described in the previous section. Again, never use this as a general means of controlling log size; instead, ensure that your log is sized, and grows, appropriately. 180
Summary
This chapter explained in detail how to capture log backups using either SSMS Backup Wizard or T-SQL scripts. We also explored, and discussed, how to avoid certain log-related issues such as explosive log growth and log fragmentation. Do not to remove any of the backup files we have captured; we are going to use each of these in the next chapter to perform various types of restore operation on our DatabaseForLogBackups database.
181
182
183
184
Figure 6-1:
During the time span of the fifth log backup of the day, a BULK INSERT command was performed on each database, in order to load a set of data. This bulk data load completed without a hitch but, in an unrelated incident, within the time span of the fifth log backup, a user ran a "rogue" transaction and crucial data was lost. The project manager informs the DBA team and requests that the database is restored to a point just where the transaction that resulted in data loss started. In the FULL recovery model database this is not an issue. The bulk data load was fully logged and we can restore the database to any point in time within that log file. We simply restore the last full database backup, without recovery, and apply the log files to the point in time right before the unfortunate data loss incident occurred (we'll see how to do this later in the chapter). In the BULK LOGGED database, we have a problem. We can restore to any point in time within the first four log backups, but not to any point in time within the fifth log backup, which contains the minimally logged operations. For that log file, we are in an "all or nothing" situation; we must apply none of the operations in this log file, so stopping the restore at the end of the fourth file, or we must apply all of them, proceeding to restore to any point in time within the sixth log backup.
185
Chapter 6: Log Restores In other words, we can restore the full database backup, again without recovery, and apply the first four log backups to the database. Unfortunately, we will not have the option to restore to any point in time within the fifth log. If we apply the whole of the fifth log file backup, this would defeat the purpose of the recovery, since the errant process committed its changes somewhere inside of that log backup file, so we'd simply be removing the data we were trying to get back! We have little choice but to restore up to the end of the fourth log, enter database recovery, and report the loss of any data changes that were made after this time. Hopefully, this will never happen to you and, unless your SLA adopts a completely "zero tolerance" attitude towards any risk of data loss, it is not a reason to avoid BULK_LOGGED recovery model altogether. There are valid reasons using this recovery model in order to reduce the load on the transaction log, and if we follow best practices, we should not find ourselves in this type of situation.
USE [master] GO BACKUP LOG [SampleDB] TO DISK = '\\path\example\filename.trn' GO ALTER DATABASE [SampleDB] SET RECOVERY BULK_LOGGED WITH NO_WAIT GO -- Perform minimally logged transactions here -- Stop minimally logged transactions here ALTER DATABASE [SampleDB] SET RECOVERY FULL WITH NO_WAIT GO BACKUP LOG [SampleDB] TO DISK = '\\path\example\filename.trn' GO
Listing 6-1: A template for temporarily switching a database to BULK_LOGGED recovery model.
186
Chapter 6: Log Restores If we do need to perform maintenance operations that can be minimally logged and we wish to switch to BULK_LOGGED model, the recommended practice is to take a log backup immediately before switching to BULK_LOGGED, and immediately after switching the database back to FULL recovery, as demonstrated in Listing 6-1. This will, as far as possible, isolate the minimally logged transactions in a single log backup file.
187
Figure 6-2:
In our second example, using a T-SQL script, we'll restore to a specific point in time within the second log backup, just before the last 100,000 records were inserted into MessageTable3. This will leave the first two tables with their final row counts (1,020 and 10,020, respectively), but the third with just 20 rows. As discussed in Chapter 1, there are several different ways to restore to a particular point inside a log backup; in this case, we'll demonstrate the most common means, which is to use the STOPAT parameter in the RESTORE LOG command to restore the database to a state that reflects all transactions that were committed at the specified time.
188
Chapter 6: Log Restores Right-click on DatabaseForLogBackups in the object explorer and select Tasks | Restore | Database to begin the restore process and you'll see the Restore Database screen, which we examined in detail in Chapter 4. This time, rather than restore directly over an existing database, we'll restore to a new database, basically a copy of the existing DatabaseForLogBackups but as it existed at the end of the first log backup. So, in the To database: section, enter a new database name, such as DatabaseForLogBackups_RestoreCopy. In the Source for restore section of the screen, you should see that the required backup files are auto-populated (SQL Server can interrogate the msdb database for the backup history). This will only be the case if all the backup files are in their original location (C:\ SQLBackups\Chapter5, if you followed through the examples). Therefore, as configured in Figure 6-3, our new copy database would be restored to the end of the second log backup, in a single restore operation. Alternatively, by simply deselecting the second log file, we could restore the database to the end of the first log file.
Figure 6-3:
189
Chapter 6: Log Restores However, if the backup files have been moved to a new location, we'd need to manually locate each of the required files for the restore process, and perform the restore process in several steps (one operation to restore the full backup and another the log file). Since it is not uncommon that the required backups won't still be in their original local folders, and since performing the restore in steps better illustrates the process, we'll ignore this useful auto-population feature for the backup files, and perform the restore manually. Click on From device: and choose the browse button to the right of that option, navigate to C:\SQLBackups\Chapter5 and choose the full backup file. Having done so, the relevant section of the Restore Database page should look as shown in Figure 6-4.
Next, click through to the Options page. We know that restoring the full backup is only the first step in our restore process, so once the full backup is restored we need the new DatabaseForLogBackups_RestoreCopy database to remain in a restoring state, ready to accept further log files. Therefore, we want to override the default restore state (RESTORE WITH RECOVERY) and choose instead RESTORE WITH NORECOVERY, as shown in Figure 6-5. 190
Figure 6-5:
Restore the full backup file while leaving the database in a restoring state.
Note that SQL Server has automatically renamed the data and log files for the new database so as to avoid clashing with the existing DatabaseForLogBackups database, on which the restore is based. Having done this, we're ready to go. First, however, you might want to select the Script menu option, from the top of the General Page, and take a quick look at the script that has been generated under the covers. I won't show it here, as we'll get to these details in the next example, but you'll notice use of the MOVE parameter, to rename the data and log files, and the NORECOVERY parameter, to leave the database in a restoring state. Once the restore is complete, you should see the new DatabaseForLogBackups_RestoreCopy database in your object explorer, but with a green arrow on the database icon, and the word "Restoring" after its name. We're now ready to perform the second step, and restore our first transaction log. Rightclick on the new DatabaseForLogBackups_RestoreCopy database and select Tasks 191
Chapter 6: Log Restores | Restore | Transaction Log. In the Restore source section, we can click on From previous backups of database:, and then select DatabaseForLogBackups database. SQL Server will then retrieve the available log backups for us to select. Alternatively, we can manually select the required log backup, which is the route we'll choose here, so click on From file or tape: and select the first log backup file from its folder location (C:\ SQLBackups\Chapter5\DatabaseForLogBackups_Native_Log_1.trn). The screen should now look as shown in Figure 6-6.
192
Chapter 6: Log Restores Now switch to the Options page and you should see something similar to Figure 6-7.
Figure 6-7: Configuring the data and log files for the target database.
This time we want to complete the restore and bring the database back online, so we can leave the default option, RESTORE WITH RECOVERY selected. Once you're ready, click OK and the log backup restore should complete and the DatabaseForLogBackups_ RestoreCopy database should be online and usable. As healthy, paranoid DBAs, our final job is to confirm that we have the right rows in each of our tables, which is easily accomplished using the simple script shown in Listing 6-2.
USE DatabaseForLogBackups_RestoreCopy SELECT COUNT(*) FROM dbo.MessageTable1 SELECT COUNT(*) FROM dbo.MessageTable2 SELECT COUNT(*) FROM dbo.MessageTable3
193
Chapter 6: Log Restores If everything ran correctly, there should be 20 rows of data in each table. We can also run a query to return the columns from each table, to make sure that the data matches what we originally inserted. As is hopefully clear, GUI-based restores, when all the required backup files are still available in their original local disk folders, can be quick, convenient, and easy. However, if they are not, for example due to the backup files being moved, after initial backup, from local disk to network space, or after a backup has been brought back from long-term storage on tape media, then these GUI restores can be quite clunky, and the process is best accomplished by script.
194
Don't worry; point-in-time restores are not as complicated as they may sound! To prove it, let's jump right into the script in Listing 6-3. The overall intent of this script is to restore the DatabaseForLogBackup full backup file over the top of the DatabaseForLogBackup_RestoreCopy database, created in the previous GUI restore, apply the entire contents of the first log backup, and then the contents of the second log backup, up to the point just before we inserted 100,000 rows into MessageTable3.
USE [master] GO --STEP 1: Restore the full backup. Leave database in restoring state RESTORE DATABASE [DatabaseForLogBackups_RestoreCopy] FROM DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Full.bak' WITH FILE = 1, MOVE N'DatabaseForLogBackups' TO N'C:\SQLData\ DatabaseForLogBackups_RestoreCopy.mdf', MOVE N'DatabaseForLogBackups_log' TO N'C:\SQLData\DatabaseForLogBackups_ RestoreCopy_1.ldf', NORECOVERY, STATS = 10 GO --STEP 2: Completely restore 1st log backup. Leave database in restoring state RESTORE LOG [DatabaseForLogBackups_RestoreCopy] FROM DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Log.trn' WITH FILE = 1, NORECOVERY, STATS = 10 GO
195
The first section of the script restores the full backup file to the restore copy database. We use the MOVE parameter for each file to indicate that, rather than use the data and log files for DatabaseForLogBackups as the target for the restore, we should use those for a different database, in this case DatabaseForLogBackup_RestoreCopy. The NORECOVERY parameter indicates that we wish to leave the target database, DatabaseForLogBackup_RestoreCopy, in a restoring state, ready to accept further log backup files. Finally, we use the REPLACE parameter, since we are overwriting the data and log files that are currently being used by the DatabaseForLogBackup_RestoreCopy database. The second step of the restore script applies the first transaction log backup to the database. This is a much shorter command, mainly due to the fact that we do not have to specify a MOVE parameter for each data file, since we already specified the target data and log files for the restore, and those files will have already been placed in the correct location before this RESTORE LOG command executes. Notice that we again use the NORECOVERY parameter in order to leave the database in a non-usable state so we can move on to the next log restore and apply more transactional data. The second and final LOG RESTORE command is where you'll spot the brand new STOPAT parameter. We supply our specific timestamp value to this parameter in order to instruct SQL Server to stop applying log records at that point. The supplied timestamp value is important since we are instructing SQL Server to restore the database to the state it was in at the point of the last committed transaction at that specific time. We need to use the date time that was output when we ran the script in Chapter 5 (Listing 5-6). In my case, the time portion of the output was 3.33 p.m. 196
Chapter 6: Log Restores You'll notice that in Listing 6-3 I added one minute to this time, the reason being that the time output does not include seconds, and the transactions we want to include could have committed at, for example, 2:33:45. By adding a minute to the output and rounding up to 2:34:00, we will capture all the rows we want, but not the larger set of rows that inserted next, after the delay. Note, of course, that the exact format of the timestamp, and its actual value, will be different for you! This time, we specify the RECOVERY parameter, so that when we execute the command the database will enter recovery mode, and the database will be restored to the point of the last committed transaction at the specified timestamp. When you run Listing 6-3 as a whole, you should see output similar to that shown in Figure 6-8.
54 percent processed. 100 percent processed. Processed 232 pages for database 'DatabaseForLogBackups_RestoreCopy', file 'DatabaseForLogBackups' on file 1. Processed 5 pages for database 'DatabaseForLogBackups_RestoreCopy', file 'DatabaseForLogBackups_log' on file 1. RESTORE DATABASE successfully processed 237 pages in 0.549 seconds (3.369 MB/sec). 100 percent processed. Processed 0 pages for database 'DatabaseForLogBackups_RestoreCopy', file 'DatabaseForLogBackups' on file 1. Processed 9 pages for database 'DatabaseForLogBackups_RestoreCopy', file 'DatabaseForLogBackups_log' on file 1. RESTORE LOG successfully processed 9 pages in 0.007 seconds (9.556 MB/sec). 10 percent processed. 20 percent processed. 31 percent processed. 40 percent processed. 50 percent processed. 61 percent processed. 71 percent processed. 80 percent processed. 90 percent processed. 100 percent processed.
197
We see the typical percentage completion messages as well as the total restore operation metrics after each file is completed. What we might like to see in the message output, but cannot, is some indication that we performed a point-in-time restore with the STOPAT parameter. There is no obvious way to tell if we successfully did that other than to double-check our database to see if we did indeed only get part of the data changes that are stored in our second log backup file. All we have to do is rerun Listing 6-2 and this time, if everything went as planned, we should have 1,020 rows in MessageTable1, 10,020 rows in MessageTable2, but only 20 rows in MessageTable3, since we stopped the restore just before the final 100,000 rows were added to that table.
198
Chapter 6: Log Restores Aside from third-party log readers (very few of which offer support beyond SQL Server 2005), there are a couple of undocumented and unsupported functions that can be used to interrogate the contents of log files (fn_dblog) and log backups (fn_dump_dblog). So for example, we can look at the contents of our second log backup files as shown in Listing 6-4.
SELECT FROM * fn_dump_dblog(DEFAULT, DEFAULT, DEFAULT, DEFAULT, 'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Log_2.trn', DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT);
It's not pretty and it's not supported (so use with caution); it accepts a whole host of parameters, the only one we've defined being the path to the log backup, and it returns a vast array of information that we're not going to begin to get into herebut, it does return the Begin Time for each of the transactions contained in the file, and it may give you some help in working out where you need to stop. An alternative technique to point-in-time restores using STOPAT, is to try to work out the LSN value associated with, for example, the start of the rogue transaction that deleted your data. We're not going to walk through an LSN-based restore here, but a good explanation of some of the practicalities involved can be found here: http://janiceclee. com/2010/07/25/alternative-to-restoring-to-a-point-in-time/.
199
Do you spot the mistake? It's quite subtle, so if you don't manage to, simply run the script and examine the error message:
Msg 3159, Level 16, State 1, Line 1 The tail of the log for the database "DatabaseForLogBackups_RestoreCopy" has not been backed up. Use BACKUP LOG WITH NORECOVERY to backup the log if it contains work you do not want to lose. Use the WITH REPLACE or WITH STOPAT clause of the RESTORE statement to just overwrite the contents of the log. Msg 3013, Level 16, State 1, Line 1 RESTORE DATABASE is terminating abnormally.
The script in Listing 6-5 is identical to what we saw in Step 1 of Listing 6-3, except that here we are restoring over an existing database, and receive an error which is pretty descriptive. The problem is that we are about to overwrite the existing log file for the 200
Chapter 6: Log Restores DatabaseForLogBackups_RestoreCopy database, which is a FULL recovery model database, and we have not backed up the tail of the log, so we would lose any transactions that were not previously captured in a backup. This is a very useful warning message to get in cases where we needed to perform crash recovery and had, in fact, forgotten to do a tail log backup. In such cases, we could start the restore process with the tail log backup, as shown in Listing 6-6, and then proceed.
USE master GO BACKUP LOG DatabaseForLogBackups_RestoreCopy TO DISK = 'D:\SQLBackups\Chapter5\DatabaseForLogBackups_RestoreCopy_log_tail.trn' WITH NORECOVERY
In cases where we're certain that our restore operation does not require a tail log backup, we can use WITH REPLACE or WITH STOPAT. In this case, the error can be removed, without backing up the tail of the log, by adding the WITH REPLACE clause to Listing 6-5. Let's take a look at a second example failure. Examine the script in Listing 6-7 and see if you can spot the problem.
--STEP 1: Restore the log backup USE [master] GO RESTORE DATABASE [DatabaseForLogBackups_RestoreCopy] FROM DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Full.bak' WITH FILE = 1, MOVE N'DatabaseForLogBackups' TO N'C:\SQLData\DatabaseForLogBackups_RestoreCopy.mdf', MOVE N'DatabaseForLogBackups_log' TO N'C:\SQLData\DatabaseForLogBackups__RestoreCopy_1.ldf', STATS = 10, REPLACE GO
201
Look over each of the commands carefully and then execute this script; you should see results similar to those shown in Figure 6-9.
54 percent processed. 100 percent processed. Processed 232 pages for database 'DatabaseForLogBackups_RestoreCopy', file 'DatabaseForLogBackups' on file 1. Processed 5 pages for database 'DatabaseForLogBackups_RestoreCopy', file 'DatabaseForLogBackups_log' on file 1. RESTORE DATABASE successfully processed 237 pages in 1.216 seconds (1.521 MB/sec). Msg 3117, Level 16, State 1, Line 2 The log or differential backup cannot be restored because no files are ready to rollforward. Msg 3013, Level 16, State 1, Line 2 RESTORE LOG is terminating abnormally.
We can see that this time the full database backup restore, over the top of the existing database, was successful (note that we remembered to use REPLACE). It processed all of the data in what looks to be the correct amount of time. Since that operation completed, it must be the log restore that caused the error. Let's look at the error messages in the second part of the output, which will be in red. The first error we see in the output is the statement "The log or differential backup cannot be restored because no files are ready to rollforward." What does that mean? If you didn't catch the mistake in the script, it was that we left out an important parameter in the full database restore operation. Take a look again, and you will see that we don't have the NORECOVERY option in the command. Therefore, the first restore command finalized the 202
Chapter 6: Log Restores restore and placed the database in a recovered state, ready for user access (with only ten rows in each table!); no log backup files can then be applied as part of the current restore operation. Always specify NORECOVERY if you need to continue further with a log backup restore operation. Of course, there are many other possible errors that can arise if you're not fully paying attention during the restore process, and we can't cover them all. However, as one final example, take a look at Listing 6-8 and see if you can spot the problem.
USE [master] GO RESTORE DATABASE [DatabaseForLogBackups_RestoreCopy] FROM DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Full.bak' WITH FILE = 1, MOVE N'DatabaseForLogBackups' TO N'C:\SQLData\DatabaseForLogBackups_RestoreCopy.mdf', MOVE N'DatabaseForLogBackups_log' TO N'C:\SQLData\DatabaseForLogBackups__RestoreCopy_1.ldf', NORECOVERY, STATS = 10, REPLACE GO RESTORE LOG [DatabaseForLogBackups] FROM DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Log_2.trn' WITH FILE = 1, NORECOVERY, STATS = 10 GO RESTORE LOG [DatabaseForLogBackups] FROM DISK = N'C:\SQLBackups\Chapter5\DatabaseForLogBackups_Native_Log_1.trn' WITH FILE = 1, RECOVERY, STATS = 10 GO
Did you catch the problem before you ran the script? If not, take a look at your output, examine the error message you get when the first log file restore is attempted. I'm sure you'll be able to figure out what's wrong in short order!
203
Summary
After reading and working through this chapter, you should also be fairly comfortable with the basics of log restore operations, and in particular with point-in-time restores. The key to successful restores is to be well organized and well drilled. You should know exactly where the required backup files are stored; you should have confidence the operation will succeed, based on your backup validations, and regular, random "spotcheck" restores. You may be under pressure to retrieve critical lost data, or bring a stricken database back online, as quickly as possible, but it's vital not to rush or panic. Proceed carefully and methodically, and your chances of success are high.
204
205
206
207
Let's assume that the same moderately-sized production database, mentioned above, has a backup strategy consisting of a weekly full backup, hourly transaction log backups and daily differential backups. I need to retain on disk, locally, the backups required to restore the database to any point in the last three days, so that means storing locally the last three differential backups, the last three days' transaction log backups, plus the base full backup. The full backup size is about 22 GB, the log backups are, on average, about 3 GB each, and 3 days' worth of differential backups takes up another 3 GB, giving a total of about 28 GB. If I simply took full backups and log backups, I'd need almost 70 GB of space at any time for one database. Deciding exactly the right backup strategy for a database is a complex process. We want to strive as far as possible for simplicity, short backup times, and smaller disk space requirements, but at the same time we should never allow such goals to compromise the overall quality and reliability of our backup regime.
208
Figure 7-1:
This is a somewhat dangerous situation since, as we have discussed, the more backup files we have to take, store, and manage the greater the chance of one of those files being unusable. This can occur for reasons from disk corruption to backup failure. Also, if any of these transaction log backup files is not usable, we cannot restore past that point in the database's history. If, instead, our strategy included an additional differential backup at midday each day, then we'd only need to restore eight files: the full backup, the differential backup, and six transaction log backups (1116), plus a tail log backup, as shown in Figure 7-2. We would also be safe in the event of a corrupted differential backup, because we would still have all of the log backups since the full backup was taken.
209
Figure 7-2:
Using a differential backup can shorten the number of backup files to restore.
In any situation that requires a quick turnaround time for restoration, a differential backup is our friend. The more files there are to process, the more time it will also take to set up the restore scripts, and the more files we have to work with, the more complex will be the restore operation, and so (potentially) the longer the database will be down. In this particular situation, the savings might not be too dramatic, but for mission-critical systems, transaction logs can be taken every 15 minutes. If we're able to take one or two differential backups during the day, it can cut down dramatically the number of files involved in any restore process. Consider, also, the case of a VLDB where full backups take over 15 hours and so nightly full backups cannot be supported. The agreed restore strategy must support a maximum data loss of one hour, so management has decided that weekly full backups, taken on Sunday, will be supplemented with transaction log backups taken every hour during the day. Everything is running fine until one Friday evening the disk subsystem goes out on that machine and renders the database lost and unrecoverable. We are now going to have to restore the large full backup from the previous Sunday, plus well over 100 transaction log files. This is a tedious and long process. Fortunately, we now know of a better way to get this done, saving a lot of time and without sacrificing too much extra disk space: we take differential database backups each night except Sunday as a supplemental backup to the weekly full one. Now, we'd only need to restore the full, differential and around 20 log backups.
210
Database migrations
In Chapters 3 and 5, we discussed the role of full and log backups in database migrations in various scenarios. Differential restores give us another great way to perform this common task and can save a lot of time when the final database move takes place. Imagine, in this example, that we are moving a large database, operating in SIMPLE recovery model, from Server A to Server B, using full backups. We obviously don't want to lose any data during the transition, so we kill any connections to the database and place it into single-user mode before we take the backup. We then start our backup, knowing full well that no new changes can be made to our database or data. After completion, we take the source database offline and begin the restore the database to Server B. The whole process takes 12 hours, during which time our database is down and whatever front-end application it is that uses that database is also offline. No one is happy about this length of down-time. What could we do to speed the process up a bit? A better approach would be to incorporate differential database backups. 16 hours before the planned migration (allowing a few hours' leeway on the 12 hours needed to perform the full backup and restore), we take a full database backup, not kicking out any users and so not worrying about any subsequent changes made to the database. We restore to Server B the full database backup, using the NORECOVERY option, to leave the database in a state where we can apply more backup files. Once the time has come to migrate the source database in its final state, we kill any connections, place it in single-user mode and perform a differential backup. We have also been very careful not to allow any other full backups to happen via scheduled jobs or other DBA activity. This is important, so that we don't alter the base backup for our newly created differential backup. Taking the differential backup is a fast process (1015 minutes), and the resulting backup file is small, since it only holds data changes made in the last 12 to 16 hours. Once the backup has completed, we take the source database offline and immediately start the
211
Chapter 7: Differential Backup and Restore differential restore on the target server, which also takes only 1015 minutes to complete, and we are back online and running. We have successfully completed the migration, and the down-time has decreased from a miserable 12 hours to a scant 30 minutes. There is a bit more preparation work to be done using this method, but the results are the same and the uptime of the application doesn't need to take a significant hit.
Chapter 7: Differential Backup and Restore However, if this sort of disaster does strike, all we can do is look through the system tables in MSDB to identify the new base full backup file and hope that it will still be on disk or tape and so can be retrieved for use in your restore operation. If it is not, we are out of luck in terms of restoring the differential file. We would need to switch to using the last full backup and subsequent transaction log backups, assuming that you were taking log backups of that database. Otherwise, our only course of action would be to use the last available full database backup. Bottom line: make sure that a) only those people who need to take backups have the permissions granted to do so, and b) your DBA team and certain administrative users know how and when to use a copy-only backup operation.
Chapter 7: Differential Backup and Restore backup is a differential, then we'll always need to restore the midnight full, followed by the midday differential. In the event of corruption or loss of a single backup file, the maximum exposure to data loss, in either case, is 24 hours. This really is a small risk to take for the great rewards of having differential backups in your rotation. I have taken many thousands of database backups and restored thousands of files as well. I have only run into corrupt files a few times and they were never caused by SQL Server's backup routines. To help alleviate this concern, we should do two things. Firstly, always make sure backups are being written to some sort of robust storage solution. We discussed this in Chapter 2, but I can't stress enough how important it is to have backup files stored on a redundant SAN or NAS system. These types of systems can cut the risk of physical disk corruptions down to almost nothing. Secondly, as discussed in Chapter 4, we should also perform spot-check restores of files. I like to perform at least one randomly-chosen test restore per week. This gives me even more confidence that my backup files are in good shape without having to perform CHECKSUM tests on each and every backup file.
214
215
Recovery model
There are no recovery model limitations for differential backups; with the exception of the master database, we can take a differential backup of any database in any recovery model. The only type of backup that is valid for the master database is a full backup. For our example database in this chapter, we're going to assume that, in our Backup SLA, we have a maximum tolerance of 24 hours' potential data loss. However, unlike in Chapter 3, this time, we'll satisfy this requirement using a weekly full backup and nightly differential backups. Since we don't need to perform log backups, it makes sense, for ease of log management, to operate the database in SIMPLE recovery model.
216
Listing 7-1: DatabaseForDiffBackups database, MessageTable table, initial data load (100,000 rows).
217
Base backup
As discussed earlier, just as we can't take log backups without first taking a full backup of a database, so we also can't take a differential backup without taking a full base backup. Any differential backup is useless without a base. Listing 7-2 performs the base full backup for our DatabaseForDiffBackups database and stores it in the C:\SQLBackups\Chapter7 folder. Again, feel free to modify this path as appropriate f or your system.
USE [master] GO BACKUP DATABASE [DatabaseForDiffBackups] TO DISK = N'C:\SQLBackups\Chapter7\DatabaseForDiffBackups_Full_Native.bak' WITH NAME = N'DatabaseForDiffBackups-Full Database Backup', STATS = 10 GO
We are not going to worry about the execution time for this full backup, so once the backup has completed successfully, you can close this script without worrying about the output. However, we will take a look at the backup size though, which should come out to just over 20 MB. We can take a look at how our differential backup file sizes compare, as we pump more data into the database.
218
Chapter 7: Differential Backup and Restore Before taking the first differential backup, we'll INSERT 10,000 more rows into MessageTable, as shown in Listing 7-3. This is typical of the type of load that we would typically find in a differential backup.
USE [DatabaseForDiffBackups] GO INSERT INTO MessageTable VALUES ( 'Second Data Load: This is the second set of data we are populating the table with' ) GO 10000
219
Figure 7-3:
There's nothing to change on the Options page, so we're done. Click OK and the differential backup will be performed. Figure 7-4 summarizes the storage requirements and execution time metrics for our first differential backup. The execution time was obtained by scripting out the equivalent BACKUP command and running the T-SQL, as described in Chapter 3. The alternative method, querying the backupset table in msdb (see Listing 3-5), does not provide sufficient granularity for a backup that ran in under a second.
220
Number of Rows
10000
Execution Time
0.311 Seconds
Storage Required
3.16 MB
We're now ready to perform our second differential backup, and Listing 7-5 shows the code to do it. If you compare this script to the one for our base full backup (Listing 7-2), you'll see they are almost identical in structure, with the exception of the WITH DIFFERENTIAL argument that we use in Listing 7-5 to let SQL Server know that we are not going to be taking a full backup, but instead a differential backup of the changes made since the last full backup was taken. Double-check that the path to the backup file is correct, and then execute the script.
221
Having run the command, you should see results similar to those shown in Figure 7-5.
Figure 7-5:
Figure 7-6 summarizes the storage requirements and execution time metrics for our second differential backup, compared to the first differential backup.
222
So, compared to the first differential backup, the second one contains 11 times more rows, took about five times longer to execute, and takes up over six times more space. This all seems to make sense; don't forget that backup times and sizes don't grow linearly based on the number of records, since every backup includes in it, besides data, headers, backup information, database information, and other structures.
223
In my test, the compressed backup offered very little advantage in terms of execution time, but substantial savings in disk space, as shown in Figure 7-7. Differential Backup Name Number of Execution Storage Rows Time (S) Required (MB)
0.31 1.57 1.43 3.2 20.6 0.5
110000
224
Figure 7-8:
225
Chapter 7: Differential Backup and Restore Notice that the base full backup, and the second differential backups have been autoselected as the backups to restore. Remember that, to return the database to the most recent state possible, we only have to restore the base, plus the most recent differential; we do not have to restore both differentials. However, in this example we do want to restore the first rather than the second differential backup, so deselect the second differential in the list in favor of the first. Now, click over to the Options screen; in the Restore options section, select Overwrite the existing database. That done, you're ready to go; click OK, and the base full backup will be restored, followed by the first differential backup. Let's query our newly restored database in order to confirm that it worked as expected; if so, we should see a count of 100,000 rows with a message of Initial Data Load, and 10,000 rows with a message of Second Data Load, as confirmed by the output of Listing 7-7 (the message text is cropped for formatting purposes).
USE [DatabaseForDiffBackups] GO SELECT Message , COUNT(Message) AS Row_Count FROM MessageTable GROUP BY Message
Message Row_Count -----------------------------------------------------------------Initial Data Load: This is the first set of data Second Data Load: This is the second set of data (2 row(s) affected) 100000 10000
Listing 7-7: Query to confirm that the differential restore worked as expected.
226
Notice the WITH NORECOVERY argument, meaning that we wish to leave the database in a restoring state, ready to receive further backup files. Also note that we did not specify REPLACE since it is not needed here, since DatabaseForDiffBackups is a SIMPLE recovery model database. Having executed the script, refresh your object explorer window and you should see that the DatabaseForDiffBackups database now displays a Restoring message, as shown in Figure 7-9.
227
Figure 7-9:
We're now ready to run the second RESTORE command, shown in Listing 7-9, to get back all the data in the second differential backup (i.e. all data that has changed since the base full backup was taken) and bring our database back online.
USE [master] GO RESTORE DATABASE [DatabaseForDiffBackups] FROM DISK = N'C:\SQLBackups\Chapter7\DatabaseForDiffBackups_Diff_Native_2.bak' WITH STATS = 10 GO
We don't explicitly state the WITH RECOVERY option here, since RECOVERY is the default action. By leaving it out of our differential restore we let SQL Server know to recover the database for regular use.
228
Chapter 7: Differential Backup and Restore Once this command has successfully completed, our database will be returned to the state it was in at the time we started the second differential backup. As noted earlier, this is actually no reason, other than a desire here to show each step, to run Listings 7-8 and 7-9 separately. We can simply combine them and complete the restores process in a single script. Give it a try, and you should see a Messages output screen similar to that shown in Figure 7-10.
Remember that the differential backup file contained slightly more data than the base full backup, so it makes sense that a few more pages are processed in the second RESTORE. The faster backup time for the differential backup compared to the base full backup, even though the former had more data to process, can be explained by the higher overhead attached to a full restore; it must prepare the data and log files and create the space for the restore to take place, whereas the subsequent differential restore only needs to make sure there is room to restore, and start restoring data. 229
Chapter 7: Differential Backup and Restore Everything looks good, but let's put on our "paranoid DBA" hat and double-check. Rerunning Listing 7-7 should result in the output shown in Figure 7-11.
Figure 7-11: Data verification results from native T-SQL differential restore.
230
Chapter 7: Differential Backup and Restore As you can see, it's no different than a normal differential restore. You don't have to include any special options to let SQL Server know the backup is compressed. That information is stored in the backup file headers, and SQL Server will know how to handle the compressed data without any special instructions from you in the script! You will undoubtedly see that the restore took just a bit longer that the uncompressed backup would take to restore. This is simply because more work is going on to decompress the data and it, and that extra step, cost CPU time, just as an encrypted backup file would also take slightly longer to restore.
231
If you hadn't already guessed why this won't work, the error message below will leave you in no doubt.
Msg 3035, Level 16, State 1, Line 2 Cannot perform a differential backup for database "ForcingFailures", because a current database backup does not exist. Perform a full database backup by reissuing BACKUP DATABASE, omitting the WITH DIFFERENTIAL option. Msg 3013, Level 16, State 1, Line 2 BACKUP DATABASE is terminating abnormally.
We can't take a differential backup of this database without first taking a full database backup as the base from which to track subsequent changes!
232
Chapter 7: Differential Backup and Restore We are now fully prepared for some subsequent differential backups. However, unbeknown to us, someone sneaks in and performs a second full backup of the database, in order to restore it to a development server.
USE [master] GO BACKUP DATABASE [ForcingFailures] TO DISK = N'C:\SQLBackups\ForcingFailures_DEV_Full.bak' WITH STATS = 10 GO
Some time later, we need to perform a restore process, over the top of the existing (FULL recovery model) database, so prepare and run the appropriate script, only to get a nasty surprise.
USE [master] GO RESTORE DATABASE [ForcingFailures] FROM DISK = N'C:\SQLBackups\Chapter7\ForcingFailures_Full.bak' WITH NORECOVERY, REPLACE, STATS = 10 GO
233
RESTORE DATABASE successfully processed 177 pages in 0.035 seconds (39.508 MB/sec). Msg 3136, Level 16, State 1, Line 2 This differential backup cannot be restored because the database has not been restored to the correct earlier state. Msg 3013, Level 16, State 1, Line 2 RESTORE DATABASE is terminating abnormally.
Due to the "rogue" second full backup, our differential backup does not match our base full backup. As a result, the differential restore operation fails and the database is left in a restoring state. This whole mess could have been averted if that non-scheduled full backup had been taken as a copy-only backup, since this would have prevented SQL Server assigning it as the new base backup for any subsequent differentials. However, what can we do at this point? Well, the first step is to examine the backup history in the msdb database to see if we can track down the rogue backup, as shown in Listing 7-16.
USE [MSDB] GO SELECT bs.type , bmf.physical_device_name , bs.backup_start_date , bs.user_name FROM dbo.backupset bs INNER JOIN dbo.backupmediafamily bmf ON bs.media_set_id = bmf.media_set_id WHERE bs.database_name = 'ForcingFailures' ORDER BY bs.backup_start_date ASC
234
Chapter 7: Differential Backup and Restore This query will tell us the type of backup taken (D = full database, I = differential database, somewhat confusingly), the name and location of the backup file, when the backup was started, and who took it. We can check to see if that file still exsists in the designated directory and use it to restore our differential backup (we can also roundly castigate whoever was responsible, and give them a comprehensive tutorial on use of copy-only backups). If the non-scheduled full backup file is no longer in that location and you are unable to track it down, then there is not a lot you can do at this point, unless you are also taking transaction log backups for the database. If not, you'll simply have to recover the database as it exists, as shown in Listing 7-17, and deal with the data loss.
USE [master] GO RESTORE DATABASE [ForcingFailures] WITH RECOVERY
Recovered, already
For our final example, we'll give our beleaguered ForcingFailures database a rest and attempt a differential restore on DatabaseForDiffBackups, as shown in Listing 7-18. See if you can figure out what is going to happen before executing the command.
USE [master] GO RESTORE DATABASE [DatabaseForDiffBackups] FROM DISK = N'C:\SQLBackups\Chapter7\DatabaseForDiffBackups_Full_Native.bak' WITH REPLACE, STATS = 10 GO RESTORE DATABASE [DatabaseForDiffBackups] FROM DISK = N'C:\SQLBackups\Chapter7\DatabaseForDiffBackups_Diff_Native_2.bak' WITH STATS = 10 GO
235
Chapter 7: Differential Backup and Restore This script should look very familiar to you but there is one small omission from this version which will prove to be very important. Whether or not you spotted the error, go ahead and execute it, and you should see output similar to that shown in Figure 7-12.
The first RESTORE completes successfully, but the second one fails with the error message "The log or differential backup cannot be restored because no files are ready to rollforward." The problem is that we forgot to include the NORECOVERY argument in the first RESTORE statement. Therefore, the full backup was restored and database recovery process proceeded as normal, to return the database to an online and usable state. At this point, the database is not in a state where it can accept further backups. If you see this type of error when performing a restore that takes more than one backup file, differential, or log, you now know that there is a possibility that a previous RESTORE statement on the database didn't include the NORECOVERY argument that would allow for more backup files to be processed.
236
Summary
We have discussed when and why you should be performing differential backups. Differential backups, used properly, can help a DBA make database restores a much simpler and faster process than they would be with other backup types. In my opinion, differential backups should form an integral part of the daily backup arsenal. If you would like more practice with these types of backups, please feel free to modify the database creation, data population and backup scripts provided throughout this chapter. Perhaps you can try a differential restore of a database and move the physical data and log files to different locations. Finally, we explored some of the errors that can afflict differential backup and restore; if you know, up front, the sort of errors you might see, you'll be better armed to deal with them when they pop up in the real world. We couldn't cover every possible situation, of course, but knowing how to read and react to error messages will save you time and headaches in the future.
237
238
Notice that we set the database, initially at least, to SIMPLE recovery model. Later we'll want to perform both differential database backups and log backups, so our main, operational model for the database will be FULL. However, as we'll soon be performing two successive data loads of one million rows each, and we don't want to run into the problem of bloating the log file (as described in the Troubleshooting Log Issues section of Chapter 5), we're going to start off in SIMPLE model, where the log will be auto-truncated, and only switch to FULL once these initial "bulk loads" have been completed. Listing 8-2 shows the script to create our two, familiar, sample tables, and then load MessageTable1 with 1 million rows.
USE [DatabaseForSQLBackups] GO CREATE TABLE [dbo].[MessageTable1] ( [MessageData] [nvarchar](200) NOT NULL , [MessageDate] [datetime2] NOT NULL ) ON [PRIMARY] GO CREATE TABLE [dbo].[MessageTable2] ( [MessageData] [nvarchar](200) NOT NULL , [MessageDate] [datetime2] NOT NULL ) ON [PRIMARY] GO USE [DatabaseForSQLBackups]
239
Listing 8-2: Creating sample tables and initial million row data load for MessageTable1.
Take a look in the C:\SQLData folder, and you should see that our data and log files are still at their initial sizes; the data file is approximately 500 MB in size (and is pretty much full), and the log file is still 100 MB in size. Therefore we have a total database size of about 600 MB. It's worth noting that, even if we'd set the database in FULL recovery model, the observed behavior would have been the same, up to this point. Don't forget that a database is only fully engaged in FULL recovery model after the first full backup, so the log may still have been truncated during the initial data load. If we'd performed a full backup before the data load, the log would not have been truncated and would now be double that of our data file, just over 1 GB. Of course, this means that the log file would have undergone many auto-growth events, since we only set it to an initial size of 100 MB and to grow in 10 MB steps.
240
Full Backups
In this section, we'll walk through the process of taking a full backup of the example DatabaseForSQLBackups database, using both the SQL Backup GUI, and SQL Backup T-SQL scripts. In order to follow through with the examples, you'll need to have installed the Red Gate SQL Backup Pro GUI on your client machine, registered your test SQL Server instances, and installed the server-side components of SQL Backup. If you've not yet completed any part of this, please refer to Appendix A, for installation and configuration instructions.
241
Figure 8-1:
A few additional features to note on this screen are as follows: we can take backups of more than one database at a time; a useful feature when you need one-time backups of multiple databases on a system if we wish to backup most, but not all, databases on a server, we can select the databases we don't wish to backup, and select Exclude these from the top drop-down the Filter list check box allows you to screen out any databases that are not available for the current type of backup; for example, this would ensure that you don't attempt to take a log backup of a database that is running under the SIMPLE recovery model. Step 3 is where we will configure settings to be used during the backup. For this first run through, we're going to focus on just the central Backup location portion of this screen, for the moment, and the only two changes we are going to make are to the backup file 242
Chapter 8: Database Backup and Restore with SQL Backup Pro location and name. Adhering closely to the convention used throughout, we'll place the backup file in the C:\SQLBackups\Chapter8\ folder and call it DatabaseForSQLBackups_ Full_1.sqb. Notice the .sqb extension that denotes this as a SQL Backup-generated backup file. Once done, the screen will look as shown in Figure 8-2.
Figure 8-2:
243
Chapter 8: Database Backup and Restore with SQL Backup Pro There are two options offered below the path and name settings in the Backup location section that allow us to "clean up" our backup files, depending on preferences. Overwrite existing backup files Overwrite any backup files in that location that share the same name. Delete existing backup files Remove any files that are more than x days old, or remove all but the latest x files. We also have the option of cleaning these files up before we start a new backup. This is helpful if the database backup files are quite large and wouldn't fit on disk if room was not cleared beforehand. However, be sure that any backups targeted for removal have first been safely copied to another location . The top drop-down box of this screen offers the options to split or mirror a backup file (which we will cover at the end of this chapter). The Network copy location section allows us to copy the finished backup to a second network location, after it has completed. This is a good practice, and you also get the same options of cleaning up the files on your network storage. What you choose here doesn't have to match what you chose for your initial backup location; for example, you can store just one day of backups on a local machine, but three days on your network location.
244
Figure 8-3:
Backup compression is enabled by default, and there are four levels of compression, offering progressively higher compression and slower backup speeds. Although the compressed data requires lower levels of disk I/O to write to the backup file, the overriding factor is the increased CPU time required in compressing the data in the first place. As you can probably guess, picking the compression level is a balancing act; the better the compression, the more disk space will be saved, but the longer the backups run, the more likely are issues with the backup operation. Ultimately, the choice of compression level should be guided by the size of the database and the nature of the data (i.e. its compressibility). For instance, binary data does not compress well, so don't spend CPU time attempting to compress a database full of binary images. In other cases, lower levels of compression may yield higher compression ratios. We can't tell until we test the database, which is where the Compression Analyzer comes in. Go ahead and click on the Compression Analyzer button and start a test against the DatabaseForSQLBackups database. Your figures will probably vary a bit, but you should see results similar to those displayed in Figure 8-4.
245
Figure 8-4:
For our database, Levels 1 and 4 offer the best compression ratio, and since a backup size of about 4.5 MB (for database of 550 MB) is pretty good, we'll pick Level 1 compression, which should also provide the fastest backup times.
Should I use the compression analyzer for every database?
Using the analyzer for each database in your infrastructure is probably not your best bet. We saw very fast results when testing this database, because is it very small compared to most production databases. The larger the database you are testing, the longer this test will take to run. This tool is recommended for databases that you are having compression issues with, perhaps on a database where you are not getting the compression ratio that you believe you should be.
246
Chapter 8: Database Backup and Restore with SQL Backup Pro The next question we need to consider carefully is this: do we want to encrypt all of that data that we are writing to disk? Some companies operate under strict rules and regulations, such as HIPAA and SOX, which require database backups to be encrypted. Some organizations just like the added security of encrypted backups, in helping prevent a malicious user getting access to those files. If encryption is required, simply tick the Encrypt backup box, select the level of encryption and provide a secure password. SQL Backup will take care of the rest, but at a cost. Encryption will also add CPU and I/O overhead to your backups and restores. Each of these operations now must go through the "compress | encrypt | store on disk" process to back up, as well as the "retrieve from disk| decrypt | decompress" routines, adding an extra step to the both backup and restore processes.
Store your encryption password in a safe and secure location!
Not having the password to those backup files will stop you from ever being able to restore those files again, which is just as bad as not having backups at all!
Our database doesn't contain any sensitive data and we are not going to use encryption on this example.
247
Depending on the system, some good performance benefits can be had by allowing more threads to be used in the backup process. When using a high-end CPU, or many multi-core CPUs, we can split hefty backup operations across multiple threads, each of which can run in parallel on a separate core. This can dramatically reduce the backup processing time. The Maximum transfer size and Maximum data block options can be important parameters in relation to backup performance. The transfer size option dictates the maximum size of each block of memory that SQL Backup will write to the backup file, on disk. The default value is going to be 1,024 KB (i.e. 1 MB), but if SQL Server is experiencing memory pressure, it may be wise to lower this value so that SQL Backup can write to smaller memory blocks, and therefore isn't fighting so hard, with other applications, for memory. If this proves necessary, we can add a DWORD registry key to define a new default value on that specific machine. All values in these keys need to be in multiples of 65,536 (64 KB), up to a maximum of 1,048,576 (1,024 KB, the default). 248
Chapter 8: Database Backup and Restore with SQL Backup Pro HKEY_LOCAL_MACHINE\SOFTWARE\Red Gate\SQL Backup\ BackupSettingsGlobal\<instance name>\MAXTRANSFERSIZE (32-bit) HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Red Gate\SQL Backup\ BackupSettingsGlobal\<instance name>\MAXTRANSFERSIZE (64-bit) The maximum data block option (by default 2 MB) determines the size of the actual data blocks on disk, when storing backup data. For optimal performance, we want this value to match or fit evenly in the block size of the media to which the files are being written; if the data block sizes overlap the media block boundaries, it may result in performance degradation. Generally speaking, SQL Server will automatically select the correct block size based on the media. However, if necessary, we can create a registry entry to overwrite the default value: HKEY_LOCAL_MACHINE\SOFTWARE\Red Gate\SQL Backup\ BackupSettingsGlobal\<instance name>\MAXDATABLOCK (32-bit) HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Red Gate\SQL Backup\ BackupSettingsGlobal\<instance name>\MAXDATABLOCK (64-bit) The network resilience options determine how long to wait before retrying a failed backup operation, and how many times to retry; when the retry count has been exhausted, a failure message will be issued. We want to be able to retry a failed backup a few times, but we don't want to retry so many times that a problem in the network or disk subsystem is masked for too long; in other words, rather than extend the backup operation as it retries again and again, it's better to retry only a few times and then fail, therefore highlighting the network issue. This is especially true when performing full backup operations on large databases, where we'll probably want to knock that retry count down from a default of 10 to 23 times. Transaction log backups, on the other hand, are typically a short process, so retrying 10 times is not an unreasonable number in this case.
249
Chapter 8: Database Backup and Restore with SQL Backup Pro The On completion section gives us the option to verify our backup files after completion and send an email once the operations are complete. The verification process is similar to the CHECKSUM operation in native SQL Server backups (see Chapter 2). SQL Backup will make sure that all data blocks have been written correctly and that you have a valid backup file. Note that, at the time of this book going to print, Red Gate released SQL Backup Pro Version 7, which expands the backup verification capabilities to allow standard backup verification (BACKUPWITH CHECKSUM) and restore verification (RESTORE VERIFYONLY), as shown in Figure 8-5b. We'll revisit the topic of backup verification with SQL Backup later in the chapter.
Fig 8-5b:
The email notification can be set to alert on any error, warning, or just when the backup has completed in any state including success.
250
Figure 8-6: Status details from a successful SQL Backup GUI backup operation.
Backup metrics: SQL Backup Pro vs. native vs. native compressed
Figure 8-6 reports an initial database size of 600 MB, a backup file size of 3.6 MB and backup time of about 3 seconds. Remember, these metrics are ballpark figures and won't match exactly the ones you get on your system. Don't necessarily trust SQL Backup on the backup size; double-check it in the backup location folder (in my case it was 3.7 MB; pretty close!). 251
Chapter 8: Database Backup and Restore with SQL Backup Pro Figure 8-7 compares these metrics to those obtained for identical operations using native backup, compressed native backup, as well as SQL Backup using a higher compression level (3).
Backup Operation Backup File Size Backup Time in MB (on disk) (seconds) % Difference compared to Native Full (+ = bigger / faster) Size Native Full Native Full with Compression SQL Backup Full Compression Level 1 SQL Backup Full Compression Level 3 Figure 8-7:
501 6.7 3.7 7.3 19.6 4.6 3.1 4.5 -98.7 -99.3 -98.5 +76.5 +84.2 +77
Speed
SQL Backup (Compression Level 1) produces a backup file that requires less than 1% of the space required for the native full backup file. To put this into perspective, in the space that we could store 3 days' worth of full native backups using native SQL Server, we could store nearly 400 SQL Backup files. It also increases the backup speed by about 84%. How does that work? Basically, every backup operation reads the data from the disk and writes to a backup-formatted file, on disk. By far the slowest part of the operation is writing the data to disk. When SQL Backup (or native compression) performs its task, it is reading all of the data, passing it through a compression tool and writing much smaller segments of data, so less time is used on the slowest operation. In this test, SQL Backup with Compression Level 1 outperformed the native compressed backup. However, for SQL Backup with Compression Level 3, their performance was almost identical. It's interesting to note that while Level 3 compression should, in theory, have resulted in a smaller file and a longer backup time, compared to Level 1, we in fact saw a larger file and a longer backup time! This highlights the importance of selecting the compression level carefully. 252
Take a look in the SQLData folder, and you'll now see that the data file is now about 1GB in size and the log file is still around 100 MB. Now that we have some more data to work with, Listing 8-4 shows the SQL Backup T-SQL script to perform a second full backup of our DatabaseForSQLBackups database.
EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForSQLBackups] TO DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_2.sqb'' WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, THREADCOUNT = 2, COMPRESSION = 1"' GO
Listing 8-4: Second full backup of DatabaseForSQLBackups using SQL Backup T-SQL.
253
Chapter 8: Database Backup and Restore with SQL Backup Pro The first thing to notice is that the backup is executed via an extended stored procedure called sqlbackup, which resides in the master database and utilizes some compiled DLLs that have been installed on the server. We pass to this stored procedure a set of parameters as a single string, which provides the configuration settings that we wish to use for the backup operation. Some of the names of these settings are slightly different from what we saw for native backups but, nevertheless, the script should look fairly familiar. We see the usual BACKUP DATABASE command to signify that we are about to backup the DatabaseForSQLBackups database. We also see the TO DISK portion and the path to where the backup file will be stored. The latter portion of the script sets values for some of the optimization settings that we saw on Step 4 of the GUI wizard. These are not necessary for taking the backup, but it's useful to know what they do. DISKRETRYINTERVAL One of the network resiliency options; amount of time in seconds SQL Backup will wait before retrying the backup operation, in the case of a failure. DISKRETRYCOUNT Another network resiliency option; number of times a backup will be attempted in the event of a failure. Bear in mind that the more times we retry, and the longer the retry interval, the more extended will be the backup operation. THREADCOUNT Using multiple processors and multiple threads can offer a huge performance boost when taking backups. The only setting that could make much of a difference is the threadcount parameter, since on powerful multi-processor machines it can spread the load of backup compression over multiple processors. Go ahead and run the script now and the output should look similar to that shown in Figure 8-8.
254
Figure 8-8:
We are now presented with two result sets. The first result set, shown in Figure 8-8, provides our detailed backup metrics, including database size, and the size and compression rate of the resulting backup file. The second result set (not shown) gives us the exit code from SQL Backup, an error code from SQL Server and a list of files used in the command. We can see from the first result set that the new backup file size is just under 7 MB, and if we take a look in our directory we can confirm this. Compared to our first Red Gate backup, we can see that this is a bit less than double the original file size, but we will take a closer look at the numbers in just a second. After these metrics, we see the number of pages processed. Referring back to Chapter 3 confirms that this is the same number of pages as for the equivalent native backup, which is as expected. We also see that the backup took just 6 seconds to complete which, again, is roughly double the figure for the first full backup. Figure 8-9 compares these metrics to those obtained for native backups, and compressed native backups.
255
Speed
All the results are broadly consistent with those we achieved for the first full backup. It confirms that for this data, SQL Backup Compression Level 1 is the best-performing backup, both in terms of backup time and backup file size.
Log Backups
In this section, we'll walk through the process of taking log backups of the example DatabaseForSQLBackups database, again using either the SQL Backup GUI, or a SQL Backup T-SQL script.
Listing 8-5: Switching DatabaseForSQLBackups to FULL recovery and taking a full backup.
The database is now operating in FULL recovery and this backup file, DatabaseFor SQLBackups_Full_BASE.sqb, will be the one to restore, prior to restoring any subsequent log backups. Finally, let's perform a third, much smaller, data load, adding ten new rows to each message table, as shown in Listing 8-6.
USE [DatabaseForSQLBackups] GO INSERT INTO dbo.MessageTable1 VALUES ( '1st set of short messages for MessageTable1', GETDATE() ) GO 50000 -- this is just to help with our point-in-time restore (Chapter 8) PRINT GETDATE() GO
257
INSERT INTO dbo.MessageTable2 VALUES ( '1st set of short messages for MessageTable2', GETDATE() ) GO 50000
258
Figure 8-10: Selecting the type of backup and the target database.
On Step 3, we set the name and location for our log backup file. Again, adhering to the convention used throughout the book, we'll place the backup file in the C:\SQLBackups\ Chapter8\ folder and call it DatabaseForSQLBackups_Log_1.sqb. Step 4 of the wizard is where we will configure the compression, optimization and resiliency options of our transaction log backup. The Compression Analyzer only tests full backups and all transaction logs are pretty much the same in terms of compressibility. We'll choose Compression Level 1, again, but since our transaction log backup will, in this case, process only a small amount of data, we could select maximum compression (Level 4) without affecting the processing time significantly. We're not going to change any of the remaining options on this screen, and we have discussed them all already, so go ahead and click on Next. If everything looks as expected on the Summary screen, click on Finish to start the backup. If all goes well, within a few seconds the appearance of two green checkmarks will signal that all pieces of the operation have been completed, and some backup metrics will be displayed. If you prefer the script-based approach, Listing 8-7 shows the SQL Backup T-SQL script that does directly what our SQL Backup GUI did under the covers.
259
Whichever way you decide to execute the log backup, you should see backup metrics similar to those shown in Figure 8-11.
Backing up DatabaseForSQLBackups (transaction log) to: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Log_1.sqb Backup data size : 50.625 MB Compressed data size: 7.020 MB Compression rate : 86.13% Processed 6261 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. BACKUP LOG successfully processed 6261 pages in 0.662 seconds (73.877 MB/sec). SQL Backup process ended.
The backup metrics report a compressed backup size of 7 MB, which we can verify by checking the actual size of the file in the C:\SQLBackups\Chapter8\ folder, and a processing time of about 0.7 seconds. Once again, Figure 8-12 compares these metrics to those obtained for native backups, compressed native backups and for SQL Backup with a higher compression level.
260
Speed
In all cases, there is roughly a 90% saving in disk space for compressed backups, over the native log backup. In terms of backup performance, native log backups, native compressed log backups, and SQL Backup Compression Level 1 all run in sub-second times, so it's hard to draw too many conclusions except to say that for smaller log files the time savings are less significant than for full backups, as would be expected. SQL Backup Compression Level 3 does offer the smallest backup file footprint, but the trade-off is backup performance that is significantly slower than for native log backups.
Differential Backups
Finally, let's take a very quick look at how to perform a differential database backup using SQL Backup. For full details on what differential backups are, and when they can be useful, please refer back to Chapter 7. First, simply adapt and rerun Listing 8-6 to insert a load of 100,000 rows into each of the message tables (also, adapt the message text accordingly).
261
Chapter 8: Database Backup and Restore with SQL Backup Pro Then, if you prefer the GUI approach, jump-start SQL Backup and work through in the exactly the same way as described for the full backup. The only differences will be: at Step 2, choose Differential as the backup type at Step 3, call the backup file DatabaseForSQLBackups_Diff_1.sqb and locate it in the C:\SQLBackups\Chapter8 folder at Step 4, choose Compression Level 1. If you prefer to run a script, the equivalent SQL Backup script is shown in Listing 8-8.
USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForSQLBackups] TO DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Diff_1.sqb'' WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 1, THREADCOUNT = 2, DIFFERENTIAL"'
Again, there is little new here; the command is more or less identical to the one for full backups, with the addition of the DIFFERENTIAL keyword to the WITH clause which instructs SQL Server to only backup any data changed since the last full backup was taken.
Backing up DatabaseForSQLBackups (differential database) to: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Diff_1.sqb Backup data size : 269.063 MB Compressed data size: 2.575 MB Compression rate : 99.04% Processed 33752 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups' on file 1. Processed 2 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. BACKUP DATABASE WITH DIFFERENTIAL successfully processed 33754 pages in 1.695 seconds (155.573 MB/sec). SQL Backup process ended.
262
Chapter 8: Database Backup and Restore with SQL Backup Pro Let's do a final metrics comparison for a range of differential backups.
Backup Operation Backup File Size Backup Time in MB (on disk) (seconds) % Difference compared to Native (+ = bigger / faster) Size Native Log Native Log with Compression SQL Backup Log Compression Level 1 SQL Backup Log Compression Level 3
270.4 4.2 2.6 4.7 4.4 2.4 1.7 2.7 -98.4 -99.0 -98.3 +45.5 +61.4 +38.6
Speed
Once again, the space and time savings from compressed backup are readily apparent, with SQL Backup Compression Level 1 emerging as the most efficient on both counts, in these tests.
263
Chapter 8: Database Backup and Restore with SQL Backup Pro configure the type of backup required (full, differential, or log) store the backup files using the default naming convention set up in Red Gate SQL Backup Pro capture a report of any error or warning codes during the backup operation. Take a look at the script in Listing 8-9, and then we'll walk through all the major sections.
USE [master] GO DECLARE @BackupFileLocation NVARCHAR(200) , @EmailOnFailure NVARCHAR(200) , @SQLBackupCommand NVARCHAR(2000) , @DatabaseList NVARCHAR(2000) , @ExitCode INT , @ErrorCode INT , @BackupType NVARCHAR(4) -- Conifgure Options Here SET @BackupFileLocation = N'\\NetworkServer\ShareName\' + @@SERVERNAME + '\' SET @EmailOnFailure = N'DBATeam@MyCompany.com' SET @DatabaseList = N'DatabaseForDiffBackups_SB' SET @BackupType = N'DIFF' -- Do Not Modify Below SET @SQLBackupCommand = CASE @BackupType WHEN N'FULL' THEN N'-SQL "BACKUP DATABASES [' + @DatabaseList + N'] TO DISK = ''' + @BackupFileLocation + N'<AUTO>.sqb'' WITH MAILTO_ONERRORONLY = ''' + @EmailOnFailure + N''', DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 3, THREADCOUNT = 2"' WHEN N'LOG' THEN N'-SQL "BACKUP LOGS [' + @DatabaseList + N'] TO DISK = ''' + @BackupFileLocation + N'<AUTO>.sqb'' WITH MAILTO_ONERRORONLY = ''' + @EmailOnFailure
264
The script starts by declaring the required variables, and the sets the values of the four confugurable variables, as below. @BackupFileLocation The backup path for our database backup files. This should be pointed at the centralized storage location. Also notice that the servername variable is used as a subdirectory. It is common practice to separate backup files by server, so this will use each server name as a subdirectory to store that set of backup files. @EmailOnFailure We always want to know when backups fail. In some environments, manually checking all database backup operations is just not feasible. Having alerts sent to the DBA team on failure is a good measure to have in place. Be sure to 265
Chapter 8: Database Backup and Restore with SQL Backup Pro test the email setting on each server occasionally, to guarantee that any failure alerts are getting through. Details of how to configure the email settings are in Appendix A. @DatabaseList A comma delimited text list that contains the names of any databases that we want to back up. SQL Backup allows you to back up any number of databases at one time with a single command. When backing up every database on a server, we can simply omit this parameter from the script and, in the BACKUP command, replace [ + @DatabaseList + ' with [*]. ' ] @BackupType Used to determine what type of database backup will be taken. In this script there are three choices for this variable to take; full, log, and differential. In the next section of the script, we build the BACKUP commands, using the variables for which we have just configured values. We store the BACKUP command in the @SQLBackupCommand variable, until we are ready to execute it. Notice that a simple CASE statement is used to determine the type of backup operation to be performed, according to the value stored in @BackupType. We don't need to modify anything in this section of the script, unless making changes to the other settings being used in the BACKUP command, such as the compression level. Of course, we could also turn those into configurable parameters. The next section of the script is where we execute our BACKUP command, storing the ExitCode and ErrorCode output parameters in our defined variables, where: ExitCode is the output value from the SQL Backup extended stored procedure. Any number above 0 indicates some sort of issue with the backup execution. ExitCode >= 500 indicates a serious problem with at least one of the backup files and will need to investigate further. ExitCode < 500 is just a warning code. The backup operation itself may have run successfully, but there was some issue that was not critical enough to cause the entire operation to fail.
266
Chapter 8: Database Backup and Restore with SQL Backup Pro ErrorCode is the SQL Server return value. A value above 0 is returned only when SQL Server itself runs into an issue. Having an error code returned from SQL Server almost always guarantees a critical error for the entire operation. We test the value of each of these codes and, if a serious problem has occurred, we raise an error to SQL Server so that, if this were run in a SQL Server Agent job, it would guarantee to fail and alert someone, if the job were configured to do so. We do have it set up to send email from SQL Backup on a failure, but also having the SQL Agent job alert on failure is a nice safeguard to have in place. What we do in this section is totally customizable and dictated by our needs.
267
So, to recap, we have 2 million rows captured in the base full backup. We switched the database from SIMPLE to FULL recovery model, added 100,000 rows to our tables, then did a log backup, so the TLog1 backup captures the details of inserting those 100,000 rows. We then added another 200,000 rows and took a differential backup. Differentials capture all the data added since the last full backup, so 300,000 rows in this case. We then added 21 rows and took a second log backup, so the TLog2 backup will capture details of 200,021 inserted rows (i.e. all the changes since the last log backup). Finally, we added another 21 rows which are not currently captured in any backup. The backup scheme seems quite hard to swallow when written out like that, so hopefully Figure 8-15 will make it easier to digest.
268
269
However, if the restore is taking place several days or weeks after the backup, then the files will likely have been moved to a new location, and we'll need to manually locate each of the required files for the restore process before SQL Backup will let us proceed. To do so, select Browse for backup files to restore from the top drop-down box, and then click the Add Files button to locate each of the required files, in turn. We can select multiple files in the same directory by holding down the Ctrl button on the keyboard. We can also add a network location into this menu by using the Add Server button, or by pasting in a full network path in the file name box. Whether SQL Backup locates the files for us, or we do it manually, we should end up with a screen that looks similar to that shown in Figure 8-17.
270
Figure 8-17: Identifying the required files for the restore process.
In this example, we need our base full backup and our differential backup files. Note that the availability of the differential backup means we can bypass our first transaction log backup (DatabaseForSQLBackups_Log_1.sqb). However, if for some reason the differential backup was unavailable, then we could still complete the restore process using the full backup followed by both the log files. We're going to overwrite the existing DatabaseForSQLBackups database and leave the data and log files in their original C:\SQLData directory. Note that the handy File locations drop-down is an innovation in SQL Backup 6.5; if using an older version, you'll need to manually fill in the paths using the ellipsis () buttons.
271
Click Next and, on the following screen, SQL Backup warns us that we've not performed a tail log backup and gives us the option to do so, before proceeding. We do not need to restore the tail of the log as part of this restore process as we're deliberately only restoring to the end of our second log backup file. However, remember that the details of our INSERTs into MessageTable2, in Listing 8-10, are not currently backed up, and we don't want to lose details of these transactions, so we're going to go ahead and perform this tail log backup. Accepting this option catapults us into a log backup operation and we must designate the name and location for the tail log backup, and follow through the process, as was described earlier in the chapter.
272
Once complete, we should receive a message saying that the backup of the tail of the transaction log was successful and to click Next to proceed with the restore. Having done so, we re-enter Step 3 of our original restore process, offering a number of database restore options.
273
Chapter 8: Database Backup and Restore with SQL Backup Pro The first section of the screen defines the Recovery completion state, and we're going to stick with the first option, Operational (RESTORE WITH RECOVERY). This will leave our database in a normal, usable state once the restore process is completed. The other two options allow us to leave the database in a restoring state, expecting more backup files, or to restore a database in Standby mode. We'll cover each of these later in the chapter. The Transaction log section of the screen is used when performing a restore to a specific point in time within a transaction log backup file, and will also be covered later. The final section allows us to configure a few special operations, after the restore is complete. The first option will test the database for orphaned users. An orphaned user is created when a database login has permissions set internally to a database, but that user doesn't have a matching login, either as a SQL login or an Active Directory login. Orphaned users often occur when moving a database between environments or servers, and are especially problematic when moving databases between operationally different platforms, such as between development and production, as we discussed in Chapter 3. Be sure to take care of these orphaned users after each restore, by either matching the user with a correct SQL Server login or by removing that user's permission from the database. The final option is used to send an email to a person or a group of people, when the operation has completed and can be configured such that a mail is sent regardless of outcome, or when an error or warning occurs, or on error only. This is a valuable feature for any DBA. Just as we wrote email notification into our automated backup scripts, so we also need to know if a restore operation fails for some reason, or reports a warning. This may be grayed out unless the mail server options have been correctly configured. Refer to Appendix A if you want to try out this feature here. Another nice use case for these notifications is when performing time-sensitive restores on VLDBs. We may not want to monitor the restore manually as it may run long into the night. Instead, we can use this feature so that the DBA, and the departments that need the database immediately, get a notification when the restore operation has completed.
274
Chapter 8: Database Backup and Restore with SQL Backup Pro Click Next to reach the, by now familiar, Summary screen. Skip to the Script tab to take a sneak preview of the script that SQL Backup has generated for this operation. You'll see that it's a three-step restore process, restoring first the base full backup, then the differential backup, and finally the second log backup (you'll see a similar script again in the next section and we'll go over the full details there).
EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForSQLBackups] FROM DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_BASE.sqb'' WITH NORECOVERY, REPLACE"' EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForSQLBackups] FROM DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Diff_1.sqb'' WITH NORECOVERY"' EXECUTE master..sqlbackup '-SQL "RESTORE LOG [DatabaseForSQLBackups] FROM DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Log_2.sqb'' WITH RECOVERY, ORPHAN_CHECK"'
Listing 8-11: The SQL Backup script generated by the SQL Backup Wizard.
As a side note, I'm a little surprised to see the REPLACE option in the auto-generated script; it's not necessary as we did perform a tail log backup. If everything is as it should be, click Finish and the restore process will start. All steps should show green check marks to let us know that everything finished successfully and some metrics for the restore process should be displayed, a truncated view of which is given in Figure 8-21.
<snip> Restoring DatabaseForSQLBackups (database) from: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_BASE.sqb Processed 125208 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups' on file 1. Processed 3 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. RESTORE DATABASE successfully processed 125211 pages in 16.715 seconds (58.522 MB/sec). SQL Backup process ended. <snip>
275
Restoring DatabaseForSQLBackups (database) from: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Diff_1.sqb Processed 33744 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups' on file 1. Processed 2 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. RESTORE DATABASE successfully processed 33746 pages in 6.437 seconds (40.956 MB/sec). SQL Backup process ended. <snip> Restoring DatabaseForSQLBackups (transaction logs) from: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Log_2.sqb Processed 0 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups' on file 1. Processed 12504 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. RESTORE LOG successfully processed 12504 pages in 1.798 seconds (54.330 MB/sec). No orphaned users detected. SQL Backup process ended. <snip>
Figure 8-21: Metrics for SQL Backup restore to end of the second log backup.
We won't dwell on the metrics here as we'll save that for a later section, where we compare the SQL Backup restore performance with native restores. Being pessimistic DBAs, we won't believe the protestations of success from the output of the restore process, until we see with our own eyes that the data is as it should be.
USE DatabaseForSQLBackups GO SELECT MessageData, COUNT(MessageData) FROM dbo.MessageTable1 GROUP BY MessageData SELECT MessageData, COUNT(MessageData) FROM dbo.MessageTable2 GROUP BY MessageData
276
Chapter 8: Database Backup and Restore with SQL Backup Pro The result confirms that all the data is there from our full, differential, and second log backups, but that the 21 rows we inserted into MessageTable2 are currently missing.
Never fear; since we had the foresight to take a tail log backup, we can get those missing 21 rows back.
277
There shouldn't be too much here that is new, but let's go over some of the WITH clause options. DISCONNECT_EXISTING Used in Step 1, this kills any current connections to the database. Without this option, we would need to use functionality similar to that which we built into our native restore script in Chapter 3 (see Listing 4-2). REPLACE This is required here, since we are now working with a new, freshlyrestored copy of the database and we aren't performing a tail log backup as the first step of this restore operation. SQL Server will use the logical file names and paths that are stored in the backup file. Remember that this only works if the paths in the backup file exist on the server to which you are restoring. NORECOVERY Used in Steps 13, this tells SQL Server to leave the database in a restoring state and to expect more backup files to be applied. ORPHAN_CHECK Used in Step 4, this is the orphaned user check on the database, after the restore has completed, as described in the previous section. RECOVERY Used in Step 4, this instructs SQL Server to recover the database to a normal usable state when the restore is complete. 278
Chapter 8: Database Backup and Restore with SQL Backup Pro Execute the script, then, while it is running, we can take a quick look at the SQL Backup monitoring stored procedure, sqbstatus, a feature that lets us monitor any SQL Backup restore operation, while it is in progress. Quickly open a second tab in SSMS and execute Listing 8-14.
EXEC master..sqbstatus GO
Listing 8-14: Red Gate SQL Backup Pro monitoring stored procedure.
The stored procedure returns four columns: the name of the database being restored; the identity of the user running the restore; how many bytes of data have been processed; and the number of compressed bytes that have been produced in the backup file. It can be useful to check this output during a long-running restore the first time you perform it, to gauge compression rates, or to get an estimate of completion time for restores and backups on older versions of SQL Server, where Dynamic Management Views are not available to tell you that information. Once the restore completes, you'll see restore metrics similar to those shown in Figure 8-21, but with an additional section for the tail log restore. If you rerun Listing 8-12 to verify your data, you should find that the "missing" 21 rows in MessageTable2 are back!
Chapter 8: Database Backup and Restore with SQL Backup Pro ship over and apply transaction logs, using the WITH STANDBY option, to roll forward the standby database and keep it closely in sync with the primary. In between log restores, the standby database remains accessible but in a read-only state. This makes it a good choice for near real-time reporting solutions where some degree of time lag in the reporting data is acceptable. However, this option is occasionally useful when in the unfortunate position of needing to roll forward through a set of transaction logs to locate exactly where a data mishap occurred. It's a laborious process (roll forward a bit, query the standby database, roll forward a bit further, query again, and so on) but, in the absence of any other means to restore a particular object or set of data, such as a tool that supports object-level restore (more on this a little later) it could be a necessity. In order to simplify our point-in-time restore, let's run another full backup, as shown in Listing 8-15.
EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForSQLBackups] TO DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_BASE2.sqb'' WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, THREADCOUNT = 2, COMPRESSION = 1"' GO
We'll then add some more data to each of our message tables, before simulating a disaster, in the form of someone accidentally dropping MessageTable2.
USE [DatabaseForSQLBackups] GO INSERT INTO dbo.MessageTable1 VALUES ( 'MessageTable1, I think the answer might be 41. No, wait...', GETDATE() ) GO 41
280
In this simple example, we have the luxury of knowing exactly when each event occurred. However, imagine this is a busy production database, and we only find out about the accidental table loss many hours later. Listing 8-17 simulates one of our regular, scheduled log backups, which runs after the data loss has occurred.
USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP LOG [DatabaseForSQLBackups] TO DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Log_3.sqb'' WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 1, THREADCOUNT = 2"'
281
Figure 8-23: Identifying the backup files for our PIT restore.
Our intention here, as discussed, is to restore a copy of the DatabaseForSQLBackups database in Standby mode. This will give us read access to the standby copy as we attempt to roll forward to just before the point where we lost Messagetable2. So, this time, we'll restore to a new database, called DatabaseForSQLBackups_Standby, as shown in Figure 8-24.
282
At Step 3, we're going to choose a new option for the completion state of our restored database, which is Read-only (RESTORE WITH STANDBY). In doing so, we must create an undo file for the standby database. As we subsequently apply transaction log backups to our standby database, to roll forward in time, SQL Server needs to be able to roll back the effects of any transactions that were uncommitted at the point in time to which we are restoring. However, the effects of these uncommitted transactions must be preserved. As we roll further forward in time, SQL Server may need to reapply the effects of a transaction it previously rolled back. If SQL Server doesn't keep a record of that activity, we wouldn't be able to keep our database relationally sound. All of this information regarding the rolled back transactions is managed through the undo file. We'll place the undo file in our usual SQLBackups directory. In the central portion of the screen, we have the option to restore the transaction log to a specific point in time; we're going to roll forward in stages, first to a point as close as we can after 10:41:36.540, which should be the time we completed the batch of 41 INSERTs into MessageTable1. Again, remember that in a real restore scenario, you will probably not know which statements were run when. 283
Click Next to reach the Summary screen, where we can also take a quick preview of the script that's been generated (we'll discuss this in more detail shortly). Click Finish to execute the restore operation and it should complete quickly and successfully, with the usual metrics output. Refresh the SSMS Object Explorer to reveal a database called DatabaseForSQLBackups_Standby, which is designated as being in a Standby/ Read-Only state. We can query it to see if we restored to the point we intended.
USE DatabaseForSQLBackups_Standby GO SELECT MessageData , COUNT(MessageData) FROM dbo.MessageTable1 GROUP BY MessageData
284
As we hoped, we've got both tables back, and we've restored the 41 rows in MessageTable1, but not the 42 in MessageTable2.
To get these 42 rows back, so the table is back to the state it was when dropped, we'll need to roll forward a little further, but stop just before the DROP TABLE command was issued. Start another restore operation on DatabaseForSQLBackups, and proceed as before to Step 2. This time, we want to overwrite the current DatabaseForSQLBackups_ Standby database, so select it from the drop-down box.
285
At Step 3, we'll specify another standby restore, using the same undo file, and this time we'll roll forward to just after we completed the load of 42 rows into Messagetable2, but just before that table got dropped (i.e. as close as we can to 10:42:45.897).
286
Once again, the operation should complete successfully, with metrics similar to those shown in Figure 8-29.
Restoring DatabaseForSQLBackups_Standby (database) from: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_BASE2.sqb Processed 125208 pages for database 'DatabaseForSQLBackups_Standby', file 'DatabaseForSQLBackups' on file 1. Processed 3 pages for database 'DatabaseForSQLBackups_Standby', file 'DatabaseForSQLBackups_ log' on file 1. RESTORE DATABASE successfully processed 125211 pages in 16.512 seconds (59.241 MB/sec). SQL Backup process ended. Restoring DatabaseForSQLBackups_Standby (transaction logs) from: C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Log_3.sqb Processed 0 pages for database 'DatabaseForSQLBackups_Standby', file 'DatabaseForSQLBackups' on file 1. Processed 244 pages for database 'DatabaseForSQLBackups_Standby', file 'DatabaseForSQLBackups_ log' on file 1. RESTORE LOG successfully processed 244 pages in 0.032 seconds (59.555 MB/sec). No orphaned users detected. SQL Backup process ended.
287
Chapter 8: Database Backup and Restore with SQL Backup Pro Rerun our data verification query, from Listing 8-18 and you should see that we now have the 42 rows restored to MessageTable2.
The first command restores the base backup file to a new database, using the MOVE argument to copy the existing data and log files to the newly designated files. We specify NORECOVERY so that the database remains in a restoring state, to receive further backup files. The second command applies the log backup file to this new database. Notice the use of the WITH STANDBY clause, which indicates the restored state of the new database, and associates it with the correct undo file. Also, we use the STOPAT clause, with which you should be familiar from Chapter 4, to specify the exact point in time to which we wish to roll forward. Any transactions that were uncommitted at the time will be rolled back during the restore.
288
Chapter 8: Database Backup and Restore with SQL Backup Pro This is the first of our restore operations that didn't end with the RECOVERY keyword. The STANDBY is one of three ways (RECOVERY, NORECOVERY, STANDBY) to finalize a restore, and one of the two ways to finalize a restore and leave the data in an accessible state. It's important to know which finalization technique to use in which situations, and to remember they don't all do the same thing.
289
Processed 125208 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups' on file 1. Processed 3 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. RESTORE DATABASE successfully processed 125211 pages in 15.901 seconds (82.195 MB/sec).
Then, we'll take a native, compressed full backup of the newly restored database, and then a native restore from that backup.
USE [master] GO BACKUP DATABASE DatabaseForSQLBackups TO DISK = N'C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_Native.bak' WITH COMPRESSION, INIT, NAME = N'DatabaseForSQLBackups-Full Database Backup' GO RESTORE DATABASE [DatabaseForSQLBackups] FROM DISK = N'C:\SQLBackups\Chapter8\DatabaseForSQLBackups_Full_Native.bak' WITH FILE = 1, STATS = 25, REPLACE GO 25 percent processed. 50 percent processed.
75 percent processed. 100 percent processed. Processed 125208 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups' on file 1. Processed 2 pages for database 'DatabaseForSQLBackups', file 'DatabaseForSQLBackups_log' on file 1. RESTORE DATABASE successfully processed 125210 pages in 13.441 seconds (53.044 MB/sec).
Listing 8-21: Code and metrics for native restore of full database backup.
290
Chapter 8: Database Backup and Restore with SQL Backup Pro As a final test, we can rerun Listing 8-21, but performing a native, non-compressed backup, and then restoring from that. In my tests, the restore times for native compressed and SQL Backup compressed backups were roughly comparable, with the native compressed restores performing slightly faster. Native non-compressed restores were somewhat slower, running in around 21 seconds in my tests.
Verifying Backups
As discussed in Chapter 2, the only truly reliable way of ensuring that your various backup files really can be used to restore a database is to perform regular test restores. However, there are a few other things you can do to minimize the risk that, for some reason, one of your backups will be unusable. SQL Backup backups, like native backups can, to some extent, be checked for validity using both BACKUP...WITH CHECKSUM and RESTORE VERIFYONLY. If both options are configured for the backup process (see Figure 8-5b) then SQL Backup will verify that the backup is complete and readable and then recalculate the checksum on the data pages contained in the backup file and compare it against the checksum values generated during the backup. Listing 8-22 shows the script.
EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForSQLBackups] TO DISK = ''D:\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL.sqb'' WITH CHECKSUM, DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, THREADCOUNT = 2, VERIFY"'
Listing 8-22: BACKUPWITH CHECKSUM and RESTORE VERIFYONLY with SQL Backup.
Alternatively, we can run either of the validity checks separately. As discussed in Chapter 2, BACKUPWITH CHECKSUM verifies only that each page of data written to the backup file is error free in relation to how it was read from disk. It does not validate that the backup data is valid, only that what was being written, was written correctly. It can cause a lot of overhead and slow down backup operations significantly, so evaluate its use carefully, based on available CPU capacity. 291
Chapter 8: Database Backup and Restore with SQL Backup Pro Nevertheless, these validity checks do provide some degree of reassurance, without the need to perform full test restores. Remember that we also need to be performing DBCC CHECKDB routines on our databases at least weekly, to make sure they are in good health and that our backups will be restorable. There are two ways to do this: we can run DBCC CHECKDB before the backup, as a T-SQL statement, in front of the extended stored procedure that calls SQL Backup Pro or, with version 7 of the tool, we can also enable the integrity check via the Schedule Restore Jobs wizard, as shown in Figure 8-30.
Backup Optimization
We discussed many ways to optimize backup storage and scheduling back in Chapter 2, so here, we'll focus just on a few optimization features that are supported by SQL Backup. The first is the ability to back up to multiple backup files on multiple devices. This is one of the best ways to increase throughput in backup operations, since we can write to several backup files simultaneously. This only applies when each disk is physically separate hardware; if we have one physical disk that is partitioned into two or more logical drives, we will not see a performance increase since the backup data can only be written to one of those logical drives at a time. 292
Chapter 8: Database Backup and Restore with SQL Backup Pro Listing 8-23 shows the SQL Backup command to back up a database to multiple backup files, on separate disks. The listing will also show how to restore from these multiple files, which requires the addition of extra file locations.
EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForSQLBackups] TO DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL_1.sqb'', DISK = ''D\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL_2.sqb'', WITH DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 3"' EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForSQLBackups] FROM DISK = ''C:\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL_1.sqb'', DISK = ''D:\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL_2.sqb'', WITH RECOVERY"'
This is useful not only for backup throughput/performance; it is also a useful way to cut down on the size of a single backup, for transfer to other systems. Nothing is more infuriating than trying to copy a huge file to another system only to see it fail after 90% of the copy is complete. With this technique, we can break down that large file and copy the pieces separately. Note that backup to multiple files is also supported in native T-SQL and for native backups, as shown in Listing 8-24.
BACKUP DATABASE [DatabaseForSQLBackups] TO DISK = 'C:\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL_Native_1.bak', DISK = 'C:\SQLBackups\Chapter8\DatabaseForSQLBackups_FULL_Native_2.bak'
Having covered how to split backups to multiple locations, let's now see how to back up a single file, but have it stored in multiple locations, as shown in Listing 8-25. This is useful when we want to back up to a separate location, such as a network share, when taking the original backup. This is only an integrated option when using the SQL Backup tool. 293
This will cause the backup process to run a bit longer, since it has to back up the database as well as copy it to another location. Just remember that network latency can be a major time factor in the completion of the backups, when using this option. Ultimately, to get the most performance out of a backup, we will need to tune each backup routine to match the specific environment. This requires testing the disk subsystem, the throughput of SQL Server, and making adjustments in the backup process to get the best backup performance. There is a useful listing on the Red Gate support site that offers some tips on how to do this using just a few extra parameters in the SQL Backup stored procedure: www.red-gate.com/supportcenter/content/SQL_Backup/help/6.5/ SBU_OptimizingBackup.
Summary
All of the same principles that we have discussed when using native SQL Server backup procedures apply when using Red Gate SQL Backup Pro. We want to follow the same best practices, and we can implement the same type of backup strategies if we are using SQL Backup in our environment. Red Gate SQL Backup Pro is not a requirement for our backup strategies, but it is a great tool that can save substantial amounts of time and disk space. Always remember to use the right tool for the right job.
294
Chapter 9: File and Filegroup Backup and Restore performing several different types of file-based restore, namely: complete restores restore right to the point of failure point-in-time restores restore to a specific point in a transaction log backup using the STOPAT parameter restoring just a single data file recovering the database as a whole, by restoring just a single "failed" secondary data file online piecemeal restore, with partial database recovery bringing a database back online quickly after a failure, by restoring just the primary data file, followed later by the other data files. This operation requires SQL Server Enterprise (or Developer) Edition.
Chapter 9: File and Filegroup Backup and Restore Improved disk I/O performance Achieved by separating different files and filegroups onto separate disk drives. This assumes that the SAN, DAS, or local storage is set up to take advantage of the files being on separate physical spindles, or SSDs, as opposed to separate logical disks. Piecemeal restores can be a massive advantage for any database, and may be a necessity for VLDBs, where the time taken for a full database restore would fall outside the down-time stipulated in the SLA. In terms of disk I/O performance, it's possible to gain performance advantages by creating multiple data files within a filegroup, and placing each file on a separate drive and, in some case, by separating specific tables and indexes into a filegroup, again on a dedicated drive. It's even possible to partition a single object across multiple filegroups (a topic we won't delve into further in this book, but see http://msdn.microsoft.com/en-us/ library/ms162136.aspx for a general introduction). In general, I would caution against going overboard with the idea of trying to optimize disk I/O by manual placement of files and filegroups on different disk spindles, unless it is a proven necessity from a performance or storage perspective. It's a complex process that requires a lot of prior planning and ongoing maintenance, as data grows. Instead, I think there is much to be said for keeping file architecture as simple as is appropriate for a given database, and then letting your SAN or direct-attached RAID array take care of the disk I/O optimization. If a specific database requires it, then by all means work with the SAN administrators to optimize file and disk placement, but don't feel the need to do this on every database in your environment. As always, it is best to test and see what overhead this would place on the maintenance and administration of the server as opposed to the potential benefits which it provides. With that in mind, let's take a closer, albeit still relatively brief, look at possible filegroup architectures.
297
In other words, a single data file in the PRIMARY filegroup, plus a log file (remember that log files are entirely separate from data files; log files are never members of a filegroup). However, as discussed in Chapter 1, it's common to create more than one data file per filegroup, as shown in Listing 9-2.
CREATE DATABASE [FileBackupsTest] ON PRIMARY ( NAME = N'FileBackupsTest', FILENAME = N'C:\SQLData\FileBackupsTest.mdf'), ( NAME = N'FileBackupsTest2', FILENAME = N'D:\SQLData\FileBackupsTest2.ndf') LOG ON ( NAME = N'FileBackupsTest_log', FILENAME = N'E:\SQLData\FileBackupsTest_log.ldf' ) GO
298
Chapter 9: File and Filegroup Backup and Restore Now we have two data files in the PRIMARY filegroup, plus the log file. SQL Server will utilize all the data files in a given database on a "proportionate fill" basis, making sure that each data file is used equally, in a round-robin fashion. We can also back up each of those files separately, if we wish. We can place each data file on a separate spindle to increase disk I/O performance. However, we have no control over exactly which data gets placed where, so we may end up with most of the data that is very regularly updated written to one file and most of the data that is rarely touched in the second. We'd have one disk working to peak capacity while the other sat largely idle, and we wouldn't achieve the desired performance benefit. The next step is to exert some control over exactly what data gets stored where, and this means creating some secondary filegroups, and dictating which objects store their data where. Take a look at Listing 9-3; in it we create the usual PRIMARY filegroup, holding our mdf file, but also a user-defined filegroup called SECONDARY, in which we create three secondary data files.
CREATE DATABASE [FileBackupsTest] ON PRIMARY ( NAME = N'FileBackupsTest', FILENAME = N'E:\SQLData\FileBackupsTest.mdf' , SIZE = 51200KB , FILEGROWTH = 10240KB ), FILEGROUP [Secondary] ( NAME = N'FileBackupsTestUserData1', FILENAME = N'G:\SQLData\FileBackupsTestUserData1.ndf' , SIZE = 5120000KB , FILEGROWTH = 1024000KB ), ( NAME = N'FileBackupsTestUserData2', FILENAME = N'H:\SQLData\FileBackupsTestUserData2.ndf' , SIZE = 5120000KB , FILEGROWTH = 1024000KB ), ( NAME = N'FileBackupsTestUserData3', FILENAME = N'I:\SQLData\FileBackupsTestUserData3.ndf' , SIZE = 5120000KB , FILEGROWTH = 1024000KB ) LOG ON ( NAME = N'FileBackupsTest_log', FILENAME = N'F:\SQLData\FileBackupsTest_log.ldf' , SIZE = 1024000KB , FILEGROWTH = 512000KB ) GO USE [FileBackupsTest] GO
299
GO
Crucially, we can now dictate, to a greater or less degree, what data gets put in which filegroup. In this example, immediately after creating the database, we have stipulated the SECONDARY filegroup, rather than the PRIMARY filegroup, as the default filegroup for this database. This means that our PRIMARY filegroup will hold only our system objects and data (plus pointers to the secondary data files). By default, any user objects and data will now be inserted into one of the data files in the SECONDARY filegroup, unless this was overridden by specifying a different target filegroup when the object was created. Again, the fact that we have multiple data files means that we can back each file up separately, if the entire database can't be backed up in the allotted nightly window. There are many different ways in which filegroups can be used to dictate, at the file level, where certain objects and data are stored. In this example, we've simply decided that the PRIMARY is for system data, and SECONDARY is for user data, but we can take this further. We might decide to store system data in PRIMARY, plus any other data necessary for the functioning of a customer-facing sales website. The actual sales and order data might be in a separate, dedicated filegroup. This architecture might be especially beneficial when running an Enterprise Edition SQL Server, where we can perform online piecemeal restores. In this case, we could restore the PRIMARY first, and get the database back online and the website back up. Meanwhile we can get to work restoring the other, secondary filegroups. We might also split tables logically into different filegroups, for example: separating rarely-used archive data from current data (e.g. current year's sales data in one filegroup, archive data in others) 300
Chapter 9: File and Filegroup Backup and Restore separating out read-only data from read-write data separating out critical reporting data. Another scheme that you may encounter is use of filegroups to separate the non-clustered indexes from the indexed data, although this seems to be a declining practice in cases where online index maintenance is available, with Enterprise Editions of SQL Server, and due to SAN disk systems becoming faster. Remember that the clustered index data is always in the same filegroup as the base table. We can also target specific filegroups at specific types of storage, putting the most frequently used and/or most critical data on faster media, while avoiding eating up highspeed disk space with data that is rarely used. For example, we might use SSDs for critical report data, a slower SAN-attached drive for archive data, and so on. All of these schemes may or may not represent valid uses of filegroups in your environment, but almost all of them will add complexity to your architecture, and to your backup and restore process, assuming you employ file-based backup and restore. As discussed earlier, I only recommend you go down this route if the need is proven, for example for VLDBs where the need is dictated by backup and or restore requirements. For databases of a manageable size, we can continue to use database backups and so gain the benefit of using multiple files/filegroups without the backup and restore complexity. Of course, one possibility is that a database, originally designed with very simple file architecture, grows to the point that it is no longer manageable in this configuration. What is to be done then? Changing the file architecture for a database requires very careful planning, both with regard to immediate changes to the file structure and how this will evolve to accommodate future growth. For the initial redesign, we'll need to consider questions such as the number of filegroups required, how the data is going to be separated out across those filegroups, how many data files are required, where they are going to be stored on disk, and so on. Having done this, we're then faced with the task of planning how to move all the data, and how much time is available each night to get the job done. 301
Chapter 9: File and Filegroup Backup and Restore These are all questions we need to take seriously and plan carefully, with the help of our most experienced DBAs; getting this right the first time will save some huge headaches later. Let's consider a simple example, where we need to re-architect the file structure for a database which currently stores all data in a single data file in PRIMARY. We have decided to create an additional secondary filegroup, named UserDataFilegroup, which contains three physical data files, each of which will be backed up during the nightly backup window. This secondary filegroup will become the default filegroup for the database, and the plan is that from now on only system objects and data will be stored in the PRIMARY data file. How are we going to get the data stored in the primary file into this new filegroup? It depends on the table index design, but ideally each table in the database will have a clustered index, in which case the easiest way to move the data is to re-create the clustered index while moving the data currently in the leaf level of that index over to the new filegroup. The code would look something like that shown in Listing 9-4. In Enterprise editions of SQL Server, we can set the ONLINE parameter to ON, so that the index will be moved but still be available. When using Standard edition go ahead and switch this to OFF.
CREATE CLUSTERED INDEX [IndexName] ON [dbo].[TableName] ([ColumnName] ASC) WITH (DROP_EXISTING = ON, ONLINE = ON) ON [UserDataFileGroup]
If the database doesn't have any clustered indexes, then this was a poor design choice; it should! We can create one for each table, on the most appropriate column or columns, using code similar to Listing 9-4 (omitting the DROP_EXISTING clause, though it won't hurt to include it). Once the clustered index is built, the table will be moved, along with the new filegroup.
302
Chapter 9: File and Filegroup Backup and Restore If this new index is not actually required, we can go ahead and drop it, as shown in Listing 9-5, but ideally we'd work hard to create a useful index instead that we want to keep.
DROP INDEX [IndexName] ON [dbo].[TableName] WITH ( ONLINE = ON ) GO
Keep in mind that these processes will move the data and clustered indexes over to the new filegroup, but not the non-clustered, or other, indexes. We will still need to move these over manually. Many scripts can be found online that will interrogate the system tables, find all of the non-clustered indexes and move them. Remember, also, that the process of moving indexes and data to a different physical file or set of files can be long, and disk I/O intensive. Plan out time each night, over a certain period, to get everything moved with as little impact to production as possible. This is also not a task to be taken lightly, and it should be planned out with the senior database administration team.
File Backup
When a database creeps up in size towards the high hundreds of gigabytes, or into the terabyte realm, then database backups start to become problematic. A full database backup of a database of this size could take over half of a day, or even longer, and still be running long into the business day, putting undue strain on the disk and CPU and causing performance issues for end-users. Also, most DBAs have experienced the anguish of seeing such a backup fail at about 80% completion, knowing that starting it over will eat up another 12 hours.
303
Chapter 9: File and Filegroup Backup and Restore Hopefully, as discussed in the previous sections, this database has been architected such that the data is spread across multiple data files, in several filegroups, so that we can still back up the whole database bit by bit, by taking a series of file backups, scheduled on separate days. While this is the most common reason for file backups, there are other valid reasons too, as we have discussed; for example if one filegroup is read-only, or modified very rarely, while another holds big tables, subject to frequent modifications, then the latter may be on a different and more frequent backup schedule. A file backup is simply a backup of a single data file, subset of data files or an entire filegroup. Each of the file backups contains only the data from the files or filegroups that we have chosen to be included in that particular backup file. The combination of all of the file backups, along with all log backups taken over the same period of time, is the equivalent of a full database backup. Depending on the size of the database, the number of files, and the backup schedule, this can constitute quite a large number of backups. We can capture both full file backups, capturing the entire contents of the designated file or filegroup, and differential file backups, capturing only the pages in that file or filegroup that have been modified since the last full file backup (there are also partial and partial differential file backups, but we'll get to those in Chapter 10).
Is there a difference between file and filegroup backups?
The short answer is no. When we take a filegroup backup we are simply specifying that the backup file should contain all of the data files in that filegroup. It is no different than if we took a file backup and explicitly referenced each data file in that group. They are the exact same backup and have no differences. This is why you may hear the term file backup used instead of filegroup backup. We will use the term file backup for the rest of this chapter.
Of course, the effectiveness of file backups depends on these large databases being designed so that there is, as best as can be achieved, a distribution of data across the data files and filegroups such that the file backups are manageable and can complete in the required time frame. For example, if we have a database of 900 GB, split across three 304
Chapter 9: File and Filegroup Backup and Restore file groups, then ideally each filegroup will be a more manageable portion of the total size. Depending on which tables are stored where, this ideal data distribution may not be possible, but if one of those groups is 800 GB, then we might as well just take full database backups. For reasons that we'll discuss in more detail in relation to file restores, it's essential, when adopting a backup strategy based on file backups, to also take transaction log backups. It's not possible to perform file-based restores unless SQL Server has access to the full set of accompanying transaction logs. The file backups can and will be taken at different times, and SQL Server needs the subsequent transaction log backups in order to guarantee its ability to roll forward each individual file backup to the required point, and so restore the database, as a whole, to a consistent state.
File backups and read-only filegroups
The only time we don't have to apply a subsequent transaction log backup when restoring a file backup, is when SQL Server knows for a fact that the data file could not have been modified since the backup was taken, because the backup was of a filegroup explicitly designated as READ_ONLY). We'll cover this in more detail in Chapter 10, Partial Backup and Restore.
So, for example, if we take a weekly full file backup of a particular file or filegroup, then in the event of a failure of that file, we'd potentially need to restore the file backup plus a week's worth of log files, to get our database back online in a consistent state. As such, it often makes sense to supplement occasional full file backups with more frequent differential file backups. In the same way as differential database backups, these differential file backups can dramatically reduce the number of log files that need processing in a recovery situation. In coming sections, we'll first demonstrate how to take file backups using both the SSMS GUI and native T-SQL scripts. We'll take full file backups via the SSMS GUI, and then differential file backups using T-SQL scripts. We'll then demonstrate how to perform the same actions using the Red Gate SQL Backup tool. 305
Chapter 9: File and Filegroup Backup and Restore Note that it is recommended, where possible, to take a full database backup and start the log backups, before taking the first file backup (see: http://msdn.microsoft.com/ en-us/library/ms189860.aspx). We'll discuss this in more detail shortly, but note that, in order to focus purely on the logistics of file backups, we don't follow this advice in our examples.
Recovery model
Since we've established the need to take log backups, we will need to operate the database in FULL recovery model. We can also take log backups in the BULK_LOGGED model but, as discussed in Chapter 1, this model is only suitable for short-term use during bulk operations. For the long-term operation of databases requiring file backups, we should be using the FULL recovery model.
The big difference between this database creation script, and any that have gone before, is that we're creating two data files: a primary (mdf) data file called DatabaseForFileBackups, in the PRIMARY filegroup, and a secondary (ndf) data file called DatabaseForFileBackups_Data2 in a user-defined filegroup called SECONDARY. This name is OK here, since we will be storing generic data in the second filegroup, but if the filegroup was designed to store a particular type of data then it should be named appropriately to reflect that. For example, if creating a secondary filegroup that will group together files used to store configuration information for an application, we could name it CONFIGURATION. Listing 9-7 creates two sample tables in our DatabaseForFileBackups database, with Table_DF1 stored in the PRIMARY filegroup and Table_DF2 stored in the SECONDARY filegroup. We then load a single initial row into each table.
307
Listing 9-7: Table creation script and initial data load for file backup configuration.
Notice that we specify the filegroup for each table as part of the table creation statement, via the ON keyword. SQL Server will create a table on whichever of the available filegroups is marked as the default group. Unless specified otherwise, the default group will be the PRIMARY filegroup. Therefore, the ON PRIMARY clause, for the first table, is optional, but the ON SECONDARY clause is required. In previous chapters, we've used substantial data loads in order to capture meaningful metrics for backup time and file size. Here, we'll not be gathering these metrics, but rather focusing on the complexities of the backup (and restore) processes, so we're keeping row counts very low.
308
Figure 9-1:
309
Chapter 9: File and Filegroup Backup and Restore Following the convention used throughout the book, we're going to store the backup files in C:\SQLBackups\Chapter9, so go ahead and create that subfolder on your database server, and then, in the Backup wizard, Remove the default backup destination, click Add, locate the Chapter9 folder and call the backup file DatabaseForFileBackups_ FG1_Full.bak. Once back on the main configuration page, double-check that everything on the screen looks as expected and if so, we have no further work to do, so we can click OK to start the file backup operation. It should complete in the blink of an eye, and our first file/filegroup backup is complete! We aren't done yet, however. Repeat the whole file backup process exactly as described previously but, this time, pick the SECONDARY filegroup in the Select Files and Filegroups window and, when setting the backup file destination, call the backup file DatabaseForFileBackups_FG2_Full.bak. Having done this, check the Chapter9 folder and you should find your two backup files, ready for use later in the chapter. We've completed our file backups, but we're still not quite done here. In order to be able to restore a database from its component file backups, we need to be able to apply transaction log backups so that SQL Server can confirm that it is restoring the database to a consistent state. So, we are going to take one quick log backup file. Go back into the Back Up Database screen a third time, select Transaction Log as the Backup type, and set the backup file destination as C:\SQLBackups\Chapter9\DatabaseForFileBackups_ TLOG.trn.
310
Without further ado, the script to perform a differential file backup of our primary data file is shown in Listing 9-9.
USE [master] GO BACKUP DATABASE [DatabaseForFileBackups] FILE = N'DatabaseForFileBackups' TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_FG1_Diff.bak' WITH DIFFERENTIAL, STATS = 10 GO
Listing 9-9: Differential file backup of the primary data file for DatabaseForFileBackups.
The only new part of this script is the use of the FILE argument to specify which of the data files to include in the backup. In this case, we've referenced by name our primary data file, which lives in the PRIMARY filegroup. We've also used the DIFFERENTIAL argument to specify a differential backup, as described in Chapter 7. Go ahead and run the script now and you should see output similar that shown in Figure 9-2.
311
Figure 9-2:
What we're interested in here are the files that were processed during the execution of this command. You can see that we only get pages processed on the primary data file and the log file. This is exactly what we were expecting, since this is a differential file backup, capturing the changed data in the primary data file, and not a differential database backup. If we had performed a differential (or full) database backup on a database that has multiple data files, then all those files will be processed as part of the BACKUP command and we'd capture all data changed in all of the data files. So, in this case, it would have processed both the data files and the log file. Let's now perform a differential file backup of our secondary data file, as shown in Listing 9-10.
USE [master] GO BACKUP DATABASE [DatabaseForFileBackups] FILEGROUP = N'SECONDARY' TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_FG2_Diff.bak' WITH DIFFERENTIAL, STATS = 10 GO
312
Chapter 9: File and Filegroup Backup and Restore This time, the script demonstrates the use of the FILEGROUP argument to take a backup of the SECONDARY filegroup as a whole. Of course, in this case there is only a single data file in this filegroup and so the outcome of this command will be exactly the same as if we had specified FILE = N'DatabaseForFileBackups_Data2' instead. However, if the SECONDARY filegroup had contained more than one data file, then all of these files would have been subject to a differential backup. If you look at the message output, after running the command, you'll see that only the N'DatabaseForFileBackups_Data2 data file and the log file are processed. Now we have a complete set of differential file backups that we will use to restore the database a little later. However, we are not quite done. Since we took the differential backups at different times, there is a possible issue with the consistency of the database, so we still need to take another transaction log backup. In Listing 9-11, we first add one more row each to the two tables and capture a date output (we'll need this later in a point-in-time restore demo) and then take another log backup.
USE [master] GO INSERT VALUES INSERT VALUES GO SELECT INTO DatabaseForFileBackups.dbo.Table_DF1 ( 'Point-in-time data load for Table_DF1' ) INTO DatabaseForFileBackups.dbo.Table_DF2 ( 'Point-in-time data load for Table_DF2' )
313
314
Listing 9-12: SQL Backup full file backup of primary and secondary data files.
Most of the details of this script, with regard to the SQL Backup parameters that control the compression and resiliency options, have been covered in detail in Chapter 8, so I won't repeat them here. We can see that we are using the FILEGROUP parameter here to perform the backup against all files in our PRIMARY filegroup. Since this filegroup includes just our single primary data file, we could just as well have specified the file explicitly, which is the approach we take when backing up the secondary data file. Having completed the full file backups, we are going to need to take a quick log backup of this database, just as we did with the native backups, in order to ensure we can restore the database to a consistent state, from the component file backups. Go ahead and run Listing 9-13 in a new query window to get a log backup of our DatabaseForFileBackups_SB test database.
USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP LOG [DatabaseForFileBackups_SB] TO DISK = ''C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_TLOG.sqb''"'
Listing 9-13: Taking our log backup via SQL Backup script.
315
Next, Listing 9-15 shows the script to perform the differential file backups for both the primary and secondary data files.
USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForFileBackups_SB] FILEGROUP = ''PRIMARY'' TO DISK = ''C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_Diff.sqb'' WITH DIFFERENTIAL, DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 3, THREADCOUNT = 2"' EXECUTE master..sqlbackup '-SQL "BACKUP DATABASE [DatabaseForFileBackups_SB] FILE = ''DatabaseForFileBackups_SB_Data2'' TO DISK = ''C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_Data2_Diff.sqb'' WITH DIFFERENTIAL, DISKRETRYINTERVAL = 30, DISKRETRYCOUNT = 10, COMPRESSION = 3, THREADCOUNT = 2"'
Listing 9-15: SQL Backup differential file backup of primary and secondary data files.
316
Chapter 9: File and Filegroup Backup and Restore The only significant difference in this script compared to the one for the full file backups, apart from the different backup file names, is use of the DIFFERENTIAL argument to denote that the backups should only take into account the changes made to each file since the last full file backup was taken. Take a look at the output for this script, shown in truncated form in Figure 9-3; the first of the two SQL Backup operations processes the primary data file (DatabaseForFileBackups), and the transaction log, and the second processes the secondary data file (DatabaseForFileBackups_Data2) plus the transaction log.
Backing up DatabaseForFileBackups_SB (files/filegroups differential) to: C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_Diff.sqb Backup data size : 3.250 MB Compressed data size: 55.500 KB Compression rate : 98.33% Processed 56 pages for database 'DatabaseForFileBackups_SB', file 'DatabaseForFileBackups_SB' on file 1. Processed 2 pages for database 'DatabaseForFileBackups_SB', file 'DatabaseForFileBackups_SB_log' on file 1. BACKUP DATABASE...FILE=<name> WITH DIFFERENTIAL successfully processed 58 pages in 0.036 seconds (12.396 MB/sec). SQL Backup process ended.
Figure 9-3:
Having completed the differential file backups, we do need to take one more backup and I think you can guess what it is. Listing 9-16 takes our final transaction log backup of the chapter.
USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP LOG [DatabaseForFileBackups_SB] TO DISK = ''C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_TLOG2.sqb''"'
317
File Restore
In all previous chapters, when we performed a restore operation, we restored the database as a whole, including all the data in all the files and filegroups, from the full database backup, plus any subsequent differential database backups. If we then wished to roll forward the database, we could do so by applying the full chain of transaction log backups. However, it is also possible to restore a database from a set individual file backups; the big difference is that that we can't restore a database just from the latest set of full (plus differential) file backups. We must also apply the full set of accompanying transaction log backups, up to and including the log backup taken after the final file backup in the set. This is the only way SQL Server can guarantee that it can restore the database to a consistent state. Consider, for example, a simple case of a database comprising three data files, each in a separate filegroup and where FG1_1, FG2_1, FG3_1 are full files backups of each separate filegroup, as shown in Figure 9-4.
318
Chapter 9: File and Filegroup Backup and Restore Notice that the three file backups are taken at different times. In order to restore this database, using backups shown, we have to restore the FG1_1, FG2_1 and FG3_1 file backups, and then the chain of log backups 15. Generally speaking, we need the chain of log files starting directly after the oldest full file backup in the set, and finishing with the one taken directly after the most recent full file backup. Note that even if we are absolutely certain that in Log5 no further transactions were recorded against any of the three filegroups, SQL Server will not trust us on this and requires this log backup file to be processed in order to guarantee that any changes recorded in Log5 that were made to any of the data files, up to the point the FG3_1 backup completed, are represented in the restore, and so the database has transactional consistency. We can also perform point-in-time restores, to a point within the log file taken after all of the current set of file backups; in Figure 9-4, this would be to some point in time within the Log5 backup. If we wished to restore to a point in time within, say, Log4, we'd need to restore the backup for filegroup 3 taken before the one shown in Figure 9-4 (let's call it FG3_0), followed by FG1_1 and FG2_1, and then the chain of logs, starting with the one taken straight after FG3_0 and ending with Log4. This also explains why Microsoft recommends taking an initial full database backup and starting the log backup chain before taking the first full file backup. If we imagine that FG1_1, FG2_1 and FG3_1 file backups were the first-ever full file backups for this database, and that they were taken on Monday, Wednesday and Friday, then we'd have no restore capability in that first week, till the FG3_1 and Log5 backups were completed. It's possible, in some circumstances, to restore a database by restoring only a single file backup (plus required log backups), rather than the whole set of files that comprise the database. This sort of restore is possible as long as you've got a database composed of several data files or filegroups, regardless of whether you're taking database or file backups; as long as you've also got the required set of log backups, it's possible to restore a single file from a database backup.
319
Chapter 9: File and Filegroup Backup and Restore The ability to recover a database by restoring only a subset of the database files can be very beneficial. For example, if a single data file for a VLDB goes offline for some reason, we have the ability to restore from file backup just the damaged file, rather than restoring the entire database. With a combination of the file backup, plus the necessary transaction log backups, we can get that missing data file back to the state it was in as close as possible to the time of failure, and much quicker than might be possible if we needed to restore the whole database from scratch! With Enterprise Edition SQL Server, as discussed earlier, we also have the ability to perform online piecemeal restores, where again we start by restoring just a subset of the data files, in this case the primary filegroup, and then immediately bringing the database online having recovered only this subset of the data. As you've probably gathered, restoring a database from file backups, while potentially very beneficial in reducing down-time, can be quite complex and can involve managing and processing a large number of backup files. The easiest way to get a grasp of how the various types of file restore work is by example. Therefore, over the following sections, we'll walk though some examples of how to perform, with file backups, the same restore processes that we've seem previously in the book, namely a complete restore and a pointin-time restore. We'll then take a look at an example each of recovering from a "single data file failure," as well as online piecemeal restore. We're not going to attempt to run through each type of restore in four different ways (SSMS, T-SQL, SQL Backup GUI, SQL Backup T-SQL), as this would simply get tedious. We'll focus on scripted restores using either native T-SQL or SQL Backup T-SQL, and leave the equivalent restores, via GUI methods, as an exercise for the reader. It's worth noting, though, that whereas for database backups the SQL Backup GUI will automatically detect all required backup files (assuming they are still in their original locations), it will not do so for file backups; each required backup file will need to be located manually.
320
Figure 9-5 depicts the current backups we have in place. We have the first data load captured in full file backups, the second data load captured in the differential file backups, and a third data load that is not in any current backup file, but we'll need to capture it in a tail log backup in order to restore the database to its current state. In a case where we were unable to take a final tail log backup we'd only be able to roll forward to the end of the TLOG2 backup. In this example, we are going to take one last backup, just to get our complete database back intact.
321
Figure 9-5:
The first step is to capture that tail log backup, and prepare for the restore process, as shown in Listing 9-18.
USE master GO --backup the tail BACKUP LOG [DatabaseForFileBackups] TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_TLOG_TAIL.trn' WITH NORECOVERY GO
Notice the use of the NORECOVERY option in a backup; this lets SQL Server know that we want to back up the transactions in the log file and immediately place the database into a restoring state. This way, no further transactions can slip past us into the log while we are preparing the RESTORE command. We're now ready to start the restore process. The first step is to restore the two full file backups. We're going to restore over the top of the existing database, as shown in Listing 9-19.
322
Processed 184 pages for database 'DatabaseForFileBackups', file 'DatabaseForFileBackups' on file 1. Processed 6 pages for database 'DatabaseForFileBackups', file 'DatabaseForFileBackups_log' on file 1. The roll forward start point is now at log sequence number (LSN) 23000000026800001. Additional roll forward past LSN 23000000036700001 is required to complete the restore sequence. This RESTORE statement successfully performed some actions, but the database could not be brought online because one or more RESTORE steps are needed. Previous messages indicate reasons why recovery cannot occur at this point.
RESTORE DATABASE ... FILE=<name> successfully processed 190 pages in 0.121 seconds (12.251 MB/sec). Processed 16 pages for database 'DatabaseForFileBackups', file 'DatabaseForFileBackups_Data2' on file 1. Processed 2 pages for database 'DatabaseForFileBackups', file 'DatabaseForFileBackups_log' on file 1. RESTORE DATABASE ... FILE=<name> successfully processed 18 pages in 0.105 seconds (1.274 MB/sec).
Figure 9-6: Output message from restoring the full file backups.
Notice that we didn't specify the state to which to return the database after the first RESTORE command. By default this would attempt to bring the database back online, with recovery, but in this case SQL Server knows that there are more files to process, so it 323
Chapter 9: File and Filegroup Backup and Restore keeps the database in a restoring state. The first half of the message output from running this command, shown in Figure 9-6, tells us that the roll forward start point is at a specific LSN number but that an additional roll forward is required and so more files will have to be restored to bring the database back online. The second part of the message simply reports that the restore of the backup for the secondary data file was successful. Since we specified that the database should be left in a restoring state after the second restore command, SQL Server doesn't try to recover the database to a usable state (and is unable to do so). If you check your Object Explorer in SSMS, you'll see that DatabaseForFileBackups is still in a restoring state. After the full file backups, we took a transaction log backup (_TLOG), but since we're rolling forward past the subsequent differential file backups, where any data changes will be captured for each filegroup, we don't need to restore the first transaction log, on this occasion. So, let's go ahead and restore the two differential file backups, as shown in Listing 9-20.
USE master GO RESTORE DATABASE [DatabaseForFileBackups] FILE = N'DatabaseForFileBackups' FROM DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_FG1_Diff.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForFileBackups] FILE = N'DatabaseForFileBackups_Data2' FROM DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_FG2_Diff.bak' WITH NORECOVERY GO
324
Chapter 9: File and Filegroup Backup and Restore The next step is to restore the second transaction log backup (_TLOG2), as shown in Listing 9-21. When it comes to restoring the transaction log backup files, we need to specify NORECOVERY on all of them except the last. The last group of log backup files we are restoring (represented by only a single log backup in this example!) may be processing data for all of the data files and, if we do not specify NORECOVERY, we can end up putting the database in a usable state for the user, but unable to apply the last of the log backup files.
USE master GO RESTORE DATABASE [DatabaseForFileBackups] FROM DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_TLOG2.trn' WITH NORECOVERY GO
Finally, we need to apply the tail log backup, where we know our third data load is captured, and recover the database.
RESTORE DATABASE [DatabaseForFileBackups] FROM DISK = N'C:\SQLBackups\Chapter9\DatabaseForFileBackups_TLOG_TAIL.trn' WITH RECOVERY GO
Listing 9-22: Restore the tail log backup and recover the database.
A simple query of the restored database will confirm that we've restored the database, with all the rows intact.
USE [DatabaseForFileBackups] GO SELECT * FROM Table_DF1 SELECT * FROM Table_DF2
325
(4 row(s) affected) Message -------------------------------------------------This is the initial data load for the table This is the second data load for the table This is the point-in-time data load for the table This is the third data load for the table (4 row(s) affected)
326
Notice that we include the REPLACE keyword in the first restore, since we are trying to replace the database and in this case aren't starting with a tail log backup, and there may be transactions in the log that haven't been backed up yet. We then restore the second full file backup and the two differential file backups, leaving the database in a restoring state each time. Finally, we restore the second transaction log backup, using the STOPAT parameter to indicate the time to which we wish to restore the database. In my example, I set the time for the STOPAT parameter to be about 30 seconds before the two INSERTs were executed, and Listing 9-25 confirms that only the first two data loads are present in the restored database.
327
Great! We can see that our data is exactly what we expected. We have restored the data to the point in time exactly where we wanted. In the real world, you would be restoring to a point in time before a disaster struck or when data was somehow removed or corrupted.
328
Chapter 9: File and Filegroup Backup and Restore Everything is running smoothly, and we get great performance for this database until our SAN suddenly suffers a catastrophic loss on one of its disk enclosures and we lose the drive holding one of the secondary data files. The database goes offline. We quickly get a new disk attached to our server, but the secondary data file is lost and we are not going to be able to get it back. As this point, all of the tables and data in that data file will be lost but, luckily, we have been performing regular full file, differential file, and transaction log backups of this database, and if we can capture a final tail-of-thelog backup, we can get this database back online using only the backup files for the lost secondary data file, plus the necessary log files, as shown in Figure 9-7.
In order to get the lost data file back online, with all data up to the point of the disk crash, we would: perform a tail log backup this requires a special form of log backup, which does not truncate the transaction log restore the full file backup for the lost secondary data file restore the differential file backup for the secondary data file, at which point SQL Server would expect to be able to apply log backups 3 and 4 plus the tail log backup recover the database. 329
Chapter 9: File and Filegroup Backup and Restore Once you do this, your database will be back online and recovered up to the point of the failure! This is a huge time saver since restoring all of the files could take quite a long time. Having to restore only the data file that was affected by the crash will cut your recovery time down significantly. However, there are a few caveats attached to this technique, which we'll discuss as we walk through a demo. Specifically, we're going to use SQL Backup scripts (these can easily be adapted into native T-SQL scripts) to show how we might restore just a single damaged, or otherwise unusable, secondary data file for our DatabaseForFileBackups_SB database, without having to restore the primary data file. Note that if the primary data file went down, we'd have to restore the whole database, rather than just the primary data file. However, with Enterprise Edition SQL Server, it would be possible to get the database back up and running by restoring only the primary data file, followed subsequently by the other data files. We'll discuss that in more detail in the next section. This example is going to use our DatabaseForFileBackups_SB database to restore from a single disk / single file failure, using SQL Backup T-SQL scripts. A script demonstrating the same process using native T-SQL is available with the code download for this book. If you recall, for this database we have Table_DF1, stored in the primary data file (DatabaseForFileBackups_SB.mdf) in the PRIMARY filegroup, and Table_DF2, stored in the secondary data file (DatabaseForFileBackups_SB_Data2.ndf) in the SECONDARY filegroup. Our first data load (one row into each table) was captured in full file backups for each filegroup. We then captured a first transaction log backup. Our second data load (an additional row into each table) was captured in differential file backups for each filegroup. Finally, we took a second transaction log backup. Let's perform a third data load, inserting one new row into Table_DF2.
330
Now we're going to simulate a problem with our secondary data file which takes it (and our database) offline; in Listing 9-27 we take the database offline. Having done so, navigate to the C:\SQLData\Chapter9 folder and delete the secondary data file!
-- Take DatabaseForFileBackups_SB offline USE master GO ALTER DATABASE [DatabaseForFileBackups_SB] SET OFFLINE; GO /*Now delete DatabaseForFileBackups_SB_Data2.ndf!*/
Listing 9-27: Take DatabaseForFileBackups_SB offline and delete secondary data file!
Scary stuff! Next, let's attempt to bring our database back online.
USE master GO ALTER DATABASE [DatabaseForFileBackups_SB] SET ONLINE; GO Msg 5120, Level 16, State 5, Line 1 Unable to open the physical file "C:\SQLData\DatabaseForFileBackups_SB_Data2.ndf".
Operating system error 2: "2(failed to retrieve text for this error. Reason: 15105)". Msg 945, Level 14, State 2, Line 1 Database 'DatabaseForFileBackups_SB' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details. Msg 5069, Level 16, State 1, Line 1 ALTER DATABASE statement failed.
Listing 9-28: Database cannot come online due to missing secondary data file.
331
Chapter 9: File and Filegroup Backup and Restore As you can see, this is unsuccessful, as SQL Server can't open the secondary data file. Although unsuccessful, this attempt to bring the database online is still necessary, as we need the database to attempt to come online, so the log file can be read. We urgently need to get the database back online, in the state it was in when our secondary file failed. Fortunately, the data and log files are on separate drives from the secondary, so these are still available. We know that there is data in the log file that isn't captured in any log backup, so our first task is to back up the log. Unfortunately, a normal log backup operation, such as shown in Listing 9-29 will not succeed.
-- a normal tail log backup won't work USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP LOG [DatabaseForFileBackups_SB] TO DISK = ''C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_TLOG_TAIL.sqb'' WITH NORECOVERY"' GO
In my tests, this script just hangs and has to be cancelled, which is unfortunate. The equivalent script in native T-SQL results in an error message to the effect that:
"Database 'DatabaseForFileBackups_SB' cannot be opened due to inaccessible files or insufficient memory or disk space."
SQL Server cannot back up the log this way because the log file is not available and part of the log backup process, even a tail log backup, needs to write some log info into the database header. Since it cannot, we need to use a different form of tail backup to get around this problem.
332
Chapter 9: File and Filegroup Backup and Restore What we need to do instead, is a special form of tail log backup that uses the NO_ TRUNCATE option, so that SQL Server can back up the log without access to the data files. In this case the log will not be truncated upon backup, and all log records will remain in the live transaction log. Essentially, this is a special type of log backup and isn't going to remain useful to us after this process is over. When we do get the database back online and completely usable, we want to be able to take a backup of the log file in its original state and not break the log chain. In other words, once the database is back online, we can take another log backup (TLOG3, say) and the log chain will be TLOG2 followed by TLOG3 (not TLOG2, TLOG_TAIL, TLOG3). I would, however, suggest attempting to take some full file backups immediately after a failure, if not a full database backup, if that is at all possible.
USE [master] GO EXECUTE master..sqlbackup '-SQL "BACKUP LOG [DatabaseForFileBackups_SB] TO DISK = ''C:\SQLBackups\Chapter9\DatabaseForFileBackups_SB_TLOG_TAIL.sqb'' WITH NO_TRUNCATE"' GO
Listing 9-30: Performing a tail log backup with NO_TRUNCATE for emergency single file recovery.
Note that if we cannot take a transaction log backup before starting this process, we cannot get our database back online without restoring all of our backup files. This process will only work if we lose a single file from a drive that does not also house the transaction log. Now that we have our tail log backup done, we can move on to recovering our lost secondary data file. The entire process is shown in Listing 9-31. You will notice that there are no backups in this set of RESTORE commands that reference our primary data file. This data would have been left untouched and doesn't need to be restored. Having to restore only the lost file will save us a great deal of time.
333
What we have done here is restore the full file, the differential file and all of the transaction log file backups that were taken after the differential file backup, including the tail log backup. This has brought our database back online and right up to the point where the tail log backup was taken. Now you may be afraid that you didn't get to it in time and some data may be lost, but don't worry. Any transactions that would have affected the missing data file would not have succeeded after the disaster, even if the database stayed online, and any that didn't use that missing data file would have been picked up by the tail log backup.
334
335
This script is pretty long, but nothing here should be new to you. Our new database contains both a primary and secondary filegroup, and we establish the secondary filegroup as the DEFAULT, so this is where user objects and data will be stored, unless we specify otherwise. We then switch the database to FULL recovery model just to be sure 336
Chapter 9: File and Filegroup Backup and Restore that we can take log backups of the database; always validate which recovery model is in use, rather than just relying on the default being the right one. Finally, we create one table in each filegroup, and insert a single row into each table. Listing 9-33 simulates a series of file and log backups on our database; we can imagine that the file backups for each filegroup are taken on successive nights, and that the log backup after each file backup represents the series of log files that would be taken during the working day. Note that, in order to keep focused, we don't start proceedings with a full database backup, as would generally be recommended.
USE [master] GO BACKUP DATABASE [DatabaseForPartialRestore] FILEGROUP = N'PRIMARY' TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_FG_Primary.bak' WITH INIT GO BACKUP LOG [DatabaseForPartialRestore] TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_LOG_1.trn' WITH NOINIT GO BACKUP DATABASE [DatabaseForPartialRestore] FILEGROUP = N'SECONDARY' TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_FG_Secondary.bak' WITH INIT GO BACKUP LOG [DatabaseForPartialRestore] TO DISK = N'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_LOG_2.trn' WITH NOINIT GO
337
Chapter 9: File and Filegroup Backup and Restore With these backups complete, we have all the tools we need to perform a piecemeal restore! Remember, though, that we don't need to be taking file backups in order to perform a partial/piecemeal restore. If the database is still small enough, we can still take full database backups and then restore just a certain filegroup from that backup file, in the manner demonstrated next. Listing 9-34 restores just our primary filegroup, plus subsequent log backups, and then beings the database online without the secondary filegroup!
USE [master] GO RESTORE DATABASE [DatabaseForPartialRestore] FILEGROUP = 'PRIMARY' FROM DISK = 'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_FG_Primary.bak' WITH PARTIAL, NORECOVERY, REPLACE GO RESTORE LOG [DatabaseForPartialRestore] FROM DISK = 'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_LOG_1.trn' WITH NORECOVERY GO RESTORE LOG [DatabaseForPartialRestore] FROM DISK = 'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_LOG_2.trn' WITH RECOVERY GO
Listing 9-34: Restoring our primary filegroup via an online piecemeal restore.
Notice the use of the PARTIAL keyword to let SQL Server know that we will want to only partially restore the database, i.e. restore only the primary filegroup. We could also restore further filegroups here, but the only one necessary is the primary filegroup. Note the use of the REPLACE keyword, since we are not taking a tail log backup. Even though we recover the database upon restoring the final transaction log, we can still, in this case, restore the other data files later. The query in Listing 9-35 attempts to access data in both the primary and secondary filegroups. 338
The first query should return data, but the second one will fail with the following error:
Msg 8653, Level 16, State 1, Line 1 The query processor is unable to produce a plan for the table or view 'Message_Secondary' because the table resides in a filegroup which is not online.
This is exactly the behavior we expect, since the secondary filegroup is still offline. In a well-designed database we would, at this point, be able to access all of the most critical data, leaving just the least-used data segmented into the filegroups that will be restored later. The real bonus is that we can subsequently restore the other filegroups, while the database is up and functioning! Nevertheless, the online restore is going to be an I/O-intensive process and we would want to affect the users as little as possible, while giving as much of the SQL Server horsepower as we could to the restore. That means that it's best to wait till a time when database access is sparse before restoring the subsequent filegroups, as shown in Listing 9-36.
USE [master] GO RESTORE DATABASE [DatabaseForPartialRestore] FROM DISK = 'C:\SQLBackups\Chapter9\DatabaseForPartialRestore_FG_SECONDARY.bak' WITH RECOVERY GO
339
We restore the secondary full file backup, followed by all subsequent log backups, so that SQL Server can bring the other filegroup back online while guaranteeing relational integrity. Notice that in this case each restore is using WITH RECOVERY; in an online piecemeal restore, with the Enterprise edition, each restore leaves the database online and accessible to the end-user. The first set of restores, in Listing 9-34, used NORECOVERY, but that was just to get us to the point where the primary filegroup was online and available. All subsequent restore steps use RECOVERY. Rerun Listing 9-35, and all of the data should be fully accessible!
340
Chapter 9: File and Filegroup Backup and Restore A similar argument applies to missing differential file backups; we'll have to simply rely on the full file backup and the chain of log files. If a log file is lost, the situation is more serious. Transaction log files are the glue that holds together the other two file backup types, and they should be carefully monitored to make sure they are valid and that they are being stored locally as well as on long-term storage. Losing a transaction log backup can be a disaster if we do not have a set of full file and file differential backups that cover the time frame of the missing log backup. If a log file is unavailable or corrupt, and we really need it to complete a restore operation, we are in bad shape. In this situation, we will not be able to restore past that point in time and will have to find a way to deal with the data loss. This is why managing files carefully, and keeping a tape backup offsite is so important for all of our backup files.
Chapter 9: File and Filegroup Backup and Restore Considerations for the SLA agreement, for any databases requiring file backups include those shown below. Scheduling of full file backups I recommend full file backup at least once per week, although I've known cases where we had to push beyond this as it wasn't possible to get a full backup of each of the data files in that period. Scheduling of differential file backups I recommend scheduling differential backups on any day where a full file backup is not being performed. As discussed in Chapter 7, this can dramatically decrease the number of log files to be processed for a restore operation. Scheduling of transaction log backups These should be taken daily, at an interval chosen by yourself and the project manager whose group uses the database. I would suggest taking log backups of a VLDB using file backups at an interval of no more than 1 hour. Of course, if the business requires a more finely-tuned window of recoverability, you will need to shorten that schedule down to 30 or even 15 minutes, as required. Even if the window of data loss is more than 1 hour, I would still suggest taking log backups hourly. For any database, it's important to ensure that all backups are completing successfully, and that all the backup files are stored securely and in their proper location, whether on a local disk or on long-term tape storage. However, in my experience, the DBA needs to be exceptionally vigilant in this regard for databases using file backups. There are many more files involved, and a missing log file can prevent you from being able to restore a database to a consistent state. With a missing log file and in the absence of a full file or differential file backup that covers the same time period, we'd have no choice but to restore to a point in time before the missing log file was taken. These log backups are also your "backup of a backup" in case a differential or full file backup goes missing. I would suggest keeping locally at least the last two full file backups, subsequent differential file backup and all log backups spanning the entire time frame of these file backups, even after they have been written to tape and taken to offsite storage.
342
Chapter 9: File and Filegroup Backup and Restore It may seem like a lot of files to keep handy, but since this is one of the most file intensive types of restores, it is better to be more safe than sorry. Waiting for a file from tape storage can cost the business money and time that they don't want to lose. Restore times for any database using file backups can vary greatly, depending on the situation; sometimes we'll need to restore several, large full file backups, plus any differential backups, plus all the necessary log backups. Other times, we will only need to restore a single file or filegroup backup plus the covering transaction log backups. This should be reflected in the SLA, and the business owners and end-users should be prepared for this varying window of recovery time, in the event that a restore is required. If the estimated recovery time is outside acceptable limits for complete restore scenarios, then if the SQL Server database in question is Enterprise Edition, y consider supporting the online piecemeal restore process discussed earlier. If not, then the business owner will need to weigh up the cost of upgrading to Enterprise Edition licenses, against the cost of extended down-time in the event of a disaster. As always, the people who use the database will drive the decisions made and reflected in the SLA. They will know the fine details of how the database is used and what processes are run against the data. Use their knowledge of these details when agreeing on appropriate data loss and recovery time parameters, and on the strategy to achieve them.
343
The error is quite subtle so, if you can't spot it, go ahead and execute the script and take a look at the error message, shown in Figure 9-8.
The most revealing part of the error message here states that The file "SECONDARY" is not part of database "DatabaseForFileBackups. However, we know that the SECONDARY filegroup is indeed part of this database. The error we made was with our use of the FILE parameter; SECONDARY is the name of the filegroup, not the secondary data file. We can either change the parameter to FILEGROUP (since we only have one file in this filegroup), or we can use the FILE parameter and reference the name of the secondary data file explicitly (FILE= N'DatabaseForFileBackups_Data2'). Let's now move on to a bit of file restore-based havoc. Consider the script shown in Listing 9-38, the intent of which appears to be to restore our DatabaseForFileBackups database to the state in which it existed when we took the second transaction log backup file.
344
The error in the script is less subtle, and I'm hoping you worked out what the problem is before seeing the error message in Figure 9-8.
345
Chapter 9: File and Filegroup Backup and Restore We can see that the first three RESTORE commands executed successfully but the fourth failed, with a message stating that the LSN contained in the backup was too recent to apply. Whenever you see this sort of message, it means that a file is missing from your restore script. In this case, we forgot to apply the differential file backup for the secondary data file; SQL Server detects the gap in the LSN chain and aborts the RESTORE command, leaving the database in a restoring state. The course of action depends on the exact situation. If the differential backup file is available and you simply forgot to include it, then restore this differential backup, followed by TLOG2, to recover the database. If the differential file backup really is missing or corrupted, then you'll need to process all transaction log backups taken after the full file backup was created. In our simple example this just means TLOG and TLOG2, but in a real-world scenario this could be quite a lot of log backups. Again, hopefully this hammers home the point that it is a good idea to have more than one set of files on hand, or available from offsite storage, which could be used to bring your database back online in the event of a disaster. You never want to be in the situation where you have to lose more data than is necessary, or are not be able to restore at all.
Summary
In my experience, the need for file backup and restore has tended to be relatively rare, among the databases that I manage. The flipside to that is that the databases that do need them tend to be VLDBs supporting high visibility projects, and all DBAs need to make sure that they are well-versed in taking, as well as restoring databases from, the variety of file backups. File backup and restore adds considerable complexity to our disaster recovery strategy, in terms of both the number and the type of backup file that must be managed. To gain full benefit from file backup and restores, the DBA needs to give considerable thought to the file and filegroup architecture for that database, and plan the backup and restore process 346
Chapter 9: File and Filegroup Backup and Restore accordingly. There are an almost infinite number of possible file and filegroup architectures, and each would require a subtly different backup strategy. You'll need to create some test databases, with multiple files and filegroups, work through them, and then document your approach. You can design some test databases to use any number of data files, and then create jobs to take full file, differential file, and transaction log backups that would mimic what you would use in a production environment. Then set yourself the task of responding to the various possible disaster scenarios, and bring your database back online to a certain day, or even a point in time that is represented in one of the transaction log backup files.
347
348
Chapter 10: Partial Backup and Restore set of archive tables for future reporting and auditing, stored in a read-only filegroup (this filegroup would be switched temporarily to read-write in order to run the archive process). We can perform this archiving in a traditional data-appending manner by moving all of the data from the live tables to the archive tables, or we could streamline this process via the use of partitioning functions. Once the archive process is complete and the new course data has been imported, we can take a full backup of the whole database and store it in a safe location. From then on, we can adopt a schedule of, say, weekly partial backups interspersed with daily differential partial backups. This way we are not wasting any space or time backing up the read-only data. Also, it may well be acceptable to operate this database in SIMPLE recovery model, since we know that once the initial course data is loaded, changes to the live course data are infrequent, so an exposure of one day to potential data loss may be tolerable. In our example, taking a full backup only once every three months may seem a little too infrequent. Instead, we might consider performing a monthly full backup, to provide a little extra insurance, and simplify the restore process.
350
The script is fairly straightforward and there is nothing here that we haven't discussed in previous scripts for multiple data file databases. The Archive filegroup will eventually be set to read-only, but first we are going to need to create some tables in this filegroup and populate one of them with data, as shown in Listing 10-2.
351
Listing 10-2: Creating the MainData and ArchiveData tables and populating the MainData table.
The final preparatory step for our example is to simulate an archiving process, copying data from the MainData table into the ArchiveData table, setting the Archive filegroup as read-only, and then deleting the archived data from MainData, and inserting the next set of "live" data. Before running Listing 10-3, make sure there are no other query windows connected to the DatabaseForPartialBackups database. If there are, the conversion of the secondary file group to READONLY will fail, as we need to have exclusive access on the database before we can change filegroup states.
352
GO
ALTER DATABASE [DatabaseForPartialBackups] MODIFY FILEGROUP [Archive] READONLY GO DELETE GO INSERT VALUES INSERT VALUES INSERT VALUES GO FROM dbo.MainData
INTO dbo.MainData ( 'Data for second database load: Data 4' ) INTO dbo.MainData ( 'Data for second database load: Data 5' ) INTO dbo.MainData ( 'Data for second database load: Data 6' )
Finally, before we take our first partial backup, we want to capture one backup copy of the whole database, including the read-only data, as the basis for any subsequent restore operations. We can take a partial database backup before taking a full one, but we do want to make sure we have a solid restore point for the database, before starting our partial backup routines. Therefore, Listing 10-4 takes a full database backup of our DatabaseForPartialBackups database. Having done so, it also inserts some more data into MainData, so that we have fresh data to capture in our subsequent partial backup.
353
Listing 10-4: Full database backup of DatabaseForPartialBackups, plus third data load.
The output from the full database backup is shown in Figure 10-1. Notice that, as expected, it processes both of our data files, plus the log file.
Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 16 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_ReadOnly' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. BACKUP DATABASE successfully processed 194 pages in 0.043 seconds (35.110 MB/sec).
354
The only difference between this backup command and the full database backup command shown in Listing 10-4 is the addition of the READ_WRITE_FILEGROUPS option. This option lets SQL Server know that the command is a partial backup and to only process the read-write filegroups contained in the database. The output should be similar to that shown in Figure 10-2. Notice that this time only the primary data file and the log file are processed. This is exactly what we expected to see: since we are not processing any of the read-only data, we shouldn't see that data file being accessed in the second backup command.
Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. BACKUP DATABASE...FILE=<name> successfully processed 178 pages in 0.039 seconds (35.481 MB/sec).
355
Chapter 10: Partial Backup and Restore Before we run a differential partial backup, we need some fresh data to process.
USE [DatabaseForPartialBackups] GO INSERT VALUES INSERT VALUES INSERT VALUES GO INTO MainData ( 'Data for fourth database load: Data 10' ) INTO MainData ( 'Data for fourth database load: Data 11' ) INTO MainData ( 'Data for fourth database load: Data 12' )
Listing 10-6: Fourth data load, in preparation for partial differential backup.
Listing 10-7 shows the script to run our partial differential backup. The one significant difference is the inclusion of the WITH DIFFERENTIAL option, which converts the command from a full partial to a differential partial backup.
USE [master] GO BACKUP DATABASE [DatabaseForPartialBackups] READ_WRITE_FILEGROUPS TO DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Diff.bak' WITH DIFFERENTIAL GO
Once this command is complete, go ahead and check the output of the command in the messages tab to make sure only the proper data files were processed. We are done taking partial backups for now and can move on to our restore examples.
356
357
Chapter 10: Partial Backup and Restore The output from running this script is shown in Figure 10-3. We should see all files being processed in the first command, and only the read-write and transaction log file being modified in the second command.
Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 16 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_ReadOnly' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. RESTORE DATABASE successfully processed 194 pages in 0.760 seconds (1.986 MB/sec). Processed 176 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups' on file 1. Processed 2 pages for database 'DatabaseForPartialBackups', file 'DatabaseForPartialBackups_log' on file 1. RESTORE DATABASE ... FILE=<name> successfully processed 178 pages in 0.124 seconds (11.159 MB/sec).
Everything looks good, and exactly as expected, but let's put on our Paranoid DBA hat once more and check that the restored database contains the right data.
USE [DatabaseForPartialBackups] GO SELECT FROM SELECT FROM ID , Message dbo.MainData ID , Message dbo.ArchiveData
358
Chapter 10: Partial Backup and Restore Hopefully, we'll see three rows of data in the ArchiveData table and six rows of data in the read-write table, MainData, as confirmed in Figure 10-4.
Figure 10-4: Results of the data check on our newly restored database.
359
Chapter 10: Partial Backup and Restore Listing 10-10 shows the script; we restore the full database and full partial backups, leaving the database in a restoring state, then apply the differential partial backup and recover the database.
USE [master] GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Full.bak' WITH NORECOVERY GO RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Diff.bak' WITH RECOVERY GO
Listing 10-10:
Once again, check the output from the script to make sure everything looks as it should, and then rerun Listing 10-9 to verify that there are now three more rows in the MainData table, for a total of nine rows, and still only three rows in the ArchiveData table.
Chapter 10: Partial Backup and Restore backups, provided the full file backup contains the primary filegroup, with the database system information. This is useful if we need to recover a specific table that exists in the read-write filegroups, or we want to view the contents of the backup without restoring the entire database.
-- restore the read-write filegroups RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = N'C:\SQLBackups\Chapter10\DatabaseForPartialBackups_PARTIAL_Full.bak' WITH RECOVERY, PARTIAL GO
Listing 10-11:
Now, we should have a database that is online and ready to use, but with only the read-write filegroup accessible, which we can verify with a few simple queries, shown in Listing 10-12.
USE [DatabaseForPartialBackups] GO SELECT FROM GO SELECT FROM GO ID , Message MainData ID , Message ArchiveData
Listing 10-12:
The script attempts to query both tables and the output is shown in Figure 10-5.
361
We can see that we did pull six rows from the MainData table, but when we attempted to pull data from the ArchiveData table, we received an error, because that filegroup was not part of the file we used in our restore operation. We can see the table exists and even see its structure, if so inclined, since all of that information is stored in the system data, which was restored with the primary filegroup.
362
Listing 10-13:
A SQL Backup script for full partial and differential partial backups.
The commands are very similar to the native commands, and nearly identical to the SQL Backup commands we have used in previous chapters. The only addition is the same new option we saw in the native commands earlier in this chapter, namely READ_WRITE_FILEGROUPS. Listing 10-14 shows the equivalent restore commands for partial backups; again, they are very similar to what we have seen before in other restore scripts. We restore the last full database backup leaving the database ready to process more files. This will restore all of the read-only data, and leave the database in a restoring state, ready to apply the partial backup data. We then apply the full partial and differential partial backups, and recover the database.
-- full database backup restore EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_FULL.sqb'' WITH NORECOVERY"' -- full partial backup restore EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_Partial_Full.sqb'' WITH NORECOVERY"' -- differential partial backup restore EXECUTE master..sqlbackup '-SQL "RESTORE DATABASE [DatabaseForPartialBackups] FROM DISK = ''C:\SQLBackups\Chapter10\DatabaseForPartialBackups_Partial_Diff.sqb'' WITH RECOVERY"'
Listing 10-14:
363
364
Listing 10-15:
Do you spot the mistake? Figure 10-6 shows the resulting SQL Server error messages.
366
Chapter 10: Partial Backup and Restore The first error we get in the execution of our script is the error "File DatabaseForPartialBackups is not in the correct state to have this differential backup applied to it." This is telling us that the database is not prepared to process our second restore command, using the differential partial backup file. The reason is that we have forgotten to process our partial full backup file. Since the partial full file, not the full database file, acts as the base for the partial differential, we can't process the partial differential without it. This is why our database is not in the correct state to process that differential backup file.
Summary
You should now be familiar with how to perform both partial full and partial differential backups and be comfortable restoring this type of backup file. With this, I invite you to sit back for a moment and reflect on the fact that you now know how to perform all of the major, necessary types of SQL Server backup and restore operation. Congratulations! The book is over, but the journey is not complete. Backing up databases and performing restores should be something all DBAs do on a very regular basis. This skill is paramount to the DBA and we should keep working on it until the subject matter becomes second nature. Nevertheless, when and if disaster strikes a database, in whatever form, I hope this book, and the carefully documented and tested restore strategy that it helps you generate, will allow you to get that database back online with an acceptable level of data loss, and minimal down-time. Good luck! Shawn McGehee
367
368
Appendix A: SQL Backup Pro Installation and Configuration SQLDBABundle.exe, so double-click the file, or copy the file to an uncompressed folder, in order to begin the installation.
Figure A-1:
There is the option to install several DBA tools from Red Gate, but here you just want to install SQL Backup (current version at time of writing was SQL Backup 6.5), so select that tool, as shown in Figure A-2, and click the Next button to continue.
369
Appendix A: SQL Backup Pro Installation and Configuration On the next page, you must accept the license agreement. This is a standard EULA and you can proceed with the normal routine of just selecting the "I accept" check box and clicking Next to continue. If you wish to read through the legalese more thoroughly, print a copy for some bedtime reading. On the next screen, select the folder where the SQL Backup GUI and service installers will be stored. Accept the default, or configure a different location, if required, and then click Install. The installation process should only take a few seconds and, once completed, you should see a success message and you can close the installer program. That's all there is to it, and you are now ready to install the SQL Backup services on your SQL Server machine.
370
Figure A-3:
Once the Add SQL Server screen opens, you can register a new server. Several pieces of information will be required, as shown below. SQL Server: This is the name of the SQL Server to register. Authentication type: Choose to use your Windows account to register or a SQL Server login. User name: SQL Login user name, if using SQL Server authentication. Password: SQL Login password, if using SQL Server authentication. Remember password: Check this option to save the password for a SQL Server login. Native Backup and restore history: Time period over which to import details of native SQL Server backup and restore activity into your local cache.
371
Appendix A: SQL Backup Pro Installation and Configuration Install or upgrade server components: Leaving this option checked will automatically start the installation or upgrade of a server's SQL Backup services. Having have filled in the General tab information, you should see a window similar to that shown in Figure A-4.
There is a second tab, Options, which allows you to modify the defaults for the options below. Location: You can choose a different location so the server will be placed in the tab for that location. Group: Select the server group in which to place the server being registered. Alias: An alias to use for display purposes, instead of the server name. 372
Appendix A: SQL Backup Pro Installation and Configuration Network protocol: Select the SQL Server communication protocol to use with this server. Network packet size: Change the default packet size for SQL Server communications in SQL Backup GUI. Connection time-out: Change the length of time that the GUI will attempt communication with the server before failing. Execution time-out: Change the length of time that SQL Backup will wait for a command to start before stopping it. Accept all the defaults for the Options page, and go ahead and click Connect. Once the GUI connects to the server, you will need to fill out two more pages of information about the service that is being installed. On the first page, select the account under which the SQL Backup service will run; it will need to be one that has the proper permissions to perform SQL Server backups, and execute the extended stored procedures that it will install in the master database, and it will need to have the sysadmin server role, for the GUI interface use. You can use a built-in system account or a service account from an Active Directory pool of users (recommended, for management across a domain). For this example, use the Local System account. Remember, too, that in SQL Server 2008 and later, the BUILTIN\Administrators group is no longer, by default, a sysadmin on the server, so you will need to add whichever account you are using to the SQL Server to make sure you have the correct permissions set up. Figure A-5 shows an example of the security setup for a typical domain account for the SQL Backup service on a SQL Server.
373
374
Appendix A: SQL Backup Pro Installation and Configuration On the next page, you will see the SQL Server authentication credentials. You can use a different account from the service account, but it is not necessary if the permissions are set up correctly. Stick with the default user it has selected and click Finish to start the installation. Once that begins, the installation files will be copied over and you should, hopefully, see a series of successful installation messages, as shown in Figure A-6.
You have now successfully installed the SQL Backup service on a SQL Server. In the next section, you'll need to configure that service.
375
File management
The File Management tab allows you to configure backup file names automatically, preserve historical data and clean up MSDB history on a schedule.
376
Appendix A: SQL Backup Pro Installation and Configuration The first option, Backup folder, sets the default location for all backup files generated through the GUI. This should point to the normal backup repository for this server so that any backups that are taken, especially if one is taken off the regular backup schedule, will end up in the right location. The next option, File name format, sets the format of the auto-generated name option that is available with SQL Backup, using either the GUI or T-SQL code. Of course, it is up to you how to configure this setting, but my recommendation is this:
<DATABASE>_<DATETIME yyyymmdd_hhnnss>_<TYPE>
Using this format, any DBA can very quickly identify the database, what time the backup was taken, and what type of backup it was, and it allows the files to be sorted in alphabetical and chronological order. If you store in a shared folder multiple backups from multiple instances that exist on the same server, which I would not recommend, you can also add the <INSTANCE> tag for differentiation. The next option, Log file folder, tells SQL Backup where on the server to store the log files from SQL Backup operations. Also in this section of the screen, you can configure log backup management options, specifying for how long backups should be retained in the local folder. I would recommend keeping at least 90 days' worth of files, but you could keep them indefinitely if you require a much longer historical view. The final section of the screen is Server backup history, which will clean up the backup history stored in the msdb database; you may be surprised by how much historical data will accumulate over time. The default is to remove this history every 90 days, which I think is a little low. I would keep at least 180 days of history, but that is a choice you will have to make based on your needs and regulations. If the SQL Backup server components are installed on an older machine which has never had its msdb database cleaned out, then the first time SQL Backup runs the msdb "spring clean," it can take quite a long time and cause some blocking on the server. There are no indexes on the backup and restore history tables in msdb, so it's a good idea to add some.
377
Appendix A: SQL Backup Pro Installation and Configuration Once you are done configuring these options, the window should look similar, but probably slightly different, to that shown in Figure A-8.
Email settings
The Email Settings tab will allow you to configure the settings for the emails that SQL Backup can send to you, or to a team of people, when a failure occurs in, or warning is raised for, one of your backup or restore operations. It's a very important setting to get configured correctly, and tested. There are just five fields that you need to fill out. SMTP Host: This is the SMTP server that will receive any mail SQL Backup needs to send out.
378
Appendix A: SQL Backup Pro Installation and Configuration Port: The standard port is 25; do not change this unless instructed to by your email administration team. Username: Email account user name. This is not needed if your SQL Server is authorized by Exchange, or some other Email Server, to relay email from an address. Talk to your email admins about getting this set up for your SQL Servers. Password: Email account password. Send from: This is the actual name of the email account from which the email will be sent. You can use one email account for all of your SQL Servers (e.g. SQLBackups@ MyCompany.com), or use an identifiable account for each server (e.g. MyServer123@ MyCompany.com). If your SQL Servers are authorized to relay email without authentication you can just use the second example without having to actually set up the mail account. That is also something to discuss with your email administrators. Once this screen is configured, it should look similar to that shown in Figure A-9. Always use the Send Test Email button to send a test email and make sure everything was set up correctly.
379
You are now installed and ready to start with the SQL Backup sections of the book! As noted at the start, if you have any trouble getting SQL Backup installed and running properly, visit the support section of the Red Gate site. There are many FAQs on common issues and a full message board with Red Gate support staff monitoring all of your questions and concerns.
380
Pricing and information about Red Gate tools are correct at the time of going to print. For the latest information and pricing on all Red Gate's tools, visit www.red-gate.com
$595
$595
Eliminate mistakes migrating database changes from dev, to test, to production Speed up the deployment of new databse schema updates Find and fix errors caused by differences between databases Compare and synchronize within SSMS "Just purchased SQL Compare. With the productivity I'll get out of this tool, it's like buying time."
Robert Sondles Blueberry Island Media Ltd
$595
$595
Copy lookup data from development databases to staging or production Quickly fix problems by restoring damaged or missing data to a single row Compare and synchronize data within SSMS "We use SQL Data Compare daily and it has become an indispensable part of delivering our service to our customers. It has also streamlined our daily update process and cut back literally a good solid hour per day."
George Pantela GPAnalysis.com
$295
"SQL Prompt is hands-down one of the coolest applications I've used. Makes querying/developing so much easier and faster."
Jorge Segarra University Community Hospital
$295
"After using SQL Source Control for several months, I wondered how I got by before. Highly recommended, it has paid for itself several times over."
Ben Ashley Fast Floor
Visit Visit www.red-gate.com for a 28-day, trial trial www.red-gate.com for a 14-day, free free
$795
Compress SQL Server database backups by up to 95% for faster, smaller backups Protect your data with up to 256-bit AES encryption Strengthen your backups with network resilience to enable a fault-tolerant transfer of backups across flaky networks Control your backup activities through an intuitive interface, with powerful job management and an interactive timeline
"SQL Backup is an amazing tool that lets us manage and monitor our backups in real time. Red Gate's SQL tools have saved us so much time and work that I am afraid my director will decide that we don't need a DBA anymore!"
Mike Poole Database Administrator, Human Kinetics
SQL Monitor
SQL Server performance monitoring and alerting
Intuitive overviews at global, cluster, machine, SQL Server, and database levels for up-to-the-minute performance data
from $795
Use SQL Monitor's web UI to keep an eye on server performance in real time on desktop machines and mobile devices Intelligent SQL Server alerts via email and an alert inbox in the UI, so you know about problems first Comprehensive historical data, so you can go back in time to identify the source of a problem Generate reports via the UI or with Red Gate's free SSRS Reporting Pack View the top 10 expensive queries for an instance or database based on CPU usage, duration, and reads and writes PagerDuty integration for phone and SMS alerting Fast, simple installation and administration
"Being web based, SQL Monitor is readily available to you, wherever you may be on your network. You can check on your servers from almost any location, via most mobile devices that support a web browser."
Jonathan Allen Senior DBA, Careers South West Ltd
$495
Virtually restoring a backup requires significantly less time and space than a regular physical restore Databases mounted with SQL Virtual Restore are fully functional and support both read/write operations SQL Virtual Restore is ACID compliant and gives you access to full, transactionally consistent data, with all objects visible and available Use SQL Virtual Restore to recover objects, verify your backups with DBCC CHECKDB, create a storage-efficient copy of your production database, and more. "We find occasions where someone has deleted data accidentally or dropped an index, etc., and with SQL Virtual Restore we can mount last night's backup quickly and easily to get access to the data or the original schema. It even works with all our backups being encrypted. This takes any extra load off our production server. SQL Virtual Restore is a great product."
Brent McCraken Senior Database Administrator/Architect, Kiwibank Limited
$1,595
SQL Toolbelt
The essential SQL Server tools for database professionals
$1,995
You can buy our acclaimed SQL Server tools individually or bundled. Our most popular deal is the SQL Toolbelt: fourteen of our SQL Server tools in a single installer, with a combined value of $5,930 but an actual price of $1,995, a saving of 66%. Fully compatible with SQL Server 2000, 2005, and 2008. SQL Toolbelt contains:
SQL
Compare Pro Data Compare Pro Source Control Backup Pro Monitor Prompt Pro Data Generator
SQL
Doc Dependency Tracker Packager Multi Script Unlimited Search Comparison SDK Object Level Recovery Native
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
"The SQL Toolbelt provides tools that database developers, as well as DBAs, should not live without."
William Van Orden Senior Database Developer, Lockheed Martin
$495
"Freaking sweet! We have a known memory leak that took me about four hours to find using our current tool, so I fired up ANTS Memory Profiler and went at it like I didn't know the leak existed. Not only did I come to the conclusion much faster, but I found another one!"
Aaron Smith IT Manager, R.C. Systems Inc.
from
$395
"Thanks to ANTS Performance Profiler, we were able to discover a performance hit in our serialization of XML that was fixed for a 10x performance increase."
Garret Spargo Product Manager, AFHCAN
.NET Reflector
Decompile, browse, analyse and debug .NET code View, navigate and search through the class hierarchies of any .NET assembly, even if you don't have access to the source code. Decompile and analyse any .NET assembly in C#, Visual Basic and IL
From $35
Step straight into decompiled assemblies whilst debugging in Visual Studio, with the same debugging techniques you would use on your own code "One of the most useful, practical debugging tools that I have ever worked with in .NET! It provides complete browsing and debugging features for .NET assemblies, and has clean integration with Visual Studio."
Tom Baker Consultant Software Engineer, EMC Corporation
SmartAssembly
.NET obfuscation, automated error reporting and feature usage reporting Obfuscation: Obfuscate your .NET code and protect your IP
from $795
Automated Error Reporting: Get quick and automatic reports on exceptions your end-users encounter, and identify unforeseen bugs within hours or days of shipping. Receive detailed reports containing a stack trace and values of the local variables, making debugging easier Feature Usage Reporting: Get insight into how your customers are using your application, rely on hard data to plan future development, and enhance your users' experience with your software
"Knowing the frequency of problems (especially immediately after a release) is extremely helpful in prioritizing & triaging bugs that are reported internally. Additionally, by having the context of where those errors occurred, including debugging information, really gives you that leap forward to start troubleshooting and diagnosing the issue."
Ed Blankenship Technical Lead and MVP