SQL Server has 3 different types of built-in backups: Full, Differential and Log. When working out how you’re going to backup your database, you need to understand what the 3 different types are intended for, so you can work out what you need to do.
A full backup is the simplest type to understand – every bit of data stored in the data file is backed up – all of your row data, your indexes, all of your stored procedures, views, security, functions and so on, are saved into your backup file.
A differential backup is a copy of everything in your data files that has changed since your last full backup. Every time data is inserted, deleted, updated, the page that the data is stored on is marked as dirty. Similarly, when any of your schema changes, it is marked as dirty – if you add or remove a column on a table, create a new stored proc, edit the definition of a view, give access to a user – any change you make. SQL Server marks that object/page as requiring a backup.
When a differential backup runs, SQL finds all of the data and schema objects that it marked as requiring a backup and backs them up. Here’s the bit that a lot of people miss though, when it backs it up it does not mark those objects as backed up. Let’s look at an example. My database has a table called Names – with 2 columns – ID and Name. I create the table and do a full backup.
Next, I insert 5 rows into my table:
Now if I do a differential backup, it will contain those 5 rows – the page that they’re stored on is marked as dirty, so it backs them up. Now lets insert another 5 rows:
If we do another differential backup now, it will contain all 10 rows. There is no check for data already contained in a differential backup.
It’s important 2 understand this for 2 reasons:
- You differential backups will continue to grow in size until you do your next full backup. Your differential backup is not just the differences since your last differential, it’s all of the changes since your last full backup. Which means:
- If you need to restore using a differential backup, you don’t need to restore all of the differentials since your last full backup, simply restore your last differential to get all of the changes.
Log backups are a backup of the log file instead of the data file. There’s a lot to learn about the log file, but it is basically a sequential log of all changes to the database as they happen. To make writes faster, SQL writes all of the changes sequentially to a log file – it doesn’t have to find the write place in the file to save it. When it has some spare time later, it commits all of the changes to the data files.
When your database is in full or bulk recovery model, SQL will keep writing to the end of the log file waiting for you to do a log backup. The log backup saves all of the data in the log file to your backup, and then marks that space for re-use, allowing SQL to overwrite your data.
Log backups give you something that Full and Differential backups don’t – they allow you to restore to any point in time. With a full backup, you can only restore the whole database, as it was at the time of the backup. With a differential backup, you can only restore the database as it was at the time of the backup, but, with a log backup, you can choose to restore to any point in time. If someone accidentally updates a whole lot of data, you can restore you database to the point right before the update happened. If someone drops a table, you can restore to the point right before the drop happened. Log backups are the only option that can achieve this for you.
Log backups tend to be quick – SQL knows exactly where the start and end of the log to be backed up is. Where a full backup could take minutes or even hours, log backups can be as quick as a few seconds.
Log backups are also sequential. You have to have every log backup, in order, if you want to restore from them. Going back to our last example, if you run a log backup after you insert the first 5 rows, then only those 5 rows will be in the backup. If you insert another 5 rows and do another log backup, then only those 5 rows will be in the 2nd backup. To get all 10 rows, you need to restore both log files.
The log chain is also important. You cannot restore a log backup until the previous log backup has been restored. If you loose a log backup, then you cannot restore anything after that point, even if you have all of the files after it.
When your database is in simple recovery model, the log file is overwritten as soon as a transaction has been completed – you can’t back it up because the data is constantly overwritten.
These are the built in options for SQL Backups. There are a few other ways to backup SQL. I’ll also discuss in a future post how to think about setting up each of these different backup types so that you can achieve the restore points that you require.