Failing Mirror Backups

Failing Backups! Is there anything more worrying for a DBA?

Boot Failure
We need the backups!

Recently we have been having multiple failures a week for our full and differential backups. We use Ola Hallengren’s backup scripts, and unless you are using a 3rd party product, you should be using Ola’s scripts as they are amazing and so simple to use.

We have a requirement to ship our full and differential backups to a secondary location to be used for testing. Ola’s scripts have a simple parameter that you can set called @MirrorDirectory along with parameters to control clean up time and so on. It was super simple to setup, we just added the Mirror Directory to the backups we wanted to copy and it was done.

However, over the last few weeks we’ve been having network issues when connecting to the secondary location. This causes the job to fail. Because we are backing up multiple databases in the same command if the first one fails then the subsequent databases are not backed up.

After working with the network team, they said it must be something to do with SQL and how it copies the data across. I needed a way to prove that it wasn’t a SQL problem, and more importantly I needed a way to stop our backups failing.

I decided to change the SQL job to no longer use the mirror, and instead add a PowerShell step to the job after all the backups to the primary location were done. The step finds all of the backup files in the primary location and copies them across with a simple Power Shell Copy-Item command.

$query = @"
;with lastBackups AS
(
 SELECT MAX(media_set_id) as media_set_id
 ,b.[database_name]
 FROM msdb.dbo.backupset b
 WHERE type = 'D'  --Differential or Full
 AND b.database_name in ('Database1', 'Database2')
 GROUP BY b.database_name
)
SELECT lastBackups.database_name, 'Microsoft.Powershell.CoreFileSystem::' + MAX(physical_device_name) as FilePath
FROM msdb.dbo.backupmediafamily
JOIN lastBackups
ON lastBackups.media_set_id = backupmediafamily.media_set_id
group by lastBackups.database_name


"@

$results = Invoke-Sqlcmd -Query $query -ServerInstance $(ESCAPE_SQUOTE(SRVR))

$destinationPath = "Microsoft.Powershell.CoreFileSystem::\serverpath"

foreach ($row in $results)
{
 $resolvedDestinationPath = $destinationPath + $row.database_name +"_Full" + [IO.Path]::GetExtension($row.FilePath)
 Copy-Item -Path $results.FilePath -Destination $resolvedDestinationPath
}

2 things caught me out:

  1. To use the file system you have to break out of the normal SQL namespace in PowerShell and prepend the source and destination file paths with Microsoft.Powershell.CoreFileSystem::
  2. To get the server name PowerShell running inside SQL Agent has a key word SRVR

Other than those 2 things, it’s pretty simple. Select the last full or differential backup file and use PowerShell to copy it over.

This didn’t actually fix the problem of the secondary location failing, but the networking team agreed it wasn’t a SQL problem and we didn’t miss out on doing the full and differential backups that we require in case of disaster. Rather than investigate they created a new secondary storage location which has been working fine ever since, but we still copy via PowerShell so that if it does fail we know we have the primary backups ready to go.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s