We use SQL AAG’s at work for our production databases. There are 3 servers in the AAG, 2 in synchronous replication, one in Async. We have a number of SQL Jobs that need to run, usually just on the primary replica, but occasionally on the other replica’s.
We also want to be able to failover the jobs should the primary server go down, so that jobs that we need to just run (like backups) continue to run even when 1 or 2 nodes go offline. SQL doesn’t provide anything out of the box, so what are the options?
When researching how to setup SQL Agent, the best suggestion seems to be to purchase an additional SQL Standard edition license and use that server to just run SQL Agent. Every job is then managed from this one server, running jobs against the servers it requires to run against. The problem with this approach is pricing – you have to license a whole Standard Edition SQL server which will cost about US$7,000 (possibly less after Volume License Discounts), or we might be able to license using the CAL model – not sure how many CAL’s would be required, that’s a question for the licensing guru’s.
Either way, it’s a lot of money for what is essentially a Task Scheduler, so we went looking for alternatives. The two that presented themselves were:
- Use Windows Task Scheduler to run a whole lot of commands using either SQL Command line, or writing a basic application. This option would require us to rebuild a lot of what SQL Agent does naturally.
- Replicate the jobs between all 3 servers.
We chose to replicate the jobs. What do we have to do then to make this happen? There are 3 things that need to happen:
- Create a job that replicates to all of the SQL Agent Jobs from the primary to all the other servers
- Each job in SQL Agent on every server needs to be written in a way that it can run all of the time on every server
- Automatically created jobs (for us, SSRS jobs) need to disable themselves if they aren’t on the primary. We only realised after going live and turning on monitoring that the SSRS jobs would try to run on the secondary replicas, but fail because SSRS was in read-only mode on the server.
In the next few blog posts, I’ll go through the steps that we have for the Job that replicates all other jobs between servers.