Automated MongoDB Restore with Powershell

mongodb-logo-powershell_sh2
Powershell 3 & MongoDB 2.6

As I’ve mentioned before, running MongoDB on Windows can present a unique set of challenges since MongoDB is, for all intents and purposes, Linux native. Recently, I was tasked with creating a method by which the following could be done in a Windows environment via automation:

  • Be spun up on a “clean” Server
  • At a specific time.
  • Restore a dumped database.
  • Report any errors.
  • Clean up afterwards.

The purpose of this is two-fold. We want to be able to continuously verify backups that we were making and test the system that would be our restore point should the primary go down.  In sum total, this method would test all of our backup and restore processes. If/When we needed to restore a backup, we would know that backup system worked and the backups were valid (mongorestore does validation on restore).

Since we’re on a Windows system, and we didn’t want to install anything we didn’t have to, the obvious choice was to script it in Powershell 3, assign it as a Scheduled Task and then have it report to a custom Event Log. The Event Log would allow an email alert to be sent out when an error was recorded.

PowerShell

Powershell (Here I used 3 since it was the lowest version with all the Event-Logging methods needed) has a few idiosyncrasies that presented more than one roadblock.  Enumerating each of these would not be terribly informative, as they’re better documented elsewhere, so I’ve narrowed it down to the few that were specific to MongoDB itself.

First of all, launching a straight mongod instance from PS results in the script stopping until the instance shuts down. To get around this, I used two scripts, executed at different times. The first script sets up the environment and starts mongod  with the given parameters.

CMD /C "mongod.exe --port 27017 --smallfiles --dbpath $restoreDir"

The second script does the restore, error reporting and the shuts down the instance. At that point, the first script resumes, cleaning up and exiting.

CMD /C "mongo admin --port 27017 --eval `"db.shutdownServer()`""

You’ll notice that I had to call CMD and then run mongo/mongod within that. This creates a problem when error reporting as the stderr doesn’t always get reported properly to PS from CMD (for reasons I won’t go into here, but if you’re interested check out these two StackOverflow articles here and here).

To get around this, stderr needs to be piped from the CMD into the stdout and put into a string. Also, mongorestore needs the –quiet switch so it doesn’t report extra information into error.

$someerror = CMD /C "mongorestore.exe --port 27017 $dumpDir --quiet" 2>&1

Then later, push that error string, if it exists, into the Event Log. This keeps every piece of informational output from being reported.

if($someerror) {
Write-EventLog -LogName Application -Source "MongoDB Auto-Restore" -EntryType Warning -EventId 10010 -Message "Mongo Auto-Restore: `n$someerror"
}

Also, you’ll want to delete the mongod.lock file when doing the restore. If this file exists in the dump directory, it will cause the restore to fail. You’ll get an error on the order of “mongorestore does not know what to do with this file.”

if (Test-Path $dumpDir\mongod.lock) {
Remove-Item $dumpDir\mongod.lock
}

A mongod.lock file usually means that the instance was shut down improperly, but can be generated for other reasons as well. For the purposes of testing a restore environment, it’s superfluous and can be outright deleted.

If you’d like to see my two scripts in their entirety, you can get them from my github.

Scheduler

The next task was to get them running on a schedule, which is less trivial than it sounds. Task Scheduler is simple, and just sends arguments and maybe a path to an executable. Since mongo/mongorestore/monod are producing output, you’ll have to run PS as an executable and send it the file with explicit instructions to not display anything and to allow scripts to be run on the local machine.

Action:
Start a Program
Program/script:
PowerShell
Arguments:
-noprofile -nologo -noninteractive -executionpolicy bypass -file .\autoMongoRestore_Run.ps1
Start in:
C:\Script\Directory

You’ll also need to give the person executing the script Local Admin permissions (run scripts, create directories etc.) as well as various other abilities through gpedit.msc. The basics of this are outlined in this forum post.

 Testing

Testing this setup is pretty straight-forward. If you don’t have production dumps that you can initially test with, or that is unreasonable, you can spin up a database, insert a few objects and then dump it. The size and number of the databases shouldn’t be relevant.

First, restore the database(s) manually; Spin up your own mongod and then restore to that instance. Then, set up your scripts to restore this test database and see how it goes.

Next, make sure you test for corruption in both your indexes and the database itself. This is as simple as opening the index/db and replacing a few } or ” with something else.  When the restore runs, it should cancel the restore and throw an error. If you’re using the scripts I provided above, it will write this to the event log.

Lastly, you’re going to want to restore with your actual production data. Depending on your database size, this may take a while, but it’s worth it to see if the backup machine can handle the load and has enough space.

Restore Complete

While this is my method for testing the Windows  restore environment, I’d be interested if anyone else has tried this sort of procedure. I like the ease of this setup, but the fact that two different scripts must be run at differing times seems a bit janky. I’ve tried other ways of starting and then continuing on, such as start /b or running the mongod with –service, but neither of them were stable enough to repeat consistently.

Again, you can check out my scripts on github, or leave me a comment here with any efficiency or method improvements. I’d be interested in seeing them.

-CJ Julius