Tag Archives: index

Top 5 New Features in SQL Server 2017 (that I care about)

New Year, New Database Engine

http://icons8.com/
This is either a database or a stack of licorice pancakes.

It’s finally here! A few days ago as of this writing, SQL Server 2017 was released for Windows, Ubuntu, RedHat and Docker. There are a lot of new things in SQL Server 2017 from Python support in Machine Learning* to better CLR security. But I thought I’d narrow down the list to changes that I’m most interested in.

1. sudo Your Way To A Better SQL Server

SQL SERVER IS ON LINUX NOW! No surprise this is the first thing on my list as I keep going on and on about it. But it’s really here!

Installation is a cinch, especially if you’re a Linux or Unix person. Curl the GPG keys, add the repository, apt-get (if you’re on a real Distro) the installer and run the setup. It’s really that easy.

If you want it even easier, then check out Docker. Slap it in and go!

All the main features are there, the engine (of course), agent, Full-text search, DB Mail, AD authentication, SQL command-line tools etc. Pretty much everything you need to get going on Linux even if you’re in a Windows-dominated environment.

2. It’s Like A Car That Repairs Itself

So this is a big one. With automatic tuning on, SQL Server can detect performance problems, recommend solutions and automatically fix (some) problems. There are two flavors of this, one in SQL Server 2017 and one in Azure; I’ll be talking about the one in SQL 2017 here.

Automatic plan choice correction is the main feature and it checks whether a plan has regressed in performance. If the feature is so enabled, it reverts to the old plan. As you can see below, a new plan was chosen (4) but it didn’t do so well. SQL 2017 reverted to the old plan (3) automatically and got most of the performance back.

Image Source: Microsoft

I’m sure that Microsoft will be expanding this feature in the future and we can expect to see more from this. Azure already has automatic Index tuning in place, so we’ll probably see that in the On-Prem version eventually.

3. Indexes That Start, Stop And Then Start Again

This is a feature I didn’t know I needed. Basically, it allows an online index rebuild that has stopped or failed (say, it ran out of disk space) to be resumed. The index build fails, you fix whatever made it fail, and then resume the rebuild, picking up from where it left off. The rebuild will be in the ‘PAUSED’ state until you’re ready to RESUME or ABORT it.

Code:
-- Start a Resumable index rebuild
ALTER INDEX [NCIX_SomeTable_SomeColumn] on [dbo].[SomeTable]
REBUILD WITH (ONLINE=ON, RESUMABLE=ON, MAX_DURATION=60)

-- PAUSE the rebuild:
ALTER INDEX [NCIX_SomeTable_SomeColumn] on [dbo].[SomeTable] PAUSE

/* If you'd like to resume either after a failure or because we paused it. This syntax will also cause the resume to wait 5 minutes and then kill all blockers if there are any. */
ALTER INDEX [NCIX_SomeTable_SomeColumn] on [dbo].[SomeTable]
RESUME WITH (MAX_DURATION= 60 MINUTES,
WAIT_AT_LOW_PRIORITY (MAX_DURATION=5, ABORT_AFTER_WAIT=BLOCKERS))

/*Or if you just want to stop the whole thing because you hate unfragmented indexes */
ALTER INDEX [NCIX_SomeTable_SomeColumn] on [dbo].[SomeTable] ABORT

This is also great if you want to pause a rebuild because it’s interfering with some process. You can PAUSE the rebuild, wait for the transaction(s) to be done and RESUME. Pretty neat.

4. Gettin’ TRIM
A cute Sheepy.
There hasn’t been a picture in a while and I was afraid you might be getting bored.

This one is kind of minor, but it excited me a lot because it was one of the first things I noticed as a DBA that made me say ‘Why don’t they have a function that does that?’ (note the single quotes). TRIM will trim a string down based on the parameters you provide.

If you provide no parameters it will just cut off all the spaces from both sides. It’s equivalent to writing RTRIM(LTRIM(‘ SomeString ‘))

Code:
SELECT TRIM( '.! ' FROM ' SomeString !!!') AS TrimmedString;

Output:
SomeString

5. Selecting Into The Right Groups

Another small important change. In previous versions of SQL Server, you could not SELECT INTO a specific Filegroup when creating a new table. Now you can, and it uses the familiar ON syntax to do it.

Code:
SELECT [SomeColumn] INTO [dbo].[SomeNewTable] ON [FileGroup] from [dbo].[SomeOriginalTable];

Also, you can now SELECT INTO to import data from Polybase. You know, if you’re into that sort of thing.

Honorable Mentions!

Here are some things that caught my eye, but didn’t really need a whole section explaining them. Still good stuff, though.

Query Store Can Now Wait Like Everybody Else

Query store was a great feature introduced in SQL Server 2016. Now they’ve added the ability to capture wait stats as well. This is going to be useful when trying to correlate badly performing queries and plans with SQL Server’s various waits.

See the sys.query_store_wait_stats system table in your preferred database for the deets. Obviously, you’ll need to turn on Query Store first.

Con The Cat With Strings

Just as minor as TRIM in some people’s books, but this is a great function for CONCAT_WS’ing (it’s not Concatenating, right? That’s a different function) strings with a common separator, ignoring NULLs.

Code:
SELECT CONCAT_WS('-','2017', '09', NULL, '22') AS SomeDate;

Output:
2017-09-22

Get To Know Your Host

sys.dm_os_host_info – This system table returns data for both Windows and Linux. Nothing else, just thought that was neat.

Get Your Model Serviced

Not something I’m jumping for joy about, but it is a good change of pace (I think). No more Service Packs, only Zuul… er… just Cumulative Updates. Check out my article on it if you want to know more about all the Service Model changes.

And Lots Of Other Things

Of course, there’s hundreds other things that are in SQL Server 2017 and related features. I didn’t even touch on the SSIS, SSAS, SSRS, MDS or ML stuff. Check out the shortened list here, broken down by category. Exciting new toys!

-CJ Julius

* Seriously, how do you release a Data Science platform and not include Python? That’s like releasing a motorcycle with only the rear tire. Yes, you can technically use it, but you you’re limiting yourself to a customer base with a very selective skill-set.

Automated MongoDB Restore with Powershell

mongodb-logo-powershell_sh2
Powershell 3 & MongoDB 2.6

As I’ve mentioned before, running MongoDB on Windows can present a unique set of challenges since MongoDB is, for all intents and purposes, Linux native. Recently, I was tasked with creating a method by which the following could be done in a Windows environment via automation:

  • Be spun up on a “clean” Server
  • At a specific time.
  • Restore a dumped database.
  • Report any errors.
  • Clean up afterwards.

The purpose of this is two-fold. We want to be able to continuously verify backups that we were making and test the system that would be our restore point should the primary go down.  In sum total, this method would test all of our backup and restore processes. If/When we needed to restore a backup, we would know that backup system worked and the backups were valid (mongorestore does validation on restore).

Since we’re on a Windows system, and we didn’t want to install anything we didn’t have to, the obvious choice was to script it in Powershell 3, assign it as a Scheduled Task and then have it report to a custom Event Log. The Event Log would allow an email alert to be sent out when an error was recorded.

PowerShell

Powershell (Here I used 3 since it was the lowest version with all the Event-Logging methods needed) has a few idiosyncrasies that presented more than one roadblock.  Enumerating each of these would not be terribly informative, as they’re better documented elsewhere, so I’ve narrowed it down to the few that were specific to MongoDB itself.

First of all, launching a straight mongod instance from PS results in the script stopping until the instance shuts down. To get around this, I used two scripts, executed at different times. The first script sets up the environment and starts mongod  with the given parameters.

CMD /C "mongod.exe --port 27017 --smallfiles --dbpath $restoreDir"

The second script does the restore, error reporting and the shuts down the instance. At that point, the first script resumes, cleaning up and exiting.

CMD /C "mongo admin --port 27017 --eval `"db.shutdownServer()`""

You’ll notice that I had to call CMD and then run mongo/mongod within that. This creates a problem when error reporting as the stderr doesn’t always get reported properly to PS from CMD (for reasons I won’t go into here, but if you’re interested check out these two StackOverflow articles here and here).

To get around this, stderr needs to be piped from the CMD into the stdout and put into a string. Also, mongorestore needs the –quiet switch so it doesn’t report extra information into error.

$someerror = CMD /C "mongorestore.exe --port 27017 $dumpDir --quiet" 2>&1

Then later, push that error string, if it exists, into the Event Log. This keeps every piece of informational output from being reported.

if($someerror) {
Write-EventLog -LogName Application -Source "MongoDB Auto-Restore" -EntryType Warning -EventId 10010 -Message "Mongo Auto-Restore: `n$someerror"
}

Also, you’ll want to delete the mongod.lock file when doing the restore. If this file exists in the dump directory, it will cause the restore to fail. You’ll get an error on the order of “mongorestore does not know what to do with this file.”

if (Test-Path $dumpDir\mongod.lock) {
Remove-Item $dumpDir\mongod.lock
}

A mongod.lock file usually means that the instance was shut down improperly, but can be generated for other reasons as well. For the purposes of testing a restore environment, it’s superfluous and can be outright deleted.

If you’d like to see my two scripts in their entirety, you can get them from my github.

Scheduler

The next task was to get them running on a schedule, which is less trivial than it sounds. Task Scheduler is simple, and just sends arguments and maybe a path to an executable. Since mongo/mongorestore/monod are producing output, you’ll have to run PS as an executable and send it the file with explicit instructions to not display anything and to allow scripts to be run on the local machine.

Action:
Start a Program
Program/script:
PowerShell
Arguments:
-noprofile -nologo -noninteractive -executionpolicy bypass -file .\autoMongoRestore_Run.ps1
Start in:
C:\Script\Directory

You’ll also need to give the person executing the script Local Admin permissions (run scripts, create directories etc.) as well as various other abilities through gpedit.msc. The basics of this are outlined in this forum post.

 Testing

Testing this setup is pretty straight-forward. If you don’t have production dumps that you can initially test with, or that is unreasonable, you can spin up a database, insert a few objects and then dump it. The size and number of the databases shouldn’t be relevant.

First, restore the database(s) manually; Spin up your own mongod and then restore to that instance. Then, set up your scripts to restore this test database and see how it goes.

Next, make sure you test for corruption in both your indexes and the database itself. This is as simple as opening the index/db and replacing a few } or ” with something else.  When the restore runs, it should cancel the restore and throw an error. If you’re using the scripts I provided above, it will write this to the event log.

Lastly, you’re going to want to restore with your actual production data. Depending on your database size, this may take a while, but it’s worth it to see if the backup machine can handle the load and has enough space.

Restore Complete

While this is my method for testing the Windows  restore environment, I’d be interested if anyone else has tried this sort of procedure. I like the ease of this setup, but the fact that two different scripts must be run at differing times seems a bit janky. I’ve tried other ways of starting and then continuing on, such as start /b or running the mongod with –service, but neither of them were stable enough to repeat consistently.

Again, you can check out my scripts on github, or leave me a comment here with any efficiency or method improvements. I’d be interested in seeing them.

-CJ Julius

 

 

A Few Useful Commmands for MongoDB

MongoDB NoSQL database
MongoDB NoSQL database

About four months ago I was handed an already existing MongoDB infrastructure and told to take care of it. It’s been a long road, and I’ve had to train up quite quickly on this system.

While I’m certainly no expert at this point, it is what I’ve been spending the bulk of my time doing. Now that I’ve got som actual administration time under my belt I thought I’d relay a few commands and tricks that I wished I’d known from the outset.

db.setProfilingLevel

db.setProfilingLevel(level, slowms)

This turns on the performance monitoring in MongoDB. The level portion sets the level of the profiling (logging) you want.

0 = Off
1 = Only slow operations
2 = All operations

The slowms is optional, but sets the threshold for what is considered a “slow” operation. It is 100ms by default and generally you can leave it there.

Keep in mind that if you are using the cloud-based MMS solution from MongoDB, this will send information to them. If your business falls under HIPAA or some similar security requirements, you may not want to use this.

Note: You can find out if profiling is enabled by using db.getProfilingStatus()

show profile

show profile

This, when used in conjunction with the previous command will show the last 5 slow operations. Very useful if you think there might be an errant query or index problem that is slowing down your DB.

db.serverCmdLineOpts()

db.serverCmdLineOpts()

db.serverCmdLineOpts() will give you output similar to this.
db.serverCmdLineOpts() will give you output similar to this. The “parsed” section is probably what you’re most interested in.

I was handed an already existing infrastructure and had to do a little bit of walking backwards through it to put my own configuration documentation together.

This is a command that tells you what command line options were being used when this particular mongo instance was started. It includes everything entered in the command line as well as anything used in the config file.

Collection Schema

var findschema = db.Collection.findOne();
for (var key in findschema) {print (key) ; }

While MongoDB is a schemaless* database, it does help to know what is being stored where. To build a layout of the database, you can use the above two lines to report all the keys in a Collection. Make sure you swap out the Collection portion to the collection you want to find the schema for.

logRotate

db.runCommand( { logRotate : 1})

If you’re running MongoDB on Linux, then you have a few more choices via syslog, but if you’re like me and using it in Windows, then this is really your only option to rotate the logs.

That is, it stops writing to one log file and starts writing to another, appending the previous log file with a time stamp. This rotates all logs as far as I can tell, including audit logs if you’re using those. I personally run this command once a month just to keep the files manageable.

More to Come

This is just a quick starter set that I had specifically noted as “things I wished I knew when I started,” and I’ve got quite a few more. In the future I’ll delve into some of my simpler metrics and stat commands that I use in conjunction with MMS.

-CJ Julius

* It’s not really schema-less as MongoDB does use a schema, but it doesn’t enforce that schema. Objects can be added and removed as needed.