I’m always looking for a way to save space in SQL Server. From archiving old data to just flat out deleting unused objects, I take great joy in removing superfluous stuff. The less junk in the system, the easier it is to focus on the things that matter.
..and fit it in a 10 kg bag
The biggest useless space eaters are tables that are (supposedly) no longer used. I could script them with data to a file, but what if they’re 100+GB? I could also back them up to another DB and then drop them from the database; that would certainly free up the space in the original DB. What if they’re needed for some process that I was unaware of and we can’t wait for the time to restore/move them back?
My conundrum was this. So, I decided to implement a process that looked at a single DBA-controlled schema and compressed every table created prior to a certain date. I could TRANSFER the superfluous table to that schema, and leave it. At some point in the future a job would come along and compress it.
If the data was needed within X days, then the table could easily be transferred back to the original schema, no harm: no foul. Also, I would save space as tables would be automatically PAGE compressed and could be decompressed if needed. De/Compression is really fast in SQL Server.
It’s Compression Time
So, this super-simple stored procedure was created prCompressCleanupTables (click for github link). It takes the following parameters:
@CompressBeforeDate – A DATETIME variable that accepts how old the table must be before it is compressed (Looks at the created date)
@Schema – Sysname variable that takes the schema name that you want to compress. Keep in mind that this is the same schema for every database, so make sure it’s unique (I use the ‘Cleanup’ schema personally, hence the name).
It skips the following databases by default: master, tempdb, model, msdb, distribution, ReportServer, SSISDB. It will skip any database that is in any state other than ONLINE, too.
Also remember that compression is locked to certain editions of SQL Server, as well as being 2008+ (you really need to upgrade if being 2008 is a limiting factor).
I’m Also A Client
I have this implemented as a job on several servers which checks weekly for new tables to compress in the appropriate databases. It checks for any tables created prior to GETDATE() – 60. I have to say, that it runs very quickly even on large tables.
Always moving forward, here at KnowledgeHunter Corp! (Note: not a real corporation) Just posted the github of the new SDIM 2.3. Changes in just about every corner, but the big overall change is the addition of job information in the data pull.
If you have no idea what SDIMS is or I’m yammering about, here’s the run-down. Otherwise, read on for the 2.3 update news.
Dey took er jerbs
The newest feature is ability to inventory pretty thorough job information. Well, the stuff I’m always looking for anyway. Also included is some extended information, which right now basically includes just a description with a few other identifying columns. Future versions may gather more data and put it here. Powershell handles it well, even with several thousand rows/jobs.
Here is the kind of information that you can expect to see:
Is the Job enabled?
Is the Schedule enabled?
Schedule Type, Occurrence, Recurrence (if applicable) and Frequency
This information has been added to the ‘Full Inventory’, so be aware that has a lot of new stuff in it too.
Full disclosure: I used heavily modified version of a few queries from Dattatrey Sindol to pull jobs data.
Fixed a few bugs:
isProduction column now does not display in Instances list. That was a feature that got removed from everywhere (before 1.0). Finally removed the column.
HIPAA level feature is now completely removed. Never got this one working right and I decided that in the future I’ll go with something a bit more general, like maybe just ‘priority’ or something.
Fixed a few typos. Me spel gud now.
Too many varchar(max) columns. I know. I know.
So, how do you get this? Well…
If you never have used it before, I would suggest going through the walkthrough I put out a while ago. I may in the future build a more simplified version for people who just want to install and don’t care about the behind-the-scenes. (UPDATE: I did just that, here)
Elseif you’re using prior to 2.1 already, then you should probably just drop all the SQL objects and rebuild with the new 2.3 items. This includes the PowerShell datapull and frontend pieces; they’re the only two PowerShell pieces.
Elseif you’re on 2.1 right now you can just replace the items that were changed. I have created a script that will do this for you, because that’s the kind of guy I am. You may need to run the script twice because I didn’t check for object existence. Running it multiple times won’t hurt. It doesn’t touch the Servers table, but it does drop the InstanceList, so if you have that statically assigned, then you’re going to want to back it up and then re-insert the data with the missing columns removed.
And Here’s A Count of All the Bolts
Lastly, here’s the items that were altered in the 2.1 – 2.3 release if you’re curious:
dbo.JobList – This holds all the jobs information. Added for new feature (look! No varchar(max)!)
dbo.InstanceList – Removed a column.
dbo.prGetInstances – Altered to fix bug
dbo.prGetInventory – Altered to fix bug
dbo.prGetJobs – New for new feature
dbo.prGetJobsExt – New for new feature
dbo.prGetServersAndInstances – Altered to fix bug
dbo.prInsertJobList – New for new feature
dbo.prUpdateInstanceList – Altered to fix bug
DB_DataPull – Altered to pull Jobs (new feature) and fix bugs
DB_DataPUll_FrontEnd – Altered to display Jobs/Jobs Extended
Management Data Warehouse is a neat tool for collecting data on your instances with regards to performance, disk usage, and anything else related to SQL Server you might want to know. I’ve recently been rolling it out to various servers in test environments to get a good handle on all that it can do, and maybe more importantly, how it’s going to affect performance.
When I was satisfied that it was working the way I wanted in these environments, I started to push it to what you could call QA. Immediately I was hit with litany of errors, all of which said basically the same thing:
Caught error#: 14684, Level: 16, State: 1, in Procedure: sp_syscollector_create_collection_set, Line: 203, with Message: Caught error#: 14262, Level: 16, State: 1, in Procedure: sp_syscollector_create_collection_set, Line: 80, with Message: The specified @schedule_name ('CollectorSchedule_Every_30min') does not exist. (Microsoft SQL Server, Error: 14684)
Um, what? Of course it doesn’t exist, I’m installing it now. Just to verify, I did some poking around, and I found that it was indeed attempting to add a schedule that didn’t exist (as you would expect) to the collection set. After a while my Jr DBA (read: Google) turned up this article on removing MDW which says, in no uncertain terms:
The fact is that, while these schedules may look like they were created by MDW, they are created by default in a default installation of SQL Server 2008 or above. You can delete these easily, if you really want to, and if you are sure you will never enable MDW again. I would still suggest scripting the schedules and saving those scripts to your file system or source control so that you can recover them in the future.
I took a look back on the TEST instances where I was previously successful and found these schedules existed both on instances where MDW had been installed, and in places where it had not. These were part of the defaut SQL Server install and had been manually removed. Whoops.
As far as I know, there isn’t any direct way to script out schedules, so I could port them to the boxes where they had been removed. To get around this, I added them to the collector set jobs and then scripted that. Then I was able to change a few variables (and remove the schedule_uid) so that this:
I ran these add_schedules on the QA boxes, went through the Configure Management Data Warehouse Wizard again, and tada!
For your convenience, I’ve scripted out the four schedules I needed to get it going below. These are not from the most current version of SQL Server (2016 as of this writing), and are not all of the schedules included out of the box, but these let me get MDW running so I could add my own custom schedules for the remainder.
The moral of this story: Don’t remove MDW schedules unless you really mean it, for all time, forever and ever, pinky swear.
The four ‘core’ schedules I needed, for your convenience:
Now that we’ve got the tables built and populated them with data (you did that right?) we can define how our Powershell script is going to access that data. If you don’t know what database or tables I’m talking about, check out the Introduction and Part I.
You can send the dynamic SQL to your database via Powershell but you really want all interactions to go through stored procedures. Why? Lots of reasons, but here’s 2:
Standardization – Every time you call a stored procedure the code will be exactly the same every time. If you change a hard-coded query in one part of your script, but not another, you can get inconsistent results.
Performance – Unless you’re accessing thousands of instances with this script then you may not care too much about performance, but SQL Server has a hard time building query plans off of dynamic SQL; it does so wonderfully when the code is in a compiled stored procedure.
So, that’s all well and good, but what do we need for our Powershell project to do? I’ve created the procedures that will allow you to do the basic tasks that you might be interested in. The sky is the limit if you want to do more.
Wow. Look at all the VarChar(MAX)’s. These could probably be turned into other datatypes that are smaller, especially ServiceStartMode. In the interest of brevity I just made these as versatile as I could. This may change in the future if I get a chance to tighten this up.
Oh, and this stored procedure inserts new services into the dbo.ServiceList table by finding the ServerName in dbo.ServerList via text search.
All DBAs should keep track of their Servers/ Instances/ etc not only for their own edification, but for Management and security reasons as well. If you’re not, then you need to, as it comes in incredibly handy even if it isn’t a requirement of the job.
Most of the time, this information is compiled into a spreadsheet of some kind or possibly in a word processing document somewhere. Keeping this data up-to-date and accurate is a pain, especially when you have to break it out into multiple tabs and/or over multiple documents.
You could get a full-blown inventory manager that collects and compiles all the data and organizes it for you. But there’s a definite cost to that solution and not one that all companies will find useful (Read: “It’s not in the budget this quarter”).
What if you can’t get someone to shell out the money for a product like that? Then you have to either keep with the spreadsheets (yuck) or you need to find another solution with the tools you have.
What’s the Catch?
So, this is my attempt to resolve this issue using two tools that any MSSQL DBA should have: Powershell and SQL Server. I will point to other software products or versions both paid and free below, but the core code should run using things you should already have. That said, here’s my software list:
Your infrastructure security is set up using Active Directory.
Setting up a new instance is already done and you can connect to it.
Your personal login or the login you are using to execute the code is able to query the relevant system tables and server info on each of the target systems.
You know your current Server\Instance setup.
There is no fifth thing.
What You Get
By the time this series is finished you’ll have a simple GUI front-end (shown above) for current data with all of your servers, instances and all the information you could want about them.
We can pull back and organize any data that SQL Server or Windows Server can spit out including, but not limited to:
OS Version, Service Pack.
SQL Services running, their statuses and logon information.
SQL Server Instance names, versions, and editions.
SQL Server Database names and sizes.
Ability to dynamically re-pull any of this information if needed.
I will walk you through my solution piece by piece over 4 posts (5 including this intro) that will consist of the following parts. This list will be updated with links to the different sections as they are released.
There are a lot of free materials out there for learning Microsoft products, and suprisingly (or not?) a lot of them are from Microsoft themselves. I thought I’d take a moment to organize and collect my list of free resources in the hopes that not only will it help me organize and find what I need, but also help others of you who don’t know about this stuff.
The one main source I’m using here is the MSDN MSsmallBiz blog with posts by Eric Ligman. There are a massive number of titles to look at, but I’ve not seen them compiled into one place. Keep in mind that some of these are older and all the links may not work. I will update this list in the future if I find new/interesting free education materials in this genre.
The last group contains quite a few of the previous two sections (but not all, I’ve found). Most are in PDF or DOCX (word) format with a few in portable and non-portable formats thrown in.
MS Office – Powershell 4.0 (this stuff is really good) – CRM – Quick Start Guide group – even more SQL Server 2012
If you’re looking for information on specific Microsoft technologies or if you’re gearing up for an MS cert, check out the Microsoft Virtual Academy. They’ve got kind of a neat gamification thing going on where you get points for completing certain courses.