What would your reaction be if, one day, all that old data you’ve been holding onto for years suddenly got up and walked out?
I’m not talking about unsecured data being hacked, personnel issues, malware in your network, or even HVAC systems with too much access to the network.
I’m talking data that suddenly, somehow, becomes self aware and rather than sitting around rotting away while you do nothing with it, decides to leave.
I recently read an article about toasters that act this way.
Yes, toasters.
Those devices that produce either charred lumps of what is presumed to have been bread or lukewarm bread that soft butter easily tears apart. Basically, a series of toasters were networked together if one was not being used, it would complain and want to go somewhere else. You can read the article here.
Now, I’m not suggesting you try to make you database and data self aware, though that would be very interesting; but I am suggesting you take a serious look at what data you are holding onto that you really don’t need.
If you are holding onto years of old data, and customers, clients or users only ever look at the last 12 months, do you still really need it all?
Maybe you need it for compliance with some regulation. We all know there are a ton of those out there, but does that regulation state it must be sitting in your live production database and immediately available? Odds are no, it doesn’t. It probably states that the older data has to be accessible in a reasonable amount of time. What’s that mean? Well, it depends on how “reasonable amount of time” is defined. If you have a couple of days you can probably restore a backup with the data from a tape that was sent offsite. This means you can get a good backup specifically with the old data, then purge it from your production system. Then you can sit back and marvel at the lack of expense required to gain a fair amount of performance & storage because you have much smaller data sets to work with.
An alternative is to archive old data out to a data warehouse environment, if you really need it. Even then, look at the warehouse and figure out just how much of the data you have there that is really required. The odds are that you have a lot more data than you will ever find useful. Just saying ‘It’s a warehouse, it’s designed for holding all the old stuff’ isn’t a really valid justification for holding onto worthless, old data. Cleaning up the warehouse leads to more important data being able to be stored, and retrieved more efficiently.
If you insist on keeping the data around in your primary database and you are lucky enough to be using Enterprise edition of SQL Server, you could look into file groups and partitioning your tables. This way you can direct your more recent, active data onto faster hardware and focused indexes to help with performance.
If you are not running Enterprise edition, you can apply for the TV show Hoarders, here.