Stretch Databases for the Budget Poor DBA

SQL Server's stretch database is a feature that a lot of people liked and were excited about when Microsoft announced it.

Stretch databases were going to provide "Cost-effective" availability for cold data, and unlike typical cold data storage, our data would always be online and available to query. Applications would not need to be modified to work with the seamless design of the stretch database. Run a query, and the data was there being pulled from the cloud when needed. Streamlining on-premises data maintenance by reducing the local footprint of the data files as well as the size of backups! It was even going to be possible to keep data secure via encrypted connections to the cloud and in theory, make a migration to the cloud even easier.

It was destined to be a major win!

Then the price was mentioned.

Do you know anyone using stretch databases today?

Yeah, me neither.

Recently while at the MVP summit it was mentioned that a lot of folks like the concept of Microsoft's stretch database but the pricing just makes it near impossible to implement anywhere.

That got me thinking - There has to be a way to work around this using the features we already have in SQL Server. I started going through the capabilities we have.
Since 2016 we can map our files to the cloud and run an on-prem engine. We have filegroups and multiple files we can use to split file locations. It is possible to mount a cloud fileshare path as a local drive that persists on reboots (and make it secure as you should!). It all started falling into place in my head. I started playing and found a way to place cold data in a live database into the cloud and make it seamlessly available to queries!

By my quick review those already existing features hit almost all the features of stretch databases and the only cost was for cheap file storage and the cost of pulling the data from there! An expense that I could even redirect to those groups using that old data and making me have to manage it.

No, I'm not bitter at all, why do you ask?

It Ain't Pretty, but It's Kinda Cool!

Yeah, this is not a pretty predefined feature to just click a couple of things and be done and going, nor is there a nice simple script to use. This method requires us rolling up our sleeves and getting a little dirty. Oh, and this may not work well for you and your requirements. I'm not even sure I'd attempt to roll this into any live environment (Ok, to be real, I probably will use it at some ~~data hoarders~~ clients at some point).

The trick was finding a way to make this all work together and still mark a majority of the benefit check-boxes from implementing stretch databases.
I think I got close, and it still needs more digging and tweaking, but the proof of concept is just too cool to hold back from everyone!

Enough already! Gimme the Goods!

I got it, here's the steps I have so far to make this stretching thing work:

Get a fileshare setup in the cloud.
Map a drive to the cloud.
Create new SQL filegroup and data file.
Create PowerShell script and startup procedure.
Alter Your Mindset.

Let's go through each step of this process and review some of the issues I encountered going through this and how I was able to work around them. You may know better ways to do some of these things and I am open to hearing them!

1- Get a fileshare setup in the cloud

For the purposes of this POC I opted to hit Azure. It is pretty much the default for those of us working in SQL Server. The reality is, I think the methods shown here will work fine for AWS S3 storage too as well as any other cloud filestore you can access. (still waiting for my AWS account to be unlocked... #JustSaying).

They key is to setup the file share (or blob storage, or whatever you want to use that can hold the file) so you can access it from outside the cloud. Be it on a private network or wide open if that's your preference.

I'm not going to actually show screenshots here and such, there are plenty of tutorials on how to setup file shares in Azure and AWS s3 storage elsewhere by people far smarter than I am.

2 - Map a drive to the cloud.

Here things got a little tricky at first. When I first did this I logged into my machine with the same account used to run the SQL Service. (I know, not a best plan, but Yay labs!) I mapped the drive like any other drive would be mapped, entered credentials and voila! I had a drive mapped to the cloud. I tested by making a plain tet file on my local mapped drive, then went to the Azure portal and there it was!

On to adding a data file mapped to the new drive!

Yeah, no. Doesn't work that easy.

The issue is that the SQL Server engine didn't see the mounted drive. It took me a while but I figured out that I need the SQL Engine, as the login I was using, to map the drive. This meant punching a hole in the SQL security and allowing xp_cmdshell. Not an end of world crisis, but something to be strongly aware of.

To map the Azure drive I collected the info from the share/connect section of the file store in the Azure portal and used it to mount the drive via the xp_cmdshell T-SQL command. Here's what that code looked like (obfuscated obviously, you don't get to use my Azure accounts! I like y'all, but not that much!)

USE [master]
GO
-- we need to turn on advanced options and map the drive so SQL can see it. 
EXEC sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
-- to do this we will use the xp_cmdshell 
EXEC sp_configure 'xp_cmdshell',1
GO
RECONFIGURE
GO
-- Make sure to use /persisted:yes so if the SQL service restarts the drive is still available (In theory).
-- Syntax shown is for Azure, use whatever your cloud service indicates is the command to map a drive

EXEC XP_CMDSHELL 'net use A: \\<YourStorageAccount>.file.core.windows.net\woxdemodbfiles /u:AZURE\<StorageAccountName> <Secret Key instead of password ex: 77iDs5mU0apD.....b5A==> /p:yes'

-- breakdown: /p = persist the connection
-- A: = drive letter - A for Azure! (or AWS?)
-- \\<YourStorageAccount>... = path to the drive in a UNC format
-- /u: user for azure / aws
-- 77id... = secret key for the login to use to validate with.
GO
-- goofy command needed to make sure drive is made avialable and active
EXEC XP_CMDSHELL 'Dir A:' 
GO

Yepp, that's right. There is a goofy little command to essentially initialize the connection to the mapped drive. Without adding in that little "Dir a:" command the drive didn't become visible to the SQL Server.

Oh, but wait, it gets more weird!

Once the drive is mapped in SQL, if you login as the service account and open the windows file browser guess what! You don't see the mounted drive! If you use this method at all make sure it is well documented and well known that the drive is mapped internally!

Anyhow, on to more fun trying to make this all work.

3 - Create new SQL Filegroup and Data File.

This was the easy part. I just added a filegroup and then a new datafile mapped to the new driv. I probably don't need to show you how to do this, but here you go anyhow:

USE [master]
GO
ALTER DATABASE [WOXDemo] ADD FILEGROUP [Cloudfiles]
GO
ALTER DATABASE [WOxDemo] ADD FILE ( NAME = N'WOxDemo_Cloud', FILENAME = N'A:\SQLData\SQL2016\WOxDemo_Cloud.ndf' , SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [CloudFiles]
GO

4 - Put data into the new file.

Now that we have a new file, in the cloud, let's throw some data in it and see if it works.

USE WOxDemo;
GO

CREATE TABLE testvalues
    (
        rowid INT IDENTITY(1, 1)
      , randomvalue INT
    ) ON CloudFiles;
GO

INSERT dbo.testvalues ( randomvalue )
VALUES ( 146257 );
GO 5000 -- just throw 5k rows in quickly


SELECT *
FROM   dbo.testvalues;

Insert to the cloud file

Slow insert, around 7 seconds, but kind of expected since is 5k rows through the wire to the cloud with no special tuning or anything.

Select from the cloud file

How about a query to dump the results?

Nice! It works! We have a hybrid DB with some tables locally and some in the cloud!

What could possibly go wrong!

5 - Create PowerShell and Startup Procedure.

Here's where things got really interesting. In an effort to do a little due diligence in coming up with this whole little experiment, I turned the SQL service off and on to simulate an update or server fail. Or to go to lunch. I don't recall.

Anyhow, things got interesting. The SQL service started as normal with no issues! That's not the interesting part. The interesting part was that the database I had 'stretched' didn't come back online. In fact it started up in the 'Pending Recovery' status.

Ugh!

What happened?

Upon checking I saw the A: drive was persisted and the SQL server saw it listed, but there was something different. The mount was there, but it was 'unavailable'. This meant that as the SQL Server engine started up and started mounting the databases and bringing them online, the A: drive I had mounted, was not available and thus the server couldn't get the databse fully online. This presented a problem.A: drive to mount prior to the databases loading. I tried agent startup jobs, startup stored procedures (yes, those are a thing, I'll go through them in a later post probably), about 4 registry hacks and a few other questionable tactics to get this drive to be up and running prior to the engine needing it. Nothing worked. I hung my head and went off semi defeated. This could be so cool if it could just get the file back online.

More ~~trial and error~~ research later and I found I could finally get the DB to come back online after this nasty drive availability issue.

First I had to make sure the drive was available. I would rerun the mount command and then goofy Dir a: commands and the drive was there and ready. Then it was just a matter of getting the DB out of the recovery state.

I found 2 ways that do that well. The first is the dreaded CHECKDB with ALLOW DATA LOSS. It works, but I don't know anyone that would want to use that command with any sort of regularity. The alternative seemed a nicer solution. It is to simply detach and re-attach the database files once the A: drive was available. Manually doing this proved it worked nicely, but how to make it do this for me automatically in case something happened where I didn't bounce the server myself?

The solution was found in 2 parts: Create a powershell script that executes the detach and attach T-SQL, and create a startup stored procedure to call the powershell and get things underway.

"But wait!" you say, "You could do that in just a startup procedure!"

True, I could, but I wanted a way to easily get into the script outside of SQL Server and a local PowerShell script made that easier. Plus it let me practice some more with my PowerShell skills. Have to knock the dust and rust off them every now and then.

What I wound up with was this.

Powershell first to detach and attach the DB:

cls

Add-PSSnapin SqlServerCmdletSnapin* -ErrorAction SilentlyContinue   
Import-Module SQLPS -WarningAction SilentlyContinue  
  
$DetachCmd = "Use master GO
ALTER DATABASE WOxDemo SET EMERGENCY
## You can only detach the databse if it is in emergency state
GO
USE [master]
GO
EXEC master.dbo.sp_detach_db @dbname = N'WOxDemo'
GO"

$AttachCmd = "USE [master]  CREATE DATABASE WOXDemo ON 
( FILENAME = N'C:\SQLData\SQL2016\WOxDemo.mdf' ),
( FILENAME = N'C:\SQLData\SQL2016\WOxDemo_log.ldf' ),
( FILENAME = N'A:\SQLData\SQL2016\WOxDemo_Cloud.ndf' )
 FOR ATTACH
GO"
Invoke-Sqlcmd $detachCmd -QueryTimeout 3600 -ServerInstance ".\sql2016"  

Invoke-Sqlcmd $attachCmd -QueryTimeout 3600 -ServerInstance ".\sql2016"  
If($?)  
{  
Write-Host 'Attaching database' WoxDemo 'sucessfully!'  
}  
else  
{  
Write-Host 'Attaching Failed!'  
};

Then the T-SQL to create a procedure to call the PS on the service startup:

use master
GO
CREATE PROCEDURE Remount_WoxDemo AS 
-- ensure the drive is mounted - use your credentials here (these won't work for you, I changed them)
EXEC XP_CMDSHELL 'net use A: \\<YourStorageAccount>.file.core.windows.net\woxdemodbfiles /u:AZURE\<YourStorageAccount> 77iDs5mU0a...o6mb5A==';
-- run a simple command on the drive to ensure SQL Service is seeing it
EXEC xp_cmdshell 'DIR a:\';
-- run a customized powershell script to detatch and reattach the DB
EXEC XP_CMDSHELL 'powershell.exe "c:\Scripts\RemountDB.ps1" -nologo' ;

GO

-- make the procedure a startup procedure to ensure the DB comes online properly when the service restarts

EXEC sys.sp_procoption @ProcName = N'Remount_WoxDemo'   -- nvarchar(776)
                     , @OptionName = 'startup'  -- varchar(35)
                     , @OptionValue = 'on' -- varchar(12)

Yep, it is all hard-coded. This is a lab, and I don't think I'd want to risk dynamic code for trying to get my database back online. I feel I would also hard code for any implementation I did of this method regardless of the environment.

What got me excited is that this worked. Sure the SQL server would start up and the database would be pending recovery then a few seconds later, once the A: drive was re-enabled and the DB detached and re-attached, it came back online like normal.

There are still going to be a LOT of gotchas I haven't come across yet in the limited time I've been playing with this. Things like logins with the 'stretched' DB as their default database will lose that association on detach, but that could be dealt with by further customization of the startup scripts. Know something else that could go wrong? Great! let me know in the comments, share your thoughts and ideas!

6 - Alter Your Mindset.

I haven't done this yet in my POC little sample, but it is on my list to figure out. Consider filegroup or file based backups rather than your regular backup strategy. This does not mean an OS file backup or snapshots. Use the advanced backup methods in SQL Server to backup the active files on a regular basis, but less often for the files in cold storage. You should still make a backup every now and then of the cloud data, and the cloud should help with that, but don't pay for pulling all that cold data back locally for regular backups. The bonus is now your backups should be a LOT smaller and faster, even if a little more work is needed to restore them!

Don't forget that you need to keep your mind open and try different things. Just because something is expensive or not easily done doesn't mean it is impossible to figure out a workaround. There is a lot of capability with our SQL Server environments and I am always impressed when I see what individuals are able to come up with to make our systems work how we want and need them to.

Go out and play and learn something new, like how to stretch a database into the cloud using an alternative to SQL Server stretch databases.