Monthly Archives: April 2010

The Rabbit hole, Tips on Dynamic SQL

Leaving the Subway 

Ya never know what’s down the rabbit hole. 

I always try and warn people when you go down the Dynamic SQL path you have to be ready to go down the rabbit hole. You always think it’s just a simple thing until you’re face to face with the Queen of hearts and she yells “Off with her head!”There is one thing that Dynamic SQL is good at, Making you think out of the box.

So I’ll share some tips on Dynamic SQL  through a conversation I had with a friend. We’ll call him Bob.

Bob: hey why can’t I do this in Dynamic SQL, select @reccount = count ( * ) in dynamic sql?

Me: you have to declare the variable you are trying to set inside the Dynamic SQL, Remember Dynamic SQL is in another Process/Thread. So it cannot see cursors/variables outside its set of Dynamic SQL. If you run a USE statement in the Dynamic SQL it will only effect the statements inside the Dynamic SQL. To get something like a count between the two threads you would need to insert it into a temp table. I’ve listed a typical way to do this below.

--Original way that fails. 
DECLARE @cmd NVARCHAR(4000
DECLARE @reccount INT 

SET @cmd = 'Select @reccount = Count(*) from master..sysdatabases'

PRINT @cmd 
EXEC sp_executeSQL @cmd 

--Way that works by inserting to a temp table. 
DECLARE @cmd NVARCHAR(4000
DECLARE @reccount INT 
DROP TABLE
#rowcount
CREATE TABLE #rowcount 
  
(reccount INT NULL)

SET @cmd = 'Select Count(*) from master..sysdatabases'

PRINT @cmd 
INSERT INTO #rowcount (reccount
EXEC sp_executeSQL @cmd 

SELECT * FROM #rowcount

Bob: Ok so what about this problem I have of adding all these strings together this get’s really complicated and hard to read when I’m adding 4-5 variables together. Here’s an example.

DECLARE @cmd NVARCHAR(4000
DECLARE @startdate date
DECLARE @enddate date
DECLARE @dblevel INT 

SET @startdate = '02-01-10'
SET @enddate = '05-01-10'
SET @dblevel = 100

SET @cmd = 'Select * from master..sysdatabases where crdate between ''' + CONVERT(VARCHAR(25),@startdate) + ''' and ''' + CONVERT(VARCHAR(25),@enddate) + ''' and cmptlevel = ' + CONVERT(VARCHAR(10),@dblevel)
PRINT @cmd 

EXEC sp_executeSQL @cmd 

This is hard to read and I have to do lots of conversions to get things to work.

Me: I’ve found when dealing with Dynamic SQL it’s best to write the query you want with values and then replace the place holders you want to change. So taking your example here’s another way.

DECLARE @cmd NVARCHAR(4000
DECLARE @startdate date
DECLARE @enddate date
DECLARE @dblevel INT 

SET @startdate = '02-01-10'
SET @enddate = '05-01-10'
SET @dblevel = 100

SET @cmd = 'Select * 
from master..sysdatabases 
where 
crdate between ''<startdate>'' and ''<Enddate>''
and 
cmptlevel = ''<dblevel>'''

SET @cmd = REPLACE(@cmd,'<startdate>',@startdate)         
SET @cmd = REPLACE(@cmd,'<enddate>',@enddate)
SET @cmd = REPLACE(@cmd,'<dblevel>',@dblevel)
PRINT @cmd 
EXEC sp_executeSQL @cmd 

This is much easier to read and find what you’re looking for and what you need to replace. Now of course there’s many ways to do this for me this has always been easier.

BOB: Ok last question. Sometimes my Dynamic SQL doesn’t run. It doesn’t return anything and doesn’t seem to do anything at all. It won’t print it doesn’t error it just does nothing! It’s very frustrating. Any idea as to why this happens?

Me: This is where you need to be careful down the rabbit hole. Very careful, here’s an example of what will cause this exact situation to occur can you spot the difference and what the problem is?

DECLARE @cmd NVARCHAR(4000
DECLARE @startdate date
DECLARE @enddate date
DECLARE @dblevel INT 

SET @startdate = '02-01-10'
SET @enddate = '05-01-10'
SET @dblevel = NULL

SET @cmd = 'Select * 
from master..sysdatabases 
where 
crdate between ''<startdate>'' and ''<Enddate>''
and 
cmptlevel = ''<dblevel>'''

SET @cmd = REPLACE(@cmd,'<startdate>',@startdate)         
SET @cmd = REPLACE(@cmd,'<enddate>',@enddate)
SET @cmd = REPLACE(@cmd,'<dblevel>',@dblevel)
PRINT @cmd 
EXEC sp_executeSQL @cmd

I set the @dblevel = null. Anytime a variable is null and you place it into a Dynamic SQL string it nullifies the entire string. Worse yet it doesn’t error or complain about this issue it just does nothing. Go ahead and test it out on your system pass any NULL into a Dynamic SQL string and it will do nothing. You should make 100% sure that every variable/value that you pass into a Dynamic String is not a null.

This is just some quick tips I had for Dynamic SQL it’s by no means a full list of all the issues. I do warn people that with “Great power comes great responsibility” – Ya I went there.  🙂 Here’s some simple pro’s and con’s to keep in mind with Dynamic SQL

Pro’s

  • Great flexibility
  • Ability to get through many db’s without writing code that lives in each one. If you have many db’s on a server and you need to check X then there’s a way to do that.
  • They can now store execution plans. I still wouldn’t guarantee this but they previous to Sql 2005 Dynamic SQL never re-used plans.

Con’s

  • Prone to errors, it’s very hard to track errors and deal with errors in Dynamic SQL.
  • SQL Injection, this is the primary method to get SQL injection into your system. Protect and check your variables.
  • Readability, even with the sample I posted above it’s hard to read and figure out what Dynamic SQL is doing
  • Debugging/Testing, Debugging Dynamic SQL consists of many print statements and lots of trial and error to make sure you have everything just right.

I would also suggest that you look into PowerShell before going the Dynamic SQL route. Many of the things you would use Dynamic SQL for can now be done in PowerShell. Here are some Articles about PowerShell from Buck Woody and Aaron Nelson.

Regardless of what you decide be careful when you going down the Rabbit hole sooner or later you’ll be stuck in a big way with only a little door to get out from.

Thanks to @SqlVariant (Blog|Twitter) on the tips for Simple-Talk’s Prettify code.   It made the code samples in here look much better. 

Sql Saturday #31 Chicago

I wasn’t able to attend the actual sessions for SQL Saturday Chicago.  For good reason though I was busy taking pictures of my wonderful Niece’s. 

I was able to attend the after party and help “Cover” the event.  SQL People and Karaoke are always a good thing and I don’t want to miss a chance to record them.  I’m thinking for PASS I will need to get a Light panel for late night karaoke.  :) 

As I spoke with people about the event they all said it was a huge success and had a lot of fun learning at the event.  My hat’s off to all the organizers and the volunteers that put this together it’s always good to see successful events like this.

Ally my Pictures/Videos can be seen here ( Facebook | Flickr | YouTube ).  They are the same you can choose which you want to tag in. 

pat

T-SQL Tuesday #005 – Vertical Tables

 

So I’m finally getting back to blogging and figured I would get in on a post for #TSQL2sDay. I didn’t want to make something to terribly long so I thought I would talk about a common reporting scenario I run into all the time. What is meant by a vertical table and some ways to get data out of the table?

My definition of a Vertical table is one that has been created to hold lots of disparate data dynamically. I’ve seen some of the largest of these in a call center environment so I’ll use that as my example.

Let’s say you’re a Customer service agent. You take hundreds of calls a day and each call has specific data attached to the call for reporting purposes. Things like Talktime (Amount of time you were on the phone), Hold Time (The time you put the caller on hold) and TransferExtension (You transferred the call to another agent). All of these things in the system are known as attached data, basically data that is riding along with your call for later system/reporting needs. Now customers are typically allowed to add attached data to the call system. Perhaps the customer would like to not only track when you were on hold and when you were talking but they also want to know something about the person that called the system. Perhaps there CallerID information also known as ANI, this is another form of attached data that the customer may dynamically assign to the call that the system has to handle. This is why many systems employ the use of vertical tables.

Vertical tables allow for you to dynamically create new pieces of data about something. There is no need to add another column to a table it can automatically start making the data appear in the table without any changes. This is the great power of Vertical tables. Hopefully you will also see this is a great weakness. So here’s what the vertical table above would look like.

Create table Attached_Call_Data

(Id int identity(1,1) not null,

Callid int not null,

Keyname varchar(255) not null,

Value Varchar(2000) null)

The Callid – would link all information back to a specific call (many times you might have a sequence as well)

The Keyname – would tell you what piece of data you have such as (‘Talktime’,’Holdtime’,’ANI’)

The Value – this would be the value of the call. So perhaps Talktime = 90 Seconds the value here would be 90

Now once again hopefully you’re seeing the problems here. The value column is just a varchar so it can hold any data. This makes it very difficult to keep consistent with what’s in the column. It also means lots of conversions in your future. Where clauses have to convert the rows as well so get ready for lots of index scans.

Ok one last example and we’ll talk pro’s and cons. Here’s what the data could look like for example.

Id,Callid,Keyname,Value

1,2882,talktime,90

2,2882,holdtime,10

3,2882,ANI,John Smith 555-555-5555

4,2818,talktime,109

5,2818,Ani,555555555

This is a very simple example but a way to prove the point with vertical tables your going to get lots of varied data. Ok so how do we report off of them and what’s the pro’s and con’s?

Vertical tables

Pro’s

1. Dynamic adding of columns to data sets without changes to code or DB design.

2. Makes for small tables as far as size go. You only have 3-4 columns in a vertical table so it’s not going to get nearly as large as a table with 100-200+ columns

3. Can be fast to search and index because of the narrow size.

Con’s

1. Can’t control the data type allowing any variation of data to come in.

2. Reporting off the table is very cumbersome.

3. Too much processing/conversion to get the data out.

4. Many millions of rows for a small data set. If the customer takes 100,000 calls in one day that could be a potential of 100,000 * 100 pieces of attached data = 10,000,000 rows of data.

5. No lookup tables for the values so trying to get a list of what a specific keyname has is not possible unless someone has written another table that already has this information. Which I’ve yet to see a system designed this way.

6. Forces Scans frequently since so many conversions have to be done on the Value column.

Besides the obvious Data integrity and design issues with vertical tables the really big issue is reporting off of them. Most OLTP reporting is best done in a de-normalized flat table with lots of columns so you can get to data easily and with less joins. To get the data in that form for Vertical tables you don’t have a lot of good options. Here’s a few I have used.

Subquery method.

Select callid,

(select value from Attached_Call_Data where callid = A.callid and keyname= ‘Talktime’) as Talktime

From Attached_call_data as a

This obviously has a lot of reads to get the data it needs.

Pivot is a method I have yet to use since it’s hard to get around the dynamic nature of the tables but it’s one I’m looking into for a future project. A friend used Pivot with great success and speed to get the data out and I’m eager to use it as well.

You could use a Loop/cursor to get through it. I can’t see this as a good way to do it unless you have some way to get the data out of SQL quickly and then loop it in memory. I have yet to ever try and use a loop to get data out.

I’m sure there are methods in C#/CLR that you could use to get through this as well.

I’m sure there are other method’s out there and I would love to hear about them from anyone that has to deal with Vertical tables. No matter which method you’re looking at though it’s going to be more processing than your standard OLTP/Normalized data structure. I’m a firm believer in every tool has a place though and I know and understand that the vertical table does. If you need a dynamic structure to hold your data and you can control validation on the app side and you can create a separate process that runs out of band for your reporting needs then this can be a very useful table. If you can’t meet these criteria you should stay away from this table, Real time reporting or real time reports written off vertical tables never work well without lots of hardware backing them.

I think this went a little longer than I hoped but hopefully you got some good information about Vertical tables if you have never had to report off these count yourself lucky.

Pat