Monthly Archives: January 2008

Performance Trace in SQL 2005

I’ve always used a server side trace to keep track of what’s going on for my servers. It’s something that helps me to answer when someone says what’s going on the server. Typically I have this track everything that takes over 2 seconds. It depends on the environment and the load as to what I actually use. I’m not going to go through my whole method in this post but I wanted to point out an interesting fact I recently learned on SQL 2005 different from SQL 2000. The duration column is stored in MicroSeconds when you write it to a Table or file. It’s no longer stored in Milliseconds. So in 2000 we could do this.

SQL 2000
Select duration * .0001 as Seconds from PerformanceTraceTable

SQL 2005
Select duration * .000001 as Seconds from PerformanceTraceTable

You will not see this when running in profiler GUI interface since it converts to milliseconds for you. It’s only an issue if you save the data down to a table or file when you run a server side trace.

I’m also finding some negative durations which I’m still looking into but will post more when I find more information on those.

Happy tracing!

Running Checkdb on Very Large Tables

So I found myself in a very precarious position on Monday morning. I was sent an email showing screen shots of a DTS package that had failed with the error message

“A possible Database Consistency problem has been detected on database X DBCC checkdb and Dbcc checkcatalog should be run on Database X”.

I thought the worst of course and went out to investigate this server to see what was going on. I’m new to this position and I had not looked at this server before so I didn’t know what to expect. First bad news was that the most recent backup was October. Not a good thing!

Once I got connected to the machine I found it was a Sql 2000 Ent edition box with 4 processors and 8 GB of ram and what I consider a VLDB. The Main db was 596GB in size and had a table that boasted 929mil rows. I begun to look in the logs and found Assertion failures in the log and long wait times to write records. Messages like this.

SQL Server has encountered 4 occurrence(s) of IO requests taking longer than 15 seconds to complete on file [x:\data.mdf] (21). The OS file handle is 0x000003AC. The offset of the latest long IO is: 0x0000000ce67800

So having seen this message before I realized we must have had some sort of hardware issue either Controller or the Disk subsystem was unavailable. After looking in the windows log I found SCSI controller errors. At this point we already had a maintenance page up for whoever was trying to access this server so I called upon my Hardware/Operations guys to take a look and see what they could find. After a few hours and a few patches to the OS they did what they could do for now without incurring a longer downtime. We thought it would take much longer to fix the DB side so we decided to see what was wrong there. I also made a backup at this time even if it was corrupt some backup is better than no backup at all.

I figured the Very large table would take a long time to run checkdb on so I decided I would run a check table on all other tables(only 38 of them) to try and find any other issues that I could possibly repair while the big one was running. It only took about an hour to run a checktable through everything else and it all came back clean. I then told it to checktable just the large table. To my surprise it only took 56 minutes and there was nothing wrong with it! What luck! So I figured there was something wrong with the DB so I started the checkdb. Again I was amazed at the speed. 1h56min later it was done and still no errors. So at this point I counted myself very lucky that SQL Server did what it was suppose to and didn’t corrupt any data when the controller failed.

Make sure that if you have a Disk/Controller failure you always run CheckDb on the Database after the system is back up. Even if it’s a very large database I think you will be surprised at how fast it can go.

Here are some steps I have used and talked about in the past to approach a corruption issue.

  1. Identify the issue. 9 out of 10 times corruption is because of a hardware failure of some sort.
  2. Make a Backup of the DB. Even if it’s corrupt you could save much of the Data in it.
    If you can take extra downtime then attempt to correct your hardware issue before running your checks. If your checks cause the issue to happen again you’re not going to be any better off. Usually in the real world you won’t have the time to deal with the hardware side unless your disks are completely unavailable
  3. Start to identify where the corruption is. If you have a large DB and you’re short on time perhaps running check table on the tables you believe to be the issue might be a good first step. You will have to run the Full checkdb at some point though to make sure you got everything.
  4. Start fixing the problems. Remember to try and drop and re-create indexes on tables that have corruption. Sometimes just the indexes corrupt and that can be usually fixed by dropping and re-creating. If not go through the levels that DBCC suggests only going to repair_allow_data_loss as a last resort. Sometimes you can export all the data out to another table and then just dropped the corrupted one.
  5. Make another backup when everything is clean. If you lost data then you can possibly compare to your corrupted DB and get any other data that might be gone.
  6. Make sure to double check your backups and I suggest running a checkdb on a consistent basis to let you know if there’s any issues

    Hope you never have to deal with corruption. But be prepared if you do. 🙂

    Pat

New Position

So I started a new job today. I’m very excited about it and look forward to the many challenges ahead of us. As I will be creating new procedures to troubleshoot and track the servers I will be managing I will be posting that information up here.

One useful tidbit that I always seem to lose track of is the exact Version of SQL I’m dealing with. Sure I know 8.0 is 2000 and 9.0 is 2o05 but what is 8.00.2039? Some out there will say that’s SP4 and they would be right but they have also learned to remember this. Since I can’t seem to remember anything perfectly I figured I would once again find the KB Article on it.

http://support.microsoft.com/kb/321185

This article will tell you what version of SQL your using instead of just the number. Very helpful if you can’t always remember which Version your dealing with.

Upcoming User Group Event

I try and Suppoort my local users groups as much as possible the one’s I run and the one’s I don’t. I will be presenting at the SLC It Pro user group on Tuesday January 8th. I’ll be presenting Sql Server for Systems Administrators. This is good information for those that may not have a DBA around or they just want to know more about how to properly manage SQL Server. Hopefully we’ll get a great turnout. You can find location and all the info at www.slcitpro.com

A View of Performance Problems

This is a look at what happens for a DBA when the DB Server starts to run slow.

4:00 a.m. Phone starts ringing with High Priority messages.

4:05 a.m. Helpdesk calls to tell you people are complaining of slowness and timeout’s happening in applications.

4:10 a.m. wearily walk down to computer/office/chair and try not tripping over toys and other dangerous parts of the house like stairs!

04:15 a.m. get logged in and start to troubleshoot.

Here are the steps I commonly use.
1. Ask where specifically they are seeing slowness. Giving me an Idea of the DB and Server you’re dealing with. I also try and find out as close to specific times as possible. While asking for info if you have Change control process’s or system change control ask if anything has changed or if something is running in your operations department.

2. Once you find out this info I typically use a trace that I always have running. My trace captures anything that runs longer than 2 seconds to a table in the DBA database. By using this I can see all the procedures that are currently running slow. Once I find these procedures it’s usually a matter of me knowing my data and I know that some procedures take a little longer while others should have ran faster. This also tells me who may be taking more resources than others since I capture the Reads, Writes, CPU and Duration.

3. In parallel with # 2 above or if you don’t have traces running I open Perfmon and start capturing stats from the server. The common ones I grab first are Processor %( just total is fine) and Current Disk Que length for the Drives that house the data and log files. If I’m going to trouble shoot longer then I grab other counters such as Page life expectancy to check for memory pressure.

4. I would also check a sp_who2 Active to see who is running what and trying to gauge how much blocking is going on. Recently Adam Machanic posted a great new script on who is doing what I love it! Thanks Adam! So I’ll add the link here if you want a great way to get the information. http://sqlblog.com/blogs/adam_machanic/archive/2007/12/31/a-gift-of-script-for-2008-who-s-active-what-are-they-doing-and-who-is-blocked.aspx

5. I would be checking on Perfmon to see if there are spikes and noting the times they happen. If they are consistent on some sort of schedule find out what is running in your jobs at that time or on that schedule (every 5 minutes, 10minutes, etc…).

6. If you have a baseline (which you should) Compare how your server normally runs to what it is doing now. This should give you an idea of how bad the problem is

All these things should point you in the right direction. This is by far not the only things you could do and I’m sure others could comment on additional steps. These are the quick common ones that I look into knowing your data and your DB Servers really helps with any performance problem. With Performance issues you cannot always give a smoking gun as to what caused the issue but you should aim to do so. How much data you have in the past and how much you have now will determine how well you can get to that smoking gun. Hopefully this gets you started on the right track.

Happy Performance Hunting

That was easy now for the tough part

Ok so I thought it would be a little harder to setup a blog. But through the Power of Google and technology I was done in just a few clicks.

Ok so about me. I’m a SQL Dba that likes to help out in his community and help others in the DBA and SQL Community. You’ve probably seen me at a PASS Summit if you have had a chance to attend one. I am currently serving on the Board of Directors for PASS and I volunteer on a few other committee’s as well.

This blog will mostly be dedicated to SQL Server but from time to time i’ll put in the occasional personal post or talk about my other hobbies(photography,R/C Cars,Restoring Old Cars,Video Editing)

Thanks for Reading