Correlating Data in SQL Trace, Part 2 - in which Johnny learns a valuable lesson

In last week's "Correlating Data in SQL Trace, Part 1", we took a trace to diagnose lock timeouts, and developed a query to correlate lines in the trace: this let us see what code was generating what lock timeouts.

The trick for the query is to use a correlated subquery to find the statement rownumber, thusly (simplified):

SELECT
        [Timeout].* ,
        Statement.*
FROM    Trace AS [Timeout]
        INNER JOIN Trace AS [Statement] ON Statement.spid = [Timeout].spid
               AND Statement.DatabaseID = [Timeout].DatabaseID
               AND statement.rownumber = ( SELECT TOP 1 rownumber
                                  FROM trace
                                  WHERE spid = [Timeout].spid
                                  AND rownumber < [Timeout].rownumber
                                  AND eventclass = 11
                                  ORDER BY rownumber DESC
                                  )
WHERE   [Timeout].EventClass = 189

A little translation: "WHERE EventClass = 189" gives us all our timeouts.  "statement.rownumber = (subquery)" finds the closest (to the timeout) previous RPC:Starting row with the same spid...in other words, the timeout's associated RPC call.

Another use for this same type of correlation query is to find all "RPC starting" with no associated "RPC:complete" in a given timeframe.  Okay, it's a pretty specific need, but it certainly illustrates that this can be a very versatile query.

Instead of correlating timeouts to queries, we want to correlate RPC start with RPC complete, and then display only those RPC:Starting rows with no associated RPC:Complete.

To achieve this, we have to make some changes to our query.

  • For one thing, we need a left outer join; this lets us return rows with NULL matches on the right side (meaning, no RPC:Complete)
  • In the rownumber=(subquery), we change the less than to greater than...before, we found timeouts and then went back in time to find their origin. Here, we're finding RPC:Starting and going forward in time to find RPC:Complete.  
  • We change the subquery's ORDER BY to ASC for the same reason.
  • And of course, change the subquery eventclass = 10, as we're looking to return RPC:Completed.
  • Introducing an outer join to the query adds a new element: the possibility of matching up rows that don't actually go together.  SPIDs get reused...if we keep our current criteria - an RPC:Completed with the same SPID and DBID with a greater rownumber - we're very likely to get false matches.  The way to fix this is to match start times in the subquery.

So our final query looks like this:

SELECT TOP 1
        [RPCStart].rownumber ,
        [RPCStart].objectname ,
        [RPCStart].textdata,
        [RPCStart].loginname ,
        [RPCStart].spid ,
        [RPCStart].starttime ,
        [RPCStart].endtime ,
       
'COMPLETE:  ' AS spacer,
        [RPCComplete].duration ,
        [RPCComplete].rownumber ,
        [RPCComplete].objectname ,
        [RPCComplete].textdata,
        [RPCComplete].loginname ,
        [RPCComplete].spid ,
        [RPCComplete].starttime ,
        [RPCComplete].endtime
FROM    Trace AS [RPCStart]
        LEFT OUTER JOIN Trace AS [RPCComplete]
                  ON [RPCComplete].spid = [RPCStart].spid
                     AND [RPCComplete].DatabaseID = [RPCStart].DatabaseID
                     AND [RPCComplete].rownumber = ( SELECT TOP 1
                     rownumber
                     FROM Trace
                     WHERE spid = [RPCStart].spid
                     AND rownumber > [RPCStart].rownumber
                     AND startTime > [RPCStart]. startTime
                     AND eventclass = 10 -- rpc completed
                     ORDER BY rownumber ASC
                     )
WHERE   [RPCStart].EventClass = 11   -- rpc starting

What have we learned?  That it's always best to tell the truth, little Johnny.  Oh, and with a little careful restructuring, a good query can be recycled for another, equally cool use.

-Jen McCown, who is really posting this at midnight-thirty local time.

http://www.MidnightDBA.com 


Currently rated 1.0 by 1 people

  • Currently 1/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , ,
Categories: Admin | SQLServerPedia Syndication | T-SQL

14 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Correlating Data in SQL Trace, Part 1

I’ve been chasing down a series of issues with a system. One of  the symptoms we’ve been getting is massive lock timeouts, so I’ve been taking (server side) profiler traces to see what code is requesting the locks that are being timed out.  Today I landed on a lovely solution for correlating the lock timeouts I see in the trace file with the associated code.  Let’s talk setup, and then we’ll walk through the thinking that led to our solution.

Setup

Take a server side trace, be sure to get “existing connections”, RPC:Starting, RPC:Completed, and Lock Timeout (Timeout > 0) (see the previous blog “Lock Timeout Revelation on Twitter”).  For the RPCs and Timeouts,  get your basic measures, plus ObjectName and DatabaseID.

Once you’ve captured enough of a trace, save it to a table, preferably on your local box. 

Depending on how much time you’re going to spend digging through this data, you may want to index your trace table.  Here are the indexes I developed for this query:

CREATE INDEX ix_trace ON trace (eventclass, spid, rownumber, databaseID, objectName) INCLUDE (starttime, endTime)
CREATE INDEX ix_trace2 ON trace (spid, eventclass, rownumber, databaseID, objectName)

Developing

Once you save the trace into a table, you'll need to know the event id (event class names aren't displayed). (*See footnote about trace in tables...) 

Trace event 189 is Lock Timeout (timeout > 0).  So when you find a 189, it’s easy to go back up the trace rows until you find the closest one with a corresponding SPID, which will have an eventclass of 11: RPC:Started.  Rough pseudocode:

1.       Find a timeout eventclass = 189

2.       Find the closest row with rownumber < timeout.rownumber and spid = timeout.spid

Part 1 is easy:

SELECT TOP 10
        [Timeout].RowNumber ,
        [Timeout].EventClass ,
        [Timeout].DatabaseID ,
        [Timeout].spid ,
        [Timeout].starttime ,
        [Timeout].endtime ,
FROM    Trace AS [Timeout]       
WHERE   [Timeout].EventClass = 189

Note that I’m limiting the query to the TOP 10 for now, just so I don’t have to look through 10,000 rows at once.

So Part 2: for each timeout, find the closest previous RPC Started with the same spid.  On first look, I started falling into the cursor trap.  You say it to yourself procedurally: “for each timeout” is actually direct pseudocode for a cursor. But don’t go there….just don’t…

What do we know about the timeout and its associated code call?

  • They have the same spid
  • They’re on the same database
  • The statement will have an RPC start (eventclass = 11), but not necessarily an RPC complete (eventclass = 10)
  • The statement will precede the timeout, and therefore have an earlier row number

So we could say that for a timeout, the corresponding statement is

  • the MAX rownum less than the timeout rownum
  • where the spids and databases match,
  • and eventclass=11. 

Well partner, that’s a correlated subquery right there! Pretend for a moment that we only have one lock timeout, SPID = 52, rownumber = 11,304.  So to find the code that goes with it:

SELECT TOP 1 rownumber
      FROM trace
WHERE spid = 52         --Timeout spid
AND rownumber < 11304   -–Timeout rownumber
AND eventclass = 11
ORDER BY rownumber DESC 

That’s our subquery. Let’s plug it into the part 1 query, and correlate the thing! 

SELECT TOP 10
        [Timeout].RowNumber ,
        [Timeout].EventClass ,
        [Timeout].DatabaseID ,
        [Timeout].spid ,
        [Timeout].starttime ,
        [Timeout].endtime ,
        Statement.*
FROM    Trace AS [Timeout]
        INNER JOIN Trace AS [Statement] ON Statement.spid = [Timeout].spid
               AND Statement.spid = [Timeout].spid
               AND Statement.DatabaseID = [Timeout].DatabaseID
               AND statement.rownumber = ( SELECT TOP 1 rownumber
                                  FROM trace
                                  WHERE spid = [Timeout].spid
                                  AND rownumber < [Timeout].rownumber
                                  AND eventclass = 11
                                  ORDER BY rownumber DESC
                                  )
WHERE   [Timeout].EventClass = 189

And finally, let’s do better with our select than just Statement.*:

SELECT TOP 10
        [Timeout].RowNumber ,
        [Timeout].EventClass ,
        [Timeout].DatabaseID ,
        [Timeout].spid ,
        [Timeout].starttime ,
        [Timeout].endtime ,
'STATEMENT:  ' AS spacer,
        Statement.rownumber ,
        Statement.objectname ,
        Statement.textdata,
        Statement.loginname ,
        Statement.spid ,
        Statement.starttime ,
        Statement.endtime
FROM    Trace AS [Timeout]
        INNER JOIN Trace AS [Statement] ON Statement.spid = [Timeout].spid
               AND Statement.spid = [Timeout].spid
               AND Statement.DatabaseID = [Timeout].DatabaseID
               AND statement.rownumber = ( SELECT TOP 1 rownumber
                                  FROM trace
                                  WHERE spid = [Timeout].spid
                                  AND rownumber < [Timeout].rownumber
                                  AND eventclass = 11
                                  ORDER BY rownumber DESC
                                  )
WHERE   [Timeout].EventClass = 189

I like the “STATEMENT” spacer; it’s a nice break between your timeout columns, and your statement columns.

I found another great application for this, but we’ve gone a bit long here. So, come back tomorrow for Correlating Data in SQL Trace, Part 2!

-Jen McCown

http://www.MidnightDBA.com

 

*Footnote: Again, I'll have to come back later with a link, but I also learned today that when you port your trace results into a table, duration is stored in microseconds (1/1,000,000 s), instead of milliseconds (1/1,000).  This seems like it would be pretty good info to have.   Edit: Linky! Here's the Brian Moran article "Duration and CPU Values" that mentions this.


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , ,
Categories: Admin | Dev | Performance | SQLServerPedia Syndication | T-SQL

134 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Lock Timeout Revelation on Twitter

I was all proud of myself today, and bragged a little bit on Twitter about a query I'd written.  That query gets its own blog here shortly, but first, a worthwhile Twitter conversation with Adam Machanic (@AdamMachanic on Twitter, Adam's blog is on SQLBlog.com) that turned very, very useful:

Me: Okay, that cool query from yesterday that correlates lock timeouts to their queries? Yah, just modified to find... (1/2) 
Me: ...RPC start statements that have no correlated RPC complete statements within a timeframe. Shows what's bunging up the CPU.(2/2) Will blog!

Adam: Why are you hitting lock timeouts on a regular enough basis to need such a query?

Me: BizTalk transactions + UPDLOCK on SELECT

Adam: BizTalk overrides the lock timeout setting?

Me: I don't think so (not sure), but we've got maxed CPU and 2 SPs that run again & again, lotsa waiting & timeouts.

Adam: It sounds like you're referring to "client" timeouts, not "lock" timeouts, if I'm understanding what you're talking about.

Me: In a SS-profiler trace, we have tons of lock timeouts (eventclass 27) showing up. I can associate those w/the SPIDs (1/2)
Me: ...that were holding the locks when they timed out. Tells me what's causing the lock contention, which I suspect is (ok 2/3)
Me: contributing to the CPU maxout. Sound crazy? (3/3)

Adam: Ah, I see what you're doing. I don't think a lock probe is very expensive but I guess if you're getting millions it adds up.

Me: Suppose emphasis should be more on the high CPU ,& longrunning queries that're CAUSING the timeouts, not the timeouts thmselvs

Adam: Is the duration on the events > 0? If not, they're not 'really' timeouts, just an indication of a lock wait.

Me: HOLY MOTHER OF POTATOES! They're all waits...I so didn't know that, thx

Adam: Glad to help :-)

Me: I even RTFM, and BOL doesn't make that clear. I can see now how the article would mean that. But now I hav acheived clarity!!

Adam: Glad to help :-)
Adam: "I was just sitting here, eating my muffin, drinking my coffee, when I had what alcoholics refer to as a moment of clarity."


After I wrote the "HOLY MOTHER OF POTATOES" line, I went and looked up Lock Timeouts in BOL.  And yes, it does mention this, but it's terribly cryptic to the untrained eye (like mine):

Lock:Timeout events with a duration of 0 are commonly the result of internal lock probes and are not necessarily an indication of a problem. The Lock:Timeout (timeout > 0) event can be used to ignore time-outs with a duration of 0.

Yeah, I read that two years ago, and disregarded it as making no sense. 

So there's our tip for the day! Lock Timouts with duration = 0 aren't really timeouts.  And, it's a good segue into the next blog, so...

-Jennifer McCown

http://www.MidnightDBA.com


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , ,
Categories: Admin | Performance | SQLPASS | SQLServerPedia Syndication | T-SQL

45 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Find Deprecated SQL Code Using Profiler

I stumbled across something really cool when I was reading through the BOL entry on sp_trace_setevent*, which gives a list of all events and columns available to trace.  This is, by the way, an extremely useful entry to have bookmarked.

There are two events I hadn't really noticed before: number 39, Deprecated, and  125, Deprecation Announcement.  "Deprecated" clearly tells you when a deprecated (that means phased out, kids) feature is used on the instance.  And Deprecation Announcement "Occurs when you use a feature that will be removed from a future version of SQL Server."

That is SO insanely useful, I might throw a deprecation party. We'd all have to drink TAB and wear really old clothes, or something.  (Get it? Because they've gone out of style? Ha! I made a funny!)  

When you begin to look at upgrading SQL Server, this is a great way to start to track down code that will break after the upgrade.  Set up a small server side trace, with just the Deprecated, Deprecation Announcement, and Existing Connections events, and let it run for a while - maybe a day, or a week, depending on how fast the trace file grows.  Then start your lists of code to update.

* By the way, you use sp_trace_setevent when you set up a server side trace.  Definitely see my SQLServerPedia article on server side trace, and the accompanying video on MidnightDBA.com.

-Jen McCown

http://www.MidnighDBA.com


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: ,
Categories: Admin | SQLServerPedia Syndication | T-SQL

112 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Highly Amusing Index Frag Anecdote

I can't believe I haven't told this story from SQLPASS yet... and by the way, here's Paul Randal's blog on this same event/topic.

One of the few sessions I managed to attend was a really cool Grant Fritchey (blog, twitter) session on DMVs.  He was talking about the DMV that includes index fragmentation data (sys.dm_db_index_usage_stats) and mentioned briefly that below a certain page level, he doesn't bother defragmenting.  During the QnA (and Tweeting all the while) I asked Grant, "For you, how many pages does a table need to have before you start worrying about the fragmentation level?" 

He answered that his lower cutoff is in the single digits, because defragmenting a table that small will make no difference at all.  And "my [upper] cutoff is at about 100 pages, but Microsoft says 1,000 pages. But even a 30 page table can wreak havoc on a query if highly fragmented." 

Paul Randal (blog, twitter) was very active on Twitter during the summit, and saw the tweets. He replied immediately: "#sqlpass 1000 is a number I made up when back at MS, but it only applies if the table is in memory. Otherwise, sure, defrag."  I got to read that out while we were still in Grant's session, and it got a big laugh.

-Jen McCown

http://www.MidnightDBA.com


Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , , ,
Categories: Admin | Dev | SQLPASS | SQLServerPedia Syndication

80 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Restore Database WITH MOVE - A Review

Content rating: Beginner/review material.

The most common database restore I do is the WITH MOVE format.  I get DB backups from friends, book or presenter sites, and restore them on my box to play with.  The RESTORE syntax WITH MOVE, from BOL, is:

RESTORE DATABASE TestDB
FROM DISK = 'C:\AdventureWorks.bak'
WITH MOVE 'AdventureWorks_Data' TO 'C:\testdb.mdf',
MOVE 'AdventureWorks_Log' TO 'C:\testdb.ldf'
GO

See, database backups store information about the database they came from...namely, the logical file names, the source file locations, and so on.  When I restore a backup on my box, I'm very likely going to restore to a different filepath - I want to save my files in a folder on C:\, not F:\ (I don't even HAVE an F drive).  

Let's take a look.  First, locate the backup file itself - in this case, Brent Ozar's Twittercache database - and get the full filepath (e.g., is it saved in "c:\temp"?).  Then you can run this in SSMS:

RESTORE FILELISTONLY FROM DISK='C:\temp\TwitterCache.bak'

That'll get you the backup information:

TwitterCache F:\MSSQL\DATA\TwitterCache.mdf D PRIMARY 134217728 35184372080640 1 0 0 6DC5C93F-811F-40D3-9C8F-56692E098701 0 0 57147392 512 1 NULL 26000000004400037 B9297268-D556-44FE-9940-13A7B8550289 0 1
TwitterCache_log F:\MSSQL\DATA\TwitterCache_log.ldf L NULL 201138176 2199023255552 2 0 0 B5C0DCCF-9EBA-4A62-8B8A-6B09844B953C 0 0 0 512 0 NULL 0 00000000-0000-0000-0000-000000000000 0 1

All you really need are the LogicalNames. In this example, we get these logical names:

TwitterCache
TwitterCache_log

Now build your restore statement, using the Logical Names (in bold) and the filepath to your SQL data and log folder(s):

RESTORE DATABASE TwitterCache
FROM DISK='C:\temp\TwitterCache.bak'
WITH MOVE 'TwitterCache' TO 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TwitterCache.mdf',
MOVE 'TwitterCache_log' TO 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\TwitterCache_log.ldf' 

The easiest part to mess up is the comma between the MOVE statements, so don't miss it.

The end.
-Jen McCown
http://www.MidnightDBA.com


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: ,
Categories: Admin | Dev | SQLServerPedia Syndication | T-SQL

88 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Quick Reference: Job Steps

There are other ways to do this, but the quick n easy way for me to get job and job step information is

  1. Get job names using

    EXEC msdb.dbo.sp_help_job

  2. Paste those names and surround them with single quotes, and preface each with

    EXEC msdb.dbo.sp_help_jobstep @job_name= '

  3. Hit CTRL-T for "results to text". (This makes it easy to browse and search, if a little less readable than the graphics pane.)
  4. Run, and browse the results.

 It's a very fast, very easy to remember way to search through job steps for certain code, or what have you.

Bonne beurre,

Jen McCown, http://www.MidnightDBA.com


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , ,
Categories: Admin | Dev | SQLServerPedia Syndication | T-SQL

40 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

AWE and 64-bit SQL Server?

I had a guy at work tell me that he'd read about turning AWE on for 64bit SQL.  That sounded TOTALLY wrong to me, so I checked BOL, which says "AWE is not required and cannot be configured on 64-bit operating systems."  I got all self-satisfied, but then I thought, I'd better check with the internetz.

Thanks to Denny Cherry (@mrdenny on Twitter) for pointing me to a reputable resource: Bob Ward's most excellent CSS SQL Server Engineers article "Fun with Locked Pages, AWE, Task Manager, and the Working Set…". This in essence says that BOL is right in one sense: the sp_configure option for AWE makes absolutely no difference for 64-bit SQL. But you can get the effect of "locking pages" (not to be confused with page locks) by using the 'lock pages in memory' privilege set. (See also Bob's accompanying article Do I have to assign the Lock Pages in Memory privilege for Local System?). 

Edit: Here's a blog by Slava Oks that also addresses 64bit and AWE (thanks for the link goes to Mladen Prajdic)

Bottom line: AWE has NO effect on 64 bit systems! (And, you set "lock pages in memory" independently, if that's the behavior you're looking for.)

-Jen McCown

http://www.midnightdba.com/


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , , , ,
Categories: Admin | SQLPASS | SQLServerPedia Syndication | Windows

80 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

Free Stuff Friday!

Free is greatness.  Please to enjoy:

North Texas User Group Next Week!

North Texans (and anyone in earshot): Free training, networking, and SQL community ties at the North Texas SQL Server User Group.  Next meeting is in a week, Thursday 9/17.  Topic this month is "Bad SQL - Why Does This "Perfectly Good" T-SQL Run So Slow?"

Oops...

Ok, sorry folks...it looks like this wasn't a legit free offering.  So sorry all, and especially to the authors of said book.  That's what I get for being naiive....

More Free Goodness!

-Jen McCown, http://www.MidnightDBA.com


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , , , , , , ,
Categories: Admin | Dev | SQLServerPedia Syndication | T-SQL | Windows

93 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed

New Vids: SQL password vulnerability, CASE, grouping, DBAs@, and more

We were busy little bees over the Labor Day holiday (or, for our European readers, "houliday").  All of these are great, of course, but I'm going to steer you DIRECTLY to Sean's new video on Recovering SQL Passwords...do not pass anywhere or collect anything, just watch it!

  • Recover SQL Passwords - Did you know that SQL Server stores username/password combos in cleartext in memory?  And did you know that you can also use a debugger to sniff the memory to retrieve those passwords?  Here I show you how to attach to an instance of SQL Server with a debugger and get a password you need to retrieve.  Thanks again to Sentrigo for bringing this to all of our attention.  12 minutes
  • T-SQL: CASE Statement -  Here’s an intro to case statements, in the form of a conversation with Sean.  I cover simple case and searched case, and the special use of null.  BTW, I really like this video, and I don’t say that very often. 12 minutes.
  • Grouping Basics: Level Zero - Here’s a ground level intro to grouping, with a lovely 80s arcade theme.  See the next blog post for the written companion to this video. 10 minutes.
  • Dirty Tricks:  List from CMD - Here’s one of my dirty tricks: getting a list of folders or files from a command prompt, and formatting it for use in code using  Word. This is the extended remix version of a Midnight Snack. 7 minutes.
  • Troubleshooting Simple Flatfile Load Errors - In this video I put together a small SSIS flatfile-to-SQL package, and troubleshoot a couple of errors.  11 minutes.
  • Who Should Manage Backups - SORRY GUYS!!! This was about a 45min video and a glitch with the computer lost the first part of it so you only get the last 19mins.  We thought about recording it again, but we honestly don't remember what we said so I made the command decision to just post it as-is. 19 min.

 -Jen McCown, http://www.MidnightDBA.com


Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Tags: , , , , , , , ,
Categories: Admin | Dev | Security | SQLServerPedia Syndication | SSIS | T-SQL

70 Comments
Actions: E-mail | Permalink | Comment RSSRSS comment feed