• Coming Right Up

    February 5, 2011
    I have submitted sessions to SQLSaturday #60, Cleveland.

    February 26, 2011
    I have submitted sessions to SQLSaturday #65, Vancouver, BC.
    This is right before the MVP Summit and is going to be one action packed SQLSat!

    March 15-16, 2011
    I am submitting sessions to #24HOP / aka 24 Hours of PASS. If you'd like to see a session, vote for me!

    April 9, 2011
    I am submitting sessions to SQLSaturday #68, Olympia, WA

    May 11-13, 2011
    I have submitted sessions for the inaugural SQL Rally in Orlando, Florida. Voting for sessions will go out soon.

  • Oh Nos! It’s Over

    December 4, 2010
    I presented "Big and Tall: Introduction to Table Partitioning" at SQL Saturday #61 in Washington DC / Northern Virginia.

    SQLPASS: November 2010
    Quiz Bowl
    My SQLPASS Quiz Bowl team won this year! Props to the raccoon hand puppet.

    Chalk Talk
    I gave a Chalk Talk at SQLPASS on Agile Database Operations.

    Lightning Talk
    I gave a lightning talk advocating peer-review based change management instead of approval based systems.

    September 18, 2010
    I presented sessions on the Data Collector and about Administering Databases in an Agile Environment at SQLSaturday #50 in East Iowa.

    August 21, 2010
    I gave a talk about the Data Collector at SQLSaturday #51 in Nashville, TN.

    August 12, 2010
    I presented to the Columbus SQLPASS User Group on how to be a DBA working in an Agile development environment.

Date Rounding Tactics and the Tiny Devil of SMALLDATETIME

Happy New Year! With every new year I think a little bit about time and dates, so let’s look at that in TSQL.

Rounding Dates: Which way is best?

Sometimes in TSQL you need to round a datetime value to the precision of either a day, hour, minute, or second.

I realized recently that I have a few ways I know how to do this, but I wasn’t sure which was the most efficient.

I did a little searching and didn’t find anything super conclusive. I had a little chat with Jeremiah Peschka (blog | twitter) and he told me which way he thought was fastest and why.

And so I decided to run some tests. Jeremiah has a way of being right about these things, but I had to see for myself.

I’ll go ahead and tell you: He was totally right, and I’ll show you why. But I learned a couple things along the way.

Reference: Types and Storage Sizes

To get started, let’s review some of our friends, the SQL Server datatypes. Hi friends!

DataType Storage Size Note
DateTime 8 bytes Rounded to increments of .000, .003, or .007 seconds
DateTime2() 6 bytes for precisions less than 3; 7 bytes for precisions 3 and 4. All other precisions require 8 bytes. 0 to 7 digits, with an accuracy of 100ns. The default precision is 7 digits.
Date 3 bytes 1, 3-byte integer stores date.
SmallDateTime 4 bytes Be very careful! Values of 29.999 seconds or more are rounded up to the nearest minute.
BigInt 8 bytes I’ll explain why this is here later. :)

Rounding to the Day

The most frequent case in which I need to round dates is to the day level. So instead of ’1/4/2011 6:15:03.393921′, I want just ’1/4/2011′.

SQL 2008′s date type made this a lot easier for everyone– now we can just cast a datetime or datetime2 value as a date, and we’ve got what we need. PLUS, our new value is nice and small, weighing in at 3 bytes.

I think most everyone agrees, we like this!

SELECT CAST('1/1/2010 23:59:59.000' AS DATE) AS [I'm a date!]

Rounding to Hours, Minutes, or Seconds:
Beware the tiny devil of SMALLDATETIME

This is still a bit more complicated. When you start thinking about these and different datatypes, you need to make sure you understand what you mean by rounding.

In SQL Server, our datatypes actually have some different opinions about what rounding means. Check this out:

SELECT
    CAST('1/1/2010 23:59:59.000' AS DATETIME) AS [I'm a DATETIME!],
    CAST('1/1/2010 23:59:59.000' AS DATETIME2(0))  AS [I'm a DATETIME2(0)!'],
    CAST('1/1/2010 23:59:59.000' AS SMALLDATETIME) AS [I'm a SMALLDATETIME, and I'm very confused.],
    CAST('1/1/2010 23:59:59.000' AS DATE) AS [I'm a DATE!]

This returns:

The SMALLDATETIME value rounds this up to January 2nd, instead of January 1. The Date datatype does not.

In considering whether or not to use SMALLDATETIME, you need to learn and establish whether or not to round up for minutes and date values. With a different example, if something occurred at 12:30:31 AM, would that be represented as having  happened in the 12:30 minute, or at 12:31?

Most of us actually want to round down. We want the largest minute number which is less than or equal to the datetime value. This is similar to what FLOOR does for integers. You could also call this truncating the portion of the datetime value you don’t want.  This is not, however, what SMALLDATETIME gives you, so use it with care.

So this is what I’m saying:

Like, seriously, SMALLDATETIME: you are SO messed up.

Comparing Methods of Rounding Dates

So given that warning, let’s actually round some date values, and let’s compare the efficiency of each method.

To start out with, let’s create a table and toss in a bunch of date values. We’ll run queries against these dates and measure SQL Server’s abilities to work with it.

To make up a bunch of datetime data, I’m using my trusty recursive CTE from my prior post.


--Populate a table with some data
CREATE TABLE dbo.Donuts ( DonutTime DATETIME2(7) )

DECLARE
    @startDate DATETIME2(7)= '2010-12-01 00:00:00' ,
    @endDate DATETIME2(7)= '2010-12-11 01:30:00' ;

WITH    MyCTE
          AS ( SELECT
                @startDate AS [Makin' the Donuts]
               UNION ALL
               SELECT
                DATEADD(ms, 1225, [Makin' the Donuts])
               FROM
                MyCTE
               WHERE
                [Makin' the Donuts] < @endDate )
    INSERT  dbo.Donuts
            SELECT
                [Makin' the Donuts]
            FROM
                MyCTE
    OPTION
                ( MAXRECURSION 0 ) ;

SELECT @@ROWCOUNT
--We now have 709716 rows of DonutTime

Now let’s look at different methods to manipulate datevalues. For our examples I’ll be rounding to the minute.

Contestant 1 -
DATEPART: isolate each part of the date, then concatenate

As we learn TSQL, this is the first method that occurs to us. We know DATEPART will return part of a date (great name!), so we can chop apart the bits. However, to get them back together properly we have to turn each part into a string to clue them back together. And then if we want to treat it like a date (which we pretty much always do), we have to cast it back.

Just look at this baby. It’s pretty ugly.

SELECT
    CAST(CAST(DATEPART(YY, DonutTime) AS CHAR(4)) + '-' + CAST(DATEPART(MM, DonutTime) AS NVARCHAR(2)) + '-'
    + CAST(DATEPART(DD, DonutTime) AS NVARCHAR(2)) + '  ' + CAST(DATEPART(hh, DonutTime) AS NVARCHAR(2)) + ':'
    + CAST(DATEPART(mi, DonutTime) AS NVARCHAR(2)) + ':00.000' AS DATETIME2(0)) AS [Wow, that was a lot of typing.]
FROM
    dbo.Donuts

Running this (after cleaning out buffers), I got these results:

Contestant 2 -
Subtracting what you don’t want

There’s a couple of variations on contestant #2. I’ll take the one I like best, which is casting to a smaller byte size by using DATETIME2(0), which is 6 bytes rather than 8 and effectively truncates to the second. Then I’ll subtract the seconds.

SELECT
    DATEADD(ss, -DATEPART(ss, DonutTime), CAST (DonutTime AS DATETIME2(0)))
FROM
    dbo.Donuts

Running this one (yes, I cleaned out the buffers), I got these results:

Well now, that’s much lower CPU time there.

NB: I did test, and in all my trials it was lower CPU time to cast into DATETIME2 rather than using a nested DATEADD function to subtract milliseconds.

Contestant 3-
Convert to a shorter character string, then back to date

This contestant is near and dear to my heart. I like it because it’s easy for me to remember. You take a short trip into CHAR() with the 121 date format and set the length to chop off the parts of the date you don’t want. Then you cast or convert back to a DATETIME2(0).

I think I like this one because it feels just a little bit violent. But not in a bad way. It’s like roller derby.

SELECT
    CAST(CONVERT(CHAR(16), DonutTime, 121) AS DATETIME2(0))
FROM
    dbo.Donuts

Oh, sad. This one didn’t do very well. It’s definitely better than Contestant #1, at least.

Contestant 4-
Use DATEADD to calculate the minutes since a given date, then add them back

Here’s the method Jeremiah suggested to me. The way he described it was “Just figure out the number of minutes since the beginning of time, and use that.”

Being a philosophy major, I of course asked “So, when was the beginning of time?”

Being a developer, he answered, “Just call it zero.”

SELECT
    DATEADD(mi, DATEDIFF(mi, 0, CAST(DonutTime AS DATETIME2(0))), 0)
FROM
    dbo.Donuts

Here are the results (clean buffers, as usual):

Ooo, check out the CPU time on that one.

Note: I ran a few trials and this is faster on the CPU when you cast as DATETIME2(0) before doing your maths. I did that to make all things equal with the other contestants, who had the same benefit.

Who Won, and Why

Here’s a recap of how everyone performed:

Contestant CPU Time (ms) Notes
Contestant 1-
DatePart o’ Rama
1872 I’m really glad this didn’t win. Because it’s awful to type. I almost wanted to skip it, it’s so unpleasant.
Contestant 2-
Subtract What You Don’t Want
375 An admirable CPU time! And altogether a lovely contestant. Check how this performs for your specific task.
Contestant 3-
Convert to CHAR and back
905 Sad, I really like this one. But although this dog will hunt, it doesn’t seem to do it all that well.
Contestant 4-
DATEDIFF from the beginning of time, as expressed in DATETIME2(0)
172 Well now, that’s impressive.

Why did contestants 2 and 4 do so well?

Jeremiah pointed out that datetime values are stored internally as two four byte integers. (BOL reference: see “Remarks”) Performing mathematic functions on an integer value is a nice fast activity on the CPU.

Performing conversions back and forth to character based datatypes, however, is not so natural, nor so fast.

What’s the internal storage format of DateTime2?  Well, I’m not sure about that one. BOL isn’t so up-front about these things anymore. If you happen to know, please tell me in the comments. I can tell, however, that it’s something that enjoys mathematics.

2010 liner notes and my theme for 2011

The MorningNews asked the following question recently:

Who you would recognize in your 2010 liner notes?

To all the people in your lives (maybe you know them, maybe you don’t) who deserve shout-outs, a la the album’s notes, the book’s acknowledgments, the piece’s title, the award’s dedication. Who would you include in your 2010 acceptance speech? Maybe your fifth grade teacher, the buddy who got you through a brutal summer, the musician who had you walking tall during a particular slog.

My grandparents

A lot changed in 2010

I don’t write much about my personal life on this blog, but in short: I changed jobs, moved house, and I’m in a new relationship. My last grandparent passed away. I remembered that life is finite.

2010 – The liner notes

To My Mamaw:

You not only taught me how to draw, you taught me how to love it, and how to be patient and thoughtful when it doesn’t turn out exactly like I planned. When I create art, I’ll always think of you. You also showed me what it is to love someone more than anything else, and how to live gracefully with loss. I love that about you.

Me with Crys in West Palm Beach

To Crys Manson ( b / t ):

Thanks for convincing me to do things which I fear will make me:
a) exhausted, b) terrified, c) nauseous d) all of the above.
You’re a fantastic friend a the BEST DBA co-pilot ever. You make me a better person.
I’ll fly into West Palm Beach with  you anytime.

To Organizers of SQL Saturday Nashville, Iowa City, and Washington DC ( b ):

Thanks for putting on a great show, and for the speaking opportunities. It’s been crazy good. I’ve loved the chance to travel to places I’ve never been and meet all sorts of smart people. Thanks for letting me figure out how to get up and talk in front of a group of strangers, and for letting me learn from other people’s presentations.

To Brent Ozar ( b / t ) and Tim Ford ( b / t ):

SQL Clown Car

Thanks for not just throwing an awesome party, but for really making friends.
I still owe you guys a ride to Tacoma.

To Richard Kim:

Thanks for believing in me so much, and for understanding me better than I know myself, sometimes.

To Len Cozza and Richard Fried:

Thanks for the honest, thoughtful, sane conversations, in a crazy place and time.

To Dev Nambi ( t ):

I love how passionate and good you are at what you do. I’m sure I’ll still think of calling you if I have crazy new performance issues to talk about. Except that’s probably not really appropriate now.

To My Former Employer, Whom I Have Not Explicitly Named Herein:

Thanks for bringing me in touch with smart and interesting people from so many different places. I don’t really miss you because, let’s face it, we still spend a lot of time together in so many ways, and that’s a good thing.

To My New Employer:

We’re going to do some really awesome stuff.
And way to go for having the gym onsite and offering free physical training sessions. Seriously awesome.

To Carl, the Giant Rabbit ( t ):

Carl, who wants nothing to do with ramps.

Thanks for all the headbutts, and for dancing on your hind feet in greedy mango-lust.
Sorry about those litter box shavings you didn’t like. I won’t do that EVER again.

To Kevin Kline ( b / t ):

I’m not sure if you’re aware that I owe you an ice cream cone.

To Jeremiah Peschka ( b / t ):

Good choice asking me to be on your SQL PASS Quiz Bowl team!
Thanks for helping me remember the books I love to read, and for helping me think about things differently sometimes. For getting me to listen to new music, and teaching me to talk in dinosaur. And for creating your own datatype. Let’s go get some tacos and I’ll tell you the rest.

2011: Theme for next year

My theme for next year comes from a blog post by Penelope Trunk from 2007:

Here’s some practical advice: Do not what you love; do what you are. …

Relationships make your life great, not jobs.  But a job can ruin your life – make you feel out of control in terms of your time or your ability to accomplish goals –  but no job will make your life complete. It’s a myth mostly propagated by people who tell you to do what you love. Doing what you love will make you feel fulfilled. But you don’t need to get paid for it.

A conversation about cheesecake and President Clinton

In many ways, I’m lucky, because I love nerding out on SQL Server and talking about it. But I don’t love all the responsibilities and trappings of work. This next year I want to devote to respecting my job and being a great contributor (and my new job is interesting and exciting), but also separating out and making plenty of time for what I love.

In 2011 I want to become a better speaker, because I love doing it. I want to develop more presentations and think through creating training materials around RDBMses, because I enjoy it.  I’ll probably learn plenty of things in doing this that make me better at  my job, and that’s icing.

Which leads me to my most important goal for 2011: take time to have fun with people I love outside of work. Schedule it. Stick with it. Do it unless the world is on fire. And if the world’s truly on fire, take the time back within a week.

I’m going to learn to be a better partner, to make plenty of time for zombie movies, and to sometimes think nothing at all about SQL.

Why I’m All For 24HOP Showcasing 24 Women Speakers

This is not the HOP you are looking for.

I’ve been thinking about the upcoming 24HOP event planned for March 15 and 16 which will showcase 24 women speakers. Karen Lopez (post | twitter), Jenn Stirrup (post | twitter), and Jen McCown (post | twitter) have all written posts about it. Kalen Delaney ( blog|twitter) has left some comments with her views, which I’ve also appreciated.

If you’re not sure what 24HOP is, prepare yourself for an acronym within an acronym: 24HOP = 24 Hours of PASS. PASS= the Professional Association for SQL Server. This is an online event featuring 24 one-hour sessions on all sorts of nerdy relational database topics.

This will be the fourth 24HOP event, I believe– there have been two general-topic English speaking events, and one Spanish and Portuguese language 24HOP event (LATAM HOP).

Some people think it’s a bad idea to have a woman-only 24HOP. And there are some thoughtful comments on why that might be the case.

However, I think it’s a very smart idea.

Think about this from the perspective of an editor who is putting together a collection of great fiction. You have a lot of choices when it comes to your book– you can choose items by theme, by time period, by genre, or by some factor of the author– their culture, their gender, their age.

A way to make a collection compelling is by a subtle thread. You want a variety of topics, and you want things in your collection to be different and interesting. There are many good types of way to build a collection, and one of those ways is by gender.

So if you’re a person who’s put on a few 24HOP events, do you want to put on something that’s very much like the one you did before? With maybe a little bit of new feature content?

Wouldn’t it be way more exciting, and a bit risky, to come up with a challenging idea that creates an interesting and different collection of speakers? It might spur your crowd of presenters to be creative and invested in the event more than a more general speaking engagement.

Well, I think so, at least. Regardless of the gender issue itself, I think this is a smart theme to build new content for the community.

High five, Tom LaRock.

Do I Want to Present?

I do!

I submitted a session. The title is “No More Bad Dates: Using Temporal Data Wisely”

(I am actually really excited about the topic. Yeah, dates and times— all sorts of weird fun to be had.)

Would I have Submitted a Session if it Wasn’t Women-Only?

Nope.

Here’s why: SQL Rally is coming up, and I’ve submitted two sessions for that.  SQL Rally and 24HOP are going to both do selections by popular vote.

And popularity contests are, frankly, terrifying. I almost didn’t submit to Rally altogether because of the risks of being voted Least Likely to Present.  But I talked myself into submitting two sessions because I really love presenting, and I know the SQL community down in Florida is super passionate and interesting.

So if this was a general-topic 24HOP, but with sessions now being selected by public vote, I would not have submitted. I’m concerned that by being up for votes on two bills, I’ll hurt my chances on whichever is voted on second, even though I’ve submitted different abstracts. (People may feel like they voted for me before, so they want to give their vote for someone else on the other to be more fair.) Plus, it’s frankly twice the anxiety. Did I mention that popularity contests are terrifying?

But I do think this 24HOP idea is exciting and a little risky, and I would like to do what I can to try to make it great, if only by submitting. So I decided to submit a session, also.

24Hop: Hat Edition

Even if I’m not selected, I want to see how this 24HOP turns out. Maybe it’ll be remarkably like the ones that came before. Maybe there’ll be just something a little different that you can’t put your finger on. Maybe it’ll be really unique.

So, Should We Have 24 HOP Showcasing 24 Men Speakers?

Well, I actually thought that would be a good idea. Until I realized that we’ve already had 24HOP featuring 23 men speakers. I think that means the theme wouldn’t generate much but confusion. It would be all sorts of “So… what are we supposed to be doing?”

But there are other things to group by. New speakers. New presentations only. Mythbusters 24HOP. Present about your favorite feature. Maybe it’s the whole event, maybe each day has a different theme, but I like the idea of a focus. You could do a 24HOP like those playwriting events where a whole production is created in 24 hours.

However this goes, I hope we continue to have experimentation with 24HOP. I’d really like for it to continue on in an exciting direction where it doesn’t quite do what you expect– otherwise I think it’ll get to be SQL PASS Express Edition, with Auto Close enabled.

Filling in Data Potholes with Recursive CTEs

Data: it can break your foot.

Imagine that you are writing a script that looks at data grouped by the minute. You notice that there are no rows for some minutes, and you’d like to display a value when that is the case, probably showing a count of zero.

In thinking about this problem this week, I spent some time getting to know CTEs (Common Table Expressions) again. And I came to the conclusion that I should spend much more time with them. Maybe I won’t end up using them all the time, but I should be looking at them regularly as options when I’m writing queries.

Here’s the story of a handy way I found to work with this.

Let’s create some data

Our story starts with some data. It’s been lovingly scripted out, but it has a few holes.

CREATE TABLE dbo.MyImperfectData (
    ItemDate DATETIME2(0) ,
    ItemCount SMALLINT )
GO

INSERT  dbo.MyImperfectData ( ItemDate, ItemCount )
VALUES  ( '2010-12-01 00:00:00', 12 ),
        ( '2010-12-01 00:01:00', 3 ),
        ( '2010-12-01 00:02:00', 6 ),
        ( '2010-12-01 00:03:00', 12 ),
        ( '2010-12-01 00:04:00', 24 ),
        ( '2010-12-01 00:05:00', 1 ),
		-- Gap where 6 would be
        ( '2010-12-01 00:07:00', 122 ),
        ( '2010-12-01 00:08:00', 1 ),
        ( '2010-12-01 00:09:00', 1244 ),
        ( '2010-12-01 00:10:00', 23 ),
        ( '2010-12-01 00:11:00', 12 ),
        ( '2010-12-01 00:12:00', 24 ),
        ( '2010-12-01 00:13:00', 27 ),
        ( '2010-12-01 00:14:00', 28 ),
		--Gap where 15, 16, 17 would be
        ( '2010-12-01 00:18:00', 34 ),
        ( '2010-12-01 00:19:00', 93 ),
        ( '2010-12-01 00:20:00', 33 ),
        ( '2010-12-01 00:21:00', 65 ),
        ( '2010-12-01 00:22:00', 7 ),
        ( '2010-12-01 00:23:00', 5 ),
		--Gap where 24 would be
        ( '2010-12-01 00:25:00', 4 ),
        ( '2010-12-01 00:26:00', 6 ),
        ( '2010-12-01 00:27:00', 7 ),
        ( '2010-12-01 00:28:00', 77 ),
        ( '2010-12-01 00:29:00', 94 )

CREATE UNIQUE CLUSTERED INDEX cxMyCTE ON dbo.MyImperfectData(ItemDate)

The data is at the minute level. We’re missing data for five minutes in this period– one three minute chunk, and two other minutes.

What’s the quickest way to show the missing rows?

At first I thought about querying the data itself to find what’s missing. This made my head hurt a bit, and seemed pretty expensive.

I thought about the fact that many data warehouse databases have calendar tables, where all sorts of information about months, days, years, hours, and minutes are normalized out into tables.

However, I didn’t have those types of tables around. For the scope of my problem I was dealing with short date ranges (<b>and by short, I mean 3 hours)</b> , and ideally I would not need to create a bunch of ancillary objects to fill in the gaps.

After some thinking, I realized that we can create a date time table at the minute level on the fly by using a recursive CTE.

Here’s a sample that counts out a few minutes:

WITH    MyCTE
          AS ( SELECT   CAST('2010-12-01 00:00:00' AS DATETIME2(0)) AS [I can count!]
               UNION ALL
               SELECT   DATEADD(mi, 1, [I can count!])
               FROM     MyCTE
               WHERE    [I can count!] < DATEADD(mi, -1,
                                            CAST('2010-12-01 00:10:00' AS DATETIME2(0))) )
    SELECT  [I can count!]
    FROM    MyCTE
OPTION  ( MAXRECURSION 0 ) ;

Our results:

Putting it all together

Taking the format of this CTE, we can change it to create a table with every minute in our time range.

We can then select from it and use a LEFT OUTER JOIN to our table with data, and use the CTE dates to fill in the gaps.


DECLARE @startDate DATETIME2(0) ,
    @endDate DATETIME2(0) ;

SELECT  @startdate = MIN(ItemDate), @endDate = MAX(ItemDate)
FROM    dbo.MyImperfectData ;

WITH    MyCTE
          AS ( SELECT   @startDate AS MyCTEDate
               UNION ALL
               SELECT   DATEADD(mi, 1, MyCTEDate)
               FROM     MyCTE
               WHERE    MyCTEDate < DATEADD(mi, -1, @endDate) )
    SELECT  MyCTEDate, CASE WHEN Itemcount IS NULL THEN '[Missing Row]'
                            ELSE ''
                       END AS ColumnDescription,
            COALESCE(ItemCount, 0) AS ItemCount
    FROM    MyCTE
            LEFT OUTER JOIN dbo.MyImperfectData ld
                ON MyCTE.MyCTEDate = ld.ItemDate
    ORDER BY MyCTEDate
OPTION  ( MAXRECURSION 0 ) ;

And there we have it! No gaps:

No gaps allowed.

Use Cases

Check out the comments! In my initial posting, I didn’t say enough about where this is best applied, and how this scales.

I think this is mostly a party trick, but it’s also a nice simple example of recursion that got me thinking about CTEs.  And while there are some situations where it can come in useful, it doesn’t scale up to large date ranges. (Check out Brad Schulz’ post on recursive CTEs here.)

So in other words, this may be helpful in some ad-hoc situations.

However, looking at the “pseudo-recursive” parts of Brad’s post, I really feel a follow-up post or two coming on.

The 9th Day of SQL: Things Aren’t as Simple as They Seem

The 12 days of SQL

Brent Ozar (blogtwitter) had an idea: a group of people should blog about writing which they’ve loved this year by people in the SQL community. For each “day of SQL,” someone picks a blog which they thought was great and writes about it.

Yesterday was Day 8, when Karen Lopez talked about a post by Louis Davidson and asked “What is your over/under?”  Karen is a great speaker, an inspiring writer, and just an incredibly interesting person. Check out her post!

On the 9th Day of SQL the engine gave to me: Something a little different than I expected.

Day 9: The Day of Paul White

This day of SQL is not about nine ladies dancing. (Sorry Karen!) Instead, it’s devoted to one New Zealander writing: his name is Paul White (blogtwitter).

First off, let me say that Paul White’s blog, “Page Free Space,” is just plain awesome. When I see Paul’s written a new post I know to allocate some time for it and read it through slowly, and that I should expect to have to think about what I’m reading to understand it.

I swear I can sometimes feel things moving around in my head when I read Paul’s posts. Apply the warning about overhead bins during flight: be careful, contents may shift while you’re reading Paul White’s blog.

So What’s My Favorite Post of the Year?

I picked Paul’s post, The Case of the Missing Shared Locks.

There’s a lot to love about this post. It is a great demonstration that things aren’t as simple as they seem.

Paul starts the post with the question:

If I hold an exclusive lock on a row, can another transaction running at the default read committed isolation level read it?

The answer to that would seem fairly straightforward. But in fact, things are pretty complicated. However, if you go through it slowly and really look at the examples, it can help you understand a lot about locking.

This is good.

Why is it Good that Things Are So Complicated? It’s CONFUSING.

Have you ever said something along these lines? “I’d like to give a presentation sometime, but I don’t have anything to talk about.”

Or, “I’m not sure that I have anything that interesting to fill a whole hour.”

Well, take a look at Paul’s post. He took something small, and he looked very closely at it. He played with it a couple of different ways, and he worked on it to see how it behaved. He stepped through it in a series of short, straightforward steps.

You can do the same thing with many things you’re familiar with. You can take a topic, or a feature, or a method of doing something and distill it into an interesting question. You can then look closely at the question and work with it carefully. Use it as a chance to explore something. You’re probably familiar with it, but by taking the time to write about it or present it, you’ll have the opportunity to get to know it better than you ever thought you could.

Who’s Next?

I’m handing the dreidl off to Crys Manson (blogtwitter) for Day 10.

Crys is a seriously great DBA, a fantastic friend, and she sometimes makes me snort liquid through my nose laughing.

Tag, Crys, you’re it!

How’d We Get Here?

If you want to check out where we’ve been so far, we’ve had:

A Little Present

You don’t need to be Jewish for this to be your favorite holiday song this year. Rock  on with the Maccabeats, y’all. (You will need to click the “watch on YouTube” link.)

Select * from dbo.FAIL: invalid metadata in views and user defined functions

This week a question on the Twitter #sqlhelp hash tag reminded me of a detail of SQL Server that I learned the hard way, but forgot to blog about. The question was:

SQLHelp in action

For those without images enabled (you anarchic luddites), that’s “Is there a good reason to have a SQL view that is just select * from a table with no filtering?”

Why Shouldn’t You Do That?

There are things you should know about how SQL Server handles metadata. When you update the schema for an underlying object (such as adding or removing a column in a table), the metadata for referencing objects is not automatically updated.

This can impact you whether or not you are using SELECT *. However it is far more likely to impact you, and to impact you more widely, when you have a practice of  using SELECT * in your object definitions for views, functions, etc.

Once Upon a Time…

There was a team of people working on optimizing their schema. Columns were being added and removed from several tables. The changes were checked in, tested, and handed off to operations for deployment. The operations DBA verified which changes were to replicated tables and validated that changes were approved by owners of subscriber systems. The change was put through pre-production and the change management system  appropriately.

Thirty minutes after the change went through to production, users of tools on a downstream system began to complain about large volumes of errors in the tool. It was discovered that many databases on the instance with the replication subscriber  had views and table valued functions referencing the article, using the “Select * syntax”. With the removal of the columns, literally hundreds of these views were returning 5402 errors, even though the users didn’t specifically need the columns which had been removed.

After the issue was identified, the operations team was able to resolve the incident by updating the metadata on the views and functions with the sp_refreshsqlmodule stored procedures.

For future schema changes to replicated tables, a script looping through all views and functions was used. This tended to be a bit problematic occasionally, as many users had longrunning queries using the impacted views and functions.

Learn More

To learn more, check out my sample script below, and also these MSDN articles on the stored procedures you use to update metadata:

Note that sp_refreshsqlmodule can also be used for views, so you likely want to just standardize with it.

Remember:

  • Check your replication subscrbiers: If you’re changing the schema of replicated objects, you may need to update metadata for referencing objects in your subscriber databases, and in other databases on all instances where objects may be referencing the tables by three part names.
  • Prepare to be blocked: Your request to update metadata can and will be blocked by running processes using those objects. Make sure to plan for this and determine if and whether you should kill running processes or not to update the metadata.

Why Many People Don’t Know About This

If you’re used to controlling  your data access with stored procedures or ORM tools such as nHibernate, it’s easy to never quite learn this detail, or to forget it.

Stored procedures won’t usually have this issues because they recompile after the schema is changed on the underlying object, which automatically refreshes their metadata.

Alternatives

If you need to reference an object by a different name, create a synonym!

Example Script: views, functions, sprocs, synonyms, and metadata.

Here’s some sql code to walk through some basic examples of how this all behaves.

This script shows a simple example of how schema updates aren’t reflected in the metadata for views and table valued functions unless they are explicitly edited, and it also slows a basic example of a 5402 error.

--Create a base table for testing
create table dbo.LookAtMe (
	i int identity primary key,
	Msg nvarchar(128)
)
GO

--Add a row
insert dbo.LookAtMe (Msg) VALUES('Hi, how are you?')
GO

--Create a view looking at the table using select *
create view dbo.vImLookinAtYou AS
	SELECT *
	FROM dbo.LookAtMe
GO

--Create a table valued function looking at the table with select *
create function dbo.tvfImLookinAtYou ()
RETURNS TABLE
AS
RETURN
(
	SELECT *
	FROM dbo.LookAtMe
)
GO

--Create a procedure looking at the table with select *
create procedure dbo.prcImLookinAtYou
AS
	SELECT *
	FROM dbo.LookAtMe
GO

--create a synonym for dbo.LookAtMe
create synonym dbo.synImLookinAtYou
FOR dbo.LookAtMe

--Our view, function, and proc each return two columns, i and Msg
select * from dbo.vImLookinAtYou
GO
select * from dbo.tvfImLookinAtYou()
GO
exec dbo.prcImLookinAtYou
GO
select * from dbo.synImLookinAtYou
GO

--Now, let's add a column
alter table dbo.LookAtMe
add lookTime datetime
GO

--Our view and functions still work
--but they aren't returning the new column!
select * from dbo.vImLookinAtYou
GO
select * from dbo.tvfImLookinAtYou()
GO
--Our procedure, however does return the new column!
exec dbo.prcImLookinAtYou
GO
--Our synonym also lets us see the new column.
select * from dbo.synImLookinAtYou
GO

--Let's refresh the metadata for the view and function
exec sp_refreshview 'dbo.vImLookinAtYou'
GO
exec sp_refreshsqlmodule 'dbo.tvfImLookinAtYou'
GO

--Now we can see the new column in the view and function
select * from dbo.vImLookinAtYou
GO
select * from dbo.tvfImLookinAtYou()
GO

--Now let's remove a column from the table
alter table dbo.LookAtMe
drop column lookTime
GO

--Oh no!
--Our view and function both return error 4502:
--"View or function ... has more column names specified than columns defined."
select * from dbo.vImLookinAtYou
GO
select * from dbo.tvfImLookinAtYou()
GO
--Our procedure is fine, though
exec dbo.prcImLookinAtYou
GO
--Our synonym is also fine. It's just a pointer.
select * from dbo.synImLookinAtYou
GO

--Let's refresh the metadata for the view and function.
exec sp_refreshview 'dbo.vImLookinAtYou'
GO
exec sp_refreshsqlmodule 'dbo.tvfImLookinAtYou'
GO

--Now they work again.
select * from dbo.vImLookinAtYou
GO
select * from dbo.tvfImLookinAtYou()
GO

--Clean up our objects
drop table dbo.LookAtMe
drop view dbo.vImLookinAtYou
drop procedure dbo.prcImLookinAtYou
drop function dbo.tvfImLookinAtYou
drop synonym dbo.synImLookinAtYou
GO

What the Business Wants: FEATURES (TSQL Tuesday #13)

It’s TSQL Tuesday Again…

This month’s #tsql2sDay is hosted by Steve Jones (blog | twitter), and the topic is “What the Business Says is Not What the Business Wants.” Steve asks the question:

What issues have you had in interacting with the business to get your job done?

I thought about this for a long time. Eventually I realized that I wanted to see the pattern in the trees from the last ten years.

What the Business Says: “We Want Features!”

As a DBA working with software developers, I’ve heard many times from the business that they want new features.

They need a new feature to land a big client, to be the first to market with something, to retain a client, to compete with x.

They need a feature to land on Mars, and if we can’t do that reliably, well, we should just do that once. They don’t care what it takes to get to that feature, we needed to get it last week.

And so the wheels squeak, and the features are prioritized with a mysterious calculus involving the business which is never fully revealed.

But although I have seen that the business does not always dictate the priorities directly and fully, I have  often seen that the business dictates the timeline for what is prioritized. And that timeline is usually pretty quick.

What the Business Wants: They Actually Do Want Features. They Just Don’t Want to Talk about Sacrificing Scalability/Availability/Recoverability for Speed of Delivery.

The thing is, the business is telling the truth. They do need features to make money– at least in the competitive markets I’ve worked in.

The Scalability Fairy sadly doesn't show up nightly when your application loses a tooth. Or a customer.

However, typically people are so busy stressing when they need the new features that no honest assessment and agreement on quality is reached.

Maybe people think it’s embarrassing to have a frank discussion about what type of SLA and reliability will come with a new feature. Possibly that opens a company to some legal challenges if a different level of service has been provided to the customer. However, this is exactly what the business should truly want: to promise the correct level of reliability to the customer.

Now, not all new features need to be high quality work. Sometimes you’re delivering a shiny toy, and nobody’s going to use it to shave, wear it as a diaper, or drive it to work. But a feature needs to have a clearly defined level of scalability and it needs to be safety-proofed to the appropriate level for that product. It shouldn’t enrage your large customers on a regular basis.

Commonly, this part of the process is overlooked. People don’t want to talk about it. Unfortunately, it leads to delivering a product that  may look good at first on the outside, but may not really please the customer until it’s been patched up in the middle of the night a few times, retrofitted on a weekend, and ultimately largely re-written and re-released.

So, What Do you Do?

As an operations person, you bring up the conversation about SLAs repeatedly and make sure all parties come to an agreement. You ask questions about disaster recovery and availability and make sure you have the time and resources to meet those needs.

You make sure there’s time in the release schedule for performance and load testing, and  you set your expectations to medium-low until you have strong data that shows how something will perform. Trickiest of all: you somehow get that data prior to release.

You document everything, because people won’t remember later.

But most of all, you establish a good relationship with people in the business by being genuine, open, and honest with them, and not pre-judging them. If your business people trust you, they are likely to ask you a few questions before making many promises.

Follow

Get every new post delivered to your Inbox.