Enjoy Every Sandwich

Thoughts on SQL, XML, .NET and sometimes beer.

<November 2008>
SuMoTuWeThFrSa
2627282930311
2345678
9101112131415
16171819202122
23242526272829
30123456


Navigation

Tools

List O'Links

Kent's Other Stuff

Subscriptions

News

Please read these
Notices and Disclamiers

Post Categories

Article Categories



Monday, March 20, 2006 - Posts

I’m always Sort of out of sorts when that sort of sort sorts that way

There’s an interesting note in the SQL Server 2005 SP1 March CTP, and yes, this is really a bug fix. I’m surprised, however, that more people didn’t get caught by it.  Find the topic “Performing Operations on User-defined Types” in Books-On-Line and look at the change history. You’ll see a note reading:

“Added information about the following behavior change in SQL Server 2005 Service Pack 1: Strings that are returned by user-defined type methods assume the collation of the database in which the user-defined type was created, regardless of the current database.

In the earlier version of SQL Server 2005, strings that are returned by user-defined type methods assume the collation of the current database.”

If you’re not all that familiar with collations in SQL Server, you might not be sure why this really is a fix and not a… well, something else. It is pretty easy to see, however, if you give it a try on the RTM version and the CTP SP1 version. First, start with a simple enough CLR-based User Defined Type. Here’s an example:

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.IO;
[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedType(Format.UserDefined,MaxByteSize=8000,IsByteOrdered=true)]
public class SqlStr : INullable,IBinarySerialize
{
    private bool _IsNull = true;
    private string _Str = string.Empty;
    public void Read(BinaryReader r)
    {
        _Str = r.ReadString();
        _IsNull = r.ReadBoolean();
    }
    public void Write(BinaryWriter w)
    {
        w.Write(_Str);
        w.Write(_IsNull);
    }   
    public override string ToString()
    {
        return _Str;
    }
    public bool IsNull
    {
        get
        {
            return _IsNull;
        }
    }
    public static SqlStr Null
    {
        get
        {
            return new SqlStr();
        }
    }
    public static SqlStr Parse(SqlString Source)
    {
        if (Source.IsNull)
            return Null;
        SqlStr s = new SqlStr();
        s._Str = Source.Value;
        s._IsNull = false;
        return s;
    }
}

Okay, so that’s pretty simple. We also need a bit T-SQL to demonstrate this behavior. Run this first in the RTM version of SQL Server 2005, changing the assembly path to work on your machine:

use master
drop database scratch1
drop database scratch2
go
sp_configure 'clr enabled',1
go
reconfigure
go
create database scratch1
alter database scratch1 collate Latin1_General_CI_AI
create database scratch2
alter database scratch2 collate Latin1_General_CS_AI
go
use scratch1
go
create assembly sqlstrAsm
from 'c:\etc\ktegels\SqlStr\SqlStr\bin\Debug\SqlStr.dll'
go
create type SqlStr
external name sqlstrAsm.SqlStr
go
create table dbo.v(id tinyint identity(1,1),s SqlStr)
go
insert into dbo.v values ('ABC')
insert into dbo.v values ('abc')
insert into dbo.v values ('Abc')
insert into dbo.v values ('abC')
insert into dbo.v values ('aBc')
insert into dbo.v values ('aBC')
insert into dbo.v values ('ABc')
insert into dbo.v values ('AbC')
go
select id,s.ToString() from scratch1.dbo.v order by s.ToString()
go
use scratch2
go
select id,s.ToString() from scratch1.dbo.v order by s.ToString()
go
sp_configure 'clr enabled',0
go
reconfigure
go

Your output should be two result sets, the first called from the Scratch1 database, the second, from Scratch2.

id / v
1 / ABC
2 / abc
3 / Abc
4 / abC
5 / aBc
6 / aBC
7 / ABc
8 / AbC

id / v
2 / abc
4 / abC
5 / aBc
6 / aBC
3 / Abc
8 / AbC
7 / ABc
1 / ABC

That makes sense when stop to consider that the second result is being order case sensitively, where as the first set is case insensitive. If you run that same script in SQL Server 2005 Service Pack 1 instances, you get different results. And that’s because the case insensitive ordering of Scratch1 gets used in Scratch2.

id / v
1 / ABC
2 / abc
3 / Abc
4 / abC
5 / aBc
6 / aBC
7 / ABc
8 / AbC

id / v
1 / ABC
2 / abc
3 / Abc
4 / abC
5 / aBc
6 / aBC
7 / ABc
8 / AbC

 So, why isn’t this a change actually a bug? To understand that, you have know what SQL Server does with “normal” string data in cross-collation situations. And that’s easy enough. Change the script to this and run it in the RTM version:

use master
drop database scratch1
drop database scratch2
go
create database scratch1
alter database scratch1 collate Latin1_General_CI_AI
create database scratch2
alter database scratch2 collate Latin1_General_CS_AI
go
use scratch1
go
create table dbo.v(id tinyint identity(1,1),s char(3))
go
insert into dbo.v values ('ABC')
insert into dbo.v values ('abc')
insert into dbo.v values ('Abc')
insert into dbo.v values ('abC')
insert into dbo.v values ('aBc')
insert into dbo.v values ('aBC')
insert into dbo.v values ('ABc')
insert into dbo.v values ('AbC')
go
select id,s as 'v' from scratch1.dbo.v order by s
go
use scratch2
go
select id,s as 'v' from scratch1.dbo.v order by s
go

When you look at the output, you’ll get it. All this change does is make a UDT’s string-returning function behave like another string functions in a cross-collation situation.

So with that sorted out, I’m sort of back to my normal sort of self.

posted Monday, March 20, 2006 12:31 PM by ktegels




Powered by Dot Net Junkies, by Telligent Systems