How do Multiple Instances Affect your Server

There are some pretty obvious challanges to managing multiple instances of SQL on the same box.  Most of which are performance and troublehooting issues.  Like with most heavily loaded SQL servers (single instances) there are the problems of job management, however, with multiple instances it's a little more complex. There are many other problems that are inherent with multiple instances but I won't list them all here.

In my mind one of the biggest issues is how is it affecting system resources, and, are resources waisted or causing contention by having multiple instances?  In my current gig we have a 2 node active\active cluster running 7 (yes I said 7) instances of SQL (I'm fighting the battle of a perception that this is ok with management).  These machines are x86 machines with 4 HT procs and 16GB RAM each and they are attached to a SAN.  The instances were laid out logically to divide applications to different instances, buisness apps, olap, default instance requirements, financial apps and so on.  This was all done during a consolidation project far before my time here and with no specific internal SQL knowledge at the time.

I have been working on educating the staff with SQL internals and the affects that this architecture has on the system, what counters to look at etc.  This MS whitepaper explains, fairly well, the performance affects of having multiple instances on a mchine and fit's my arguments of single instances and x64.

Anyway, enjoy and have a good weekend,
Zach

TempDB: Using Multiple Files in SQL 2005

I have seen in a couple of places where it is recommended to continue to use multiple files for TempDB in SQL Server 2005.  Recently we had a MS consultant on site that was able to contact the SQL Product Development group about that question in particular.  Here is the responce that I got in return.

______________________________________

The information in KB 328551 is exactly why I don’t think any recommendation based on number of processors is of much use. 

This recommendation has been cut and pasted for several versions of SQL Server now, and while it may have worked great for SQL 7 and SQL 2000 when most machines had just 1-2 processors or a really large server may have had 4 processors.  Now with SQL Server 2005, many more servers are starting with 2 sockets with dual core processors (4 cores) and then enabling Hyper threading to give the appearance of 8 processors to the OS.

Configuring a server with either 4 or 8 tempdb files may by itself cause performance problems (due to the overhead of SQL having to manage too many files), so the recommendation should be changed to start with just 1, and monitor for tempdb contention. ONLY if you see contention then add more files until the contention is resolved. However it’s worth noting that just adding more files won’t always avoid tempdb performance issues, as it’s entirely possible that the issue is with a slow I/O subsystem. Adding more files ONLY helps resolve contention for the schema lock when creating new objects, which in tempdb can happen at a very high rate.  However not all applications make use of tempdb. It’s possible that specific application may make no use of tempdb. In that case creating multiple tempdb files is a big overhead and a waste of time.

On the other hand, the application might make extensive use of tempdb, but the bottleneck is not on the schema lock, but rather on the I/O throughput. Unless the additional files are created on different disks, then just adding more tempdb files won’t help this either.

Adding tempdb files only helps if the issue is around schema lock contention. This is another point worth bearing in mind, and one that’s not addressed at all in the number of files = number of processors recommendation.

All these points help illustrate why the recommendation for tempdb on a completely new system should be

1.       Start with a single tempdb file
2.       Monitor for tempdb related performance issues
3.       If the bottleneck is the schema lock then consider adding more tempdb files until the contention is resolved.
4.       If the bottleneck is I/O throughput then consider adding disks to the array backing the tempdb files, or create additional tempdb files on additional disks. 

Since you already have an existing system, you should monitor the tempdb usage and bottlenecks (you have the perfmon logs already), and use this information to help determine the starting number of tempdb files, and the number of disks backing the tempdb files.

As always this is just a starting point as new server hardware is invariably faster, has more memory, and faster disks, which will most likely change where the next bottleneck is.

________________________________________

Anyway, thought I would pass it along.

-Zach

Pictures of my boys

Now I know most of you dont care, but I have posted pictures of my boys in the gallery.  I was showing them what daddy was working on and they wanted me to put pictures of them on the stuff that I work on (that's about as technical as it gets with my family). 

-Zach

Tough Row to Hoe (Part 2)

Alright, so there is a spreadsheet that I am attempting to deliver on to you guys that read Tough Row to Hoe.  I am working on cleaning up my DMV spreadsheet a bit, and figuring out a way that I can post the Excel file that I created it in. 

Now with the RTM version of 2005 almost all of the DMV's are documented but when I started the task of organizing all of the DMV's and documenting them there was very little information in the BOL about them.  So now you can find what you need in BOL in reference to the DMV's so I thought I wouldn't need my spreadsheet anymore. 

Well, this has turned out to not be true.  I have had multiple situations where I was doing something in 2005 and I had no idea what DMV I would need to look at to find the information I was researching.  With the way I organized my spreadsheet I have found that it's extremely easy to find what I need in seconds.  I am working on adding the links to the BOL entry for the views so that I can look at the info there too, but that is in the next version. 

In the mean time does anyone know how to attach a file to a blog enrty here?

Thanks,
Zach

Science: Kinetic Energy, CPU physics

So I know this is not SQL Server specific information but these are a couple of the things I have read recently that I have found interesting. 

There was an incredible discovery at the Sandia National Lab that could mean a great deal, potentially, for our world.  Sandia recently reported being able to heat steel and tungsten to 2 billion degrees Kelvin for 10 billionths of a second (Wow, I wonder how they accomplish measuring something in billionths of a second, much less measuring something that is 2 billion degrees).  The process that the metal goes through was interesting too. 

They took a spool of steel threads, the spool was about the size of a coffee cup and the threads of steel were thinner than a human hair.  Then they pumped 20 million amps of current through the core (spool), and the current vaporizes the core into a cloud of ions (plasma).  There is a magnetic field from the current that takes the ions and squishes (that's right I said it, squishes) the plasma to a thickness about the size of a pencil lead with a velocity that could travel a plane from New York to San Fran in just a few seconds.  Then for a short period of time (no doubt measured in 10s of billionths of a second), the plasma ions, once squished (yup, there's that word again and I'm sure the appropriate word is squashed but I like squished better), has no place to travel because it has been condensed to it's most compact form, therefore keeping the ions from traveling and should have kept the ions from producing energy.  But the ions continued to produce energy and during this period is when the temperatures became super charged and produced the 2 billion degree heat.  I read from one source (and there are many out there) that they may have discovered a new state of matter.   

One thing that I think was pretty significant was " the radiated x-ray output was as much as four times the expected kinetic energy input", that was interesting in itself, check it out if your interested. Anyway, I thought that one at the very least, blog worthy.

The second interesting find was on the future of CPU's.  This was an article I found on ZDNet, and it refers to Moore's Law as coming to end of life with current processor technology.  The article stated that "Simply put, today's devices, which are based on complementary metal oxide semiconductor standards, can't get much smaller and still function properly and effectively. That's where spintronics comes in," said UCLA engineering professor Kang Wang, who will act as director of the institute."

Spintronics is a technology that is being invested in by the big chip manufacturers.  The spintronics technology has something to do with (I'm no engineer) the spin of electrons as they are charged and passed from one point to another (I can picture engineering geeks everywhere just elated with the possibilities). Anyway, short but interesting read.

Enjoy,
Zach

Transactional Database Structures, Intriguing...

I subscribe to Jason Haley's blog and there was an interesting blog reference today on "Transactional" Database Structures.  I have to admit I have never worked on a database that has required this kind of design before and I found myself intrigued by the concept, I like the simplicity and complexity of it. 

He is referencing an architecture that uses purely inserts and no updates along with a checksum to show whether or not a client is active or inactive based on whether they have made their payments to the provider.  He also summarizes the pro's and con's of using this architecture. 

Because I have no working experience with this structure and architecture I don't know what the performance caveats would be so I can't say whether or not I support it, but I will say that I do find it interesting. 

It's a good read to get the brain going though.

-Zach

Interview with Louis Davidson

Check out the podcast interview of Louis Davidson on SQL Down Under's web site.  It really is a great concept.  There are 11 other mp3 downloadable interviews with people that have great SQL knowledge (Kalen Delany, Adam Machanic, Tom Moreau, Itzik Ben-Gan) and others.

Enjoy,
Zach

SQL Server 2005 System Map

If you haven't heard it yet (and it has more than likely been blogged about a 100 times over) the SQL Server 2005 system map is downloadable off of Microsoft's Download site.  Check it out for yourself.  I can't wait for the compiled help version of this to come out. 

Enjoy!
Zach

Movin On

Recently I made a decision to go out on my own.  I am taking a contract for a few months while I build up and start my own business. 

I know that I havent blogged at all in the past few months and I'll be starting up again shortly. 

-Zach

Tough Row to Hoe

Man, I know that you all have been patient, and it seems like forever ago that I told you all that I would be posting my findings for the DMV's.  I have probably spent over 80 hours researching DMV's and I still feel so far from ready to post anything.  Problem is, I am reasearching rabbit trails as I find them, and while this is good for discovery it keeps me from my original goal.  Work has been catching up with me too. 

I promise to finish at least a couple of these before the week is out so I can post something.

Sorry to keep you waiting,
Zach

SQL Server 2005 Migration

A friend of mine, Darshan Singh, wrote this article for SQL Server magazine.  It's a great article for those of you that are looking to make the migration from 2000 to 2005.  The article covers two methods of migrating SQL Servers (in-place upgrade) and databases (side-by-side upgrade). 

Enjoy the article, it's very well written, clear, and has some great content.

Enjoy,
Zach

DMV Research

I have been doing a great deal of research on the DMV's, functions, stored procs and tables that are shipped in 2005.  During my adventure I have learned a lot along the way.  I have collected a ton of great information and have had some luck finding information in a lot of great places.  Some of the information that I have picked up has been a result of just trying to understand exactly what it is I'm looking at when I query a DMV.  This stuff can be mind boggling when your not a developer and there is not a bit of information available about it on the web.

I have been collecting the information and breaking the DMV's, functions and stored procs up into categories and trying to derive what the object is doing, then determining how it could be used and maybe giving some query examples. 

Look for some of the results soon.

-Zach 

DMV's - Query Troublshooting

I read the SQL Server Engine Tips blog this morning about troubleshooting SQL statements in 2005.  If you don't read this blog you really should consider checking it out.  The blog uses the new T-SQL features and DMV's to retrieve information about queries that have run through the server. 

There's a lot of great content in the post.

Enjoy,
Zach

What's Next?

This week I am doing research on 2 different SQL 2005 topics.  The first is a collection of the DMV's I've heard about and grouping them together for usage.  The second is testing DB Mirroring.  I want to hammer mirroring and see how it performs. The plan is to write a couple of articles about the setup, performance and test results of the technology. 

If there is anything in specific you would like to know about DB Mirroring, let me know and I'll try to fit it into the testing.

Later,
Zach

PASS 2005 - Day 2

Wow is all I can really say.  I only got to show up today for PASS for a couple of hours but all I can say is WOW!  This was my first PASS conference and man was it great.  I got to see all of my old co-workers at Scalability Experts today and that was great, I really enjoyed seeing them.  I saw all of the great booths and new products in the SQL Server market, which was a lot of fun.  Met some new people and hopefully new friends.  I really enjoyed the social side of the gathering.  It was great spending some time with ex-coworkers of mine Randy Dyess, Rick Heiges, Erin Welker, Darshan Singh and Larry Chesnut.  I met Louis Davidson and talked with him for a bit (very nice guy). And I got to stay for almost all of one session. 

I went and saw Bob Dorr's session about the SQL Server 2005 OS Engine.  Man was it great, the content was excellent and there was a lot of talk about how to use the system dmv's that PSS will use for trouble shooting issues, how to get to a sessions last error message, all of the statistics (OS, SQL, thread, etc.) for that thread when the error occured and so on.  Man it was great and I wish that I could have stayed for the whole thing, but it was great while it lasted.  Some of the DMV's he covered were...

  • dm_os_ring_buffer
  • dm_os_sys_info
  • dm_os_hosts
  • dm_os_stacks
  • Dm_os_wait_tasks
  • Dm_os_workers
  • Dm_os_schedulers
  • Dm_os_buffer_descriptors

 SIDE NOTE:  Have I ever told you how much I love OneNote? Dang, I love that app.  It's a great app for taking notes at things like technical sessions.

There were awsome topics beinng covered and it was a lot of fun.  Hope everyone that is there enjoys Friday.

Thanks PASS, you all did a fantastic job!!

-Zach

PASS 2005

There are a lot of great blog posts about the events and things going on at PASS this year.  I get to go today for about half the day which will be exciting (Thanks Tom!!).  I am really looking forward to seeing everyone there. 

SQL Server 2005 - DMV's

This is a topic that I am extremely insterested in.  I learned more about SQL Server internals from Microsoft than any other source by diving into the shipped system tables and stored procedures.

This is a big thing in SQL Server 2005, because, as I'm sure you're aware, we no longer have access to the system tables within SQL Server 2005.  There is a great blog by Randy Dyess that talks about this topic in particular. 

Along the lines of looking into system stored procedures, the sp_helptext stored procedure is replaced with object_definition() function.  The object_definition function is an undocumented function that will return the definition of a given object.

EXAMPLE:
select object_definition(object_id('sys.objects'))

The example above will give you the code behind creating that view, and will more than likely reference resource objects that you can't get to unless you are using the dedicated ADMIN connection.  The 'sys.objects' object is a system view, this object can be replaced with the name of a stored procedure to return the source text of that procedure or the name of a funtion to return the text of a function. 

This will be huge in SQL 2005 to determine how the MS Dev team is using SQL Server to give you the information that you need, to discover how and where SQL is getting and putting its information. I will blog some more on this topic tomorrow as I get into the actual DMV's and what information they are providing.

Good luck in the SQL 2005 hunt,
Zach

P.S. If you are going to PASS, have fun!! Hopefully I'll get a day this week to go.

 

Retrieving Unique Key Column Names

It's easy enough to get the PK and the column names it uses from SQL Server but what about Unique Key column names?  Ever tried it?  A friend of mine asked how you get those column names from SQL since there is no direct table to retrieve those from.  This is how I went about doing it. 

First you have to have a starting point.  For example, for all unique keys give me the column names for each, or another example might be, for a given table give me the column names of it's unique keys (don't mistake a unique key for a primary key, they are different things).  These two examples require different queries.  But I'll show you how I did it using the second example.

select o.name, o.id, o.parent_obj, o.xtype, o2.name, o2.id, i.name, i.indid, k.id, k.colid, c.name
from sysobjects o
inner join sysobjects o2
on o2.id = o.parent_obj
inner join sysindexes i
on o2.id = i.id and o.name = i.name
inner join sysindexkeys k
on o2.id = k.id and i.indid = k.indid
inner join syscolumns c
on c.id = o2.id and k.colid = c.colid
WHERE o2.name = 'table3'

Got Questions? Let me know and I'll post the answers in the comments.

Take it easy,
Zach

CHAR vs VARCHAR

This topic was discussed a bit in a forum post recently that has quite a bit of interest in my opinion.  Since I posted about the use of the GUID recently I thought I would do some some testing on the performance differences and the downfalls of using the CHAR and VARCHAR data types.

First we'll consider the functional differences between the 2 data types.  The CHAR is obviously a static length character datatype while the VARCHAR is a variable length character datatype.  The obviously positive and negative (we’ll cover this later) thing about a VARCHAR, is that you only store the amount of data you need to and no more, well, to get a little deeper you store an additional 2 bytes with every VARCHAR value. 

FUNDAMENTAL CONCEPT

When inserting into a CHAR(100) column regardless of the length of the data going into the column, SQL Server will store 100 bytes for this column.  When inserting into a VARCHAR(100) column depending on the length of the data going into the column, SQL Server will store the number of bytes of data going into this column. 

SELECTS

If you are using the same data to insert into 2 separate tables, one utilizing a char and the other utilizing a varchar, and the length of the data going into those tables are of fixed length, then there will be no performance difference in the retrieval of that data as long as no functions are used as part of the data retrieval.  So in other words if I had a char(100) column in the first table and a varchar(100) column in the same spot on the second table and all of the data that went into those columns were exactly 100 characters in length then both data types would perform equally as well.  If the data that was going into the char and varchar columns was of variable length (values varying from 1-100 characters equally) then the varchar would perform better.  This is not because the varchar data type is more efficient but because there will be fewer data pages to retrieve from disk (I'll get into this some more when I go over the inserts).

INSERTS

Comparing inserts for these data types is where the difference in performance lies because of the storage of data.  If the same 2 tables are used as in the select example and the data that is going into those tables is of variable length then the inserts will perform faster on the table with varchar.  If those same 2 tables are used and the data is of a fixed 100 character length then the inserts using both data types will be equal. 

UPDATES

Now, lets take a future scenario, lets say you have your variable length data in your 2 tables, and you want to do an update.  If your varchar(100) value is only storing 10 characters in a particular row.  And your data pages are full.  And you update it with a value that is 100 characters long, a page split will occur.  This will cause this update to be slower than updating your char column value that is 10 characters in length with a 100 character value.  Why you may ask?  Well because when you insert a 10 character value into a char(100) column, the database will store a value of 100 characters, the original 10 characters plus a padding of 90 spaces on the end.   When you insert a 10 character value into a varchar(100) column, the database will store a value only giving space on the data page for the 10 characters.  After the page fills there is no room for the additional 90 bytes of data so the system splits the page and moves to row to a new data page, placing a pointer to the new page and row. Evidence of this activity can be seen in the test below.

TEST RESULTS

In these tests I cover a couple of different scenarios.  

  1. Inserting 200,000 rows into a table called T1, which contained an integer identity column and a char(100) column
  2. Inserting 200,000 rows into a table called T2, which contained an integer identity column and a varchar(100) column
  3. Selecting all rows from table T1
  4. Selecting all rows from table T2
  5. Selecting a range of 15,000 rows from table T1
  6. Selecting a range of 15,000 rows from table T2
  7. Selecting a single row from table T1
  8. Selecting a single row from table T2
  9. Updating 200,000 rows from table T1
  10. Updating 200,000 rows from table T1

Inserting 200,000 NULL rows into the char and varchar tables T1 and T2 proves the amount of space allocated to store the data for each table.  During these two tests table T1 (char table) required 2817 data pages to store the data.  Table T2 (varchar table) required only 322 data pages to store all of the data.  This was because the T1 table reserved the full 100 character value for the char column regardless of the fact that the data inserted into the char column was a NULL value and therefore required 2817 pages to store the data.  Table T2 did not reserve the 100 characters of space when inserting the NULL value into the table therefore only needing 322 pages to store the data.  The inserts for both tables took 60 seconds for T1 and 61 seconds for T2.  The inserts were run multiple times and the averages were used for reporting.   After the inserts on table T1 the Scan Density and Page Density were 99.44% and 99.09% respectively.  After the inserts on table T2 the Scan Density and Page Density were 95.35% and 99.73% respectively.

Selecting all of the rows from both tables only 174 milliseconds for T1 and 152 milliseconds for T2.  This is because of the drastic difference in the number of pages being returned by both queries with no WHERE clause.  The number of logical reads that required by the system for table T1 were 2819 and the number logical reads required to retrieve all pages for T2 was 323.  When returning a range of 15,000 rows from T1 the query completed in 21 milliseconds and retrieved 214 data pages.  When returning the 15,000 rows from table T2 the query completed in 17 milliseconds and retrieved only 26 data pages.   When returning a single row from T1 the query completed in 0 milliseconds and retrieved 3 data pages.  When returning a single row from table T2 the query completed in 0 milliseconds and retrieved only 2 data pages. 

The performance hit when using a varchar in these test cases can be seen doing updates on these two tables.  When updating all rows in these two tables from a NULL value to a 100 character value the results vary greatly.  It took 1032 milliseconds to update all 200,000 rows in table T1, while taking 1911 milliseconds to update all rows in T2.  After doing a SHOWCONTIG on the two tables after the update you can see the cost of the updates to be page splits caused by the updates against T2.  There are still only 2817 pages for table T1 but now there are 3859 pages for T2 instead of the 322 pages we saw earlier.  There is also still a Scan Density and Page Density of 99.44% and 99.09% for T1 but now there is a 46.04% and 74.25% Scan Density and Page Density for T2.

CONCLUSION

There is no performance difference fundamentally retrieving data from the char and varchar datatypes.  The difference in performance comes from how you will be utilizing the columns. 

Make sure that you plan for the way you will be using the data, be it static or dynamic.

-Zach

XML discussion

Check out Adam Machanic's blog from yesterday about XML opinions.  He has 2 links there pointing to XML discussions that have 2 great view points.

Kent Tegel's blog on his view of XML usage is, of course, a good read as usual.

Enjoy!!

-Zach