Sample code can be downloaded here.
Table partitioning is a powerful new feature in SQL Server 2005. It
is primarily a manageability feature, but my interest here is in the performance and scalability
of
table and index partitioning for use with very large tables.
When I started looking for table partitioning resources, Books Online seemed to be the only resource, period. It
covers the basics pretty well (see the References
section for topics). Then Kimberly Tripp published her excellent
white
paper. I definitely recommend reading it. If you like to
understand the theory before trying out a new feature, read Kimberly's paper
first, then use the example code here to expand your understanding of
partitioning. If you want to get your feet wet right away, read this and
run the sample code first, then follow up by reading Kimberly's paper. I
will not cover the same things here as Kimberly did but will give
step-by-step examples and test code, as well as compare the performance of different
partitioning schemes.
Enough of that, let's get right to the code. All of these examples use
a simple work database.
USE master
GO
CREATE DATABASE [TestPart] ON PRIMARY (
NAME =
N'TestPartPrimary',
FILENAME = N'C:\TestPart_Primary.MDF',
SIZE = 20,
FILEGROWTH = 10%
)
LOG ON (
NAME =
N'TestPart_Log',
FILENAME = N'C:\TestPart_Log.LDF',
SIZE = 2,
FILEGROWTH = 10%
)
GO
USE [TestPart]
GO |
Create a Simple Partitioned Table and Index
This table uses a bit column for the partition key and will create two
partitions.
Create a Partition Function and Partition Scheme
CREATE PARTITION FUNCTION PFN_Partition2(bit)
AS RANGE RIGHT FOR VALUES (
1 -- Partition 1 gets records
where int < 1 (0)
-- Partition 2 gets
records where int >= 1 (1)
)
GO
CREATE PARTITION SCHEME PS_Partition2 AS PARTITION PFN_Partition2
ALL TO ([PRIMARY])
GO |
After executing the above script, you should see this message:
Partition scheme 'PS_Partition2' has been created
successfully. 'PRIMARY' is marked as the next used filegroup in partition
scheme 'PS_Partition2'.
Create a Partitioned Table and Index
CREATE TABLE dbo.TblPartBit (
Col1 varchar(32) NOT
NULL,
PartKey
bit NOT NULL
) ON [PS_Partition2] (PartKey)
GO
CREATE NONCLUSTERED INDEX TblPartBit_CustomerID_IDX
ON dbo.TblPartBit (Col1, PartKey)
ON PS_Partition2 (PartKey)
GO |
Done. But how do you know that it's partitioned? Look at
the table properties screen. Right-click on the table in the Object
Explorer and select Properties.

When a table is partitioned, the partition scheme is listed after
"Partition Scheme" and "Table Is Partitioned" is True.
For the index, right-click on the index in the Object Explorer, select
Properties, then click on Storage in the left pane.

Create Customer/Order/Order-Item Partitioned Tables and
Indexes
Let's move on to more-involved examples. We will compare partitioned
tables that have the same columns and indexes but different partitioning
schemes. The questions here are:
- If partitioning is good, how does it affect performance?
- What affect does the number of partitions have on performance?
These tables implement a standard customer/order/order-item database.
The examples assume that the database is OLTP and heavily favors queries by
customer. The schema uses a "PartKey" column that is a persisted
computed column based on a mod of the integer CustomerID. This is a type
of hash partitioning. This allows stored
procedures to take a CustomerID as input and direct the query to just the
partition on which that customer will reside. Understand that this is not
pure, transparent partitioning. It is, however, what I see as
the most powerful use of table partitioning in an OLTP environment. It allows even distribution
of records within the tables and indexes.

To understand the benefits here, take the largest example: one million
customers, an average of ten orders per customer (ten million orders), and two
items per order (20 million items). Using a 100-partition table and
index scheme, each order partition will have on average 1/100th of the total, or
100,000 records. Note that 1,000 is the maximum number of partitions
allowed.
One important caveat here is that to partition a unique index, the index must
include the partition key. In this example, creating a unique index on the
combination of CustomerName and PartKey is not at all the same as creating a
unique index on CustomerName alone. So in these examples, the unique index
is not partitioned.
Example code below is for ten-way partitions only. The full code to
create and populate all of these tables is in Code.zip.
CREATE PARTITION FUNCTION PFN_Partition10(int)
AS RANGE RIGHT FOR VALUES (
1, -- Partition 1 holds
records where int <
1
(0)
2, -- Partition 2 holds
records where int >= 1 and int < 2 (1)
3, -- Partition 3 holds
records where int >= 2 and int < 3 (2)
4, -- Partition 4 holds
records where int >= 3 and int < 4 (3)
5, -- Partition 5 holds
records where int >= 4 and int < 5 (4)
6, -- Partition 6 holds
records where int >= 5 and int < 6 (5)
7, -- Partition 7 holds
records where int >= 6 and int < 7 (6)
8, -- Partition 8 holds
records where int >= 7 and int < 8 (7)
9 -- Partition 9
holds records where int >= 8 and int < 9 (8)
-- Partition 10
holds records where int >=
9
(9)
)
GO
CREATE PARTITION SCHEME PS_Partition10 AS
PARTITION PFN_Partition10
ALL TO ([PRIMARY])
GO
CREATE TABLE dbo.TblCustomer10 (
CustomerID int
NOT NULL,
CustomerName
varchar(80) NOT NULL,
PartKey AS (CustomerID % 10) PERSISTED,
DateCreated datetime
NOT NULL,
CONSTRAINT TblCustomer10_PK PRIMARY KEY CLUSTERED (CustomerID,
PartKey)
) ON [PS_Partition10] (PartKey)
CREATE UNIQUE NONCLUSTERED INDEX TblCustomer10_CustomerName_UIDX
ON dbo.TblCustomer10 (CustomerName)
WITH FILLFACTOR = 80
ON
[PRIMARY]
-- Note that this index is not partitioned
CREATE TABLE dbo.TblOrder10 (
OrderID
bigint
NOT NULL IDENTITY(1,1),
PartKey
int
NOT NULL,
CustomerID int
NOT NULL,
OrderGUID
uniqueidentifier NOT NULL,
OrderDate
datetime NOT NULL,
CONSTRAINT TblOrder10_PK PRIMARY KEY CLUSTERED (OrderID,
PartKey),
CONSTRAINT TblOrder10_CustomerID_FK FOREIGN KEY (CustomerID,
PartKey)
REFERENCES dbo.TblCustomer10 (CustomerID,
PartKey)
) ON [PS_Partition10] (PartKey)
CREATE NONCLUSTERED INDEX TblOrder10_CustomerID_OrderID_IDX
ON dbo.TblOrder10 (CustomerID, PartKey)
WITH FILLFACTOR = 80
ON [PS_Partition10] (PartKey)
CREATE TABLE dbo.TblOrderItem10 (
OrderItemID bigint
NOT NULL IDENTITY(1,1),
PartKey
int
NOT NULL,
OrderID
bigint
NOT NULL,
ProductID int
NOT NULL,
CONSTRAINT TblOrderItem10_PK PRIMARY KEY CLUSTERED (OrderItemID,
PartKey),
CONSTRAINT TblOrderItem10_CustomerID_FK FOREIGN KEY (OrderID,
PartKey)
REFERENCES dbo.TblOrder10 (OrderID,
PartKey)
) ON [PS_Partition10] (PartKey)
CREATE NONCLUSTERED INDEX TblOrderItem10_OrderID_IDX
ON dbo.TblOrderItem10 (OrderID, PartKey)
WITH FILLFACTOR = 80
ON [PS_Partition10] (PartKey)
GO |
Read PerformanceWhat is the read-performance benefit or penalty of table and index partitioning?
Well, let's just say we're hoping to see better results from subsequent betas
and the final RTM product. The execution plan statistics look great.
The actual scripted read tests show partitioned tables to be significantly
slower than non-partitioned tables.
Read Performance - Execution Plan Statistics
In each test, there were ten orders per customer and two items per order.
Orders were placed against random customers, so some customers have no orders
and some have several.
The percentages here are out of 100% using the "Include Actual Execution
Plan" statistics from SQL Server Management Studio. All tests were run on a single-processor desktop
with a 2.66 GHz P4, 1 GB RAM, and a single-spindle 80 GB IDE drive running
Windows 2003 Server and SQL Server 2005 Beta 2. Your mileage may vary.
| Query |
Partition
Count |
Customers (x 1000) |
| 1 |
10 |
100 |
1000 |
| 1 |
1 |
93% |
93% |
93% |
93% |
| 10 |
2% |
1% |
1% |
2% |
| 100 |
6% |
5% |
5% |
5% |
|
| 2* |
1 |
42% |
36% |
30% |
23% |
| 10 |
26% |
28% |
24% |
38% |
| 100 |
31% |
35% |
46% |
39% |
|
| 3 |
1 |
89% |
90% |
90% |
93% |
| 10 |
4% |
3% |
3% |
2% |
| 100 |
7% |
7% |
7% |
5% |
|
* Query 2 uses the non-partitioned index.
Read Performance - Scripted Random Reads
Each test performs 1,000
queries against random customers. The query is similar to Query 1 above
but returns just the CustomerName to a local variable so that there is no result
set.
SELECT @customerName = c.CustomerName
FROM dbo.TblCustomer10 AS c
JOIN dbo.TblOrder10 AS o
ON c.CustomerID = o.CustomerID
AND c.PartKey = o.PartKey
JOIN dbo.TblOrderItem10 AS i
ON o.OrderID = i.OrderID
AND o.PartKey = i.PartKey
WHERE c.CustomerID = @customerID
AND c.PartKey = @partKey
Partition
Count |
Customers (x 1000) |
| 1 |
10 |
100 |
1000 |
| Time |
Factor |
Time |
Factor |
Time |
Factor |
Time |
Factor |
| 1 |
141 ms |
baseline |
160 ms |
baseline |
168 ms |
baseline |
65.3 sec |
baseline |
| 10 |
445 ms |
3.2x |
1981 ms |
12.5x |
16067 ms |
95.6x |
226.3 sec |
3.6x |
| 100 |
3192 ms |
22.7x |
4520 ms |
28.3x |
21821 ms |
129.9x |
230.9 sec |
3.5x |
|
Write Performance
What is the write-performance penalty of table and index partitioning?
Each test writes 1,000 records to each of the three tables used above. Recovery
model is set to "Full". The tests were run four times and results were
averaged.
Partition
Count |
System with
No Load |
System with
Heavy Load |
| Time |
Factor |
Time |
Factor |
| 1 |
1.40 sec |
baseline |
44.99 sec |
baseline |
| 10 |
2.32 sec |
1.7x |
52.19 sec |
1.2x |
| 100 |
2.61 sec |
1.9x |
60.47 sec |
1.3x |
| 1000 |
111.44 sec |
79.5x |
472.95 sec |
10.5x |
|
If your tables are relatively small, you likely haven't even read this
far. For large tables, partitioning of tables and indexes in the final
version of SQL Server
2005 may be a win.
For large tables, consider between 10
and 100 partitions. Avoid using more than 100
partitions unless given specific guidance.
As always, experiment and test. Then wait for the next version of the
beta and experiment and test again. The optimal solution will depend
greatly on your access patterns, your hardware, and your configuration. It
will be very interesting to see how these performance numbers change as SQL
Server 2005 gets closer to RTM.
Left as an Exercise for the Reader ...
There is much more to partitioning than is covered here. All these
examples use the PRIMARY filegroup. In a real implementation, you would
want to use multiple filegroups for manageability and possibly performance.
Kimberly's paper has examples with multiple filegroups. Also, a real
implementation would have much more powerful hardware than I used here.
This could greatly affect the performance statistics.
- "SQL
Server 2005 Partitioned Tables and Indexes" by Kimberly L. Tripp
- SQL Server 2005 Books Online
- "Partition Functions"
- "Partition Schemes"
- "Partitioned Tables"
- "Partitioned Tables and Indexes"
- "Partitioned Indexes"
- "Taking Advantage of Multiple Disk Drives"
- $PARTITION
- sys.data_spaces
- sys.destination_data_spaces
- sys.partition_*
- sys.index_columns.partition_ordinal
- sys.partitions