posted on Thursday, August 04, 2005 3:05 PM by amachanic

Swinging From Tree to Tree Using CTEs, Part 1: Adjacency to Nested Sets

I'm not sure how many times over the last several years I've seen the same tired article titles... "Climbing Trees in SQL," "Climbing Up the SQL Tree," or maybe, "Naked Coeds Playing in the Trees!" ... Oh wait, I think that last one might be something else.

But anyway, the point is, I'm going to adhere to that standard.  But I'm adding a Tarzan-esque theme to this post, because we're not going to just climb a tree.  We're going to swing about between trees.  Different types of tree representations are appropriate for different scenarios.  Which is why, as I pointed out in my recent SearchSQLServer article on recursive Common Table Expressions, we have so many different ways of representing them.  Adjacency Lists, Materialized Paths, Nested Sets, and Nested Intervals spring to mind.  And there are probably others.

My article shows how to use the CTEs, in conjunction with a dynamically generated materialized path, to manipulate an Adjacency List, getting many of the benefits associated with using the Nested Sets Model.  And that's great.  But the Nested Sets Model itself might be useful in your endeavors.  So in this post, I will show how to extend the CTE discussed in that article.

The end-result CTE in the article, which can be run in the AdventureWorks database, looks something like this (renamed for the sake of this post):

WITH EmployeeLevels AS
(
    SELECT
        EmployeeId,
        CONVERT(VARCHAR(MAX), EmployeeId) AS thePath,
        1 AS Level
    FROM HumanResources.Employee
    WHERE ManagerId IS NULL

    UNION ALL

    SELECT
        e.EmployeeId,
        x.thePath + '.' + CONVERT(VARCHAR(MAX), e.EmployeeId) AS thePath,
        x.Level + 1 AS Level
    FROM EmployeeLevels x
    JOIN HumanResources.Employee e on e.ManagerId = x.EmployeeId
)
thePath is the materialized path, a '.' delimited breadcrumb from the root to each node.  We also end up with a level, which represents how many nodes away from the root in the hierarchy each employee sits (very important for those upwardly-mobile junior execs, no doubt!)

I'm going to assume that readers of this post are familiar with the Nested Sets Model, as popularized by Joe Celko.  If not, read the link.

After staring at the Nested Sets for a while, I arrived at the following mathematical properties:

  • Value of Lft for the root node is 1
  • Value of Rgt for the root node is 2 * (Number of nodes)
  • Value of Lft for any node is ((Number of nodes visited) * 2) - (Level of current node)
  • Value of Rgt for any node is (Lft value) + ((Number of subnodes) * 2) + 1
I think the only factor here that requires further explanation is "(Number of nodes visited)".  By this, I mean the number of nodes that would have been visited (including the current node) if one were doing a preorder traversal of the tree. Luckily, this number is quite easy to determine.  The row number for each row, as determined by ordering by the materialized path, is this number.  I encourage readers to validate this with pencil and paper if it doesn't quite make sense reading on the screen.  Draw a simple tree and traverse, starting from the lefthand side of the root node.

But how to translate that into T-SQL?  Luckily, SQL Server 2005, in addition to CTEs, also includes the very useful ROW_NUMBER() function.  So to get the row number, we simply need to add another CTE into the chain (did you know that successive CTEs can use the results of previous CTEs?):

WITH EmployeeLevels AS
(
    SELECT
        EmployeeId,
        CONVERT(VARCHAR(MAX), EmployeeId) AS thePath,
        1 AS Level
    FROM HumanResources.Employee
    WHERE ManagerId IS NULL

    UNION ALL

    SELECT
        e.EmployeeId,
        x.thePath + '.' + CONVERT(VARCHAR(MAX), e.EmployeeId) AS thePath,
        x.Level + 1 AS Level
    FROM EmployeeLevels x
    JOIN HumanResources.Employee e on e.ManagerId = x.EmployeeId
),
EmployeeRows AS
(
    SELECT
         EmployeeLevels.*,
         ROW_NUMBER() OVER (ORDER BY thePath) AS Row
    FROM EmployeeLevels
)
We now have current level and number of nodes visited, which gives us the Lft value for each node.  But how to determine the number of subnodes, in order to get the Rgt value?  Luckily, the materialized path also gives us that capability...

For any given node, number of subnodes can be determined by counting all nodes whose path value is LIKE the current path value + '.%'.

The resultant query should be fairly obvious at this point:

WITH EmployeeLevels AS
(
    SELECT
        EmployeeId,
        CONVERT(VARCHAR(MAX), EmployeeId) AS thePath,
        1 AS Level
    FROM HumanResources.Employee
    WHERE ManagerId IS NULL

    UNION ALL

    SELECT
        e.EmployeeId,
        x.thePath + '.' + CONVERT(VARCHAR(MAX), e.EmployeeId) AS thePath,
        x.Level + 1 AS Level
    FROM EmployeeLevels x
    JOIN HumanResources.Employee e on e.ManagerId = x.EmployeeId
),
EmployeeRows AS
(
    SELECT
         EmployeeLevels.*,
         ROW_NUMBER() OVER (ORDER BY thePath) AS Row
    FROM EmployeeLevels
)
SELECT
     ER.EmployeeId,
     ER.thePath,
     ER.Level,
     ER.Row,
     (ER.Row * 2) - ER.Level AS Lft,
     ((ER.Row * 2) - ER.Level) +
        (
            SELECT COUNT(*) * 2
            FROM EmployeeRows ER2
            WHERE ER2.thePath LIKE ER.thePath + '.%'
        ) + 1 AS Rgt
FROM EmployeeRows ER
ORDER BY thePath
... And that's it!  We have now converted the Adjacency List tree into a Nested Sets tree.

In the next installment, I will show how to determine the Nested Intervals encoding from an Adjacency List tree, also using recursive CTEs.

Questions?

Comments

# Swinging From Tree to Tree Using CTEs, Part 1: Adjacency to Nested Sets @ Thursday, July 13, 2006 12:47 AM

Originally posted here.I'm not sure how many times over the last several years I've seen the
same tired...

Anonymous

# Swinging From Tree to Tree Using CTEs, Part 1: Adjacency to Nested Sets @ Thursday, July 13, 2006 12:48 AM

Originally posted here.

I'm not sure how many times over the last several years I've seen the
same...

Anonymous

# re: Swinging From Tree to Tree Using CTEs, Part 1: Adjacency to Nested Sets @ Thursday, August 03, 2006 3:54 PM

Do you have a version that will work with SQL Server 2000?

Brian

# Swinging From Tree to Tree Using CTEs, Part 1: Adjacency to Nested Sets @ Monday, January 08, 2007 2:29 PM

Originally posted here . I'm not sure how many times over the last several years I've seen the same tired

Anonymous