Getting Time Right in Analysis Services
By Tom Chester
Published: 10/30/2003
Reader Level: Beginner Intermediate
Rated: 4.50 by 4 member(s).
Tell a Friend
Rate this Article
Printable Version
Discuss in the Forums

Rare is the OLAP cube that does not include a so-called Time dimension – a dimension that allows data to be analyzed over time. Though often named simply “Time”, the typical Time dimension is actually a date dimension, with a hierarchy consisting of Year-Quarter-Month-Day or the like.

Despite the unique ever-presence of Time, this is the one dimension that developers who are new to SQL Server Analysis Services (a.k.a Microsoft OLAP) often get wrong. This article will explain the correct way to implement Time dimensions in Analysis Services, and the benefits thereof.

Readers are expected to be familiar with Analysis Services, fact tables, dimension tables, and star schemas.

The Typical Mistake

Consider the following star schema. This is the basis for a cube that will contain three dimensions: Time, Product, and Customer.

Figure 1 – Star schema lacking a time dimension table

This is a typical example, albeit a simplified one, of a schema implemented by a novice OLAP developer. Notice there is a dimension table for Product and a dimension table for Customer, but not for Date; data is being derived from the fact table. There are many drawbacks to this design, as explained below.

The Pitfalls

The lack of a Time dimension table leads to several pitfalls, some of which may not be evident until after the database goes into production:

No Control of Dimension Membership

When the Time dimension is derived from the fact table, fact instances determine the membership of the dimension. When facts determine dimension membership, you can wind up with unexpected (and undesired) gaps in the dimension.

Suppose your database is used to analyze Web site usage. Further suppose your Web site goes down for a day, hence there are no entries in the fact table for this date. The missing day will not exist in the OLAP dimension. This day of non-activity may be important to users of the cube, something you want the ability to view or report. Only with a dimension table do you have control over dimension membership.

Slower to Process

When the Time dimension is derived from the fact table there are extra steps required as part of the regular cube processing cycle. It becomes necessary to perform an incremental update on the Time dimension, as well as the cube. This takes extra processing time, perhaps significant in the case of large fact tables.

Limited Control of Member Names

When the Time dimension is derived as a date-time column in the fact table, Analysis Services derives the member names, over which you have limited control. Take quarters for instance. Here are some common names you may wish to assign to a quarter:

  • Q1
  • Qtr1
  • 2004-Q1|
  • FY2004-Q1 (where FY is short for “fiscal year”)

By default, Analysis Services will assign the name Qtr1. While it is possible to modify the expression that defines the name, it takes complex gyrations to generate the last two examples.

Unable to Implement Non-Standard Rollups

When Analysis Services derives a time hierarchy (e.g., Year-Quarter-Month) from a date-time column, the developer has no control of the rollups. For example, January 1 will always rollup to January. At first glance, this may seem good, but there are many situations where flexibility is a requirement. None of the following examples of non-standard rollups are possible when the time hierarchy is derived from a date-time column:

  • Financial applications often divide a year into 13 four-week accounting periods, as opposed to 12 months.
  • There are businesses and even entire industries (e.g., the advertising industry) where a month begins on a date other than the first for reporting and/or accounting purposes. For example, when the month is defined as beginning on the 7th, February 1-6 must rollup to January.
  • Suppose the time hierarchy must include a Weeks level. If derived from a date-time column, you have no control of what day should be considered the start of week.

Unable to Assign Member Properties

Suppose you need to analyze Web site activity that occurs on holidays. Naturally, this requires that the database be aware of which dates are holidays. This requires a Holiday Y/N attribute assigned to dates, which is not practical without a dimension table.

OLAP member properties are derived from attributes in the dimension table. It is possible to assign attributes to dates only with a dimension table, which in turn enables special time-based reporting and analysis. Even if this is not an immediate requirement, it is critical that your database be designed to support this capability in the future should the requirement emerge.

Dimension Can’t Be Shared

When a time dimension is derived directly from the fact table, the dimension cannot be shared with cubes that are derived from different fact tables. As a result, you will wind up with multiple time dimensions, each requiring care and feeding. It is practical to implement a shared Time dimension only with a dimension table.

To drive home the value of shared dimensions, be aware that a shared dimension can have different granularity in different cubes. This permits, for example, a shared Time dimension with a Year-Quarter-Month-Day hierarchy to have Day granularity in Cube A, and Month granularity in Cube B.

Time Done Right

Here’s a revised star schema that includes a Time dimension table. Notice that time is treated no different from the other dimensions. Like Product and Customer, a foreign key in the fact table joins to the Time dimension table.

Figure 2 – Star schema with Time dimension table

Typically, as with other dimensions, an integer is used as the primary key/foreign key for the dates. This is not required; you can use a date-time column instead. An integer will result in a leaner fact table though. However, it is important to note that in order to achieve flexibility of member names, and flexibility of rollups, that the date-time column not be used as the basis for the dimension. That’s the reason for the various Name and Key columns.

There is one important caveat: Contrary to common practice, it is my belief that denormalized designs, like TimeDim in Figure 2, are to be avoided. (Common practice does not necessarily equate to best practice after all.) In order to really do time right, this table should be snowflaked into separate year, month, quarter, and day tables.

What’s Missing?

Here are some additional columns that might be applicable in your time dimension table:

  • Fiscal period names and keys -- FYYearName, FYYearKey, FYQtrName, FYQtrKey, etc.
  • DayOfWeek (e.g., Monday)
  • Attributes such as IsHoliday

Building the OLAP Dimension

To fully realize the benefits of a Time dimension table, you must build the dimension correctly within Analysis Manager. Take note, because this requires some counter-intuitive inputs.

When using the dimension wizard, once you have specified the dimension table, you are prompted to specify if you are building a Time dimension. (This prompt only occurs when there is a date-time column in the dimension table, hence you may not get the prompt, in which case this is a non-issue.) Do not check Time Dimension; do check Standard Dimension. This in turn displays the same user interface that is used to create all other dimensions, where the levels are specified one by one, vs. the one-click interface used to define (less flexible) time dimensions. This is where you gain the benefit of specifying the member names, and gain the benefit of non-standard rollups, per the hierarchies defined in the table.

An “Official” Time Dimension

This begs the question “What is special about time?” The answer is, "almost nothing." Analysis Services possesses no special awareness of time.

But, you might ask, what about the myriad date-time functions contained in the MDX language (the query/formula language used in Analysis Services), such as LastPeriods, ParallelPeriod, and the like? Guess what, these functions can be used with any dimension, not just time, illogical though it may be to do so. This raises the one trivial area where there is significance to an “official” Time dimension: With an official Time dimension, certain arguments for certain MDX date-time functions become optional arguments.

In addition, it is conceivable that a given Analysis Services front-end will treat an official Time dimension differently than a non-Time dimension. Thus, to anoint your hand-rolled Time dimension as an official Time dimension, perform these inputs in the Analysis Services dimension editor, bearing in mind the benefits are superficial:

  1. Select the dimension object and change the Type property to Time.
  2. Select each level one by one, and set the Type property accordingly (Year, Month, etc.).

Populating the Dimension

With a Time dimension table in place, it is incumbent on the OLAP developer to populate it. If historical data is being added to the cube, you will backfill the dimension with historical dates. Going forward, it is typical to add a year’s worth of new dates in advance of the new year, rather than adding new dates as part of a daily or weekly load routine.

In the case of planning applications (e.g., forecasting), a Time dimension table makes it possible to create an OLAP planning model in advance of fact instances. That is, you can populate the Time dimension with future dates. This creates an empty cell space in the cube where formulas can be used to generate a plan, similar conceptually to the way that spreadsheets are used.

What’s the Key?

It is common that the keys for the Time dimension table are sequential numbers starting with 1. This is inane for two reasons:

  • This precludes, or at least renders difficult, the task of backloading historical data. Even if backloading is not part of the original requirement, it could emerge as a future requirement.
  • The benefits of logically formed keys are not obviated just because this is a data warehouse. Assuming day granularity, it is recommended that the key be a big integer in the form of YYYYMMDD.

Include the All Member?

By default, all dimensions have a system-generated level that contains a system generated All member — a member that represents the grand total for the dimension. You may want to exclude the All level/member from your Time dimension. That’s because a grand total for time may represent an arbitrary range of dates; it may be a number you do not wish to report. To exclude All, select the dimension object in the dimension editor and set the All Level property to False.

It’s a good idea to make this decision up-front, prior to the rollout of the database. When you take away the All member, it is possible MDX formulas will have to be revised.

However, if you do exclude the All level/member, be aware that any query that doesn’t explicitly reference the Time dimension will default to the first member at the top level (e.g., the oldest year), whereas it is probably desirable to default to the last member (e.g., the most recent year). This behavior can be overridden in the dimension editor. With the dimension object selected, change the Default Member property. (It’s possible to use a dynamic MDX expression so that yearly maintenance is not required, a topic outside the scope of this article.)

Conclusion

To summarize the main recommendations:

  • Always use a dimension table for time; do not derive the dimension from the fact table.
  • Do not derive the dimension from a date/time column.
  • Use a meaningful key for dates, e.g., YYYYMMDD. This makes it easier to backload historical data.
  • Decide in advance if the All member is meaningful; remove it, if not.

Tom is a software developer, database consultant, author and educator. He has been focused in the field that has come to be known as Business Intelligence since 1990. Tom has created software products that were published by Microsoft (Open EIS Builder) and Hyperion, formerly Arbor Software (Open Budget Pak, EssReporter). He can be reached via his Web site: www.tomchester.net.



Marketplace
(Sponsored Links)
What are the green links?
   



 
Copyright 2007 CMP Tech LLC | Hosted By SecureWebs.com
Privacy Policy (4/10/06) | Your California Privacy Rights (4/10/06) | Terms of Service | Advertising Info | About Us | Help