Use new aggregation types to solve complex business problems

The power of an OLAP solution resides in its ability to quickly answer highly summarized questions such as "What were the total product sales in Europe by quarter?" Most OLAP solutions answer summarized questions by performing simple addition to find the sum of a series of numbers. Some OLAP situations, however, require aggregation types other than sum. For example, when aggregating inventory levels, you want the last value in a time series. If you have a measure called Minimum Inventory, you would aggregate it using the minimum of the series of numbers in the measure. SQL Server 2000 Analysis Services supports five types of simple aggregations: sum, count, min, max, and distinct count. These aggregation types are self-explanatory except for distinct count, which you use to determine how many distinct combinations of dimension members exist. The classic use of distinct count is to find answers to questions such as "How many sales transactions included both eggs and milk?"

However, some OLAP solutions require aggregation formulas that go beyond these five. For example, if you have a dimension that models a financial income statement, you might want most of the numbers to sum to the parent dimension member except in one situation: At the point where income categories and expense categories come together, you want to subtract expenses from income. Analysis Services' custom rollups feature lets you build custom aggregation formulas that can solve nonstandard aggregation problems. In this issue, I cover the three custom rollup techniques—custom rollup formulas, custom member formulas, and custom rollup operators—that Analysis Services supports and describe the benefits and drawbacks of each.

Customizing Aggregation


The first thing to know about custom rollups is that they affect a different scope within the cube than does a measure. For example, using the count function to aggregate a particular measure in your cube affects that measure across all dimensions. Analysis Services' three custom rollup techniques affect only one dimension but all measures within that cube. You can use MDX to further limit the scope of the custom rollups to a subset of a cube's measures or to a subset of a dimension, but you can't expand custom rollups beyond one dimension.

Custom rollup techniques can solve a variety of problems, but before I describe the types of custom rollups, let's explore a couple of alternative techniques that also create custom aggregations. For example, if you need to define an algorithm for aggregating one or more measures across all dimensions but the basic Analysis Services aggregation types won't do, you can use either a calculated member in the measure's dimension or a calculated cell formula that you limit to one measure. Both of these techniques are powerful because you use MDX formulas, which are flexible and extensive, to define them. Calculated cells are possibly the most powerful custom aggregation tool because they control the way existing (noncalculated) dimension members are evaluated, and you can limit their effects to almost any subset of a cube. (For more information about calculated cells, see "The Power of Calculated Cells," November 2000.)

A common technique for customizing the way you aggregate a measure is to define a calculated measure based on a loaded measure, then hide the loaded measure. For example, you might aggregate a Sales measure as a sum, but in two dimensions, you want to aggregate the measure as an average. In the measure definition, you can specify that a measure be named TempSales and be loaded directly from the Sales column in the fact table. You can mark this measure as hidden so that it's invisible to OLAP client applications; then, you can use TempSales in a calculation without TempSales being available to your analysis users. You can then use Analysis Manager to create a new calculated measure named Sales that will return the unmodified TempSales value except when you want the value to be an average of TempSales.

This technique of creating calculated measures and hiding loaded measures is common in SQL Server 7.0 OLAP Services implementations because OLAP Services doesn't support calculated cells or custom rollup techniques. However, calculated measures in both SQL Server 2000 and 7.0 have several drawbacks. For example, you can't use calculated measures when writing back to cube cells. One reason Analysis Services and OLAP Services won't let you write back to a calculated measure is that a calculated measure doesn't map directly back to a column in the fact table, so Analysis Services doesn't know which loaded measure to modify. Consequently, calculated cells aren't particularly useful in budgeting or modeling applications.

Another drawback of calculated members is that you can't use them with the MDX AGGREGATE function. The AGGREGATE function is a common MDX function that you use to aggregate a set of members or tuples. The measure you identify in the set determines the aggregation method that the AGGREGATE function uses. If you use a calculated measure in the set, Analysis Services (and OLAP Services) can't determine the aggregation method, so the AGGREGATE function fails. If you use a technique such as calculated cells to modify a measure's aggregation, the AGGREGATE function works because it is based on the measure's defined aggregation method.

Custom Rollup Techniques


The first custom rollup technique, custom rollup formulas, lets the cube builder define an MDX formula for each dimension level. Analysis Services uses this formula to determine the value of the dimension level's members. For example, you could use an AVERAGE function rather than a summation to determine all members in one dimension level. If you use the AVERAGE function, the MDX formula for a dimension called Customers would be Avg( Customers.CurrentMember.Children ). One advantage of custom rollup formulas is that you don't have to add columns to your dimension tables to store the formulas. Instead, you can add the formula in the Dimension Editor Advanced Properties pane, as Figure 1 shows.

If you want to apply a custom MDX formula to a subset of the members in a dimension level, you have a couple of choices. You can define a custom rollup formula for all members at the dimension level and include some conditional logic to restrict which members the formula affects. For example, if you want to apply the average formula to all members of the Customers, Country level except USA, you could use this formula:

iif( Customers.CurrentMember.Name = "USA",
   Customers.CurrentMember, Avg(
        Customers.CurrentMember.Children ) ).

The second custom rollup technique, custom member formulas, lets you define a different MDX formula for each dimension member. To define these formulas, you need to add an extra column to the dimension table for storing the formulas. Analysis Manager adds this column automatically when you turn on custom member formulas in the Dimension Editor, as Figure 2 shows. Defining a formula for every dimension member might seem like overkill, but in some situations, the resulting flexibility is useful. Imagine you have a cube containing sales figures that span multiple countries with different currencies. You could store currency in the fact table as local currency values, but when you aggregate the values to the country level, you might want to convert them to a common currency. This approach would let you use like numbers for country-by-country comparisons but still show the sales values within the country in units that make sense to local salespeople.

Custom member formulas can also be useful when you want to apply custom weighting to values as you aggregate them. For example, if you generally average regional numbers, but some regions are more important to your business, you might want to increase the weighting of those numbers in the average. Maybe you usually have a lot of sales in Yugoslavia, but because of recent turmoil in that area, you want to reduce the effects of those sales numbers in your decision making. With custom member formulas, you can tweak the weighting numbers for each region in a geography dimension.

The third custom rollup technique, custom rollup operators, is a simpler version of custom member formulas. As with custom member formulas, you must create an extra column in your dimension table to store the operators. Custom rollup operators are simpler than custom member formulas because you have a choice of just five operators to use, whereas custom member formulas can be any MDX formula. Rollup operators control how Analysis Services aggregates a set of dimension members to form the parent dimension member's value. For example, if all the child members have + operators, Analysis Services adds their values. If one of the children has a - operator and the rest have +, Analysis Services subtracts this child from the sum of the other children. The operators are applied in the order of the dimension members. In other words, Analysis Services evaluates the members in their natural dimension order, applying each unary operator in sequence. (Unary operators are operators that require only one operand—i.e., value.) The legal unary operators are:

+   The Analysis Services calculation adds the dimension member's value to the total of the previous members' values.
-   The Analysis Services calculation subtracts the dimension member's value from the total of the previous members' values.
*   The Analysis Services calculation multiplies the dimension member's value by the total of the previous members' values.
/   The Analysis Services calculation divides the total of the previous members' values by the dimension member's value.
~   The Analysis Services calculation ignores the dimension member's value.

Custom rollup operators are especially useful for simple aggregation algorithms. A good example is a budgeting cube that aggregates budget categories. You must add income categories to the total and subtract expense categories. You could use the multiply operator to apply tax rates when modeling an income statement.

For an example of custom rollup operators, let's look at the FoodMart 2000 Budget cube that comes with Analysis Services. The Budget cube's account dimension is a parent-child dimension that uses custom rollup operators. Figure 3 shows the account dimension table; the account_rollup column contains the rollup operators. Because account is a parent-child dimension, the combination of the account_id column and the account_parent column determines the parent-child hierarchy.

The budgeting scenario seems like an obvious match for custom rollup operators, but beware of one thing: Write-back allocations, which are new to Analysis Services, don't support custom rollup operators. Write-back allocations let you write a new value to an intermediate level in a cube. Internally, Analysis Services determines which leaf-level (bottom-level) members the write-back operation affects, then changes those members and reaggregates back up to the targeted member. Suppose you used the simple equal allocation scheme (in which the write-back value is split equally among all leaf-level descendant members) to write back to an intermediate account in the FoodMart Budget cube. Then after the leaf-level accounts were filled in and aggregated back up to the targeted account, the value wouldn't be the same one you specified. You get a different result because the equal allocation assumes a simple summation of all leaf-level members but the Budget cube uses custom rollup operators. To make write-back work with custom rollup operators, you probably need to create a write-back application that specifies a complex MDX formula to apply the write-back allocations to the leaf-level members. Or your application might have to implement write-back allocations outside of Analysis Services and write only to leaf-level cells.

Building cubes can be tricky because real-world business situations always have special requirements, such as currency conversions and non-additive measures (e.g., inventory levels). To meet these special requirements, an OLAP solution must be flexible in defining aggregation formulas. Understanding Analysis Services' techniques for customizing the way values are aggregated in a cube can help you design your cubes in the best way to address your particular business needs.