21732.zip

Ensure correct MDX calculations by using solve order and pass numbers

Solve order and pass numbers are among the most complex concepts in SQL Server 2000 Analysis Services' MDX language. Many analytical applications require calculations—in the form of calculated members, custom formulas, and cell calculations—to be embedded in the cubes. Individual MDX queries also frequently require embedded calculations. Because embedded and nested calculations are so common, the order in which MDX evaluates these calculations is crucial to achieving the correct results. You can use a solve-order keyword and pass numbers to control the order of evaluation. This topic is advanced, even for regular MDX users, so put on your crash helmet—we're diving in!

You often use two calculated members in combination in an MDX query, such as when you include a calculated member in the list of members for the columns and another calculated member in the list of members for the rows. Listing 1 shows a simple example of a new FoodDrink member that's the sum of Food and Drink and a new CAOR member that's the sum of CA and OR (California and Oregon). I included these members on the rows and columns, respectively. Figure 1 shows that these calculated members are totals of the columns and rows. When the two calculated members intersect (in the bottom right cell), you end up with a grand total. You can think about the combination of these calculated members as the sum of the right column, which you can express as

`                              (CA, FoodDrink) + (OR, FoodDrink)`

This formula is equal to

`                              ((CA,Food) + (CA,Drink)) + ((OR,Food) + (OR,Drink))`

Or the formula could be the sum of the bottom row. You can express that sum as

`                              (CAOR, Food) + (CAOR, Drink)`

which is equal to

`                              ((CA, Food) + (OR, Food)) + ((CA, Drink) + (OR, Drink))`

In Listing 1's query, the order of the calculated members doesn't matter because the formulas are simple summations. But the order does matter when the mathematical operators aren't transitive.

The MDX query that Listing 2 shows has two calculated members. The first calculated member, called Canned Percent, returns Canned Foods' percent of the total of Canned Foods and Canned Products. The second calculated member returns California's percent of all states in the United States. Both of these calculated members are percent-of-total calculations that are embedded within their own dimensions; they don't specify which measure (or numeric quantity) they operate on. Creating a calculation in a dimension rather than as part of a measure's definition can be convenient because you can combine such a calculation with any measure in a query to determine the percent of total for that measure. The MDX query in Listing 2 returns these two calculated members and a couple of the members from which these calculations derive. Figure 2 shows the result of the MDX sample application we're working with.

Again, look at the lower right cell, in which the two calculated members intersect. If Analysis Services evaluates the Canned Percent member first, the expression looks like

`                              (CA, Canned Percent) / (USA, Canned Percent)`

Expanding Canned Percent's formula gives you

`                              ((CA, Canned Foods) / ((CA, Canned Foods)                               + (CA, Canned Products))) /                              ((USA, Canned Foods) / ((USA, Canned                               Foods) + (USA, Canned Products)))`

If you fill in the numbers, you get the totals that Figure 2 shows:

`                              (5,268/(5,268 + 448)) / (19,026/(19,026 + 1,812)) = .9216/.9130 = 1.01`

Now, look at how the formula appears if Analysis Services evaluates CA Percent before Canned Percent:

`                              (CA Percent, Canned Foods) / ((CA Percent, Canned Foods) + (CA Percent, Canned Products))`

If you expand CA Percent's formula, you get the following formula:

`                              ((CA, Canned Foods)/(USA, Canned Foods)) /                              (((CA, Canned Foods)/(USA, Canned Foods)) +                               ((CA, Canned Products)/(USA, Canned Products)))`

or

`                              (5,268/19,026) / ((5,268/19,026) + (448/1,812)) = 0.528`

The result of this evaluation order is different from the result that Figure 2 shows. If Canned Percent is evaluated first, the formula evaluates to 1.01 (or 101 percent), but if you evaluate CA Percent first, the formula evaluates to 0.53 (or 53 percent). If you needed to evaluate CA Percent first to get the appropriate result, you could use the SOLVE_ORDER keyword to force Analysis Services to give CA Percent priority. Listing 3 shows the query in Listing 2 with the SOLVE_ORDER keyword added so that CA Percent is evaluated first. You can use SOLVE_ORDER to control the evaluation order of calculated members and of calculated cells and custom rollups.

Now let's add complexity to the solve-order problem by introducing the concept of pass numbers. When executing an MDX query, Analysis Services resolves the embedded calculations in passes. Evaluation passes are identified by number. The last pass (i.e., the most nested) is always pass number 0. If a query doesn't contain cell calculations, custom rollup formulas, or custom rollup operators, Analysis Services executes the MDX query in a single pass—pass number 0. Analysis Services executes calculated members at pass number 0 and executes custom rollup formulas and custom rollup operators at pass number 1. Analysis Services performs cell calculations at pass number 1 unless you specify otherwise by changing the value of CALCULATION_PASS_NUMBER.

You can use this multipass execution process with pass numbers to create iterative formulas, such as goal-seeking formulas. Goal-seeking formulas are formulas for which you know what the outcome needs to be; you simply adjust formula input until you achieve the desired outcome. For example, you might want to know how much you need to decrease your product cost to increase your overall profitability by 10 percent. To solve this problem, your formula must try different product costs until it finds the value that achieves a profit increase of 10 percent.

Another use of a goal-seeking formula might be to determine how much revenue your company must make next month to meet revenue objectives for the year. To determine this amount, you can create a goal-seeking algorithm that tries various revenue values for next month; with each value, the formula uses a forecasting algorithm to determine what your total revenue for the year might be.

To control both iterative formulas and evaluation order, you can use pass numbers and solve order together. Solve order specifies the evaluation order within a single evaluation pass. You can then use pass numbers to control which iteration of a multipass formula will use a cell calculation. The pass number specifies the first pass in which the formula is used, and pass depth specifies the number of passes to use the formula. For example, you could specify that a cell calculation has a pass number of 3 and a pass depth of 2—meaning that Analysis Services will perform the calculation at pass number 3 and pass number 2. The calculation won't be in effect for pass number 1 because a calculation defined with a pass number of 3 would have to be active for 3 passes to reach pass number 1 (e.g., a pass depth of 3).

If you can use solve order to control the evaluation order of two calculated members, why would you need to use pass numbers? Although you can use pass numbers to control evaluation order, that's not generally why you use them. Pass numbers are important in more complicated formulas, such as recursive calculations (i.e., calculations that reference themselves) or goal-seeking calculations.

Consider the code example that Listing 4 shows. The first thing you might notice is the abundance of a function called CalculationPassValue(). This function lets you control which pass numbers Analysis Services uses to determine a formula's value. For example, CalculationPassValue(Time.CurrentMember, 0) means that you want the value for Time.CurrentMember after pass number 0, ignoring all other passes. Referencing pass number 0 is useful when you want the actual value that's loaded in the cube. In other words, you don't want other cell calculations to affect the value you're retrieving.

The MDX query in Listing 4 returns the Unit Sales for the months of 1997, which Figure 3, page 68, shows. The row labeled OldUnitSales shows the Unit Sales values as they exist in the cube. The row labeled Unit Sales contains values affected by the cell calculation MinRecentValue. MinRecentValue is a recursive formula that returns the lesser of the current month and the previous month values. The formula is recursive because the same formula determines the previous month's value. The formula continues searching through previous months as long as the values continue to get smaller. Two conditions cause the formula to stop searching: the formula finds a month value that is larger than the successive month or the formula has exceeded its pass depth.

Because the MDX query in Listing 4 has a pass depth of 1, the cell calculation looks back only 1 month. Figure 3 shows that month 6 returned the value 21,081.00. The cell calculation MinRecentValue compared month 5 with month 6 and returned month 5 because it was smaller.

Figure 4 shows the results you get if you change the query's CALCULATION_PASS_DEPTH property to 3. Notice in this result that the month-6 value is 20,179.00. Because the pass depth increased, the formula could recurse (call itself) three times before it reached pass number 0, in which the cell calculation was no longer in effect. For month 6, the formula goes back to month 4. The formula determines that month 3 has a value greater than month 4, so it stops and returns the month-4 value.

Note that when a formula calls itself, that recursive call doesn't change the pass number—in other words, a recursive call doesn't constitute another execution pass. You can always reference another pass with the CalculationPassValue() function, as the MinRecentValue function does. The reason the MinRecentValue formula changes pass numbers when it calls itself is that it uses CalculationCurrentPass()-1 to reference the next lower pass number.

To write effective cell-calculation formulas, you need to have at least a cursory understanding of pass numbers. Even the simplest cell-calculation formulas can mushroom into infinite recursion if you're not careful. For example, say you write the following formula to try to scale down all the month values in the Sales cube by 10:

`                              (Time.CurrentMember, \[Unit Sales\]) / 10`

The problem with this formula is that it references the same cell that initiated your cell calculation. Therefore, the formula will call itself indefinitely. The only way to avoid this infinite-recursion problem is to reference the cell value from a lower pass number, as the following formula shows:

`                              CalculationPassValue( (Time.CurrentMember, \[Unit Sales\]), 0 ) / 10`

The concepts of solve order and pass numbers are complex, and you need practice to learn when and how to use them. If you're writing queries that select calculated members on more than one axis (e.g., rows and columns), you definitely need to consider using solve order. If you're using calculated cell formulas, you probably need to use pass numbers. And if you use multiple calculated cell formulas with recursion, you might need to use both solve order and pass numbers—but I'll leave those scenarios for you to explore.

To practice writing MDX queries that return ordered lists, tackle the puzzle in the Web sidebar "September MDX Puzzle," http://www.sqlmag.com, InstantDoc ID 21988. For the answer to the August puzzle, see the Web sidebar "August MDX Puzzle Solution Revealed," InstantDoc ID 21989.