OLAP lets end users drill down for report details
You might think that financial reporting should be a straightforward implementation of the OLAP model. Almost every vendor that offers a product in the business intelligence (BI) arena touts financial reporting as one of the primary implementations of its tool. But if you're going to use SQL Server 2000 Analysis Services to implement financial reporting, you can expect some complicated problems to come your way. Although Analysis Services can readily provide the detailed information that users want, it isn't designed to present the information in the form they require.
Fortunately, many OLAP developers encounter these problems, and you can learn from the experiences of others. In this article, I describe some of the problems that my team encountered when we tried to create a financial-reporting system for a large banking group, and I share the solution we used to overcome these problems. Our solution uses the Analysis Services database, not the presentation layer. To create presentation-layer reports, you can use an OLAP client tool, such as those that ProClarity or IntelligentApps provide, or a reporting tool, such as SQL Server 2000 Reporting Services.
If you want to practice the technique I describe in this article, you can download a sample Microsoft Access database and OLAP archived database, along with the included MDX queries, to reproduce our team's model. To use the sample databases, download the .zip file at InstantDoc ID 41783, open the file in a directory of your choice, and restore the OLAP database. You'll need to change the path of the Access database, which is the source for the OLAP database, to the correct path on your system. You can examine the sample MDX queries by using the MDX Sample Application that comes with Analysis Services.
Getting to the Bottom Line
When you're an OLAP developer, the problems you're concerned with aren't accounting problems. I assume that you've already solved problems such as eliminating intercompany transactions and converting currency by using your financial or enterprise resource planning (ERP) system and that you're starting with a clean trial balance. Now, you're ready to start OLAP modeling.
The starting point of financial reporting is the chart of accounts, a hierarchical list of accounts that make up the organization's financial system. Usually, this chart is an unbalanced parent-child list. In our team's starting OLAP model, ChartOfAccounts became the first dimension in our cube. Other dimensions included Time (most financial reports compare figures across several periods), Organization (a list of subsidiaries that's also a parent-child dimension), and business-specific dimensions such as Currencies and Responsibilities.
Accountants call the data source for our model the trial balance. In our starting model, TrialBalance is a table that has keys relating to each account, period, organization, and any other dimensions. We used the TrialBalance table as a fact table for our OLAP cube. The TrialBalance table contains several data fields, including the opening balance, transactions over the period, and the closing balance. Figure 1 shows a model of table relationships in the starting model.
Problems surfaced when we tried to create reports from this model. If you've ever looked at a company's annual report, you know that it contains some standard statements (i.e., reports) such as a balance sheet, a statement of operation (aka a profit-and-loss statement), and a report about changes in shareholders' equity. In addition, annual reports typically contain notes that provide further information about data in the main statements.
Statements and notes contain several lines representing various accounts, with subtotals at several levels. Figure 2 shows a typical profit-and-loss statement. Each of the lines in the report represents the total of several accounts in the ChartOfAccounts table. So, for example, a large number of accounts are consolidated into the Cost of Sale line in Figure 2. Figure 3 shows how the relationships between accounts and lines in a report might look. You can see that account 203, for example, is directly included in report line A and indirectly included in report lines B and C through account 203's parent account, 210.
Report users usually want to see reports in a format like the one in Figure 2, but they also want the ability to drill down from each report line into the accounts that contribute to that line. To create this kind of dynamic report, our team first tried adding the report structure into the ChartOfAccounts dimension. However, as Figure 3 shows, the joined structure creates a network hierarchy in which a child can have more than one parent. Analysis Services doesn't support this structure.
The second solution we tried was to create multiple hierarchies in the ChartOfAccounts dimension, with a different hierarchy for each report. We quickly discovered that option was also impractical for several reasons. First, our organization uses more than 100 reports, which means we'd need an equally large number of hierarchies. Because Analysis Services works behind the scenes to create a full dimension for each hierarchy, this option would significantly decrease database performance. Second, in a model such as ours, every report includes only a small subset of the leaf-level members of the ChartOfAccounts dimension, but Analysis Services requires that all leaf-level members be included in all hierarchies. Third, our organization's list of reports—and the inclusion of accounts in each report—tends to change frequently. If we had a separate hierarchy for each report, we would have to frequently restructure the whole database.
Finally, we realized that the solution was hidden in the problem. Our problem was that there were many-to-many (M:N) relationships between the report lines and the accounts. A basic rule in the mathematical model behind OLAP is that if two lists of entities create M:N relationships, the lists can't define two levels of the same dimension; instead, you must create two dimensions. So, we needed to separate the report lines from the chart of accounts, then map the relationships between the report lines in a Reports table and the accounts in the ChartOfAccounts table.
To accomplish this remodeling, we first created a new table and a new dimension—Reports—that reflected the structure of the reports without containing the chart of accounts. Reports is a parent-child dimension that contains all the lines in all reports. Figure 4 shows the Reports dimension table rows that correspond to the report lines in Figure 2. The table's UO field contains for each report line a unary operator that controls how level-member values roll up to their parent's values. If the unary operator is +, the value of the member is added to the aggregate value of the preceding sibling members. If the unary operator is -, the value of the member is subtracted from the aggregate value of the preceding sibling members. The unary operator value ~ means that the value of the member is ignored in the rollup process. In the Reports table, we created another field called Order, which determines the sort order for displayed members; we use this member property instead of sorting by member name or member key.
Our team's second step in remodeling table relationships was to map the relationships between the report lines in the Reports table and the accounts in the ChartOfAccounts table. Our new map created a new fact table called ReportStructure, which ties the chart of accounts to the individual report lines. This table contains three fields—Report_Line_ID, Account_ID, and Include_In_Rep—and includes a record for any account from any level that's included directly (i.e., not through its parents) in any leaf-level report line. If we want to include an account in a report line, we have to add a record to this table. The value of the Include_In_Rep field shows how to include a given account in the report. A value of 1 means "include as is" and a value of -1 means "change sign, then include." The Include_In_Rep field can have any value (not only a 1 or -1), and as you'll see in a moment, the model will multiply the account's data by this coefficient before including it in the report line.
The ReportStructure table has two keys: Account_ID and Report_Line_ID. By using this table as a fact table related to the ChartOfAccounts and Reports dimensions, we created a second cube—the ReportStructure cube—which maps the M:N relationships between accounts and report lines.
Because our team might include in our reports accounts at any level—not only the leaf level—we had to enable non-leaf members with data in the ChartOfAccounts dimension. To do so, we set the ChartOfAccounts dimension's Members with Data advance property to Non-leaf data hidden.
Analysis Services has some special features that are tied to its Measures dimension, and we used those features to convert the account data to report-line values. We moved into a separate dimension called Variables all the items that made up the Measures dimension in our previous model and left the TrialBalance cube with one measure, Value. Figure 5 shows our final model.
Be aware that a drawback to moving measures to another dimension is that you can no longer use special features in your OLAP front end that are tied to the Measures dimension. For example, in our case, we discovered limitations to on-the-fly calculations that involved the Measures dimension. But we devised a workaround for this problem that's satisfactory for our situation, and although the workaround is too complex to cover in this article, it lets us keep the items in the Variables dimension as measures without losing the benefits of our solution.
To create the reports we needed from the two cubes in our new model, we had to build a virtual cube that contains all the dimensions of the TrialBalance cube, the Reports dimension from the ReportStructure cube, and two measures: Value from the TrialBalance cube and Include_In_Rep from the ReportStructure cube.
The Include_In_Rep measure in the ReportStructure cube contains a non-zero value for each of the accounts to include in any particular report line. The Value measure from the TrialBalance cube stores the actual value for each account. If we multiply these two measures, we get the value of each account included in the report line and a 0 for accounts that aren't included. (Figure 6 shows an example of TrialBalance accounts that will be included in a particular report line.)
The multiplication is complex because the Value and Include_In_Rep measures have different dimensionality (i.e., the two measures relate to different dimensions). Value measure lacks the Reports dimension and the Include_In_Rep measure lacks the Time, Organization, Variables, and ChartOfAccounts dimensions. When you compute values in a virtual cube, measure cells contain only values at the All level for dimensions that aren't common to all the underlying cubes. For those dimensions, you can use the ValidMeasure() function to return the measure value from the cell at the All level coordinates.
Because the Value and Include_In_Rep measures have different dimensionality, we had to use the ValidMeasure() function to force dimensions that aren't common to all the cubes to their All level, as the following code shows.
\[Measure\].\[Report Value\] AS 'ValidMeasure (\[Measure\].\[Value\]) * ValidMeasure(\[Measure\].\[ Include In Rep\])'
The result is a calculated measure, Report Value. What we actually wanted to include in the final result is the total of Report Value across all the accounts. The right place for this total is in the All member of the ChartOfAccounts dimension. So, we first had to calculate the Report Value for all accounts, then aggregate them into the All member. Listing 1 shows the code to create the Report Value calculated measure. Note that, while calculating the All member's value, we excluded it from \[ChartOfAccounts\].Members. If we didn't, we'd have an infinite recursive formula.
We still had one hole in our solution. We needed to avoid aggregating the Include_In_Rep field across the ChartOfAccounts dimension because we didn't want to have a parent account in any report that included one of its descendants. However, we did want to aggregate measures across the Reports dimension. As Russ Whitney noted in his Mastering Analysis column "Alternative Aggregations" (March 2003, InstantDoc ID 37707), Analysis Services doesn't natively support aggregating a measure across one dimension and avoiding it across another one. To solve the problem, we created a custom rollup in the ChartOfAccounts dimension, as the code in Listing 2 shows. In the case of the Value measure, we aggregated all leaf-level descendants of the current account, and for the Include_In_Rep measure, we simply used the non-leaf member data.
Now, we can easily produce common financial reports like the one that Figure 2 shows. If our users need further detail about any line in a report, they can cross-join the Reports and ChartOfAccounts dimensions, suppress zero, and get a drill-down report like the one that Figure 7 shows.
One word of warning: When you create the ReportStructure table, be sure you don't directly or indirectly include an account more than once in the same report. Doing so will produce double counting. The relationships in Figure 3 illustrate the kind of duplication that can occur. Account 101 is included twice in report line A, once directly and once indirectly, through its parents.
An important benefit of our solution is its flexibility. If users in our organization want to add or change reports, they can update the ReportStructure and Reports tables, then reprocess the small ReportStructure cube. We don't have to change the ChartOfAccounts table or reprocess the large TrialBalance cube. You can apply this modeling technique to many kinds of applications other than financial reporting. Wherever you need to create large numbers of dynamic subsets of members of a given dimension and create a hierarchy of these sets, this technique provides the solution.