Maintain and update your time-tracking application's write-back cube

Data analysts use database information not only to examine past business trends, but also to forecast future trends and allow for changes. Analysis Services' write-back capabilities let you do this kind of complex analysis. Write-back lets users change the underlying values stored in an OLAP cube. Analysts can use this write-back capability to develop forecasting, modeling, and data-entry systems. The creation of an employee time-tracking application for a consulting firm is a specific example of how to use Analysis Services to develop a complete write-back application. To track employees' billable time, consulting companies need to see not only what hours an employee has logged, but also what time is available for a project. And employees need to be able to enter changes to their billable time. In last month's Mastering Analysis column, "Tracking Time," June 2002, InstantDoc ID 24769, I explained the first part of how to use Analysis Services' write-back capabilities to develop a time-tracking application. I described how to design the cube, load it with data, and create a calculated member that determines the cost of your employees' work hours. This month, I complete the application by explaining how to automate cube maintenance and how to create a data-entry application.

Automating Maintenance

As I discussed in "Tracking Time," one problem with write-back applications is that changes to the data or the data's structure in the star schema might cause Analysis Services to delete and recreate the write-back table. The recreation is a problem because the write-back table contains all the information that the application's users have entered. So if Analysis Services recreates the write-back table in your time-tracking application, you could lose all the information employees have entered about their billable time.

To solve this problem, I created a Data Transformation Services (DTS) task that moves records from the write-back table into the star schema's fact table. After the records are in the fact table, you can safely reprocess the cube without being concerned about whether Analysis Services will recreate the write-back table. Also, limiting the number of records in the write-back table significantly improves cube performance. Regularly moving the write-back table's records keeps the number of records at a manageable level. For these reasons, I schedule my DTS task to run daily.

Creating the DTS task to move the records to the fact table is fairly straightforward. The most complex aspect of the process is that the write-back table and the cube's fact table are structured differently, as Figure 1 shows. The Tracker cube's fact table has dimension tables for Employee, Project, and Time values, and the write-back table contains all that information. The write-back table is a completely denormalized version of the star schema. Every dimension level in the Tracker cube has a column in the write-back table. This relationship is the reason that Analysis Services needs to frequently recreate the write-back table: Any change to the structure of a dimension leads to a change in the columns of the write-back table. The Author and ChangeTime columns of the write-back table contain the user's name and time that the write-back entry was made. This information about each write-back entry lets analysts determine the source of incorrectly entered information, so I created the same Author and ChangeTime columns in the fact table. This extra copy of the columns lets an analyst drill through to the fact table from a selected cube cell value to view who entered the value and when.

Listing 1 shows the SQL INSERT statement that moves the fact table records. This INSERT statement also joins the records in the write-back table to the dimension tables in the star schema to determine which foreign key values the fact table needs. For example, the write-back table contains a ProjectName level but not a ProjectID. The fact table requires a ProjectID, so the INSERT statement joins the write-back table to the Projects table by ProjectName to determine the correct ProjectID. Note that the format of the date in the write-back table differs from the standard date and time format that the Time dimension uses. This difference makes joining the tables to determine the TimeID difficult. I resolved the problem when I first created the Time dimension table by creating another column called TimeText in the TimeTable table that matches the format Analysis Services uses in the write-back table.

I won't cover all the details of creating and scheduling a DTS package that moves the records, but I'll walk through enough to get new DTS users started. Start Enterprise Manager, and in the tree view in the left pane of the window, find the server that contains your star schema tables. Then, right-click the Data Transformation Services folder. Select New Package to create and edit a new DTS package. After you define the data-movement and cube-reprocessing steps in the package, you can save it. When you right-click the package name, you can select Schedule Package to enter information about how frequently you want the package to execute.

Creating a Data-Entry Application

After you design and build the cube and you have a DTS package to maintain it, the next step in creating the complete write-back application is to create an application that lets users enter new data. I created two such applications for my Tracker cube. One is a simple Web-browser­based application that lets employees enter time data directly into the cube. The other, a desktop application for employees who work at home or travel for business, lets employees enter data when they're not connected to the company network. Later, they can use this application to upload the time information to a Web service, which will write the information back to the Tracker cube.

I used C# with ASP.NET to build the browser-based application that Figure 2 shows. If you're using the .NET languages for development, you know that the .NET Framework supports data access through ADO.NET, which is a native SQL driver and a native OLE DB driver. But in the .NET Framework, no ADO MD equivalent exists for accessing multidimensional data sources. Microsoft hopes to close this gap with ADOMD.NET, a new .NET API that the company announced when I was writing this article. ADOMD.NET, which should be in beta by late summer, won't require that you have Microsoft Data Access Components (MDAC) and PivotTable Service loaded on the client computer; instead ADOMD.NET will be based on XML for Analysis. (For more information about this Web service API, see Mastering OLAP, "XML for Analysis," April 2001, InstantDoc ID 19846.) With ADOMD.NET, you'll be able to develop analysis applications by using only .NET, which means the applications you create will be much smaller and easier to deploy.

Until ADOMD.NET is available, you have a couple of other choices for accessing multidimensional data sources. You can use the native OLE DB driver, which lets you execute MDX queries and retrieve the results as recordsets (rather than the multidimensional cellsets available in ADO MD). Or you can access the COM-based ADO MD through .NET's COM Interop capability. Although C# and the other .NET languages include excellent support for COM, accessing ADO MD is tricky because some of the ADO MD methods (e.g., OpenSchema) require parameter types that don't have exact .NET equivalents. For the time-tracking application I was working on, I decided to use the native .NET OLE DB driver because it was easier.

If you choose to use OLE DB to access Analysis Services, Listings 2, 3, and 4 will help you. The code in Listing 2 shows how to establish a connection to an Analysis Services database. Listing 3 shows code that executes an MDX query and retrieves the name of a dimension member from the resulting columns. Listing 4 demonstrates how to execute an MDX UPDATE statement to change a value in the cube.

You should know a few things about using the MDX UPDATE statement. First, only SQL Server 2000 Analysis Services supports this statement. If you're using SQL Server 7.0 OLAP Services, the MDX UPDATE statement will give you a syntax error. Even if you have Analysis Services, you might get a syntax error because you might be using the SQL Server 7.0 version of PivotTable Service. If you uninstall and reinstall the .NET Framework (which includes MDAC 2.7), the default version of PivotTable Service switches to SQL Server 7.0. This switch can happen even if you've never installed OLAP Services because the basic SQL Server 2000 components include both the SQL Server 2000 and 7.0 versions of PivotTable Service. You can fix this bug by changing the HKEY_CLASSES_ROOT\MSOLAP\Clsid registry subkey value to \{a07ccd0c-8148-11d0-87bb-00c04fc33942\}.

Second, the MDX UPDATE statement always updates leaf-level cells regardless of whether you specify a mid-level dimension member. A leaf-level dimension member is a dimension member from the lowest level in the dimension. A leaf-level cell is a cell identified by leaf-level dimension members from every dimension in the cube. If you don't specify a leaf-level cell in an MDX UPDATE statement, Analysis Services splits the UPDATE statement you provided into individual leaf-level UPDATE statements for all the cells that contribute to the aggregated value of the cell you specified. Sometimes this automatic allocation is exactly what you want—but not often. Notice that in Listing 4's UPDATE statement, I used dimension members from both the Time.Project and Time.Fiscal hierarchies of the Time dimension that Figure 1 shows. If I'd specified a dimension member from only one of the Time hierarchies, Analysis Services would evenly distribute the UPDATE value I specified across all the leaf-level members of the hierarchy that I didn't specify. This distribution happens because Analysis Services treats different dimension hierarchies as if they were different dimensions. Thus, Analysis Services would write back about 700 fractional numbers to the cube, one for each day in the Time dimension.

Finally, remember that it's appropriate to use transactions with the MDX UPDATE statement. When you use a transaction to group UPDATE statements between a BeginTransaction and a Commit statement, either all the updates work or none work. Listing 5 shows an example of how to structure a transaction using the OLE DB connection object. You can start a transaction with the BeginTransaction method of the OLE DB connection object. The BeginTransaction method returns a transaction object that you can use when you execute MDX UPDATE statements. When the transaction is complete, you can issue either a Rollback or Commit method on the transaction object. I used the transaction object to combine all the updates associated with a page of data—such as the page that Figure 2 shows—into a single Commit. When you use transactions, you guarantee that all employee changes will be made unless an error occurs—in which case no changes are made.

Unfortunately, I can't include all the source code for this application because it includes some proprietary information. However, the instructions I've presented in this article should be enough to help you implement an equivalent application for your business. The key parts of creating the application are structuring the cube, moving records from the write-back table to the fact table, and using MDX to perform updates. Unless you have specific analysis requirements that require you to implement your own project time-tracking application, you might prefer to buy an application rather than build it. But you can use the write-back capabilities of Analysis Services to develop many other custom applications such as a custom forecasting application or an application that explores your organization's "what-if" business scenarios.