Most analysis applications examine historical data but don't let users modify that data. However, SQL Server 2000's Analysis Services lets users write back to the analysis data (the cube). Write-back lets users change the underlying values stored in an OLAP cube. Analysts can use this write-back capability to develop forecasting, modeling, or data-entry systems. Enabling Analysis Services' write-back capability is simple: Right-click a cube in Analysis Manager, and select Write Enable. But developing a complete write-back application is much more complex. A write-back application is valuable because it lets you analyze the present or the future—rather than just the past. For example, you can view your progress in the current month or even predict upcoming sales results. Recently I developed a write-back solution for our software-development organization to track the time that each team member spends on each active project.
The application I wrote is simple but capable, much like the time-tracking applications that consultants use for billing time or tracking time to calculate their Return on Investment (ROI). The difference between this application and many other time-tracking applications is that it's primarily an analysis application. Most other time-tracking applications are primarily billing applications that include simple reporting capabilities. You could extend this application to integrate it with a billing system or produce billing reports, but this application's purpose is to determine ROI, utilization (which is a measurement of the percentage of hours billed), and efficiency.
The fact table contains a record for each time entry for every combination of employee, project, and date. The Hours measure is summed in the fact table so that each Hours value represents a change to the previous value. For example, if you entered 8 hours for a work day but changed the value to 6 hours, the fact table would contain two rows—one containing 8 hours, the other containing -2 hours. The fact table rows also contain the username of the person who made the time entry and the date and time of the entry. This information is useful for diagnosing problems that might crop up later in the data because, for each cell in the cube, you can see when values were entered and who made the changes.
In the cube I created, the Employee dimension is a two-level dimension even though employees are categorized in three ways: by department, by group, and by name. To make the dimension easier to navigate, I omitted Group Name from the dimension levels and included it as a member property. If you need to know an individual's group, you can inspect the member property without having to navigate groups during analysis. If measuring a group's cost or utilization is important, you might want to include such a level in the Employee dimension.
Note two other details about the Employee dimension. First, the hourly billing rate for an employee is available as a member property. This cube structure doesn't contain enough information to determine historical billing rates for an employee—only the current billing rate is available. This limitation could be a problem if you're viewing a sum of an employee's billings for a time period that includes a billing-rate change. This problem wasn't a concern for my application, but one possible solution is to include a Billing Rate measure in the fact table. Then when an employee makes a time entry, an automated process—such as the Data Transformation Services (DTS) package I describe later—can copy the current billing rate from the Employee dimension to the new fact-table entry. If the billing rate later changes in the Employee dimension, the previous fact-table entries will still contain the older (appropriate) billing rate.
A second thing to note about the Employee dimension is that it includes the employee's email address. Making sure employees keep their time entries current is a constant battle. Because the email address is a member property, the application can run an automated service against the cube to send an email reminder to employees.
The Project dimension includes estimated and actual dates for important project milestones. The milestones that we track for commercial software development are feature complete (FC), release candidate 1 (RC1), and release to manufacturing (RTM). Tracking only one set of these estimated dates is somewhat limiting because, in reality, several revisions of the project schedule exist, and each revision has a set of estimated milestone dates. Each set is the best estimate at a particular point in time. For example, after the project's investigation phase, you have an estimate for when the project will be completed, but that estimate is updated after the project design milestone and again after the FC milestone. To keep this application simple, I tracked only the most recent schedule estimates.
The last dimension to discuss is the Time dimension. I generated a Time dimension that has two hierarchies. The first hierarchy (Time.Project) matches a standard calendar but is organized into weeks rather than quarters and months. The second hierarchy (Time.Fiscal) organizes the days in our company's fiscal calendar, in which first quarter starts April 1. The week organization matches our project schedules and is the most straightforward way for team members to view time when entering their time information. The fiscal calendar is the most natural way to analyze the data because it matches the way our company manages budgets and measures performance.
In a historical-analysis application, the cube's Time dimension usually contains entries only for the date range that contains data. In a data-entry application such as mine, including dates in the cube for all time periods in which employees will enter data is important. I used Microsoft Excel 2002 to construct the Time.Project dimension, as Figure 2 shows. Because I needed to track which days were work days, I needed to include in the table fields that identify weekends and holidays. Excel makes the identification of weekends easy with the WEEKDAY() function. The following Excel formula returns a 1 if the date in cell B3 is a Saturday or Sunday; otherwise, the formula returns a 0:
After I created the time table, I imported the data into the star schema's TimeTable, which Figure 1 shows. TimeTable has the date listed twice. The first field, Date, is a normal datetime field that creates the Time.Project dimension in Analysis Manager. The second field, DateText, is a string-type field that matches the Analysis Services date format. I explain why these date formats are important later.
The star schema contains only one measure: Hours. This measure is the one that employees modify when they write back to the cube. I also created a calculated member called Cost to use in analysis. Cost uses Hours and the Billing Rate in the Employee dimension to determine the Cost for any time period, employee, and project (or combination). In a consulting company, this calculated member would be called Revenue. But my application is designed for analysis by the person paying for the project, not the person working on the project. The following MDX formula returns the Cost measure:
If I were to create a project time-tracking application for a consulting group, I'd include facilities for tracking project expenses, because time is only part of what you bill a client for. Expense tracking is an important part of analysis because you can often find ways to save money when you analyze expense data. I won't describe expense tracking in this article, but note that an expense cube can use the same Project and Time dimensions that a time-tracking cube uses. You can easily create a virtual cube that lets an analyst use these shared dimensions to see time and expense information together.
When you write-enable the cube, Analysis Manager creates a write-back table. Initially, the table is empty, but Analysis Services puts a record in the table for each write-back value (one day, employee, project, and hourly value). This write-back table is different from the fact table, but it contains the same information. You can think of the write-back table as a completely denormalized version of the star schema that Figure 1 shows. The write-back table contains a field for each level of each dimension plus a field for each measure. The write-back table also includes fields for the username of the person who made the entry and the date and time that the entry was made.
Because of a couple of problems with the write-back table, you probably want to periodically empty this table into the cube's fact table and reprocess the cube. The first problem is that cube access becomes slower when the write-back table grows. The second—and more important—problem is that making changes to the cube's structure can cause Analysis Services to delete and reconstruct the write-back table. Any change to the cube's dimension levels or measures causes a structural change in the write-back table. So, for example, if you add a level to the Project dimension, Analysis Services will reconstruct the write-back table and destroy the table's contents. Therefore, you need to make sure the write-back table is empty whenever anyone makes structural changes to the star schema. To convert the write-back table's records to the form of the star schema's fact table, you need to join the write-back table records to the dimension tables to determine the TimeID, EmployeeID, and ProjectID. At this point, the TimeTable's DateText field becomes important. The format of the DateText field matches the format of the date in the write-back table, which simplifies joining the two tables.
In my next article, I'll describe how to automate moving records from the write-back table to the fact table and how to reprocess the cube. I'll also show you how to create a simple data-entry application for entering time values into the cube. Although the application I created is simple, it shows what you need to know to use Analysis Services to develop a write-back or modeling application.