Using stored procedures keeps data available
At the Cendian Corporation, a chemical logistics provider, the business intelligence (BI) group I work with gives users throughout the company the information they need to make better decisions. So my solution covers a variety of end-user needs, such as financial reporting and accrual determination, margin analysis across shipping lanes, and delivery reliability tracking. My company employs 300 people, 125 of which are active Cognos BI users. My group also supports users who access data directly from Microsoft Access or Query Analyzer or who receive automated reports through email. I perform daily incremental updates to my data warehouse, and I perform daily rebuilds of my data mart and BI tables. The tables are the key denormalized access point for the source data for my users and they help improve the performance and consistency of subsequent aggregate table builds and data-mart loads by eliminating redundancy in multi-table joins and business-rule logic. A daily reload provides me with a great deal of design flexibility; however, the reload can increase load times as volumes grow because all data is reprocessed daily.
During my testing, I found that tables reloaded faster if they had no indexes and I added indexes after the reload. Also, many of my BI tables were being loaded from views that had evolved over time to include all necessary business-rule logic, and the extraction, transformation, and loading (ETL) tool I use was only providing a pipe to and from the SQL Server. My BI solution resides on two SQL Servers: a data-warehouse server (our main workhorse, a clustered, 8-CPU server running SQL Server 2000 Enterprise Edition) and a data-mart server that's for reporting. The environment also employs an Informatica Server for ETL and a Cognos Server for reporting and analysis. The two SQL Servers have 600GB of usable space on EMC drives and are currently utilizing 400GB of that disk space.
The key to my solution is using SQL Server's EXEC command to execute a dynamic character string in a T-SQL batch. To load tables without indexes, I needed to drop the indexes before the load and add them back afterward. To avoid the maintenance nightmare that can result from hard-coded scripts, I use a handful of stored procedures. The sp_CBI_UpdateIndexInfo stored procedure collects all index information for key tables in my databases and puts it into a central table repository. The sp_CBI_IndexMaintenance stored procedure creates and executes the SQL code that drops or creates all defined indexes, including statistical and constraint indexes, for a specified table. Because I can invoke pre- and post-SQL commands for each target table from the ETL tool, I use the information from the central index repository table, which sp_CBI_UpdateIndexInfo collected, to invoke sp_CBI_IndexMaintenance before and after a table load to drop and create an entire table's indexes.
I created the sp_CBI_BuildTable stored procedure that Web listing 1 shows, which dynamically generates and executes SQL code that creates a target table from a source table or view by using a SELECT INTO−type load. Figure 1 shows the steps my solution uses to create the target table. Sp_CBI_BuildTable also calls sp_CBI_IndexMaintenance to create any indexes that existed on the target table previously. I needed to minimize table down time by using temporary table names during the rebuild. When the temporary table and index creation is successful, sp_CBI_BuildTable drops the current target table and renames the temporary table to the final target table name.
To users, my tables are read-only, so my stored procedures automatically recreate the indexes, compacted with no empty space, by using the index option fillfactor set to 100 (regardless of whether the index was originally created that way). The result is smaller indexes, which means less disk space to hold them and less I/O to query them.
As a result of using my dynamic SQL solution to drop and create indexes before and after a table reloads, I can reload a table twice as fast as I could before. Plus, I don't have to modify any scripts if I choose to add new indexes to any of my tables. I can use a view to refresh SQL Server tables that I've rebuilt dynamically 10 times faster than before. These two solutions have given me the speed and flexibility to offer more pre-built BI results, which has attracted more users within the company and has made my BI support crucial to the company.