Downloads
21453.zip

Finally, a use for the cross join

SQL Server developers must be endlessly inventive. Among the nifty devices bouncing around in their bag of tricks is the technique of using T-SQL to create T-SQL statements. When a potentially tedious project I worked on cried out for this technique, I ended up writing many such statements, which I share with you here. The project began when a client asked me to create 375 tables for a staging database. I spend a lot of time at the keyboard, but the idea of writing code to create 375 tables was daunting. So, I looked for another way to create the tables.

The solution to my problem lay in the data definition and the client's intended use for the tables. The 375 tables needed to hold data that automated data pumps funnel in from external data sources. After the data was lodged in the staging tables, my client could run the data through a sequence of data-integrity checks, transform it, then load it into a production warehouse. The data is organized into 75 regions that are numbered 01 through 75 and normalized at the external sources so that each region's data is distributed into five related tables. Because the data-integrity checks and transformations vary by table and by region, the region data must remain compartmentalized into these 75 separate tables. According to the client's established naming convention, the five types of tables are labeled A, B, C, D, and E. Consequently, I must name Region 01's tables A01, B01, C01, D01, and E01, for example, so that the automated data pump can find its target.

Because the table names and table structures are so standardized, I could develop a shortcut process for creating the 375 tables. Although several kinds of shortcuts would have worked in this scenario, I decided to try a type of query that I've had little use for in the past—the cross join. Most often, you use the cross join to quickly create large test data sets. SQL Server Books Online (BOL) defines a cross join as "\[a join\] that does not have a WHERE clause." That description is partially correct. A cross join doesn't contain a phrase in either the FROM clause or the WHERE clause that forces a match between the join columns of the two tables that you're linking. If you write a join by using the ANSI technique—which inserts the join expression into the FROM clause—you can specify CROSS JOIN, as the following code shows:

USE pubs
GO
SELECT au_fname, au_lname, royaltyper
FROM Authors CROSS JOIN TitleAuthor

This query contains no instruction that says, "match values of column A to values of column B," as you expect in a standard INNER JOIN or OUTER JOIN query. Instead, a cross join's resultset is a Cartesian product, which "multiplies" each row from the first table by all rows from the second table. The Authors table contains 23 rows, and the TitleAuthor table consists of 25 rows, so the resultset from this query, the Cartesian product (23 x 25), has 575 rows.

After I decided to use cross joins to help build all these tables, I proceeded to write my queries. First, for tables A through E, I created a set of "template" tables that contained the appropriate column names and data types for each of the tables but no rows of data. Next, I built two more tables—one that contained the region prefixes (A, B, C, D, and E) and one that contained the region numbers (01 through 75). Then, I cross-joined the two tables and created yet another table that contained the 375 new table names. Because I was going to build from the template tables that I created in the first step, I wrote a second query that created another table containing 375 rows, each row a SELECT INTO statement. After I created the 375 SELECT INTO statements as rows in a table, I could easily select from the table, copy the resultset, paste it into Query Analyzer, and execute all 375 SELECT INTO statements, thereby creating 375 tables.

Here's the step-by-step process, which uses the Pubs database. First, open Enterprise Manager, Query Analyzer, then choose the Pubs database from the drop-down list at the top of the window. Note that the SELECT INTO operation is part of the process. Therefore, you have to set the database to accept SELECT INTO, either from Enterprise Manager or from Query Analyzer. In Enterprise Manager, right-click Pubs, select Properties, select the Options tab, and select the Select Into/Bulkcopy check box. From Query Analyzer, run the following statement:

USE pubs
GO
sp_dboption 'pubs', 'SELECT into/bulkcopy', true

Now, create and populate a table to hold the prefixes for the 375 new table names:

CREATE TABLE MyPrefix (PrefixCode CHAR(1) )
GO

From Enterprise Manager, open the table MyPrefix (to see the new table, you might have to refresh the display) and input the values A, B, C, D, and E to create five rows in the table. Close the table window. Then, create and populate a table to hold the region values:

CREATE TABLE MyRegion(RegionID CHAR(2) )
GO

From Enterprise Manager, open the MyRegion table and input values 01 through 75, thereby creating 75 rows in the table. Close the table window. Now, create the template tables; for this exercise, you can use five tables from the Pubs database:

SELECT * INTO Table_A FROM authors WHERE 0=1
SELECT * INTO Table_B FROM publishers WHERE 0=1
SELECT * INTO Table_C FROM discounts WHERE 0=1
SELECT * INTO Table_D FROM stores WHERE 0=1
SELECT * INTO Table_E FROM sales WHERE 0=1

Next, run the following code to create a set of 375 names for the new tables by cross-joining the prefix table (MyPrefix) with the table that contains the list of region numbers (MyRegion):

SELECT PrefixCode + RegionID AS 'NewName' INTO #MyTableName
FROM MyPrefix CROSS JOIN MyRegion
GO

The resultset that you get should look like the data that Figure 1 shows. You can create temporary tables because you'll use T-SQL to create all inserts from this point on.

Now, you can build and populate a temporary table to hold the 375 SELECT...INTO code strings, as Listing 1 shows. All you have to do to produce the necessary code to create the 375 tables is first run the following query:

SELECT * FROM #MyString
GO

Then, display the resultset in text mode, as Figure 2 shows. Highlight the resultset, copy it to the Windows Clipboard (press CTRL+C), paste it back into the query window, and run the 375 SELECT INTO statements. In a few minutes, the 375 new tables will reside in the Pubs database.

Time-Saving Solutions for Repetitious Code


Until I received this project assignment, I hadn't considered using T-SQL to create T-SQL statements. I've never needed to. Neither had I been able to give a positive answer to the question, "What good is a cross join?" In one simple project assignment, I've been able to do both. You can use this technique in your job when your manager or client asks you to develop sets of repetitious code. I know that in the future, I'll look more carefully at my assignments and employ this technique whenever I can.