As part of my role as a hands-on data management advisor, I have to migrate data stored in legacy and nonrobust systems into a managed system environment. One of the most common formats I have to work with is that of Microsoft Excel. I prefer to review the data from large Excel worksheets and workbooks in SQL Server because I can use T-SQL queries to thoroughly check the data. To get the data into SQL Server, I had been using the Import Data feature of SQL Server Management Studio (SSMS) or building INSERT statements by concatenating values together in Excel expressions. However, both methods proved tedious, so I developed a solution that automatically builds INSERT statements.
My solution consists of an Excel macro named Create_Insert_Statements and a stored procedure named BSP_UTILITY_MatchColumnDataTypes. Here’s how the solution works. In the Excel worksheet whose data you want to export, you start the macro. The macro prompts you for a name for the table that will contain its output. The macro then constructs a CREATE TABLE statement using that table name and the worksheet’s column headers. The headers must be unique and can’t contain spaces. In addition, the worksheet’s first column needs to have contiguous values.
Next, the macro loops through the data, building INSERT statements for all the columns and rows. The INSERT statements, which look like
INSERT INTO <table name>
are written to a blank column (i.e., the third column past the last column containing data).
After the last INSERT statement is written, the macro copies the column to the clipboard. You then paste the INSERT statements into a New Query window in SSMS and execute them in any database (provided you have the necessary permissions). The result is a table and a command that you can use to run the BSP_UTILITY_MatchColumnDataTypes stored procedure. Because the macro saves all the columns as varchar(255), the stored procedure changes the data type of each column into a more appropriate type:
- If the values are numeric, have no decimals, and fall between 2^31 and -2^31, the column data type is changed to int.
- If the values are numeric but not integers, the column data type is changed to float.
- If no values meet the criterion isdate()=0, the column data type is changed to date.
- If the values don’t meet any of the aforementioned criteria, the varchar column is resized to the length of the longest value in the column.
You can download the code for the stored procedure (which is in the MatchColumnDataTypes.sql file) as well as the Create_Insert_Statements macro (which is in the Dynamic Database Insert Excel Tool.xls speadsheet) from the SQL Server Magazine website. Go to the top of this page and click the Download the Code Here button.
Besides the macro, the Dynamic Database Insert Excel Tool.xls spreadsheet contains sample data you can use to test the solution. To do so, follow these steps:
- Run MatchColumnDataTypes.sql to create the BSP_UTILITY_MatchColumnDataTypes stored procedure.
- Open Dynamic Database Insert Excel Tool.xls in Excel. Navigate to cell A2.
- Press Ctrl+Q.
- When the macro prompts you, enter a name for the output table.
- After the macro executes, paste the selected column into a New Query window in SSMS.
- Execute the code.
- In the results, find and uncomment the command that executes the BSP_UTILITY_MatchColumnDataTypes stored procedure. Manually execute that command to change the column data types.
I’ve tested this solution with SQL Server 2005 and Excel 2007. To view the macro’s code in Excel, press Alt+F8 to launch the Macro window. Select the Create_Insert_Statements macro and click Edit. If you’re unfamiliar with Excel macros, you can find information about them at the Microsoft Office web page.
With this solution, I can quickly import Excel data into SQL Server so that I can thoroughly review it using T-SQL. It’s not a perfect solution for all Excel data migration scenarios, but it greatly helps me with ad hoc reviews of large Excel data sets.