How to create an .msi file equivalent package to move databases cross platform
Suppose that you're the administrator of hundreds of SQL Server instances in a virtualized environment. Previously, many of your SQL Server computers were running a single departmental application and its database, but you've recently consolidated some of those applications and their databases on a single SQL Server instance to reduce the number of SQL Server computers.
Now you'd like the ability to move individual databases from one SQL Server instance to another, just as you can move OS resources (and users of OS resources) in a virtualized environment. Specifically, you'd like to be able to use a dashboard to monitor the resource utilization of each SQL Server instance and each database. If you notice that one instance is being overutilized or underutilized, you'd like to balance the resources by using the dashboard to quiesce a database, then drag and drop it from one instance to another.
Currently, there are a few challenges that you'd have to overcome to implement this idea.
Challenge #1. Databases aren't standalone resources that can be moved between instances at will. You need to move logins along with the databases, and logins live in the master database, not your database. So, to move Database A from Instance A to Instance B, you must, at minimum, move the logins from Instance A's master database to Instance B's master database. In addition, you must make sure that your database has no dependencies on msdb, other database objects in the master database, or instance-level configuration parameters such as instance collation. In other words, you need to limit the objects in your database to make sure it's as instance-independent as possible.
Challenge #2. What mechanism do you use for the drag-and-drop operation? Do you use backup and restore operations, plus something to accommodate logins? Do you script the database objects and data?
Using backup and restore operations might not work because your instances might be running different versions of SQL Server and those operations don't accommodate restoring a newer version on an older instance. The drag-and-drop mechanism needs to be instance-independent, with intelligence in the drag-and-drop API to check for possible object-version conflicts (e.g., spatial columns in a database you're going to drag and drop onto a SQL Server 2005 instance).
Using scripts might not work either. DBAs spend a nontrivial amount of time honing their scripts and checking for conflicts when doing a database move, SQL Server upgrade, or application upgrade. But suppose you want to write a script that generalizes "the drag-and-drop a database" concept for databases that the script doesn't have intimate knowledge of? The script would need to do enough checking to "just work" and return the appropriate errors so that the drag-and-drop operation doesn't cause subtle problems down the road.
From Theory to the Real World
Let's move from discussing this as a theoretical problem to a real-world use case. The types of challenges that I've described are an intrinsic part of moving databases in a SQL Azure Database platform. Currently, the SQL Azure Database platform doesn't have an msdb database, and only a discrete set of database objects are permitted. Except for database logins, none of the usual custom database objects are allowed in the master database. You also have no control of the instance environment. You only have control of logins and database-level objects. And almost all database-level properties (e.g., collation) aren't user-settable.
As a DBA, you might need to deploy on-premises databases to a SQL Azure Database platform. To do so, you need a portable, instance-independent, and version-independent database representation that lets you move schema and data. Neither attach and detach operations nor backup and restore operations will work for this because the SQL Azure version isn't under your control. These operations also don't work across trust environments, such as on-premises-to-cloud or appliance-to-cloud.
However, you can use a data-tier application (DAC) to create a "database container" that lets you move schema (and soon data) to and from SQL Server and SQL Azure databases. It's helpful to think of DAC as the .msi file equivalent for databases. DAC operates at the logical layer, so it provides a file artifact that is open, transparent, portable, and trustable. DAC isn't for everyone, though. A multi-database application with lots of dependencies on msdb, the master database, linked servers, and other instance-level and cross-database objects (such as Service Broker routes) is not a good candidate for DAC.
DAC targets departmental databases in the enterprise as well as simple third-party applications that contain a few gigabytes of data. These are also the types of applications that would be well-suited for consolidation and likely well-suited for a SQL Azure-based deployment.
DAC databases can be registered with the SQL Server Utility. This utility lets you monitor and control the configuration and performance of a set of SQL Server instances. (Although the SQL Server Utility must run on a SQL Server 2008 R2 server, SQL Server 2008 SP2 and SQL Server 2005 SP4 servers can be supported with the utility.) If a SQL Server instance is under the control of a utility control point (UCP -- i.e., a central server for collecting and monitoring data and controlling policies), you can observe and control not only instance-level data but also database-level data of DAC databases.
Equally important, after a database is registered as a DAC, you can use SQL Server Utility health policies to tune and control views showing DAC resource consumption. Thus, the SQL Server Utility can provide the dashboard that I referred to previously (although we're not using it to move databases around just yet).
How DAC Works
DAC provides a logical representation of your database and a set of services for managing it. The logical representation consists of a container that houses a set of files. The container is called a DAC package or a DACPAC because of its extension (.dacpac). The DACPAC contains at least four files: a script that contains all of the database objects or an upgrade script, and three XML files that contain the logical database description. Additional files, such as pre- and post-deployment scripts, can be included. The DACPAC is simply a .zip file, so you can easily open it to view the files inside.
The DAC services are exposed in two ways, the first of which is through GUI-based tools such as Visual Studio (VS) 2010 DAC Projects and SQL Server Management Studio (SSMS). Although DAC supports operations in previous SQL Server versions (see the list), support for DAC services isn't present in SSMS versions before SQL Server 2008 R2.
DAC services are also exposed through a .NET API contained in Microsoft.SqlServer.Management.Dac.dll. You can use the API directly in any .NET language (e.g., C#). The .NET DAC Management API surfaces its services as a series of methods in the DacStore and DacExtractionUnit classes. The methods include:
- DacExtractionUnit.Register -- Registers a database in SQL Server or SQL Azure Database as a DAC.
- DacExtractionUnit.Extract -- Produces a DACPAC from an existing SQL Server database.
- DacStore.Install -- Installs a new database from a DACPAC. It also registers the database.
- DacStore.IncrementalUpgrade -- Upgrades an existing database from a DACPAC. This method is used in DAC 1.1 and later.
- DacStore.Unmanage -- Removes the database from DAC control, but leaves the database intact.
- DacStore.Uninstall -- Removes the database from DAC control, and either deletes or detaches the database, based on a parameter value.
The DAC registration process updates the dbo.sysdac_history_internal and dbo.sysdac_instances_internal tables in the msdb database to record the process. Because the SQL Azure Database platform doesn't contain an msdb database, these tables exist in the master database in SQL Azure. You can use these tables to keep a version history. If the SQL Server instance is being managed with a UCP, registration also enables database monitoring by UCP.
The way that DAC works to update an existing database changed between DAC 1.0 (released with SQL Server 2008 R2 and VS 2010) and DAC 1.1 (released with SQL Server 2008 R2 SP1 and VS 2010 SP1). In DAC 1.0, a copy is made of the original database, a new database is created with the new database schema, and the relevant data is copied from the old database to the new database. Then the databases are renamed to reflect that the updated database is the current version and the original database is an old version. In DAC 1.1, databases are updated in-place. Based on early customer feedback, it's suggested that you should upgrade to DAC 1.1 if you're using DACs.
Both DAC 1.0 and DAC 1.1 support a set of the most commonly used database objects. (It's the same set because no additional objects were added in DAC 1.1.) If your database contains objects outside of the supported set, you might not be able to export them to a DAC database. Some objects such as SQL Server CLR (SQLCLR) assemblies and CLR-based database objects (e.g., SQLCLR procedures) cause errors on export or import. Other objects such as users and roles simply aren't exported or imported. In addition to database objects, logins can be imported into or exported from a DAC database.
You can write pre- and post-deployment scripts to migrate objects that aren't supported and don't produce errors. You can also insert data into newly created database tables using a post-deployment script.
In addition to the database object metadata and scripts, you can include a server selection policy in your DACPAC. You use this optional policy to limit the SQL Server instances on which the DACPAC can be deployed (e.g., only on servers running SQL Server 2008 R2 or later, only on servers running SQL Server 2008 R2 Enterprise Edition). If an attempt is made to deploy the DACPAC on a different version or edition, it will fail because the instance doesn't match the server selection policy. A DBA can also choose to override the server selection policy.
The DACPAC should be kept under version control to guard against ad-hoc changes being made to the database outside of DAC controlled versioning. SSMS will check to see if any objects have been added or changed outside of DAC management at deployment time. Note that VS 2010's Schema Compare feature can use a DACPAC as a source "database," a target "database," or both.
Now that you've gotten a feel for how DAC works, let's walk through a simple example of how to use DAC to deploy an on-premises database on a SQL Server 2008 R2 instance to a SQL Azure Database platform. The on-premises database is a copy of the pubs database, which ships with SQL Server. The following instructions are based on using DAC 1.1, SSMS in SQL Server 2008 R2 SP1, and VS 2010 SP1.
First, you need to register the pubs database as a DAC database using SSMS. Registering the database will validate that the database's objects are supported by DAC and assign an application name and version to it. As Figure 1 shows, all of the pubs database objects are supported. If the instance is under UCP control, the pubs database is registered with a UCP. When the registration is completed, a report is produced. If the validation fails, you can get a detailed report.
Next, you need to extract the DACPAC. In SSMS Object Explorer, select Databases. Right-click the pubs database, choose Tasks, and select Extract Data-tier Application to bring up the Extract Data-tier Application Wizard. Click Next on all the pages of the wizard to accept the defaults and produce a DACPAC file.
If you put the DACPAC under version control, you can use it as the "one version of truth" as far as the database schema is concerned. Other DBAs can then extract a new DACPAC after making changes to the database schema, or developers can use the DACPAC directly when developing a new version of the application in VS 2010. So, in VS 2010, follow these steps:
1. Create a new SQL Server Data-tier Application project.
2. Right-click the project in Solution Explorer, and choose Import Data-tier Application to import your DACPAC file. This also populates your project with the database objects contained in the DACPAC.
3. Make some schema changes.
4. Choose Build Solution from the Build menu. VS 2010 will save your changes and build the project. This produces an updated DACPAC. Its location will be shown in the Output window.
After you've made some schema changes, it's time to deploy the updated DACPAC to your SQL Azure Database platform. You can perform this in VS 2010 or in SSMS. Let's go the SSMS route:
1. Connect SSMS Object Explorer to your SQL Azure database, right-click the SQL Azure database instance icon, and choose Deploy Data-tier Application.
2. On the Introduction page, click Next.
3. On the Select Package page, browse to and select the updated DACPAC.
4. On the Update Configuration page, notice that the SQL Azure database is given the same name as the original database by default. If desired, change it. Click Next.
5. On the Summary Page, click Next to attempt to deploy the DACPAC.
This deployment attempt will fail because the stored procedures in the database use the COMPUTE BY clause, which isn't supported in SQL Azure. You can fix this problem by deleting the stored procedures in question or by updating them to use WITH ROLLUP instead of COMPUTE BY. Either way, you would need to coordinate this change with your application programming team in a real deployment. However, since this a practice deployment, simply delete the stored procedures in the VS 2010 project. After rebuilding the project, try to deploy the DACPAC to the SQL Azure Database platform again using SSMS. This time it will succeed.
After the deployment, you need to retrofit the on-premises pubs database with the changes you made to the DACPAC. Follow these steps:
1. In SSMS Object Explorer, select Management/Data-Tier Applications, right-click the pubs database, and choose Upgrade Data-tier Application.
2. On the Introduction page of the Upgrade Data-tier Application Wizard, click Next.
3. On the Select Package page, browse to and select the DACPAC you just deployed. Even though this is a first-time deployment on SQL Azure and an update to the on-premises pubs database, you can use the same DACPAC because it includes not only a change script but also a complete copy of the schema. Click Next.
4. On the Detect Changes page, the wizard will compare the on-premises pubs database with the DACPAC, then report on the detected changes. Click Save Report if you want to save a copy of the report, then click Next.
5. On the Options page, you'll find options to roll back the schema changes on failure and options to run pre- and post-deployment scripts, as Figure 2 shows. For this simple example, you can leave these check boxes unselected. Click Next.
6. On the Review Upgrade Plan page, you can review the updates that will be made to the pubs database, as shown in Figure 3. You can save a copy of the updates by clicking Save Action Report. You can also generate a copy of the change script for review before performing the updates by clicking Save Script.
7. On the Summary page, review the information to make sure it's correct. If everything is okay, click Next to start the upgrade. The changes are then applied, in-place, to the on-premises pubs database. It's strongly suggested that the database be quiesced before performing the database updates, but that's a best practice with or without DAC.
Note that although the DACPAC was successfully deployed, you'll encounter problems later on. SQL Azure databases require a clustered index on every table, and two of the tables in the pubs database don't have a clustered index. You'll find this out when you attempt to insert rows, in which case you'll need to change the database schema again. If you would've run a post-deployment script to validate clustered indexes on all tables, you would've caught the problem at that point.
What's Coming Up
DAC 2.0 is supported in the current Community Technology Preview (CTP) of SQL Server Data Tools (formerly code-named Juneau) and SQL Server 2012 (formerly code-named Denali), as well as in SQL Azure Database. This functionality is also available using the Import/Export section of the Database tab available on the Windows Azure portal.
A Promising Future
As you've seen, DAC is a container and a set of libraries for database deployment and management. You can use it to simplify the management and upgrade of databases, as well as to formalize the database development process. DAC has improved by leaps and bounds since its initial release, and more improvements are just around the corner. It's not quite at the "drag and drop a database" stage yet, but that day might not be too far in the future.