SQL Server 2005 introduces enhancements and features that affect almost every aspect of the product, as well as completely new infrastructures and platforms. For the DBA in particular, there are many exciting features to learn about. When I started writing this article, I created a list of features that are either significantly enhanced or completely new, and before long I had dozens of items to cover. But if I cover all the items on my list, you won't be able to see the forest for the trees. So I decided that for DBAs, this article needs to present two main areas of information: which new features in SQL Server 2005 make it worthwhile for your organization to upgrade, and which features will immediately affect you in terms of managing and maintaining the new product. I don't cover enhancements in tools. You can learn more about that topic in Kalen Delaney and Ron Talmage's "SQL Server 2005 Management Tools" (August 2005).

Database Mirroring

Database mirroring is one of the new features in SQL Server 2005 that I find the most exciting for high availability. In simple terms, database mirroring lets you maintain a standby server that provides data redundancy. That is, SQL Server maintains a mirrored copy of the principal database on another server and can automatically fail over to the mirror if the principal server fails.

In SQL Server 2000 Enterprise and Developer editions, log shipping lets you maintain a standby server. Log shipping was widely adopted because it provides an affordable standby server solution that's fairly easy to set up and doesn't require specialized hardware. However, log shipping is a hybrid of several features that weren't originally designed for the sole purpose of maintaining a standby server. Essentially, log shipping backs up the transaction log in the source server, copies the backup files, and restores them to the target server. SQL Agent jobs maintain the process. Log shipping has several limitations:

  • It incurs latencies according to the backup/copy/load intervals.
  • You can't set log shipping up to work in a synchronous mode where a transaction isn't committed in the source unless it was received by the target.
  • It doesn't support auto failover.
  • There are several possible points of failure because log shipping isn't a native solution, and upon failure it might not be easy to restore the log shipping process.

Database mirroring addresses these limitations by introducing a native solution for maintaining a standby server.

Three servers play major roles in database mirroring. The principal holds the primary database. The mirror holds the copy. The witness (which is an optional server) allows support for auto failover. A new client feature, auto-client redirect, appears to support the client functionality in a failover event. No specialized hardware is necessary, and setting up a mirroring session is amazingly simple. Database mirroring transfers transaction log records over a mirroring session rather than copies backup files. The mirror copy of the database is in a non-recovered mode, meaning it's not accessible for read purposes. Figure 1 illustrates the database mirroring server configuration. An option exists to make a snapshot of the mirror available for read purposes. I discuss database snapshots next.

You can set up a mirroring session to work in synchronous or asynchronous mode. In synchronous mode, changes to the principal database are committed only after the mirror receives them. In this mode, latencies are based on the network speed and number of changes. Asynchronous mode is the faster mode because changes sent to the principal can be committed immediately. However, in asynchronous mode there's no guarantee that the mirror received all changes that were committed in the principal. (For more information about database mirroring, refer to Ron Talmage's Microsoft article, "Database Mirroring in SQL Server 2005".)

Author's Note: After evaluating the early beta test coverage, Microsoft learned that the database mirroring feature has been tested and deployed by fewer customers than is necessary to build full confidence in such a crucial database-availability feature. Therefore, Microsoft decided that database mirroring needs more time in the hands of customers before it's made generally available for production use. The feature is complete, has passed extensive internal testing, and Microsoft is committed to making it generally available in the first half of 2006. The SQL Server 2005 release to manufacturing (RTM) will have database mirroring turned off by default, but the feature will be available for evaluation under a trace flag. Microsoft will continue to field-test the feature with customers and will release it for general use as soon as it is ready.

Database Snapshots

Database snapshots is a new feature in SQL Server 2005 that lets you provide new functionality and higher availability for your databases. A database snapshot is an object that looks like a read-only copy of a database at a particular moment. With a database snapshot, you create a new database that's a snapshot of an existing database, providing the snapshot a name and a corresponding data file for each existing data file in the source database. Because snapshots are read-only and don't accept changes, no transaction log files are needed. A database snapshot is extremely space-efficient. The data files in the snapshot are initially created completely empty. SQL Server uses a trick to maintain the snapshot as a point-in-time picture of the source database: Upon the first change that a page in the source database incurs after a snapshot is created, the page is copied to the corresponding snapshot file before the change is applied. Further changes to the same page don't result in copying because the snapshot maintains the state of the page before any changes were made to it.

When you query a snapshot, SQL Server reads from the snapshot files pages that were modified and reads from the source database pages that weren't modified. The largest a snapshot can become is the size of the source database's data files, and only if each and every page of the source database is changed after the snapshot is created.

You can create multiple snapshots of the same database at different points in time. For example, you might want to create a snapshot periodically (e.g., every day or every hour). In cases in which data is accidentally deleted, you can recover the data from the most recent snapshot. You can also create a snapshot before applying a very intensive process that has the potential to damage the data. For example, the process might be too big to maintain in a single transaction. If the process fails, you can revert the database back to its snapshot's state as long as only one snapshot of the database was created. You can also use snapshots for reporting purposes for applications that logically need to access all data at a certain point in time.

Finally, you can create a database snapshot on the mirrored database, and in this way make the mirror available for reading purposes. Doing so lets you reduce the query load from the principal database.

Although the benefits to working with database snapshots are obvious, keep in mind that whenever a page is modified for the first time after a snapshot is created, its image before the change is copied to the snapshot. The copying process has a cost and will slow down some of the modifications against the source database.

Row Versioning and Snapshot Isolation

SQL Server 2000 could maintain only one version of a row—the most recent one. SQL Server 2005 introduces a new technology that lets it maintain versions of a row at different points in time in linked lists stored in tempdb. This new technology serves several purposes: It supports the new Snapshot and Read Committed with Snapshot isolation levels (known together as Snapshot-based Isolation Levels), building the inserted and deleted tables in triggers and online index operations and supporting Multiple Active Result Sets (MARS).

The new isolation levels let SQL Server work in a mode in which shared locks aren't acquired when you read data; therefore, readers aren't blocked by writers. SQL Server returns to the readers a consistent (committed) version of the row by using row versioning. In snapshot isolation, SQL Server returns to the reader the most recent consistent version of a row according to the time the reader's transaction started (technically, this is the time at which the first statement of the transaction was submitted). Multiple reads within the same transaction are guaranteed to return the same version of the row. Snapshot isolation also detects update conflicts. If another transaction changed the data between the time you accessed a row and the time you modified it in the same transaction, upon modification your transaction will fail. The failure signals that someone else changed the row and your calculations may not be valid anymore. At this point, you can retry the transaction.

SQL Server 2005 introduces another isolation level based on row versioning that's actually an enhancement to an existing isolation level. This new isolation level is called Read Committed with Snapshot. In this isolation level, your transaction doesn't acquire shared locks when reading and gets the most recent committed version of a row according to when the current SELECT statement started (rather than when the transaction started). In this mode, multiple different reads within the same transaction can get different versions of the row (all committed). Your reads don't need to wait for modifying transactions to commit.

These new isolation levels benefit applications that principally read data because the applications won't need to wait for modifying transactions to commit. Also, you now have a simple way to enforce optimistic locking with conflict detection. But remember that the new isolation levels aren't adequate for all environments because of the cost to maintaining and traversing the linked lists with the row versions in tempdb. For example, the isolation levels won't be adequate in cases where modifications are frequent and many update conflicts exist.

Row versioning is also used in SQL Server 2005 to build the inserted and deleted tables in triggers. In SQL Server 2000, inserted and deleted are actually views on top of the section in the transaction log that contains the log records of the change that fired the trigger. By querying inserted and deleted, you gain access to the old and new images of the affected rows. Whenever you access inserted or deleted you're in effect scanning a portion of the transaction log. Even when not working with triggers, the transaction log can potentially be a bottleneck, mainly in online transaction processing (OLTP) systems. SQL Server must first write a change to the transaction log before it can apply the change to the data portion of the database. The architecture of the transaction log is such that SQL Server can only write to it sequentially. Therefore, any interference with transaction log activity (e.g., querying deleted and inserted in triggers, transaction log replication) ultimately postpones flushing changes to the data. You need to put a lot of attention and focus on the transaction log when implementing and tuning your database.

SQL Server 2005 uses row versioning to keep track of versions of rows that are affected both by the change that fired the trigger and by changes submitted from the trigger itself. SQL Server builds inserted and deleted from the linked lists with the row versions it maintains in tempdb. Unlike the transaction log, tempdb can be written to and read from in parallel if tempdb is striped on multiple disk drives. It's comforting to know that the performance tension on the transaction log is reduced, but you should also realize that you need to put more focus on tuning tempdb now.

Another very important new feature that uses row versioning technology is online index operations. All index operations in SQL Server 2000 (including create, rebuild, and drop) occur offline. It's important to rebuild indexes periodically to combat index fragmentation, which can dramatically slow ordered index range scans (i.e., range queries that use an index). When you rebuild a clustered index, SQL Server acquires an exclusive lock on the table so that the table isn't available for reading or writing. When you rebuild a non-clustered index, SQL Server acquires a shared lock on the table so that the table isn't available for writing, and the index itself can't be used during the operation. SQL Server 2000 introduced an online index defragmentation utility (DBCC INDEXDEFRAG) to address some of the problems with index rebuilds. However, this utility sometimes takes longer than an index rebuild, consumes more log resources, and the end result is not as optimal as that of a full index rebuild.

SQL Server 2005 introduces online index operations (i.e., create, rebuild, drop). I find the ability to rebuild an index online important because doing so is necessary for regular maintenance. SQL Server uses row versioning to maintain changes that take place during the rebuild activity. Logically, imagine that another copy of the index is constructed, allowing you to access the source data, then when the operation finishes, the copy substitutes the source. Keep in mind that you'll need enough database space for the operation. This enhancement is especially crucial for 24/7 shops that can't allow downtime.

Native Partitioning

Partitioning isn't a new concept in SQL Server 2005. Scaling and maintaining huge tables can be a nightmare for DBAs, requiring rebuilding indexes to deal with fragmentation, purging and archiving historic data, and other time-consuming and tedious tasks. Dataloading becomes slower as a table grows larger for some insert patterns to the different indexes. DBAs use partitioning to physically split a single huge table into multiple, smaller, more manageable units, resulting in an environment that is much more scalable.

Partitioning in SQL Server 2000 is a hybrid of several elements that weren't originally designed for partitioning: namely, tables, check constraints, views, and some added functionality. In SQL Server 2000 partitioning is achieved by:

  • Creating multiple smaller tables instead of a single large one
  • Limiting the range supported by each table with a check constraint
  • Creating a view that unifies all pieces; making the view look and feel like the original single table to the eyes of the user and application

The fact that partitioning isn't native in SQL Server 2000 has several implications, both in terms of managing partitioned data and performance. DBAs need to maintain multiple physical objects instead of one. Each partition is a separate table, and there might be variations among tables in terms of schema and indexing. The optimizer needs to separately consider each relevant table for a query when generating an execution plan, resulting in long optimization time and large plans. Also, there are many requirements and limitations to support the updatability of the data through the partitioned view, which Figure 2, illustrates.

SQL Server 2005 introduces native partitioning of tables and indexes, addressing many of the limitations of partitioning in SQL Server 2000. You create a single object (table or index) that is internally partitioned, as Figure 3 shows. You need only work with a single entity, resulting in significantly reduced management and maintenance overhead. Because partitioning is native in SQL Server 2005, you use a new vocabulary with native commands for activities such as adding, removing, and switching partitions. A significantly reduced set of requirements for partitioning fully supports both retrieval and modification of data. In addition, all of a table's partitions must be identical in structure (e.g., schema, indexes); therefore, the optimizer needs to spend much less time to optimize queries, and execution plans are typically smaller. There's much more to say about partitioning in SQL Server 2005.

Monitoring with DMVs and DMFs

SQL Server 2005 introduces 70 new Dynamic Management Views (DMVs) and Dynamic Management Functions (DMFs). The new DMVs and DMFs provide a lot of useful information about the current state of SQL Server and its health, allowing you to diagnose problems and tune the performance of your instance and databases.

Diagnostic information is now easily accessible in a fully documented, convenient table format. Some of this information was not exposed at all in SQL Server 2000, some was documented but not available in a convenient table format (e.g., fragmentation information through DBCC SHOWCONTIG), and some was available only through undocumented tools (e.g., wait information through DBCC SQLPERF(WAITSTATS)).

The new DMVs and DMFs cover two types of scope: database scope and server scope. The new dynamic management objects cover the following categories of diagnostics: Common Language Runtime, Database Mirroring, Databases, Execution, Full-Text Search, Indexing, I/O, Query Notifications, Replication, Service Broker, SQL Operating System, and Transactions. You can find a lot of information about these new objects in SQL Server 2005 Books Online under the subject Dynamic Management Objects. Researching these new objects is time well spent because you can now find much useful information about them.

Security

SQL Server 2005 introduces several important security-related enhancements. All permissions in SQL Server 2005 are now grantable with the GRANT statement. In SQL Server 2000, some permissions were directly grantable and others were available only through other means (e.g., by being a member of a role). The management of security in SQL Server 2005 is hierarchical and requires you to get used to new vocabulary. For example, securables are entities that you can secure with permissions (e.g., a login or database at the server scope, an assembly or a service at the database scope, or lower-level entities such as a type at the schema scope or a table at the object scope). Actions on securables are granted to principals (e.g., Windows groups/logins, SQL Server roles/logins, database roles/users).

SQL Server 2005 also has an accurate notion of a schema (as defined by the ANSI committee) and completely separates database users from schemas. In SQL Server 2000, the schema to which an object belongs and the database user who creates the object are the same and can't be separated. If user1 creates Table1, the table becomes user1.Table1. If you want to drop the database user user1 because the user leaves the company, you first must change the owner of all objects that user1 owns to a different database user. In SQL Server 2005, you create objects in schemas, and a schema doesn't represent a database user—it's only a schema. You can grant, deny, and revoke permissions from principals on a schema, which in SQL Server 2005 is a securable. You don't have any problems when you need to drop database users because they don't "own" objects. You also have much more flexibility in regulating security against collections of objects (i.e., schemas).

Another enhancement that DBAs have been waiting for is the ability to use built-in tools to encrypt data within the database. SQL Server 2005 introduces a new tool that lets you control the security context of a command or a routine by using the EXECUTE AS statement. EXECUTE AS replaces the older SETUSER statement, providing more flexible means for security context switching. For more information on security enhancements in SQL Server 2005, refer to BOL and to the Microsoft SQL Server 2005 article "Introduction to SQL Server 2005 Relational Engine Security Features.

T-SQL Enhancements

If you're in charge of writing, reviewing, and maintaining T-SQL code, you'll find a wealth of new features and enhancements in SQL Server 2005. Let me highlight a couple of Data Definition Language (DDL) and Data Manipulation Language (DML) enhancements that I find particularly useful and interesting.

SQL Server 2005 introduces DDL Triggers, which let you reject or react to DDL events. A trigger can react to server-level events (e.g., ALTER LOGIN) or database-level events (e.g., CREATE TABLE). The potential in this new feature is enormous—you can use it to enforce company policies with object naming conventions, auditing, schema version changes, and more.

SQL Server 2005 also introduces the new XML datatype. This is a native XML datatype that lets you store XML data within your database and also use it for local variables and input/output arguments of routines. You can index the new XML datatype with specialized XML indexes, you can constrain it with schemas, and you can query and modify it by using the XQuery language.

SQL Server 2005 enhances the variable length datatypes (VARCHAR, NVARCHAR, and VARBINARY) by introducing the MAX specifier instead of an actual size. With the MAX specifier, you let SQL Server determine when to internally manage the value as a regular value or as a large object. The nice thing about working with the enhanced datatypes is that the programming model for regular types and large objects is unified. You can use them as local variables and as input/output arguments for routines, and you can also use most of the functions that apply to regular datatypes with the enhanced datatypes.

The two querying-related enhancements that I find most exciting are Analytical Ranking Functions and Recursive Queries. The former lets you provide row numbers and other ranking values to rows of a result set of a query. I've been able use those functions to tune many existing solutions. The latter lets you to write recursive queries, which are very handy in manipulating hierarchical data.

What's New in SQL Server 2005 for the DBA?

By now, you probably realize that it's easier to answer the question "What's not new or changed in SQL Server 2005?" Almost every area of the new product has changed. I hope that this overview gave you a taste of some of the exciting features that will affect your life as a DBA for the better and allow your organization to get more from SQL Server.