There's no doubt that storage is one of the key components to SQL Server performance and availability for both large and small SQL Server instances. With today's increases in server and virtual server processing power and large-scale memory support, storage and I/O can easily become the bottleneck for overall system throughput. You can avoid this bottleneck if you have a basic understanding how SQL Server uses storage and know some essential SQL Server storage best practices.
The Basics: Data and Log Files
The most basic concept to understand concerning how SQL Server uses storage is that databases are composed of two types of files:
- Data files. These files store the database data. SQL Server data files are NTFS files that have a file extension of .mdf. A basic database consists of a single data file, but a database can consist of multiple data files that reside on one or more drives.
- Log files. These files store the database transactions that allow a database to be restored back to a specific point in time. SQL Server transaction log files are NTFS files that have a file extension of .ldf. A database can have multiple log files that reside on one or more drives.
If you use SQL Server Management Studio (SSMS) to create a new database, the data and log files are stored on the same drive by default. Unless you specify otherwise, the data and log files are created in the same directory as the SQL Server system databases, which is <Drive>:\Program Files\Microsoft SQL Server\MSSQL<version>.MSSQLSERVER\MSSQL\DATA. For example, for a SQL Server 2014 instance installed on the C drive, the data and log files are in the C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA directory by default.
Related: SQL Server Log Files Update
As a best practice, you should place the data and log files on different drives. Because SQL Server writes all the database transactions into the transaction log, the log files benefit from being on drives with high write performance. Data files are used to support queries and often need to support numerous read operations. When creating databases, you can specify the locations of your data and log files using the T-SQL CREATE DATABASE command. To change the location of existing data and log files, you can run the ALTER DATABASE command with the MODIFY FILE option. Listing 1 shows an example of how to move a database's data file to a different location.
USE master SELECT name, physical_name FROM sys.master_files WHERE database_id = DB_ID('AdventureWorks2012'); ALTER DATABASE AdventureWorks2012 SET offline GO ALTER DATABASE AdventureWorks2012 MODIFY FILE (NAME = AdventureWorks2012_Data, FILENAME = "E:\Data\AdventureWorks2012_Data.mdf") ALTER DATABASE AdventureWorks2012 MODIFY FILE (NAME = AdventureWorks2012_Log, FILENAME = "F:\Logs\AdventureWorks2012_log.ldf") ALTER DATABASE AdventureWorks2012 SET online GO
Although not everyone agrees with this recommendation, another best practice for SQL Server databases is to enable AutoGrow. When you enable this property for a database, the database's data and log files will automatically grow if more space is required. This setting prevents the system from stopping if no space is available.
With that said, AutoGrow should be considered a last ditch safety mechanism. It shouldn't be used as your primary method to manage database growth. You should manually manage the growth of all data and log files. Databases activity halts while the database experiences AutoGrow events. Frequent AutoGrow events can lead to disk fragmentation and reduced performance. Nevertheless, AutoGrow is a good safety measure for unexpected data growth. The following commands provide an example of how to set the AutoGrow property for a database's data and log files:
ALTER DATABASE AdventureWorks2012 MODIFY FILE (NAME = 'AdventureWorks2012_data', FILEGROWTH = 1024MB) GO ALTER DATABASE AdventureWorks2012 MODIFY FILE (NAME = 'AdventureWorks2012_log', FILEGROWTH = 256MB) GO
It's almost always a bad practice to enable the AutoShrink property for a database. Like AutoGrow events, AutoShrink events cause all database activity to halt. Plus, you can't control when AutoShrink runs. Using AutoShrink can result in a spiral of AutoGrow then AutoShrink events, resulting in database performance hits and excessive file fragmentation. You can turn off AutoShrink using the command:
ALTER DATABASE AdventureWorks2012 SET AUTO_SHRINK OFF
Another storage best practice is to enable Instant File Initiation. Unlike most of the other configuration settings discussed here, Instant File Initialization is controlled with a Windows Server policy. Instant File Initialization avoids zeroing out allocated space for a file. It simply allocates the required space. SQL Server uses Instant File Initialization during database creation, AutoGrow, and database restore operations. You can enable Instant File Initialization on the Windows server by using the Administrative menu to open Local Security Policy. Then expand Local Policies and double-click Performance volume maintenance tasks, as shown in Figure 1.
This will open the Performance volume maintenance tasks Properties dialog box in which you can enter the name of the SQL Server Service account.
Storage and RAID Levels
After you have a basic understanding of the way SQL Server uses storage, one of the next most basic storage concepts to understand is the RAID levels that you can use for the disks in a storage subsystem. The RAID levels have a big impact on both performance and availability. As you might expect, the more expensive options typically offer the best performance and availability. The most common RAID levels that you'll encounter are:
- RAID 0 (sometimes called disk striping). This RAID level spreads all the data across all the available disks. You often see this RAID level used in different database benchmarks. RAID 0 provides good performance, but you should never use it on a production server because one disk failure will result in data loss.
- RAID 1 (sometimes called disk mirroring). With RAID 1, data is mirrored on the disks. Read and write performance is good, but the overall disk capacity is cut in half. RAID 1 is often used for SQL Server log files. It can sustain one disk failure.
- RAID 5 (sometimes called disk striping with parity). RAID 5 stripes data across multiple disks and uses a disk for data redundancy. It's often used for data files. This RAID level provides good read performance and can sustain the failure of one disk. However, it's known for slower write performance.
- RAID 10 (sometimes called disk mirroring with striping). RAID 10 combines the performance of striping with the protection of mirroring. RAID 10 provides the highest levels of performance and availability out of the different RAID levels. RAID 10 uses twice as many disks as RAID 5, but it can sustain multiple disk failures. A RAID 10 array can sustain failures for as many as half of the disks in the set. RAID 10 is good for both data and log files.
Another important SQL Server storage component to understand is tempdb. Tempdb is a SQL Server system database that's a global resource available to all users. SQL Server uses tempdb to complete internal database operations. Tempdb is used for temporary user objects and internal database engine operations, including joins, aggregations, cursors, sorting, hashing, and row versioning. Unlike the data in a typical user database, the data in tempdb isn't persisted after the SQL Server instance is shut down.
Tempdb is typically one of the most active databases on a production SQL Server instance, so following a few best practices for tempdb can help ensure that your SQL Server database performs well. First, the tempdb data and log files should be placed on different physical drives than your production database data and log files. Because tempdb is so active, it's also a good idea to make sure the drives are protected with RAID 1 or striped with RAID 10. The Microsoft SQL Server Customer Advisory Team (SQLCAT) has come out with a recommendation that tempdb should have one data file for each CPU core. However, this recommendation is best suited for very heavy workloads. It's more commonly recommended that tempdb have a 1:2 or 1:4 ratio of data files to CPU cores. As with most performance recommendations, this is a general guideline; the requirements for your system will vary. If you're unsure of how many data files to use for tempdb, a common recommendation is to start with four data files. Typically, one log file is enough for tempdb. (For more in-depth tempdb recommendations, see the resources listed in the Learning Path.)
Next, tempdb should be adequately sized to avoid AutoGrow events. Like user databases, tempdb will experience delays during AutoGrow events. By default, tempdb is sized with an 8MB data file, a 1MB log file, and a 10 percent AutoGrow capacity, which is too small for most production workloads. It's also important to know that restarting SQL Server results in tempdb being reset to its last configured size.
The tempdb data and log files can be sized and moved using code like you saw in the section titled "The Basics: Data and Log Files." The query in Listing 2, which is from the MSDN site, shows how to find the size and growth percentage of the tempdb data and log files.
SELECT name AS FileName, size*1.0/128 AS FileSizeinMB, CASE max_size WHEN 0 THEN 'Autogrowth is off.' WHEN -1 THEN 'Autogrowth is on.' ELSE 'Log file will grow to a maximum size of 2 TB.' END, growth AS 'GrowthValue', 'GrowthIncrement' = CASE WHEN growth = 0 THEN 'Size is fixed and will not grow.' WHEN growth > 0 AND is_percent_growth = 0 THEN 'Growth value is in 8-KB pages.' ELSE 'Growth value is a percentage.' END FROM tempdb.sys.database_files; GO
Solid State Disks
Multiple cores have increased processing power, and many of today's tier 1 servers support very large amounts of RAM, which has made the I/O subsystem the bottleneck for many workloads. Traditional hard disk drives (HDDs) have gotten larger, but they really haven't gotten any faster. Solid state disks (SSDs) can be a solution to this problem. SSDs are a relatively new storage technology that has begun to gain critical mass in the SQL Server market the past year. Previously, SSDs were too expensive and the capacity was too low to use them for many production database scenarios. One reason why SSDs have really begun to take off is that they offer raw performance advantages over traditional rotational HDDs. For example, a 15,000rpm Serial Attached SCSI (SAS) drive can provide about 200MB/sec of throughput. In contrast, an SSD Serial ATA (SATA) drive using a 6GB connection can provide about 550MB/sec of sequential throughput. Why are SSDs faster than traditional HDDs? The main reason is that the seek time is drastically reduced. When a spinning HDD needs to retrieve random data, the drive head must move to the new data location. SSDs have no moving parts, so moving to new data locations happens with the speed of memory.
SSDs and high-performance flash storage can be implemented in a number of different ways. One common implementation is the 2.5" SSD drive form factor. These drives plug into DAS slots, and the electronics interface is the same as a standard HDD. Another common form factor for SSDs is the PCI Express (PCIe) board, which plugs directly into the system bus. This device takes advantage of the speed of the PCIe bus to provide more I/O operations per second (IOPS) and more throughput than the standard disk interface allows. In addition, many SANs provide SSD drive tiers and automated data tiering to move hot workloads onto the high-performance SSD layer while keeping less demanding workloads on the slower and less expensive HDD tiers.
There are several different types of SSD storage. The common types include SSD storage based on DRAM and SSD storage based on flash memory technology, such as single-level cell (SLC) and multi-level cell (MLC). Each type has its pros and cons:
- DRAM. Much like standard computer RAM, DRAM is very fast but not durable. DRAM implementations need a battery backup to persist data during a power outage. This type of storage is often implemented as PCIe cards that are installed into the server's motherboard.
- SLC. SLC is faster and has a longer life cycle than MLC, so SLC is used in enterprise-grade SSD storage. However, SLC implementations are significantly more expensive than MLC implementations.
- MLC. MLC is usually found in consumer devices and has a lower price than SLC. However, MLC implementations have lower write performance and a significantly higher wear rate than SLC implementations.
Although SSDs provide better performance than rotational HDDs, SSDs have a significantly shorter life cycle. I/O-intensive applications such as SQL Server will shorten the life of an SSD drive. In addition, the greater the percentage of the drive that's used, the shorter the SSD's lifespan will be. One recommendation is to make sure that your SSDs have at least 20 percent of the drive unallocated. Read performance is consistent over the entire life cycle of the drive. However, write performance degrades as the drives are used, which means the length of time it takes to perform write actions will increase. It's also important to know that you don't need to defragment SSDs because they don't access data the same way as HDDs. In fact, defragmenting SSDs will only serve to shorten their life cycle.
If you want to use SSDs, don't use single SSD drive implementations and plan on replacing the SSDs during the life cycle of your server. Some of the possible SQL Server implementations for SSDs include:
- Moving indexes to SSD. Typically, indexes aren't very large and they have a lot of random read activities, making them ideal candidates to move onto SSD drives.
- Moving data files to SSD. Data files usually have more read operations than write operations, so in most cases, they're well suited for SSDs.
- Moving log files to SSD. Log files are write-intensive. Therefore, if you use SSDs for log files, use enterprise-grade SSDs and a RAID 1 or RAID 10 mirrored implementation.
- Moving tempdb to SSD. Tempdb tends to have very high levels of random writes, which can lead to SSD burnout. Therefore, if you use SSDs for tempdb, use enterprise-grade SSDs, use a RAID 1 or RAID 10 mirrored implementation, and plan for the SSDs' replacement cycle. In addition, you might look into PCIe DRAM implementation for tempdb. DRAM storage provides higher levels of write performance and doesn't experience SSD burnout. However, DRAM storage can be expensive.
Another important storage best practice is to create performance baselines and periodically recheck your system's performance against these baselines. This can really help you with troubleshooting problems that might arise, as well as tracking your system's database growth and other performance trends. Baselining is one of the best ways to proactively manage your systems. Although the topic of measuring SQL Server performance is well beyond the scope of this article, I'll briefly discuss some of the most important storage metrics that you should keep an eye on.
The first set of performance counters that you should monitor are the memory-related counters in Windows Performance Monitor. Although they're technically not storage counters, if you don't have enough memory, the rest of the counters don't really matter. Be sure to watch the Memory object's Available MBytes counter. This counter shows the amount of physical memory available for allocation to a process or for system use. If this counter shows lower than 100MB, your system would probably benefit from additional memory. Another important counter to watch is the Paging File object's % Usage counter, which shows the amount of the Windows page file that's in use. This value should be less than 70 percent. If this value is higher, the system probably needs more memory.
In addition to the Windows memory-related counters, there are a number of Windows Server storage performance counters that can help you monitor your storage performance. However, these counters are really only meaningful when your SQL Server instance is running on DAS. If you're using a SAN, you need to check the SAN's performance metrics.
If your SQL Server instance is using DAS, you should first check to see that there's at least 20 percent free disk space for each NTFS drive. Afterward, you can check the Windows Server storage counters using Performance Monitor. Table 1 lists a few of the most important counters to check, all of which are under the Logical Disk object.
Logical Disk: % Idle Time
This counter tracks how much time the disk is in an idle state, which means all the requests from the OS to the disk have been completed. If this value is less than 50 percent, the disk might be experiencing high I/O, which means you might want to move some files off this drive.
Logical Disk: Avg. Disk Sec/Read
These two counters display the average time the disk transfers took to complete. A value of more than 15ms might mean that you have a disk I/O performance problem.
Logical Disk: Disk Reads/Sec
You can use these two counters to calculate your system's overall read/write ratio.
SQL Server offers a large array of performance counters to help you monitor and manage your SQL Server instance. Some of the most important SQL Server storage counters that you should monitor are listed in Table 2. You can use Performance Monitor to monitor them.
SQLServer:SQL Statistics:Batch Requests/sec
This counter tells you how busy your system is. A value of 1000 or higher is a heavy workload.
SQLServer:Buffer Manager:Buffer Cache Hit Ratio
This counter tells you how often the system is able to find data in its buffer cache. This value should be high—over 90 percent. The higher the value, the better.
SQLServer:Buffer Manager:Free List Stalls/sec
This counter tracks when SQL Server can't get data from the buffer because the buffer is out of space. If this value is greater than 2, the system would probably benefit from additional memory.
SQLServer:Access Methods:Full Scans/sec
This counter shows how many full table scans are being performed. This value should be low—less than 1. If it's higher, you probably need to add indexes.
SQLServer:Access Methods:Page Splits/sec
This counter tracks when a new row is added to a full index page, which causes a page split. This value should be low—less than 20. If it's higher, consider increasing the fill factor on your indexes or rebuilding your indexes more frequently.
Store and Forward!
Storage is a critical component in SQL Server database performance. Knowing some basic best practices can help you optimize SQL Server's storage availability and performance. For more in-depth information about the different storage aspects covered here, check the references in the Learning Path.
For more in-depth information about the different storage aspects covered, see:
- Demystify Tempdb Performance & Management
- How to Optimally Use SSDs Without Burning Them Out
- Optimizing tempdb Performance
- Place Data and Log Files on Separate Drives
- A SQL Server DBA myth a day: (12/30) tempdb should always have one data file per processor core
- SQL Server PerfMon Counters of Interest