Up until this point, I’ve covered how to manipulate data stored in physical tables. Now I’ll introduce you to querying data represented by virtual tables. These virtual tables are SQL constructs known as views.
Unlike conventional physical tables, views don’t contain data. A view is merely a stored SQL statement that, when executed, generates a result set. When you query a view, you’re really querying the result set of a previously defined query. By the end of this lesson, you’ll have a firm grasp of views and how to create them. You’ll also understand the advantages and disadvantages of using them.
The PrerequisitesTo run the examples in this lesson, make sure your MyDB database contains the following tables:
- The Employee table created in Lesson 3
- The Movie and Genre tables created in Lesson 5
- The revised MovieReview table created in Lesson 6 (and not the original MovieReview table created in Lesson 4)
- The MovieReviewSummary table created in Lesson 6
If you haven’t created these tables, you’ll find the code in the 99765.zip file. To download this file, click the "Download the Code" link at the top of this page.
How to Create Views
To create a view, you first need to define a SELECT statement. The SELECT statement doesn’t have be a simple query of one table. It can be as complex as you need it to be. For example, you can join multiple tables or use functions to create computed columns. You can even include other views in the SELECT statement.
However, there are a few items you can’t include in a SELECT statement used to define a view. As noted in the SQL Server 2005 Books Online (BOL) documentation, you can’t include
- COMPUTE or COMPUTE BY clauses
- ORDER BY clauses unless there’s also a TOP clause
- INTO clauses
- OPTION clauses
- References to temporary tables or table variables
After you have a SELECT statement defined, you need to use the CREATE VIEW command. The basic syntax is
( ColumnName1, ColumnName2, ColumnName3\[,...n\] ) AS --Your SELECT statement
In the DatabaseName.SchemaName.View-Name segment, DatabaseName is the name of the database in which the new view will reside and Schema-
Name is the schema to which the table will belong. (If you’re unfamiliar with schemas, see Lesson 6.) The DatabaseName and SchemaName arguments are optional. If you don’t include the DatabaseName argument, the view will be created in the current database. If you don’t include the SchemaName argument, the table will be owned by the dbo schema, which is the default schema of the database owner (DBO). The examples that follow assume the current database is MyDB and you’re the DBO, so they don't include the optional Database-Name and Schema-Name arguments.
You use the View-Name argument to specify the name of the view you want to create. View names can be as long as 128 characters and must follow the rules of identifiers. The ViewName argument is required.
Inside the parentheses is where you can list the names of the view’s columns. Alternatively, you can specify the names within the SELECT statement, as the following examples show. Like column names in tables, column names in views are limited to 128 characters, must be unique, and must follow the identifier rules.
Let’s create a view with the SELECT statement that was used to create the MovieReviewSummary table in Lesson 6. To begin, execute Listing 1’s code in the query window in SQL Server 2005’s SQL Server Management Studio (SSMS) or SQL Server 2000’s Query Analyzer. In Listing 1, notice that I broke the code into two sections called batches. A batch is a series of one or more related T-SQL statements that are submitted to SQL Server for execution. SQL Server executes the commands in each batch sequentially from top to bottom. Each batch is separated by the GO keyword. In this case, the first batch causes the vMovieReview-
Summary view to be dropped if it already exists in the current database. The second batch contains the command that creates the view.
Note that there are multiple ways to determine whether a view or any other SQL object exists in a database. I prefer using the OBJECT_ID function because it’s short and to the point. The OBJECT_ID function returns the object’s ID value from the sysobject table if the object exists. If the object doesn’t exist, OBJECT_ID returns a null value. When the value isn’t null, SQL Server uses the DROP VIEW command to delete the existing view.
After you run the code in Listing 1, you can examine the view by running the code
ORDER BY Stars DESC
Figure 1 shows the results.
Using views offers several advantages. A key benefit is that views let you use only the data you’re interested in. In addition, views can help reduce code redundancy and complexity. (Views can also restrict access to subsets of data in a table and isolate applications from base-table structural changes, but these advanced uses for views go beyond the scope of this lesson.)
To demonstrate the benefits that views can provide, imagine the following scenario: You work at a ficticious company in which employees review movies. Your boss has asked you to produce two reports. The first report needs to show the least-popular movies reviewed so far, those movies’ genres and ratings, and the names of the employees who reviewed the movies. The second report needs to show the top three movie reviewers and how many movies they reviewed.
When looking at the Movie-Review, Movie, Genre, and Employee tables, you become a bit overwhelmed because of all the extra data available. You’re interested only in the employee names and the movies’ names, genres, and ratings, yet the tables include hire dates, salary information, and other data. Furthermore, both reports need to access the same set of tables, so you don’t want to write the same code twice.
Thankfully, as Listing 2 shows, you can create a view that contains only the information you need, all in one spot. The vMovieReview view contains three inner joins; the MovieReview table is joined to the Movie, Genre, and Employee tables. The view contains only those columns needed, so the view doesn’t contain any extraneous information.
With the vMovieReview view, writing the reports is a piece of cake. To produce the first report
that shows the least-popular movies, you write the query
WHERE Rating = 1
ORDER BY Movie,
To produce the second report that shows the three top movie reviewers, you write the query
COUNT(*) AS 'Reviews'
Group BY EmployeeName
ORDER BY 2 DESC
Executing these two queries produces the reports in Figure 2 and Figure 3, respectively.
Now let’s compare the two queries you just ran against the vMovie-
Review view with the queries that would need to run against the physical tables. To produce the first report using the physical tables, the query would be
mr.Stars AS 'Rating',
LEFT(e.FirstName + ‘ ‘ +
FROM MovieReview mr
INNER JOIN Movie m
ON m.MovieID = mr.MovieID
INNER JOIN Genre g
ON g.GenreID = m.GenreID
INNER JOIN Employee e
ON e.EmployeeID = mr.EmployeeID
WHERE Stars = 1
ORDER BY Movie,
To produce the second report using the physical tables, the query would be
LEFT(e.FirstName + ' ' +
FROM MovieReview mr
INNER JOIN Employee e
GROUP BY LEFT(e.FirstName +
' ' + e.LastName,18)
ORDER BY 2 DESC
The two queries that ran against the view are clearly shorter and less complicated than the queries that you'd need to run against the physical tables. With less code to write, the chance of making errors such as typos is reduced.
As stated previously, a view is a virtual table. A view’s virtual nature is its greatest strength and its greatest weakness. Because the contents of a view are determined at runtime, every time you reference a view, the SELECT statement that was used to define the view must be executed because the view exists only in temporary storage. (Note that there is one exception: indexed views. See the sidebar “Indexed Views vs. Regular Views,” for more information.)
Consider the following scenario: Some developers at a company create a view that performs complex transformations on data derived from multiple tables. This view is the source for an online historical sales report. Hundreds of regional managers run this report at least once a day, complaining all the while about the sluggishness of the system.
Each time this report is executed, the view is executed. The view, in turn, reads the original data, transforms it, then stores the transformed data in temporary storage. So, the original data is being transformed hundreds of times each day even though that data hasn't changed. Although the developers saved themselves some coding time by using a view, they greatly increased the amount of work the server must perform daily.
In this case, the developers ignored the virtual nature of the view and treated it as if it were a physical table. It would have made more sense from a performance perspective to use the view to generate the report data once and store the results in a physical table. Subsequent report executions could then query the physical table, not the view.
You might find it hard to conceive that a view could actually hurt a server’s performance, but it happens all too often. For example, a few years ago I was troubleshooting a fairly standard performance problem. After an upgrade, users were complaining that a certain application was painfully slow. I tracked down the problem to a view used to populate the application’s main grid. This innocent-looking view didn’t merely join a few tables together—it joined multiple views together. Each view, in turn, joined to other views, and so on. When all was said and done, there were more than 100 tables being joined together. No wonder the grid was slow!
Just because SQL Server lets you join views to other views, it doesn’t mean that you should. Be careful with joins and always examine the contents of a view before incorporating it into your code. Test the view under various scenarios to make sure it performs as expected.
Use Views Wisely
Views are incredibly useful constructs that can either make or break a project. Used wisely, views can reduce code redundancy, reduce errors, and make your job easier. Used unwisely, views can result in performance problems.