This month, I'll show you how to write simple SQL queries. You can run the queries that this article describes from the ISQL/W window or the Query window in the Enterprise Manager. Both windows have the same features. Select the Query tab at the top of the Query window before you type in SQL statements. You use the Query window to write select statements, or database queries, when you want to retrieve information from the database. You can also use the Query window for data modification statements, such as insert, update, and delete instructions. And you can use the Query window to administer the database, create tables and columns, and so on.
The Select Statement
You use an SQL query to retrieve the data you want from the database. A query has three parts: SELECT, FROM, and WHERE. SELECT tells SQL Server which columns you want to see as the result of the query. FROM specifies which table or tables you want to use in the query. WHERE limits which rows you want to see. A select statement contains three elements: select, from, and where. The syntax isn't case sensitive, but I'll use uppercase and lowercase to distinguish the commands from the object names. A typical query looks like this:
SELECT col1, col2, col3...... FROM table1 WHERE col1 = search condition
Consider this simple query:
USE pubs GO SELECT * FROM authors
The phrase "USE pubs" ensures that your statement queries the pubs database. You could instead select the database in the DB window on the Toolbar. GO ensures that SQL Server has completed the previous command before the next one starts. This query will return all columns and rows in the authors table. You use the asterisk to specify all the columns in the table. You use FROM authors to retrieve data from the authors table.
But one of the three elements—the WHERE clause—is missing from this statement. So if you use this statement, SQL Server will return all the rows. A query without a WHERE clause can return a large amount of data, increasing network traffic. And an update or delete statement without a WHERE clause to restrict the affected rows could be disastrous.
Because the authors table in the pubs sample database is small, you can run this query without worry. Click on the green arrow or press CTRL-E. Screen 1 shows the results. You need to scroll the window to see all the columns and all the rows.
If you want to see only certain columns, you can restrict the output by listing the names of the columns you want. You can run a stored procedure that SQL Server provides to determine the column names. From the Query window, type and run:
You can type this query in the same window as the previous query, then highlight it to run the line on its own. Or you can use the cursor or mouse to highlight the word authors, then press ALT-F1. This action runs the sp_help stored procedure on the highlighted table name. Both methods will yield the same output—a list of columns and their datatypes and length. Another way you can identify the column names is to add a line to the query:
SELECT * FROM authors WHERE 0 = 1
Because the WHERE condition is never met for this query (zero can't equal one), the query won't produce output, as Screen 2 shows. But the query will return a page header that includes column names, with the order and spacing in which they'll appear on the final query. Now you can choose which columns you want to see by listing them in the SELECT clause:
SELECT au_id, au_lname, au_fname, phone FROM authors
Note that commas separate the column names. For clarity, FROM is now on the line below SELECT. I also indented it, again for clarity, but you can type it on the same line. If you run this query, you'll see that the output consists of only the four requested columns.
Changing the Output Column Order
The query returned the columns in the same order in which SQL Server created them when the table was built. But you can list the columns in the order in which you want them. Screen 3 shows the results if you list the author's first name, last name, phone number, and author ID in that order:
SELECT au_fname, au_lname, phone, au_id FROM authors
Labeling the Output Columns
The top row of the output lists the names of the columns you selected. Programmers devised the column names, so the names might not mean much to the person who sees the output report. You can use T-SQL or ANSI standard syntax to rename the output columns. If you use the T-SQL syntax, supply the label first, followed by an equal sign, then the column name:
SELECT 'First Name' = au_fname, 'Last Name' = au_lname, Phone = phone, 'Author ID' = au_id FROM authors
Three of the four labels are in quotes because they contain spaces. You don't need to use quotes when the label is a single word, as for the Phone column. An alternative syntax is to put the column name first, then the label, separated only by a space (no equal sign).
|Setting Query Display Options|
| To show which query produced which set of results in the screen shots for this article, I selected a configuration option that instructed ISQL/W to repeat the query at the top of the result set. This option isn't the default setting. To turn it on, you can click on the wrench icon on the toolbar to bring up a tabbed dialog box, then select the Format Options tab and check Output Query. Or you can select Query, Set Options from the menu to bring up the same tabbed dialog box. |
Also, you can turn off the Print Headers display option, so the headers aren't the first row of output. Turning off this option might be useful, for example, if you export the data to another program.
SELECT au_fname 'First Name', au_lname 'Last Name', phone 'Phone', au_id 'Author ID' FROM authors
Using the ANSI standard, you add 'AS' between the column name and the label:
SELECT au_fname AS 'First Name', au_lname AS 'Last Name', phone AS 'Phone', au_id AS 'Author ID' FROM authors
Any of these three select statements will produce the same results, as Screen 4 shows.
Restricting the Rows
The previous select statements return all the rows in a table, but you can use the WHERE clause to retrieve only a subset of the rows that meet certain criteria. The WHERE clause consists of a column name, a comparison operator, and a value or range of values. The comparison operator can be an equality or an operator such as less than or greater than. Examples include:
WHERE city = 'Oakland' WHERE au_lname = 'Bennet' WHERE zip > '84000'
The quotes around the zip code value denote that this is a text value. Zip code is a text field, not a numeric field, so if you need to do a comparison, you must base it on the text value of the zip code field.
Also, you can use the BETWEEN operator, which retrieves data within a specified range, for example, WHERE zip BETWEEN '80000' AND '89999'. Or you can ask for records with the column you specify in a list. The following WHERE clause will return data only from the western states:
WHERE state IN ('AL', 'CA', 'CO', 'ID', 'UT', 'NV', 'WA')
Modifying Search Conditions
You can use the NOT operator to exclude a certain set of data and retrieve the rest. The NOT operator excludes all records that meet the criteria you specify. For example, the pubs database contains names of many authors from California. You can use this query to find authors from other states:
SELECT au_fname, au_lname, phone, au_id FROM authors WHERE state NOT LIKE 'ca'
Be careful about using NOT because SQL Server has a powerful Query Optimizer that calculates the best way (i.e., the way that uses the fewest resources) to run a query. Most often, the Query Optimizer will use an index. An index is a set of pointers to where data is stored, just as a book index points to information, especially when you ask for a range of data, such as a mailing list sorted by zip code. The appropriate index will make the query run fast and efficiently. The query uses the index to get the first record in the range, then starts reading in one record after another. The SQL Server read-ahead process can pre-fetch the records, anticipating what will be needed next. But when you use NOT, such as
WHERE zip NOT BETWEEN 80400 AND 80499
(for example, because you're sending a mailing to everyone outside your local range of zip codes), the Query Optimizer can't use an index. Instead, it performs a table scan, which means it looks at every record to see whether it meets the search conditions. This method is much slower than using an index. To perform this query correctly, you need to use two conditions:
WHERE zip 80499
This query defines two ranges, 00000 through 80399 and 80500 through 99999, so the query can use the index again.
Combining Search Conditions
As the previous example demonstrates, you can combine search conditions. But be cautious when combining search conditions if you aren't familiar with Boolean logic: Don't confuse the AND and OR conditions. Suppose I say, "Get me a list of all our authors from Utah and Texas." You know what I want. But if you wrote the query as
SELECT au_fname, au_lname, phone, au_id FROM authors WHERE state LIKE 'ut' AND state LIKE 'tx'
the query will not return any data because it asked for all authors who live in both Utah and Texas. The result you want is data that meets both conditions, authors who live in either Utah or Texas. Instead, use this query:
SELECT au_fname, au_lname, phone, au_id FROM authors WHERE state LIKE 'ut' OR state LIKE 'tx'
Note that you need to repeat the search condition. You can't write
WHERE state LIKE 'ut' OR 'tx'
because SQL Server expects a complete search condition on either side of the OR. When the list gets longer, you need to use IN:
WHERE state IN ('AL', 'CA', 'CO', 'ID', 'UT', 'NV', 'WA')
is the same as:
WHERE state LIKE 'AL' OR state LIKE 'CA' OR state LIKE 'CO' OR state LIKE 'ID' OR state LIKE 'UT' OR state LIKE 'NV' OR state LIKE 'WA'
Note that you can use uppercase and lowercase state codes interchangeably if you use a case-insensitive sort order for data, so that a search for 'tx' produces the same results as a search for TX.
When you combine multiple searches, you need to consider the order in which SQL Server will evaluate the expressions. SQL Server's order of precedence is NOT, then AND, then OR. For example, suppose you want to find in the titles table all the books in the computer category that cost more than $10 and all the books in the business category. You can use SQL rules to ensure that you get what you want, rather than all the books in both categories that cost more than $10. Here are some examples of how different syntax produces different results. This query returns four rows, because the price restriction applies to business and computer books.
/* Price restriction applies to both categories */ SELECT type, SUBSTRING (title,1,50) as Title, price FROM titles WHERE (type = 'Business' OR type = 'Popular_Comp') AND Price >10
This query returns six rows because the price restriction applies only to computer books.
/* Price restriction applies only to computer books */ SELECT type, SUBSTRING (title,1,50) as Title, price FROM titles WHERE type = 'Business' OR ( type = 'Popular_Comp' AND Price >10)
If you don't use parentheses, you get six rows back because SQL Server interprets the query as in this example. SQL Server processes the AND first, and it joins the price requirement to the computer book requirement. Then SQL Server processes the OR condition, which returns either business books or the higher-priced computer books. Screen 5 shows the results of the three queries. (For clarity, I omitted the queries and showed only the results.)
/* No parentheses to help out */ SELECT type, SUBSTRING (title,1,50) as Title, price FROM titles WHERE type = 'Business' OR type = 'Popular_Comp' AND Price >10
Notice the SUBSTRING in the above queries: I had trouble showing the long book title, and SQL Server formatting is limited, so I requested only the first 50 characters of the title. Next month, I'll discuss formatting output data with SQL Server. I'll also continue the discussion of select statements.