T-SQL problems have a way of tricking programmers. At first glance, those problems look so simple that you're certain you'll need only a few minutes to solve them. However, after you start working on a solution, you realize that the problem isn't so simple after all. Validating IP addresses can be one of those problems. In particular, you might be stymied when you try to write a CHECK constraint that validates character strings containing IP addresses entered into a table column. I found this challenge interesting because I had to perform T-SQL acrobatics and exhaust my bag of tricks before devising a solution. As I share my conclusions in this article, I assume that you know how to use the LIKE predicate to write simple CHECK constraints and pattern searches and that you're familiar with basic string-manipulation functions such as CHARINDEX(), REPLACE(), SUBSTRING(), LEFT(), and RIGHT(). For each part of the solution, I've provided complete Web listings that you can download from http://www.sqlmag .com, InstantDoc ID 26728.

The Problem


Let's lay the groundwork for the problem by describing an IP address. In binary format, an IP address is composed of four octets, each of which contains 8 bits. An octet can hold 256 (28) different values. Typically, you represent an IP address as w.x.y.z, in which w, x, y, and z are whole numbers in the 0-to-255 range. A common way to store an IP address in a database is in a varchar column in the form 'w.x.y.z'. The following statement creates the IPs table for this article's examples:

CREATE TABLE IPs
( ip varchar(15) NOT NULL CONSTRAINT PK_IPs PRIMARY KEY
   /* other columns */ )

I chose a maximum length of 15 characters for the ip column because each of the four octets can hold a number no longer than three digits, and I used three periods to separate the octets. To verify that only valid IP addresses enter the table, you need to add a CHECK constraint to the table by following the requirements that I just specified. In practice, the requirements for valid IP addresses are a bit more complex. For more information about these requirements, see the Microsoft article "Understanding TCP/IP Addressing and Subnetting Basics" at http://support.microsoft.com/default.aspx ?scid=KB;EN-US;q164015. To write a CHECK constraint that validates an IP address, you need to devise a logical expression that evaluates to TRUE only when the IP address is valid.

The Solution


To devise the logical expression that evaluates to TRUE when the IP address is valid, you need to complete three tasks. For each task, you need to write a simple logical expression, then construct a complex logical expression that combines all the simple logical expressions.

Task 1: Confirm that the input value holds exactly three periods and that no octet is empty. You can script the first task in several ways. The following example shows how you can use the LIKE predicate to verify that the input value holds exactly three periods and that each octet holds at least one value:

ip LIKE '%_%.%_%.%_%.%_%'
AND
ip NOT LIKE '%.%.%.%.%'

Alternatively, you can write the first task by using the REPLACE() function:

LEN(ip) - LEN(REPLACE(ip, '.', '')) = 3
AND
ip NOT LIKE '%..%' AND ip NOT LIKE '.%'
AND ip NOT LIKE '%.'

In the first line of this code snippet, the REPLACE() function replaces each occurrence of a period in the input value with an empty string. Subtracting the length of the result string from the length of the original string gives you the number of times a period occurs in the original string. The second line of code confirms no empty octets in the middle, beginning, and end of the input value.

Task 2: Extract each octet from the input value. To accomplish the second task, you can start by using the LEFT() function to extract the first octet. You want to express the length of the string that you need to extract from the input value as the position of the first period in the input value minus 1, as the following code shows:

LEFT(ip, CHARINDEX('.', ip) - 1)

To extract the second octet, you can use the SUBSTRING() function. The starting position from which you want to extract the substring is the position of the character after the first period. The following code shows how you can express the length of the substring as the position of the second period, minus the position of the first period, minus 1:

SUBSTRING(
   ip,
   -- Start: position of the first period plus 1
   CHARINDEX('.', ip) + 1,
   -- Length: position of the second period
   -- minus position of the first period minus 1
   CHARINDEX('.', ip, CHARINDEX('.', ip) + 1) -
      CHARINDEX('.', ip) - 1)

The tricky part here is to find the position of the second period. The CHARINDEX() function has a third argument specifying the character position from which to start searching for the first argument within the second argument. In this case, I specified the position of the first period plus one as the start location.

You can use the SUBSTRING() function again to extract the third octet. You need to express the starting position from which you want to extract the substring as the position of the character after the second period. I showed how to calculate the position of the second period when I extracted the second octet. You can express the length of the substring as the length of the input value, minus the character position of the third period from the end of the string, minus the character position of the second period. Asking for the character position of the third period from the end of the string is the same as asking for the position of the first period in the string in reverse order:

SUBSTRING(
   ip,
   -- Start: position of second period + 1
   CHARINDEX('.', ip, CHARINDEX('.', ip) + 1) + 1,
   -- Length: the length of whole string minus the
   -- position of the third period from the end minus
   -- the position of the second period
   LEN(ip) -
      CHARINDEX('.', REVERSE(ip)) -
      CHARINDEX('.', ip, CHARINDEX('.', ip) + 1))

To retrieve the fourth octet, you can use the RIGHT() function. The following code shows how you can express the length of the string as the position of the third period from the end of the string minus 1:

RIGHT(ip, CHARINDEX('.', REVERSE(ip)) - 1)

Task 3: Convert the octet to an integer value and check that the integer value is in the 0-to-255 range. In the third step, you need to convert each extracted octet to an integer value and confirm that the result is in the 0-to-255 range. The complete expression in the CHECK constraint should consist of four simple logical expressions separated by the AND operator, each handling a different octet. Each octet's expression should take the following pattern:

CAST( AS int) BETWEEN 0 AND 255

Now that you know how to implement each of the three tasks for validating an IP address, you can create the required CHECK constraint. Run the script that Web Listing 1 shows to add the CHECK constraint to the IPs table. Then, try to insert a valid IP address into the IPs table:

INSERT INTO IPs VALUES('131.107.2.200')

The INSERT is successful. Now try to insert the following invalid IP addresses into the IPs table:

INSERT INTO IPs VALUES('1.1.1.256')
INSERT INTO IPs VALUES('1.1.1.1.1')
INSERT INTO IPs VALUES('1.1.1')
INSERT INTO IPs VALUES('1..1.1')
INSERT INTO IPs VALUES('.1.1.1')

These INSERT statements should generate the following CHECK constraint violation error:

Server: Msg 547, Level 16, State 1, Line 1
INSERT statement conflicted with COLUMN CHECK constraint 'CHK_IP_valid'. The conflict occurred in database 'Northwind', table 'IPs', column 'ip'.
The statement has been terminated.

You might think that the CHECK constraint you created would be sufficient to prevent the insertion of invalid IP addresses, even when you attempt to insert a value that contains invalid characters such as 'a.1.1.1'. You reason that such an attempt would fail on a conversion-to-integer error, and the row wouldn't be inserted. You'd be right, but you'd encounter other problems. First, the error that you receive when you insert invalid characters is a conversion error, not the CHECK constraint violation that you expect. You need to consider this difference when you develop the error-handling routine in your client application's code. Second, a conversion error terminates the batch, so you won't have a chance to handle the error in T-SQL. This point deserves elaboration. Run the following code, which attempts to insert a row containing an invalid IP address, then prints 'After INSERT' in the same batch:

INSERT INTO IPs VALUES('1.1.1.256')
PRINT 'After INSERT'

The insert results in a CHECK constraint violation, then the batch continues, and SQL Server executes the PRINT statement. You should receive the following message:

Server: Msg 547, Level 16, State 1, Line 1
INSERT statement conflicted with COLUMN CHECK constraint 'CHK_IP_valid'. The conflict occurred in database 'Northwind', table 'IPs', column 'ip'.
The statement has been terminated.
After INSERT

Now run the following code, which also tries to insert a row containing an invalid IP address, then prints 'After INSERT' in the same batch:

INSERT INTO IPs VALUES('a.1.1.1')
PRINT 'After INSERT'

You receive the following error message:

Server: Msg 245, Level 16, State 1, Line 1
Syntax error converting the varchar value 'a' to a column of data type int.

When the code produces a conversion error, SQL Server terminates the batch and fails to execute the PRINT statement.

Somehow, you need to verify that SQL Server can convert each octet into a valid integer before you attempt to convert the octet. One way to verify conversion is to wrap each expression that extracts an octet in the ISNUMERIC() function, then convert the octet to an integer value as follows:

ISNUMERIC() = 1 AND CAST(
AS integer) BETWEEN 0 AND 255

However, using the ISNUMERIC() function introduces several problems. First, when you write the expression, you rely on T-SQL's support for short circuits. (For information about short circuits, see the T-SQL Black Belt column "Short Circuit," September 2000, InstantDoc ID 9148.) That is, you know that if ISNUMERIC() = 1 returns FALSE, SQL Server doesn't evaluate the rest of the expression because the result is already known to be FALSE. This way, the code doesn't generate a conversion error. However, when you write a logical expression such as

AND

for optimization reasons, SQL Server might decide to evaluate the expression as

AND

If the optimizer rearranges your expression so that the CAST() function is evaluated before the ISNUMERIC() function, you might receive a conversion error. You can solve this problem by using a CASE expression, as the following code shows:

CASE
   WHEN NOT () THEN 0
   WHEN NOT () THEN 0
   ELSE 1
END = 1

This CASE expression forces a short circuit in the desired order of evaluation because it returns the first value for which the logical expression evaluates to TRUE and doesn't continue evaluating the other logical expressions.

The second problem in using the ISNUMERIC() function is that the function treats monetary, decimal, floating-point, and integer values as valid numeric values. For this reason, you need to add yet another logical expression that verifies that the values hold no commas, money symbols, or the characters D and E, which are valid in floating-point values.

The third problem is that using the ISNUMERIC() function requires a lengthy logical expression that would be hard to maintain, and a combination of all the logical expressions you've written so far already generates a long complex expression. In short, writing a logical expression based on a pattern search that makes certain that each octet can be converted to a valid integer is easier. The following expression confirms a possible conversion by disallowing any characters other than digits and periods:

ip NOT LIKE '%\[^0-9.\]%'

Now you can revise the CHECK constraint in Web Listing 1 by adding the above logical expression. Note that you should also revise the constraint so that it uses the CASE expression to ensure that SQL Server converts the octets to integer values only after it checks all the other logical expressions. Run the script that Web Listing 2 shows to implement the revised CHECK constraint. Next, run the following code, which attempts to enter an IP address containing an invalid character:

INSERT INTO IPs VALUES('a.1.1.1')
PRINT 'After INSERT'

Notice that you receive an error message that reports a CHECK constraint violation, and SQL Server executes the PRINT statement.

Simplifying the Solution


Although you've added to the table column a CHECK constraint that successfully filters out invalid IP addresses, the constraint's logical expression is hard to read and maintain because it is long and complex. The longest and most complex part of the code extracts the octets from the original input value. You can simplify this section of code in a couple of ways. The first approach is to look for elements in T-SQL that already handle the extraction of certain parts of a string. When I was looking for a simplified solution, I noticed that an IP address looks very similar to another construct in T-SQL—a four-part object name, which is depicted as server.database.owner .object. T-SQL provides a function called PARSENAME() to parse each part of an object name. The function's syntax is PARSENAME ('object_name', object_piece). In this syntax, object_name is the four-part name, and object_piece is an integer value in the range 1 to 4 where you specify which object part you want to extract (1 ­ object, 2 ­ owner, 3 ­ database, 4 ­ server). To extract the nth octet of an IP address, you invoke the PARSENAME() function as follows:

PARSENAME (, 5 - n).

Web Listing 3 shows how to implement the revised CHECK constraint that uses the PARSNAME() function.

The second approach to simplifying the problematic part of the CHECK constraint in Web Listing 2 is to write logical expressions based on pattern searches only. In this approach, you verify that an octet is a number in the range 0 to 255 without converting it to an integer. You already wrote logical expressions that confirm that the octets are composed only of digits. Now you can add a logical expression that verifies that a number contains no more than three digits:

ip NOT LIKE '%\[0-9\]\[0-9\]\[0-9\]\[0-9\]%'

an expression that reports no numbers in the range 300 to 999:

ip NOT LIKE '%\[3-9\]\[0-9\]\[0-9\]%'

one that confirms no numbers in the range 260 to 299:

ip NOT LIKE '%2\[6-9\]\[0-9\]%'

and finally, an expression that finds no numbers in the range 256 to 259:

ip NOT LIKE '%25\[6-9\]%'

Because the above logical expressions replace the section of code that extracts the octets and converts them to integers, conversion errors are eliminated, and the CHECK constraint can check each logical expression independently without regard to the order of validation. In short, you don't need to use a CASE expression to force the order of validation. Web Listing 4 shows how to implement the revised CHECK constraint.

A Completely Different Approach


If you'd rather avoid the methods I've discussed so far—and the option is available to you—consider not storing the IP addresses in one varchar column. For example, you can store each octet in a separate column that has a data type of tinyint and add a computed column that concatenates all octets and periods to generate a formatted IP address for presentation purposes. The great thing about this approach is that the tinyint data type accepts only values in the range 0 to 255. No validation is required. Web Listing 5 shows how to create the revised IPs table and populate it with a few valid IP addresses. This solution is easier than the others and can save you time and effort.

The Road to the Solution


Sometimes you can learn more from the process that leads to the solution than from the solution itself. Trying to improve the problematic parts of your code is always a good idea, but experimenting with radically different approaches than the one you started with can also yield spectacular results. By taking a different approach, you might find a simple solution that eclipses the original.