Enforcing Restricted Uniqueness on Columns

Congratulations to Carlos Antonio Alvarado Andrade and Toby Ovod-Everett. Carlos won first prize of $100 for the best solution to the February Reader Challenge, "Imposing Data Restrictions." Toby won second prize of $50. Here's a recap of the problem and the solution to the February Reader Challenge.

Problem:


Arun is a database architect who designs database schemas for products that use SQL Server 2000 and 7.0. Arun is currently working on a schema for a product that lets end users configure field names. The product's schema table contains a column that stores a field name, and this field name is displayed in the product's UI. The table can be installed under any case-insensitive database collation supported by SQL Server. The schema of the table looks like this:

create table meta_FieldNames ( fieldid int not null primary key, fieldname nvarchar(50) not null )

As part of the schema design, Arun wants to allow only a mix of upper or lowercase alphabetical characters with no numbers, international alphabet characters (e.g., accent marks, tildes) or special characters in the field names. How can he impose this restriction on the "fieldname" column of the table?

Solution:


Arun can use a CHECK constraint on the fieldname column to impose restrictions on the data that users enter. He can add the check constraint by using an ALTER TABLE statement that uses a LIKE expression to verify the value:

ALTER TABLE meta_FieldNames
ADD CONSTRAINT CK_FieldName CHECK (fieldname NOT LIKE '%\[^a-z\]%')

However, although this expression will restrict upper or lowercase letters depending on the collation of the column, it won't restrict letters with special characters. For example, this CHECK constraint will let users enter an "a" with an accent mark in the column. This expression alone doesn't satisfy Arun's requirements, so he should modify the expression to include an explicit list of letters:

ALTER TABLE meta_FieldNames DROP CK_FieldName
ALTER TABLE meta_FieldNames
ADD CONSTRAINT CK_FieldName CHECK (fieldname NOT LIKE '%\[^abcdefghijklmnopqrstuvwxyz\]%')

By modifying the LIKE expression to include the explicit list of letters, Arun ensures the constraint's success, irrespective of the various alphabetical letters' sorting order based on the collation of the column. Finally, to prevent users from entering empty values (') into the column, Arun can modify the CHECK constraint to include a LEN check:

ALTER TABLE meta_FieldNames
ADD CONSTRAINT CK_FieldName CHECK (fieldname NOT LIKE '%\[^abcdefghijklmnopqrstuvwxyz\]%' AND len(fieldname) > 0)

To validate the CHECK constraint, Arun can run the following insert statements:

INSERT INTO meta_FieldNames values (1, 'aaaaaa')
INSERT INTO meta_FieldNames values (2, 'AAAAAA')
INSERT INTO meta_FieldNames values (3, '12345')         -- will be rejected
INSERT INTO meta_FieldNames values (4, 'a1')            -- will be rejected
INSERT INTO meta_FieldNames values (5, 'A1')            -- will be rejected
INSERT INTO meta_FieldNames values (6, 'a,b,c') -- will be rejected
INSERT INTO meta_FieldNames values (7, 'A,B,C') -- will be rejected
INSERT INTO meta_FieldNames values (8, ')               -- will be rejected

MARCH READER CHALLENGE:


Now, test your SQL Server savvy in the March Reader Challenge, "Enforcing Restricted Uniqueness on Columns" (below). Submit your solution in an email message to challenge@sqlmag.com by February 16. Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We'll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.

Problem:


John is a database architect for a company that uses a sales application populated with customer data from various data sources. The customer data is stored in a SQL Server 2000 database table called Customers. The following column shows some of the columns in the table:

CREATE TABLE Customers
(
  CustomerId int identity(1,1) NOT NULL primary key,
  CompanyCode varchar(10) NULL
  /* ... other columns ...*/
)

The CompanyCode column isn't always present in the data that SQL Server imports into the table. In some cases, the value can be empty or NULL. John wants to enforce uniqueness on the CompanyCode column-but only for non-null values-and exclude empty values. How can John enforce uniqueness with minimal coding and without affecting the legacy applications that provide the customer data?

Discuss this Article 1

TOBY (not verified)
on Feb 13, 2006
The above listed solution is incorrect. There exist collations for which the above will not work! First off, we will use SQL Server 2000 to simulate installation of the above table under the SQL_Latin1_General_Cp850_CI_AI collation. This is the Case-Insensitive and Accent-Insensitive variant of CP850. Since the original problem statement only requires case-insensitivity and makes no assertion regarding accent sensitivity, the above collation meets the requirements. To do this under SQL Server 2000: create table meta_FieldNames ( fieldid int not null primary key, fieldname nvarchar(50) COLLATE SQL_Latin1_General_Cp850_CI_AI not null ) This has assigned that particular collation to the fieldname column. Now we use the exact constraint proposed above: ALTER TABLE meta_FieldNames ADD CONSTRAINT CK_FieldName CHECK (fieldname NOT LIKE '%[^abcdefghijklmnopqrstuvwxyz]%') And now we go to town: INSERT INTO meta_FieldNames values (1, 'Aaâ') That should be an a with a caret above it as the third character. No problem. I have now inserted a character with an accent on it into the table. The problem is much more difficult than it appears at first sight. If you take into consideration codepages like SQL_Latin1_General_Cp1254_CI_AS (the Turkish codepage (in which an upper case 'i' is an 'I' with a dot on top, and a lower case 'I' is an 'i' without the dot!), you will find that many initially obvious solutions fall victim to at least one of the fascinating codepages available! There is a reason my solution involved 156 REPLACE calls, many of them with varbinary parameters. Note that it is an easy problem to solve if you aren't worried about SQL Server 7.0. If you are only worried about SQL Server 2000, simply an appropriate collation to the column in your DDL and then it's easy. It's when you have to deal with a nasty collation that things get difficult. --Toby Ovod-Everett

Please or Register to post comments.

IT/Dev Connections

Las Vegas
September 30th - October 4th

Paul ThurottOur Experts will show you:
• Common SQL Server
Problems
• Best Practices for T-SQL
• SQL Server Integration
Services
• Database Development

Come See Mike Otey & Tim Ford in Person!

Early Registration Now Open

From the Blogs
May 9, 2013
blog

My ISO 8601-Compliant Signature 2

My family recently just "officially" announced that we're in the process of adopting a child from South Africa. We're quite excited, of course, but there's a ton of paperwork to do—along with the need for gobs of signatures....More
May 8, 2013
blog

Use SSIS for ETL from Hadoop

In this blog post, Mark Kromer walks you through using SSIS as a way to use ETL techniques using Microsoft's Hadoop on Windows (HDInsight) as a source using Hive connectors...More
Vision road sign
May 6, 2013
blog

Cheaters Never Win, Even in TPC Benchmarks

In this portion of the series on database benchmarking, I want to tell you about one of my favorite aspects of the TPC benchmarks – CHEATING....More
SQL Server Pro Forums

Get answers to questions, share tips, and engage with the SQL Server community in our Forums.