Who’s in charge of definitions? And who gets to decide which new words make it into the dictionary? One of my favorite words—uniqueifier—was introduced in SQL Server 7.0 to describe the way that all clustered indexes must be able to uniquely reference each index key. If the declared clustered key isn't defined to be unique, SQL Server will make it unique by adding a 4-byte uniqueifier to all duplicate values. When I Googled "uniqueifier," I was asked if I really meant "uniquifier." There were 822 hits for the spelling of the word with the middle "e," and 3990 hits for the spelling without the "e." The SQL Server 2005 Books Online (BOL) glossary spells the word with the middle "e," which is the spelling I use. Although there were a lot of technical documents that referred to the word, some without even defining it, it didn't appear on www.wikipedia.com or www.dictionary.com. So is uniqueifier (or uniquifier) a word? I use it when I write about SQL Server, so to me, it definitely is a word.

My academic background and undergraduate degree are in Linguistics, so I'm well aware of the fact that language changes and new words enter the language all the time, but I still wonder when a new word actually becomes "official." This musing was actually inspired by a recent technical document I was reading that was written by a member of the SQL Server product team at Microsoft that described some of the features in SQL Server 2008. In this document, the author used the abbreviation DML, which stands for Data Manipulation Language, without defining it. After a bit of confusion, I was able to discern he was referring to only data-modification statements (i.e., INSERT, DELETE, UPDATE). 

When I learned SQL many years ago, I learned that there were three parts to the language: Data Definition Language (DDL), Data Control Language (DCL), and DML. DML was defined to include all data-access statements (i.e., INSERT, UPDATE, DELETE, SELECT).

However, many of the developers at Microsoft talk about DML as if it stands for Data Modification Language and use the term to specifically exclude the SELECT statement.  I wondered if maybe the developers were confused because they weren't really database guys. Although they develop a database product, they don’t use it, so being aware of the nuances of the SQL language might be something that they haven’t thought much about. So it might be that database guys know that DML includes SELECT, but developers don’t.

I checked the handy BOL glossary again and found good news and bad news. First, the entry for DML referred me to Data Manipulation Language, which did have what I considered to be the correct definition of the term: "The subset of SQL statements that is used to retrieve and manipulate data. DML statements typically start with SELECT, INSERT, UPDATE, or DELETE." However, the waters became muddied when I saw the following definition for "DML Trigger" in the same glossary: "A stored procedure that executes when data in a specified table is modified." So someone reading that definition might just assume that the "M" in DML stands for modification.

I wrote to Steve Kass, a friend of mine who is a SQL Server MVP with a PhD in Math. Steve is also very well versed in the intricacies of the SQL language. I asked him what he thought about the use of the term DML. He was basically reassuring and told me that I wasn’t exactly wrong to think the way I do ( and now that I’ve checked BOL I’m even more convinced of that), but he referred to the 2003 and 1999 American National Standards Institute (ANSI) SQL standards and pointed out that they offer little assistance. The concept <direct select statement> isn't in the chapter that discusses data manipulation; instead, it's in a chapter entitled "Direct invocation of SQL." He went on to describe the ANSI standard’s list of the "main classes" of SQL statements and said that there was no class called SQL-manipulation statements or SQL-modification statements.

I’m left to wonder if insisting on my definition of DML is a hopeless cause and this little bit of the SQL language is destined to change no matter what I say or do. I feel like I might be trying to hold back the tide if I try to explain my concerns to the small percentage of product developers that I work with directly.  I’m well aware that you should choose your battles carefully, and since this battle might be one I can't win, all I can hope for is that anyone using the term DML at least use it consistently. It would be even better if DML was defined in any document in which it's used, so that readers know how the writer is using the term.