SQL Server resources

I'd like to tell you about some of my favorite SQL Server resources. I monitor both of Microsoft's SQL Server news server forums (microsoft.public.sqlserver) and Steve Wynkoop's SQL Server 6.5 and SQL Server 7.0 mailing lists (http://www.swynk.com) because I like to see how people are using SQL Server and the kinds of problems they're running into. Even a rough count of active messages in Microsoft's newsgroups or the length of a message thread (both forums) is instructive. For example, as of mid-May, here are some active message counts in Microsoft categories: microsoft.public.repository: 314; microsoft.public.sqlserver.clients: 782 (overlaps with OLAP); microsoft.public.sqlserver.connect: 748 (overlaps with ODBC); microsoft.public.sqlserver .datawarehouse: 877; microsoft.public.sqlserver.mseq: 97 (English Query).

My conclusions are that, despite ODBC and OLE DB, a lot of people are having problems with connectivity. And data warehousing and OLAP aren't catching on as fast as Microsoft probably expected. Also, I'm surprised that Microsoft doesn't have a newsgroup for SQL Server Enterprise Edition (clustering) customers. However, that lack seems to confirm my informal poll that not many folks are using it. Are you? For implementation tips, see Brian Moran and David Sapery, "SQL Server Clustering" (June 1999).

English Query and the Microsoft Repository are yawners in most people's eyes. However, everyone wants natural language interfaces to back-end data, and I'm convinced it's just a matter of time until the English Query feature of SQL Server catches on. In September 1997, Microsoft invested $45 million in linguistic technology vendor Lernout & Hauspie. The Microsoft Speech API software development kit (SDK) has been available from Microsoft Developer Network (MSDN) for several years now, and Microsoft Research's Natural Language Processing (NLP) group consists of 27 researchers.

Perspectives on Scalability


Scalability is always an issue. At the launch of SQL Server 7.0 at Comdex last fall, Microsoft demonstrated a multi-terabyte Terraserver database and Unisys showed its 2TB database consisting of 178 tables, some in third normal form (3NF) and some in star schema layouts. SQL Server can scale, but most firms aren't willing to give up Sun Starfires or NCR-hosted Teradata data warehouses for an Intel or Alpha box. Sun and NCR come to mind because of their early-1999 press releases: Sun announced shipment of its one-thousandth Sun Enterprise 10000 server, known as Starfire, and NCR announced its fiftieth Terabyte Club Teradata customer. Barnes & Noble's Web site uses SQL Server 7.0, but Amazon.com runs on Oracle. Speaking about the explosive customer relationship management (CRM) market, META Group's Aaron Zornes said, "Most people do not know that a lot of the basis for \[Amazon.com's\] success is a 2TB data warehouse. What it lets them do is keep track of customer interest and inventory movement to tailor the shopping experience."

Successful e-commerce sites generate a lot of customer data. In a recent press release, Oracle described three customers that had abandoned SQL Server for Oracle, citing scalability as the reason. According to Oracle, the customers became victims of their own success. Yes, scalability matters.

I asked some of my Manhattan-based colleagues whether Sybase, often described as "owning Wall Street," was beginning to see defections now that SQL Server 7.0 was shipping. "Are you kidding?" was the typical reply. Yes, most securities and financial firms are using Windows NT for decision-support applications, and sometimes SQL Server, but only for meat-and-potatoes OLTP processing. But such firms certainly aren't ready to give up their UNIX boxes, thank you. One of my colleagues agrees, saying that the issue is not converting from Oracle, Sybase, or whatever to SQL Server. The issue is converting from tested, tried and true UNIX boxes to NT. Some people have more faith in UNIX's reliability and stability than in NT's. The logical continuation of this perception is that because SQL Server runs only on NT, an OS perceived as second to UNIX, such people won't buy into SQL Server, no matter how good and convenient it is or how much marketing appeal it has.

Subsecond Performance


You're probably tired of hearing about Oracle's million-dollar challenge to anyone who could run the Transaction Processing Council's TPC-D query number five on SQL Server 7.0 in under two hours (100 times slower than the 71 seconds Oracle then needed to run the query). Oracle's Larry Ellison closed the challenge in February, but the way Oracle set up this challenge makes you wonder whether the major databases contain lines of code that were written specifically for the TPC and other benchmarks.

For the record, IBM published results documenting DB2 on Windows NT's subsecond performance and said that "Other vendors don't seem to understand what scalability means. While Microsoft SQL Server is limited to a single machine, Oracle, with its shared-disk architecture, has been unable to scale beyond six nodes, and IBM has demonstrated the ability to handle a terabyte of data on Windows NT using 32 nodes. And while Oracle publishes TPC-D benchmarks with a single user on the system, IBM ran the 1TB benchmark on Windows NT with eight parallel query streams, a new record for concurrent users. No other vendor has even attempted this."

Bill Gates' Vision


Bill Gates describes his vision of creating a single view of the Microsoft customer in his new book, Business @ the Speed of Thought. A customer-centric view sounds reasonable. Many other CEOs have the same idea. And a host of software vendors market customer-centric solutions, including CRM, call center, sales force automation (SFA), database marketing, and Help desk products.

However, ERP systems, Web and e-commerce applications, and data warehouses will prove challenging if Microsoft tries to implement Gates' vision. Also, Microsoft will have to address problems associated with extended supply chains. If a supply chain includes distributors, VARs, and suppliers, the question of who owns the customer will certainly come up. Put another way, if you buy a copy of NT or SQL Server from CompUSA, will Microsoft ever find out? When you start thinking about these issues and realize that some Micro-soft customers wear many hats (i.e., a contact person in a firm that has paid for Alliance support but who is also a developer and owns Microsoft software on a home PC), it's not hard to see how complex Microsoft's customer database problem must be.

Design Challenge


So here's my challenge to you: Design a customer database for Microsoft. Submit a normalized design (i.e., a SQL Data Definition Language—DDL—design or entity relationship diagram—ERD) to me and tell me whether I can publish it.