Reading the tea leaves of key technologies

In his TechEd 99 keynote speech, Bob Muglia, senior vice president of Microsoft's Business Productivity Group, outlined a roadmap of technologies that Microsoft plans to deliver in the next year or so. I'll give you my spin on these technologies.

Collaboration and Knowledge Management



The first technology in this category is a forthcoming Exchange-brokered framework for collaboration and knowledge management (KM). Rumors suggest that major turf wars raged at Microsoft about which product would serve as the foundation of the KM effort and that SQL Server lost out to Exchange. Web Store, which will be part of the forthcoming Platinum release of Exchange—Platinum is the code name for the next version of Exchange—reportedly provides a portal for accessing information from the Web, from Exchange collaboration servers, and from traditional online transaction processing (OLTP) database systems. According to Sarah Sterling, program manager in Microsoft's Workgroup Solutions Group, in her TechEd presentation "Building Team Knowledge Management Solutions Using the Team Productivity Update for BackOffice Server 4.5," you can use either Access 2000 or SQL Server with Team Productivity Update (TPU), a new feature add-on for BackOffice Server 4.5, as the tracking database. Another component associated with TPU is Team Workspace, which works with both Internet Explorer 5.0 (IE5) and Outlook. Team Workspace also works with Microsoft's new Office Server Extensions (OSE) and Application Instantiation Model (AIM). But I think Platinum's public-folder-based applications are likely to be the center of Microsoft's workflow and KM strategy. I'll be monitoring announcements from the Microsoft Exchange Conference 99 (http://www.microsoft.com/ corpevents/mec99/) in October in Atlanta to find out more about these products.

Data-Mining Technology


Another technology Muglia spoke briefly about at TechEd is Tahoe, Microsoft's code name for server-side services related to document management and data-mining technology, some of which might be rolled into a future release of SQL Server. According to Muglia, Tahoe will search an organization's data stores and return best-bet results based on information tagged with Extensible Markup Language (XML). Microsoft also officially launched its OLE DB for Data Mining extensions in a TechEd forum.

Tahoe's kind of data mining isn't just fancy querying—it's high-end stuff that involves a lot of statistics, predictive (classification) and dependency (density estimation) modeling, algorithms for clustering (segmentation) and deviation detection, and fairly compute-intensive summarization and visualization. And if that complexity isn't enough, preparing data for the algorithms is often harder than doing the data mining itself. No wonder end-user data-mining tools have garnered lackluster response. Even Microsoft avoids suggesting that OLE DB for Data Mining can magically bring data mining to the masses.

In fact, Microsoft's Data Mining and Exploration (DMX) group has been instrumental in developing the OLE DB for Data Mining specification. Microsoft isn't creating a proprietary specification. It is simply rallying the troops to create a de facto industry standard by duplicating its successful OLE DB for OLAP specification process. Microsoft is inviting selected independent software vendors (ISVs) to help formulate a draft specification, give public feedback, and ship a software development kit (SDK) before the official rollout.

Although the specification and SDK should be available soon at http://www.microsoft.com/data, I recommend that you read about data mining at http://research.microsoft.com/dmx/. Microsoft's DMX team focuses on scaling data mining, reduction, and analysis algorithms for use on large data sets. The team's emphasis areas include classification, clustering, sequential data modeling, frequent-event detection, and fast data-reduction techniques. The team collaborates with Microsoft's database group on implications and requirements that data mining imposes on the database engine. Given Microsoft's 1.5-year timeline for taking OLE DB for OLAP from spec to ship, I think data mining is still leading-edge technology for Microsoft.