OLE DB or ODBC? Look before you leap
OLE DB and ODBC are APIs that provide access to a range of data sources. Microsoft designed ODBC to access SQL data and OLE DB to access any type of data in a COM environment. Many people don't entirely understand the factors that led Microsoft to implement OLE DB. Others overestimate the current role of OLE DB in the corporate and Web-development environments. In this article, I aim to clarify why Microsoft introduced OLE DB, the role it plays today, and more important, the role it will play in the future. In doing so, I address the main concerns that seasoned ODBC developers might have about OLE DB.
What's Wrong with ODBC?
The introduction of OLE DB doesn't mean that Microsoft is dropping ODBC. For the foreseeable future, Microsoft plans to support ODBC, as do other database and tool vendors. So what's wrong with ODBC? Nothing. ODBC is adequate for data access. My real-world experience confirms that if ODBC meets your needs and your clients' needs, you can forget (for now) about OLE DB and related technologies.
However, ODBC has become a mature, fully expressed technology, and Microsoft probably won't develop it further. ODBC is on a dead-end track with a few switches left before the end of the line. The last of these switches shunts your applications toward OLE DB. And you have to decide whether you've already passed the last switch.
So nothing is wrong, unsupported, or invalid about ODBC. You know about its performance, flexibility, and architecture. You know about the various development tools and the framework built on top of ODBC (e.g., RDO). To determine how close you are to the point of decision, analyze how well your information system's planned future enhancements match ODBC's capabilities. Keep in mind that in the next five years, ODBC will provide virtually the same operational capabilities as it does today. ODBC will continue to let you access SQL data that isn't integrated with other, nonrelational data types such as extensible markup language (XML) files, Microsoft Office documents, or email. If these elements are part of your company's data store, you need to consider OLE DB.
What's Right with OLE DB
OLE DB has evolved from ODBC. OLE DB and ODBC both provide a relatively universal layer of code that uses the same core API to access different databases. The operation under the hood is left to special, invisible modules that are database-specific and that act as drivers for the topmost layers of code. The open database connectivity principle emerges somewhat intact in OLE DB. The big difference is in the surrounding context and in the generalization of a few main terms; Table 1 shows commonly used terms and what they mean in ODBC and OLE DB.
Through a common functionality set, OLE DB accesses all types of information—relational and non-relational, flat and hierarchical, persistent and volatile, based on SQL or on any other query language. To facilitate that access, OLE DB data sources are COM-based components with a well-defined programming interface. These components, called data providers, wrap a data store. When connected to a data provider, a client application always receives a set of records, whether the data provider manages tables from a relational database management system (RDBMS) or from a directory listing. The data provider is responsible for retrieving the data from the physical source and formatting it. The data can reside on a persistent medium (disk files or databases), be kept in memory, or span various machines and platforms. It can be relational or hierarchical, structured or flat, written in a standard or a proprietary format, and accessible or inaccessible through ODBC.
In addition, the result set (called a rowset or a recordset) that OLE DB returns isn't a simple stream of bytes copied into the client application's memory, as with ODBC. It's a stream of data that an independent COM module, with a separate programming interface, holds. Such a module gives you several facilities for manipulating the recordset—sorting, filtering, and scrolling—and it enables concurrent access to the same data from multiple clients. You can set up the data rows to work even when they're disconnected from the data source, making the recordset a new and powerful data type.
Drawbacks to Using OLE DB
So, is OLE DB all good? To decide, carefully consider two points. On one hand, OLE DB is a somewhat immature technology. On the other hand, Microsoft is pushing it as the core data service for future Windows platforms. This push means that Microsoft will significantly improve OLE DB in the future. Let's look at these two points in detail to anticipate their impact on users.
Microsoft proposed OLE DB as a central data-access technology with Visual Studio 97. At TechEd 99, OLE DB 2.5—a new release that is scheduled to ship with Windows 2000—debuted. Many changes since OLE DB's inception make the technology seem somewhat unstable. But in the early 1990s, people perceived ODBC as unstable, too. ODBC 3.x and later worked fine for most people, and today developers think of ODBC as a stable and consolidated technology. But that process took time. Data is the fuel of companies, so carefully evaluate any technology that affects data access.
Figure 1 compares the architecture of ODBC and OLE DB. Both rely on specialized components (drivers in ODBC and providers in OLE DB) that connect to the data source. Within ODBC, a driver usually acts as a proxy component, passing in SQL commands to the DBMS and returning result sets. An OLE DB provider accepts data in any query language it supports (not only and not necessarily SQL) and returns recordsets. A data provider encapsulating a DBMS is limited to passing SQL commands to the underlying database server. A data provider that renders a nonrelational data store (e.g., an email store) accomplishes the additional task of creating the recordset and populating it with the data. Such a provider might support a query language that is simpler than SQL. For example, to return email from a customer, the provider needs to know only the name of the sender. This type of command
is simple, effective, and easy to code.
OLE DB is a two-part technology with a foreground interface and a background engine. The background engine processes the queries and retrieves the data. The foreground part of OLE DB contains the programming features that any valid provider needs to expose to its clients. In this area, only Microsoft can establish how, functionally and technologically, OLE DB evolves, by determining the details of the COM interfaces that a provider needs to support. You'll see great improvements with OLE DB 2.5, which fixes design limitations and adds more powerful features. For example, OLE DB 2.5 allows the return of irregular, nontabular recordsets. You can use OLE DB to expose semi-structured and hierarchical data such as XML streams, Word and Excel documents, or the content of a file system directory. In addition, OLE DB 2.5 reduces the amount of information that a consumer needs to know about the required provider. Instead of specifying complex connection and command strings, you can bind directly to the desired recordset via a URL-based syntax. This feature, called direct URL binding, lets you use a connection string such as
Behind that intuitive URL scheme is the name of the data provider and the command text that asks the provider to scan the Inbox folder and retrieve all the messages sent by Joe User. The latest Microsoft platform Software Development Kit (SDK) partially covers the highlights of OLE DB 2.5.
Because Microsoft considers OLE DB the main technology for data access in the Windows environment, you'll get the support you need to use it for the foreseeable future. Also, SQL Server is Microsoft's principal database server product, so SQL Server providers will immediately reflect new features in OLE DB. Will other vendors' databases accommodate new OLE DB features? In my opinion, OLE DB's weakness is that the only reliable providers are those for SQL Server 6.5 and 7.0. The OLE DB providers for Jet and Oracle reportedly have bugs and incomplete, or inadequate, functionality implementation, which limits their usefulness to developers. Microsoft's OLE DB provider for Oracle isn't great, but it's perhaps the best available. OLE DB 2.1 facilitated multienvironment data access, but the environment for the problem remains. For OLE DB to become a widely accepted standard, all database vendors, and perhaps some third parties, need to release complete providers for various databases and possibly work together to develop them in a better way.
In general, OLE DB has yet to gain the reliability and the acceptance that ODBC has built during the past few years. But OLE DB has the potential to surpass ODBC in terms of supported data sources, flexibility, and programming interface. Moreover, OLE DB's central role in Microsoft's Windows Distributed interNet Applications (DNA) and DNS architectures makes it a serious candidate to replace ODBC. OLE DB promises to be an important technology for planning your future data-access strategies. But don't adopt it blindly.
Why Choose OLE DB?
If you have to decide about OLE DB today, what are the factors to consider? Microsoft designed OLE DB with performance in mind, but architecturally speaking, an OLE DB call has to traverse more layers of code than does a SQL query issued through raw code to the ODBC API. So don't be surprised if you find that a pure ODBC approach is slightly faster. To make up for this inefficiency, OLE DB gives you the ability to aggregate and integrate structurally different types of data and the possibility of using these data types with Web applications.
Don't underestimate the effort necessary to move to OLE DB. This caution applies even if you're already using an object model such as RDO or Data Access Object (DAO), or ADO before version 2.x. Also, proceed carefully if your target database isn't SQL Server. You begin to recoup the cost of moving to OLE DB only when you can exploit the integration with applications and system services that OLE DB allows. If you don't know how to leverage heterogeneous queries, don't have a data warehouse, or don't plan to integrate nonrelational data types (e.g., documents, spreadsheets, email) into the company's data store, you might end up with worse performance! Or you might upgrade your system, but get no concrete advantage.
I usually advise my clients to move to OLE DB only to satisfy urgent business needs (such as those above) that OLE DB addresses well. I also recommend it for new systems because you can design the system to take advantage of OLE DB's strengths. Flexibility, integration, and uniformity are important aspects of OLE DB; if they are integral parts of your business objectives, moving to OLE DB might be cost-effective for you.
Web-based systems are a great territory to employ OLE DB in, although the inferior reliability of some providers might hamper your efforts. However, choosing Microsoft Data Access Components (MDAC), which includes OLE DB as the core technology, lets you leverage Remote Data Services (RDS) on the Web to issue remote calls to components via HTTP. And OLE DB lets you receive, on the client side, disconnected recordsets to work with—not bad, after all!
OLE DB has good and bad points; it isn't magic. It's simply undergoing a process of refinement to become a widely accepted standard. Consider it today, but use it only to get concrete advantages.
Learn more from Ken Miller's "The Ins and Outs of Data Access."