SQL Server now holds every conceivable world record for the TPC-E database benchmark. That news would be slightly more impressive if TPC-E scores existed for any database besides SQL Server, but heck, winning a race with just one runner doesn't mean that runner did a bad job. I first wrote about TPC-E, the latest benchmark from the Transaction Processing Performance Council, in my commentary TPC's New Benchmark Strives for Realism.

Microsoft became the first database vendor to have a published TPC-E result when Unisys published a TPC-E score on July 12 using SQL Server 2005 on a dual-core 16-processor ES7000. IBM followed suit with a dual-core 2-processor server two weeks later, and Dell posted a dual-core 4-processor result on August 24. Both IBM's and Dell's results used SQL Server, so SQL Server is currently the only database vendor listed, meaning SQL Server currently holds all the top scores. Sane vendors don't post TPC-E scores that make them look bad, but I suspect it's only a matter of time before IBM and Oracle post TPC-E scores for their database products that leapfrog the latest SQL Server scores, which will in turn be bested by Microsoft in the never-ending game of benchmark leapfrog.

Still, I thought now would be a good time to revisit TPC-E to help you understand how this score compares to TPC-C. I have to admit that I've become jaded over the years when it comes to database benchmarks. My primary complaint has always been that the systems being benchmarked are generally so much larger and more powerful than any customer would ever have that the benchmarks themselves end up providing little value to customers. I reserve the right to regret saying this a year from now, but I was pleasantly surprised when I began learning a bit more about the TPC-E test, and I cautiously hold out hope that this benchmark could be more valuable to online transaction processing (OLTP) customers than the TPC-C benchmark. The full TPC-E specification document is 259 pages and can be found at http://www.tpc.org/tpce/spec/TPCE-v1.2.0.pdf. It's not the most exciting read.

IBM has published a white paper called Overview of TPC Benchmark E: The Next Generation of OLTP Benchmarks. I've found this document to be invaluable in understanding the TPC-E design goals and some of the key differences between TPC-C and TPC-E. The white paper is 29 pages, which sounds like a lot to read. But trust me, there's a lot of white space and plenty of charts. You can quickly scan the document in five minutes or less and take away many of the key facts about the goals of TPC-E and how this new benchmark is different from TPC-C. The document also contains useful background information for anyone interested in the topic of database benchmarking. Check it out. It's definitely worth your time. Here are a few of the most interesting facts about TPC-C and TPC-E that lead me to cautiously hope TPC-E will prove to be more relevant to database customers than TPC-C.

Some characteristics of TPC-C include

  • no server-side referential integrity
  • no check constraints
  • only one roundtrip from client to server per transaction
  • no RAID required on the server to ensure that the server is operating in a recoverable manner other than what's provided by a transaction log in the relational database management system (RDBMS)
  • no TPC-C code provided, making it difficult for organizations to use this test as an inhouse benchmarking tool
  • solutions generally involve 50-200 separate physical disks to ensure adequate performance numbers

Some characteristics of TPC-E include

  • Server-side referential integrity is enforced across 50 foreign keys.
  • There are 22 check constraints.
  • Transactions can involve more than one round trip to the server.
  • RAID-protected data is required and vendors are required to publish how long it takes for the database to recover to at least 95% of rated throughput after a catastrophic disk failure.
  • The Transaction Processing Performance Council publishes the full TPC-E source code and provides a software tool called EGen, which is designed to facilitate the implementation of TPC-E, making EGen a useful tool for customers to experiment with directly.
  • TPC-E solutions use 1-20 physical disks, ensuring that the disk layout more closely resembles what's likely to be used in the real world.

I hope the bullets I've listed above have whetted your appetite for how and why TPC-E might prove to be more useful and practical to customers than TPC-C. It will be interesting to see how Microsoft stacks up when other database vendors begin publishing their results. What would it mean if other database vendors didn't publish results? Trust me, Oracle and IBM will publish TPC-E results if the scores make their products look good in comparison to SQL Server. If IBM and Oracle don't publish TPC-E results, Microsoft winning a race that no one else is running in would look better and better.