2007-10-29
One size mostly fits all
Database pioneer Michael Stonebraker has published a series of papers arguing that the traditional relational database engine is obsolete. His claim is that, for every database workload important today, there is a specialized database architecture that can dramatically outperform (by an order of magnitude or more) established “one size fits all” databases.
The latest paper in this series is The End of an Architectural Era (It's Time for a Complete Rewrite) (Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, Pat Helland). This paper concentrates on transaction processing workloads. It presents a research database called H-Store, which outperforms a traditional RDBMS by 82 times on the TPC-C benchmark. H-Store has some features that may be familiar to some of my readers: It's an in-memory database, with the application logic in the same process as the database. Because it is an in-memory database, and because it targets OLTP workloads, transactions are very short-lived, and so it confines transactions to a single thread, avoiding the overhead of locking or any other form of concurrency control.
(Some aspects of H-Store are less than elegant. The transaction classification scheme used to support clustered operation seems rather ad-hoc. And the idea of handling conflicting transactions by requiring that “each execution site must wait a small period of time (meant to account for network delays) for transactions arriving from other initiators” (section 4.4) is positively scary.)
Other workloads are discussed in an earlier paper: One Size Fits All? Part 2: Benchmarking Results (Michael Stonebraker, Chuck Bear, Uğur Çetintemel, Mitch Cherniack, Tingjian Ge, Nabil Hachem, Stavros Harizopoulos, John Lifter, Jennie Rogers, and Stan Zdonik). One particular workload discussed in that paper is data warehousing (i.e. reporting). For these applications, they propose column stores (database engines storing tables as multiple sequences of column values, rather than as a single sequence of row tuples). They demonstrate a performance advantage for a research database called C-Store, and its commercial descendant Vertica, in a telco call analysis application and a simplified variant of the TPC-H benchmark.
Despite these results, I doubt that there is an imminent threat to established “one size fits all” databases. The market is much more likely to stick with traditional databases and SQL whenever they provide adequate performance, complementing them with specialized database architectures only when there is a compelling need for the performance advantages that they can provide.
06 June 2010 16:25
Comment from Research Paper
Many institutions limit access to their online information. Making this information available will be an asset to all.