| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

The End of an Architectural Era (It’s Time for a Complete Rewrite)

Page history last edited by Nathan T Suver 5 years, 5 months ago

Source:

STONEBRAKER, M., MADDEN, S., ABADI, D. J., HARIZOPOULOS, S., HACHEM, N., AND HELLAND, P. The end of an architectural era (it’s time for a complete rewrite). In International Conference of Very Large Data Bases (VLDB), 2007.

 

 

 

ABSTRACT

In previous papers [SC05, SBC+07], some of us predicted the end of  “one size fits all” as a commercial relational DBMS paradigm.  These papers presented reasons and experimental evidence that showed that the major RDBMS vendors can be outperformed by 1-2 orders of magnitude by specialized engines in the data warehouse, stream processing, text, and scientific database markets.   Assuming that specialized engines dominate these markets over time, the current relational DBMS code lines will be left with the business data processing (OLTP) market and hybrid markets where more than one kind of capability is required.  In this paper we show that current RDBMSs can be beaten by nearly two orders of magnitude in the OLTP market as well.  The experimental evidence comes from comparing a new OLTP prototype, H-Store, which we have built at M.I.T., to a popular RDBMS on the standard transactional benchmark, TPC-C. We conclude that the current RDBMS code lines, while attempting to be a “one size fits all” solution, in fact, excel at nothing.  Hence, they are 25 year old legacy code lines that should be retired in favor of a collection of “from scratch” specialized engines.  The DBMS vendors (and the research community) should start with a clean sheet of paper and design systems for tomorrow’s requirements, not continue to push code lines and architectures designed for yesterday’s needs.

 

 

Summary

This article aggressively advocates for the complete replacement of traditional RDBMS systems, noting that specialized engines can dramatically outperform traditional RDBMS systems.  They note that most modern database management systems derive (at least in part) from System R (https://en.wikipedia.org/wiki/IBM_System_R), which was designed for the business data processing market, on hardware profiles that have changed dramatically in the last 25 years.  The authors argue that RDBMS vendors have made incremental improvements over the last few decades, but have not done a complete redesign.  They argue that those legacy systems are simply no longer competitive, and present experiments to back up that claim, showing of their prototype can easily beat a traditional rdbms system by 2 orders of magnitude.

 

The article is dated around 2007, which is about 1 year after amazon released AWS.  Generally, their conclusions appear valid and have stood up since it's original publication, as evidenced by current trends in data management, the rise in the eventual consistency models, NoSql, etc..  It's difficult to argue that the older, monolithic relational model and the "one solution fits all" approach to legacy RDBMS systems is not weathering well.

 

Solution Extensions:

The statement "it's time for a complete rewrite" seems appealing on paper, but moving away from an entire data processing paradigm is not a cheap endeavor.  Simply stating that "long running commands should be broken into smaller transactions, running on independent nodes" might require a rewrite of non-functional requirements, such as "what to do in the case of transaction failure", etc...  The analysis required to do that type of refactor is not insignificant, especially on a mature system.  Perhaps the end goal of eventually leaving monolithic database systems behind is rational, but a detailed, phased approach that considers the costs and risks of such a move would be hugely beneficial to a business unit owner.

 

Relevant Content:

In contrast, some members of the DBMS community proposed much nicer embedding of database capabilities in programming languages, typified in the 1970s by Pascal R [Sch80] and Rigel [RS79].  Both had clean integration with programming language facilities, such as control flow, local variables, etc.  Chris Date also proposed an extension to PL/1 with the same purpose [Dat76].   Obviously none of these languages ever caught on, and the data sublanguage camp prevailed.  The couplings between a programming language and a data sublanguage that our community has designed are ugly beyond belief and are low productivity systems that date from a different era.  Hence, we advocate scrapping sublanguages completely, in favor of much cleaner language embeddings.   In the programming language community, there has been an explosion of “little languages” such as Python, Perl, Ruby and PHP.  The idea is that one should use the best language available for any particular task at hand.  Also little languages are attractive because they are easier to learn than general purpose languages.   From afar, this phenomenon appears to be the death of “one size fits all” in the programming language world.    Little languages have two very desirable properties.  First, they are mostly open source, and can be altered by the community.  Second they are less daunting to modify than the current general purpose languages.  As such, we are advocates of modifying little languages to include clean embeddings of DBMS access. Our current favorite example of this approach is Ruby-on-Rails3.  This system is the little language, Ruby, extended with integrated support for database access and manipulation through the “modelview-controller” programming pattern. Ruby-on-Rails compiles into standard JDBC, but hides all the complexity of that interface.

 

Comments (0)

You don't have permission to comment on this page.