Facebook Needs ‘Major Rewrite’ Warns Database Guru
Facebook’s reliance on MySQL is unsustainable and requires a total rewrite, warns Michael Stonebraker
The rapid growth of Facebook has resulted in it operating an unsustainable database structure that needs a complete rewrite – according to database guru Michael Stonebraker.
Facebook is operating a huge, complex MySQL implementation equivalent to “a fate worse than death”, said computer scientist Stonebraker, during an interview with Gigaom. Stonebraker was a founder of Ingres as well as Vertica (recently bought by HP), and has also been an adjunct professor at MIT and a pioneer of object oriented databases.
Fate Worse Than Death
Stonebraker also warned that the only way out is to “bite the bullet and rewrite everything”, because Facebook has split its MySQL database into 4,000 shards in order to handle the social networking site’s massive data volume. He also said that Facebook is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve.
Gigaom points to the oft-quoted statistic from 2008 that Facebook had 1,800 servers dedicated to MySQL and 805 servers dedicated to memcached. Facebook meanwhile maintains a MySQL Facebook page dedicated to updating readers on the progress of its extensive work to make the database scale along with the site.
For it all to work, the social networking giant of course requires extensive data centre resources.
In September last year Facebook had to defend its first dedicated data centre in Prineville, Oregon, against environmental campaigner Greenpeace, who argued that Facebook should be reducing its carbon footprint by moving from coal-fired electricity to renewable sources.
Transaction Workload
Meanwhile Stonebraker and co have warned that MySQL was not designed for “webscale applications or those that must handle excessive transaction volumes.”
Stonebraker is quoted as saying that the problem with MySQL and other SQL databases is that they “consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading) and relatively few on actually finding and serving data.”
ACID stands for Atomicity, Consistency, Isolation, Durability. It is an acronym used to state that transactions are performed reliably and accurately. This is vital in e-commerce sectors, where transactions are highly reliant on the accuracy of the data set.
Stonebraker believes this is fine for small applications with small data sets, but when a company such as Facebook experiences such rapid growth, it “quickly becomes too much to handle as data and transaction volumes grow.”
Of course Facebook’s main reason for existence is its interaction with its users, but unfortunately it seems that even just clicking the Like button, or other actions, requires a transaction that Facebook’s MySQL database has to process.
Database Management
Gigaom quotes Stonebraker as saying that “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.
The advantage of MySQL however is that it is open-source and therefore free, and skilled personnel are relatively easy to hire. This is the reason, according to Stonebraker, why web startups often choose the product when they need to build a system in a hurry. He warns the problem with this is that if the startup experiences rapid growth, like Facebook, they don’t have the time to re-engineer the service from the database up.
Instead, he said, “they end up applying Band-Aid fixes that solve problems as they occur, but that never really fix the underlying problem of an inadequate data-management strategy.”
MySQL is of course now owned by Oracle following its acquisition of Sun Microsystems.
Neither Facebook or Oracle responded to eWEEK Europe UK at the time of writing.