Last week Oracle certified Amazon EC2 as a supported platform, that same week Larry Elison attacked the concept of cloud computing as pure hype. Obviously, Larry is not happy with this whole cloud thing, and I think it’s not just the threat it poses to the software industry’s traditional licensing model that worries him, rather, as Robert X. Cringely points out in his “Cloud computing will change the way we look at databases” post, it’s the likelihood that it sounds the death-knell for large-scale traditional databases.
This new database paradigm is memory rather than disk centric, with the disk-based element acting as an archive/backup/restore mechanism which can easily be stored on commodity SAN devices ( e.g. Amazon’s ESB). Using MapReduce technology Google effectively holds the whole Internet in memory, not in one big super computer but in lots of cheap commodity servers.
But it’s not just in the realm of mega datasets that RAM based databases threaten traditional models. Excel is a memory-based database engine, so too in-memory OLAP tools such as Palo. Such products’ ability to handle large volumes of data has increased over the years, with the decrease in RAM costs and the appearance of cheap 64 bit machines (which are no longer limited to 2G/3G process working sets).
That doesn’t mean that we’ll throw away SQL databases in their entirety, SQL and the relational model will continue to be useful. But perhaps of greater use in local datastores/caches that as the building blocks for large scale datastores. For such local caches, less will be more; fewer features, easier to configure, more flexibility. That’s why I like SQLite; long after the dinosaurs of the database world have disappeared, I imagine SQLite databases will continue to survive, embedded in mobile phones, browsers, wherever a local datastore is required. And more than likely operating in memory rather than off disk.
By combining Excel with an in-memory SQLite database, linked to a Palo OLAP in-memory server, it’s possible to take advantage of three powerful data-processing technologies (spreadsheets, SQL, multi-dimensional cubes) all within your PC’s RAM. You could do serious datasmithing with such a combination on a pretty mediocre laptop, with most modern machines providing an excess of CPU power, no need for super fast disks, just as much memory as you can muster. And, with Windows on EC2, these three amigos will soon be capable of being used as a cloud bursting platform.
Excel, SQLite and Palo, my data trinity.