Google Spreadsheets – ETL tool

Although I’m a total Excel fanboy, I most admit I rarely use it any longer for personal stuff such as home budgets, tax calculations, what-ifs, to-do lists etc.; I now tend to use Google Spreadsheets. Likewise, personal notes, drafts and useful bits of code are stored using Google Docs rather than MS Word. Three main reasons for this shift to the cloud:

  • Google Docs & Spreadsheets are ‘good enough’ for most of the trivial lists and calculations I require in my personal life and indeed for most business purposes as well, at least those that don’t require a pivot table.
  • These spreadsheets and documents are important but not necessarily in the ‘state secret/I-could-tell-but-then-I’d-have-to-kill-you’ scale of things, by building them in Google Apps they are securely backed-up and easily accessible.
  • A lot of the spreadsheets are collaborative in nature, and in the collaboration field, Google Spreadsheets just gets better and better.

Today, Google announced further additions to their spreadsheet product. The AutoFill feature adds functionality I’ve come to expect from Excel, but with a twist, integration with Google Sets. But the additions that really caught my eye were the new data import functions. Now again, Excel has had web queries since Excel97, and it always amazed me why online pretenders to the throne tended to ignore the most common source of tabular data on the web, the HTML table; something to do with the great XML/Tables divide I guess!

Google now not only fixes this omission,providing access to HTML tables and comma/tab separated file, but also provides access to RSS/ATOM and generic XML sources. All that’s missing now are functions that can read other common online data files formats such as Excel, MSAccess, XBase and of course SQLite.

This addition of HTML import support and the AutoFill feature will further reduce the number of times I’ll need to fire up Excel for personal tasks, but the RSS/ATOM/XML import feature also has potential as a tool in my micro-ETL toolbox. Using Excel as my only micro-ETL tool is possible when the data is either already in Excel/CSV or accessible via a COM API or via ODBC drivers, otherwise I can call-in either Ruby, Talend, Kettle or even RSSBus. But now I’ve another option, if the data is public and published as RSS/ATOM or some other variation on XML, I can use Google Spreadsheets to fetch the data and import the resulting tabular dataset into Excel via a Web Query or via the GData API.

New Google Reader Search facilityOne other thing. While researching this post, looking up links etc. I used another new feature Google added today, Google Reader’s new Search facility. As most of my references are discovered via the blogs I subscribe to, the ability to restrict searches to that subset of the web is fantastic; I even used it to search through my own blog posts! If del.icio.us offered the same option it would make re-finding stuff even easier. I did try to use Google Co-Op to build a search engine restricted to my del.icio.us links but it didn’t seem to like the volume of links (4000 odd) I sent it.

Advertisements

One response to “Google Spreadsheets – ETL tool

  1. Pingback: Google Reader - KM killer app « Gobán Saor