S3 as an Excel hub

In my Steam Powered PowerPivot post I explained the concept of powered-spoke based Excel systems. One of the hub options was Amazon’s Simple Storage Service, S3. I’ve been using S3 both as a simple backup service and as a “systems hub” since the service began in 2006. Back then, it was not necessarily the easiest to use or the most suitable, but it was cheap and reliable. The code I used to access S3 from Excel can be found here http://blog.gobansaor.com/2008/02/22/xlaws-excel-vba-code-for-accessing-amazons-s3-and-simpledb/

Over the years, lots of new features have been added to S3 that have made the service more useful and easier. The final big missing, for me at least, was the lack of multiple  authorisation credentials  for a single account i.e. the ability to set-up multiple users with their own access policies. Last year, with the launch of IAM (Identity & Access Management) Service, that last major problem was removed.

Although currently IAM is accessible only by the API that’s not a problem as my favourite tool for managing S3 (and AWS in general) is Python’s boto package, and it has that, and everything else you might wish to do S3 -wise, covered.

Although Boto does most of my heavy lifting (particularly on the server-side), I still have need for a VBA based S3 utility. Boto is fine for situations where I’m using microETL with Python enabled, but in many situations this is not ideal or even possible, particularly on the client-side of systems. Here, even a single file add-in may not be possible which often means a macro-enabled workbook is the only option.

MicroETL has a noSQL compile option which enables a lot of the code to operate without the SQLite libraries, thus allowing it to be used as a single-file VBA add-in or embedded in a workbook.

Up until now my S3 code depended on the xliteSha1Hmac.dll to provide SHA1-HMAC hashing (necessary to authorise access to S3) meaning that only pre-signed S3 URLS could be used in single-file deliverables. Even if this was not the case, prior to IAM, pre-signed URLs (usually with time limits) were often the only option as sharing an S3 account’s single authorisation credentials with all and sundry was not ideal from a security point of view. But post IAM, being able to set-up sub-accounts with specific access policies means the full power of S3 can now be used at client-level.

I’m in the process of adding a pure VBA HMAC-SHA1 facility (32 bit only at the moment, 64 bit continues to require the xliteSha1Hmac.dll). I’m also adding a JSON encode/decode module to allow tables to be more easily shared with non-Excel programs, currently I default to using tables encoded using ADO’s export XML schema to share via S3, but the JSON option will make it easier to share with a pure Python server (such as a PiClound based service, see my Expand Excel’s Horizons – look to the cloud post).

Cloud-shared tables can not only pass datasets back and forth but can pass SQLScripts (which are, after all, simply tables) for processing elsewhere.

For example, client workbooks not having Python  (or SQLite ) functionality installed could write a script, post it to S3; that script table could then be picked up by a “server” microETL workbook (or by a pure Python server), executed, and the results posted back to S3 to be picked up by the client (possibly waiting a response using a TIMER).

Server processes could also do the opposite; passing down SQLScripts for client-side execution.

When people think of utilising the cloud, this sort of scenario is not what comes to mind, thinking that Excel is a desktop product so it cannot take advantage of the enormous cost-advantages of cloud-based solutions. Not so, it’s now very easy to integrate the likes of S3 into existing processes, adding that cloud magic but keeping those existing works-so-don’t-fix-it processes firmly on the ground.

If you need help figuring out how to take advantage of S3 (or other AWS services) and how to integrate them with your existing technology base do contact me. I’ve been working with this technology for over 5 years (and have 30 odd years of experience with good-old reliable “ground-based systems”,

What’s with the bucket picture? S3 calls its primary data storage unit a bucket.

UPDATE:

IAM is now available from the AWS Management Console http://aws.typepad.com/aws/2011/05/identity-and-access-management-console-support.html


Advertisements

3 responses to “S3 as an Excel hub

  1. There is actually a few third party desktop apps that helps managing Amazon IAM service such as CloudBerry Explorer . Check out our blog post on how you can create IAM users, groups and policies. http://blog.cloudberrylab.com/2010/10/how-to-create-subaccounts-and-share.html

  2. Pingback: S3 as an Excel hub

  3. Pingback: PowerPivot mini-me server! | Gobán Saor