Steam Powered PowerPivot

In previous posts I described various options for enabling PowerPivot models (or indeed any Excel model) as the hub in a hub’n’spoke application. This thin’n’core approach is the norm for web apps as it was for the client server applications that preceded them.

But as we know, Excel “applications” rarely follow this pattern.

They tend to follow a more peer-to-peer topology, often with each player being an “equal”; collaboration being the primary aim rather than service provision.

But even in such scenarios there usually is a hub at the centre providing the communication linkages. Most likely the company’s email system or a “shared drive” of some sort. In fact, on many corporate teams or projects, you know you’ve been accepted into the fold when the project manager says the magic words “I’ll get you access to the shared drive”.

But sometimes, hub-like service provision or a single controller role is required and in such situations one of the spokes takes over that responsibility. For example, in a budgeting scenario one spoke may be the source of historical stats, in effect, performing a service role.

If classic client-server uses the powered-hub-driving-the-client-spokes like the driven wheels of a car as an analogy, then many Excel systems could be compared to a steam engine’s driven wheels; where the hub provides a centre to rotate about but it’s a particular spoke that does the driving due to its linkages to the piston arm.

Such systems built around the sending back and forth of workbooks or CSV files via email or shared drives are common and most of us have seen them in action. Most tend to be manual, in that they require user action on both client and server side, but with a bit of VBA magic one or both can be automated. Options for doing so on the server side are:

  • A VBA loop that sleeps and checks for change, process any changes and sleeps …
  • Trigger a microETL.SQLScript using a COM automation call from a external program like that used in this scenario.
  • Trigger a script using an RTD Server that checks for new activity.
  • Use a “Javascript” timer as per Excel Hero’s neat trick.

RTD servers are rarely encountered in the wild as they only appeared on the scene in Excel XP and were not the easiest things to code or understand but now new tools such as ExcelDNA makes coding RTDs much more approachable and useful.

Daniel Ferry‘s JSTimer idea is a simpler VBA alternative to an RTD and is non-blocking unlike a pure VBA sleep-check-sleep loop. It also has the advantage that no add-in is required so the client-side could be a single-file macro-enabled workbook.

Using a shared folder or email as the conduit for passing data back and forth and using Excel files or CSVs as the data artefacts may be the obvious choices but other “more modern” alternatives exist.

For example, for situations where a shared folder isn’t an option why not use something like Amazon’s S3? S3 is secure (much more secure than external email), reliable, simple to use and extremely cheap. A modern-day alternative to an FTP server. Ideal for big corporates,SMEs and sole-traders.

Orchestrating the processing of files on S3 is very simple using tools such as Python’s Boto (another good reason for having Python in-built in microETL). For the client-side, pure VBA could  be used, see my xlAWS code.

Likewise, using workbooks as the medium of data exchange is usually the first option, with CSV files a close second. But since Excel 2003 Professional, Excel has had excellent support for the exchange of XML files via XML Maps. Although not useful for highly complex schemas or very large datasets (both of which are anyway the work of the devil 😉 ), XML Maps are really good at handling relatively simple dialog and configuration inter-change requirements.

The process of importing/exporting data via XML can be easily automated using basic VBA but can also be safely left under manual control. XML files can also be generated and read by non-Excel components.

So when you’re confronted with a PowerPivot model that is not suitable for direct sharing, due to:

  • technical constraints e.g. non-2010 clients
  • or business reasons e.g. detail data behind model, or the model itself, needing to be kept secret
  • or model complexity reasons i.e. too complex for a single PowerPivot model …

… investing in a top-of-the-range SharePoint farm is not your only option. By utilising some of the enormous power that comes as standard with Excel such as VBA, XML Maps, RTDs etc. further empowered by an Excel-focused ETL tool such as microETL a lot can be achieved on a relatively small budget.

If you need help designing or configuring such a scenario or need help in understanding where PowerPivot can fit in to your decision making and information distribution requirements, contact me. My job in life is to provide than help …

Download latest version of microETL from http://www.gobansaor.com/microetl

Advertisements

13 responses to “Steam Powered PowerPivot

  1. Pingback: Steam Powered PowerPivot

  2. Do you think there would be a place for a distributed revision control system for data? I’d hate to develop software without git/bzr/hg/… and I don’t quite understand why there isn’t an equivalent for spreadsheet or database work. In my spare time, I’ve been working on applying fossil (a distributed revision control system created by the sqlite folks) to data sharing. But I suspect there must be a reason why this hasn’t been done before that I’m not understanding. Do you have any insight into this?

  3. Hi Paul,

    I’m guessing you’re the guy behind http://share.find.coop/ ?

    So is there a need for a fossil/git/etc-like distributed revision control for spreadsheets?

    The short answer is yes. I myself have looked at using fossil for same reason, that’s how I came across Coopy which has some very good stuff in it.

    The long answer is again yes but maybe the comparison with software artefacts is slightly off-course . This is because businesses may often look chaotic from the outside and the sending back and forth of spreadsheets may look like a recipe for disaster (as it sometimes is), but the reality is usually very different. All successful businesses are built around processes aka work-flows; the individuals involved in such work-flows tend to know there place and their role. So multiple actors working on the same problem and then later merging that work is usually a sign of a broken process and the last thing a business manager will think is “I need to automate this sort of behaviour” rather she will try to fix the broken process.

    Day to day business (which is what most business systems support) is very much water-fall “pass the parcel” in nature (non-linear “exceptions” are also very common, but that’s why businesses employ humans who can react on the fly i.e. make it up as they go along, this part of business is much harder to “system’ise”).

    So I guess what I’m saying is that DIFF management rather than MERGE management is probably the primary ‘need’ that something like Coopy could satisfy. And also possibly thinking “single service spoke” rather than “service hub” or “multiple service spokes” (as I’ve tried to describe above) may also lead to a better definition of the problem to be “solved”.

    I’m not 100% clear in my own mind on the best approach so don’t take above as gospel, just my gut instinct at the present moment.

    Tom

  4. That’s very helpful, thank you. Yes, I’m the Coopy guy. Since you mention DIFFs: I spent some time working with Joe Panico, author of diffkit (www.diffkit.org), trying to hammer out a good DIFF format for data (we got as far as this: http://share.find.coop/doc/tdiff_spec_draft.html). Do you have thoughts on how diffs should be expressed? In the database world, people seem to stick with SQL for diffs, which is fine in closed environments but less useful in more loose environments (which I’m interested in). For Excel users, would it even be useful to have a diff format, or are graphical visualizations key?

  5. Paul,

    If you can come up with a generic DIFF graphical visualisation for Excel that would be ideal, but nobody has yet, at least I’m not aware of one.

    But a structured document type format would still be very useful but it must be easily loaded into Excel without code-based (or at least not too much code) parsing. Again, to return to my post above, I’ve hinted I’m increasingly interested in XML Maps as a data inter-change data format for Excel apps. By XML maps I mean that type of XML (relatively simple master-detail(s) type schemas i.e. not anything like XBRL!!!) that the XML Map functionality, now widely available in Excel 2007/2010, supports. So perhaps a DIFF document capable of being handled by XML Maps might be worth pursuing.

    This whole DIFF business has been a constant interest to me as I’ve spent a large portion of my career doing system migrations, where initial loads for testing and pre-live need to be kept in sync via some sort of delta tracking process. The data structure I eventually settled on to aid me in such endeavours was a variation on a star-schema (that maybe due to the “I’ve a hammer so everything is a nail syndrome” so be warned 😉 ) .

    Each value (text or number) was dimensioned against its name (i.e. it’s column name) it’s key (i.e its “tuple”), various date roles (creation, modification, deletion), who did the change, and so on. This of course involves an enormous explosion in the meta-data to track such deltas but as usually deltas track master data rather than high volume transactions that’s normally not a problem.

    I’ve been meaning to add a few helper functions to microETL to facilitate such scenarios and will do so over the next few months. The idea being to be useful in “loose environments” (which I’m also interested in).

    Tom

  6. Thanks again, that’s very useful.

  7. Pingback: S3 as an Excel hub | Gobán Saor

  8. Pingback: PowerPivot mini-me server! | Gobán Saor

  9. Pingback: Those with a datasmith’s hammer, see every problem as a table. | Gobán Saor

  10. Pingback: Excel as a book of record. | Gobán Saor

  11. Pingback: When HAMMER met SWF | Gobán Saor

  12. Pingback: Excel as a Google Visualization API Data Source | Gobán Saor

  13. Pingback: PowerPivot for Excel as a XML/A Server | Gobán Saor