Python the new VBA ?

These last two weeks, Python has been on my mind. First off, last week I decided to make time to fully investigate Picalo, an open-source Python-based data analysis tool, and then, this week, Google announced their long awaited cloud-computing offering, Google Apps Engine, with the language at its core.

Python was the first of the “LAMP generation” scripting languages that I decided to learn in any detail ( I had used Perl before that but only on a per-task basis (similar to how I’d used AWK)). I then invested time in learning PHP, then Ruby and finally JavaScript. And here I am, back where I started, with Python.

But it’s not the same Python I learned three years ago, not that it has changed that much, but my appreciation of the language has, largely due to my deep dives into other languages. For example, JavaScript’s treatment of functions as first-class objects, highlighted the same functionality in Python, something I’d missed (or rather, not fully understood) the first time I encountered the language. Likewise, Ruby’s RoR introduced me to a “best of breed” approach to web application design, something that can be used as a comparison aid when approaching new web frameworks such as Django.

But of course the scripting language that continues to power most of my datasmithing activities is Excel VBA. That’s why I was so excited to see a tool such as Proto utilise VBA as its scripting language. But, Microsoft has abandoned VBA, there will be no more Protos.

Also, Excel VBA is now a Windows only language. Windows, however, is no longer the ‘only’ business client OS (see how many Apple laptops you can spot the next time you’re in a business-class airport lounge, a few years ago it would have been zero, not any more), and is currently nowhere to be seen as a cloud computing platform (but that’ll change).

I’m at heart a table-oriented programmer, and I, like Picalo’s author Conan Albrecht, believe “data analysis is best done through scripting”; but not just data analysis, the T in ETL (Extract, Transform and Load) and the I in DI (Data Integration) and SI (Systems Interfacing) also benefit from a scripting approach.

So, what to adopt as a successor/companion-in-her-old-age to VBA, will it be Ruby, JavaScript, Python, Perl, even PHP?

It looks like it’ll be Python because it’s …

The runner up is of course Ruby, but its poor integration with Windows is a major problem and the datasmithing “prior art” of Picalo and Resolver makes Python hard to beat.

UPDATE Jan 2010:

To experience the best of both worlds, VBA & Python, my xLite (Excel combined with SQLite) datasmithing platform now allows Python to be used in conjunction with VBA.  Check it out here

 

UPDATE: July 2011:

For another method of integrating Python (this time .NET’s IronPython) with Excel/VBA see http://blog.gobansaor.com/2011/07/18/vba-multithreading-net-integration-via-hammer/
UPDATE:

Also, as Dan pointed out in the comments below, I’d not included Jython in my list of reasons for embracing Python. I must add it to my list of things to try out particularly as both my “classic” ETL tools, Talend and Kettle are JVM based.

Another thing to add to the (ever growing) list is Mike Pitarro’s SnapLogic python-based ETL tool. They have …

…just released a 2.0 Beta version with some major architectural enhancements. The SnapLogic model is very different from traditional ETL systems. It takes an approach that’s more like the web, based on loose coupling and HTTP interactions. We model data source, sinks, and transformations as URI addressable endpoints, and have a model where than can be chained together in pipelines to build transformation logic. We use a plugin architecture to make it easy to add custom components.

Advertisements

11 responses to “Python the new VBA ?

  1. Don’t forget Jython, you may not work on the JVM now but you might down the road. Jython is moving quickly again toward 2.5 (or will it be 2.6?)

  2. Dan,

    You’re right, and as I do a lot of work with the JVM via Talend and Pentaho PDI and am actively looking at WaveMaker I must look into Jython.

    Tom

  3. Not to mention that SPSS also links to Python as its scripting language of choice.

    See http://www.spss.com/devcentral/ for more details.

  4. Pingback: Python to replace VB6 … « Gobán Saor

  5. Joseph B Cohen

    Are we back to ‘my python’ can beat ‘your perl’ ? ANYTHING you can do in python you have been able to do in perl for a ‘long time now’. And this above ‘It’s powerful, fun to use, easy to learn, function orientated and I even like the use of human-eye-friendly ????? indentation rather than braces and delimiters as a code grouping mechanism.’ would get you fired if you worked for me.
    I’ll tell you what I think – working in the computer field since 1968 – it’s called perspective : python is a perl ripoff ! Change some syntax, steal the C libraries and wow ! a new language.
    Quote from King Solomon – Ecclesiastes(spelling?) “there is nothing new under the sun”. Works as well today, in the computer field, as in did two thousand years ago .

  6. Joseph,

    Calm down, it’s only a programming language! I expect these sort of rants from the “young folks”; there are many things to get excited about in life, but not programming language wars.

    Firing people , i.e. affecting people’s “real lives”, would I hope be based on something more substantial than expressing an opinion on something as trivial as a programming tool!

    I’m a relative new comer to the business, 1978, but I’ve worked with a lot of tools, most average, some excellent, some atrocious, but as long as I got paid, it made very little difference to me what the tool was. Perl by the way would be up there in the excellent category, as would PL/SQL, MUMPS, SAP Config, ESSBASE, Excel VBA.

    As I said, when others who pay me pick the tool I just get on with it, but when I get to pick my own tools I tend to choose based on (a) fit for purpose (b) cost effectiveness (being self employed) and (c) my own totally subjective “would I like to spend my time working with this” attitude. Being “un-fair” to Perl by choosing Python doesn’t come into it.

    To paraphrase, William Munny as he pointed the shotgun at Little Bill Daggett in the closing scenes of Unforgiven, “Fair’s got nothin’ to do with it”. But I guess that’s what you say as you fire another employee for daring to express an opinion contrary to your own.

    Tom

  7. It’s nice to read an article from someone that is not a programming language fanatic and see that python can solve his problems in an easy way.
    Python is not ‘the language’, it is not the solution to all problems. But makes our day’s work more easy!
    just a comment: python is not like vb scripts. Macros are always runs as scripts. Python makes the first time you run the script, a compiled program. Always the second run is faster than first. But if you change something in the code, it recompiled and you never know of that.
    Regards,

  8. @matigro

    Agreed, I like Python simply because I can get things done with it. The same reason I like VBA, C, SQL etc. , fit for purpose, minimum ceremony, that’s all I ask.

    Tom

  9. Pingback: TAG Cubes – SQLite Star Query Part III « Gobán Saor

  10. Great post. BTW, Picalo will soon have a state of the art table engine (via PyTables), so the in-memory list-based table limitation will be gone. It’s C-based and automatically does disk caching. I’m excited to get it finished.

    • Thanks Conan,

      I saw from the mailing-list that you’re working on adding PyTables support to the already excellent Picalo, looking forward to trying it out.

      Another addition you might consider at some stage is tightly integrating Picalo tables with SQLite (perhaps using APSW http://bit.ly/2Dq2lm as it allows for easy coding of user-defined functions and SQLite virtual tables).

      I’ve combined Excel, SQlite and Python to create my own “datasmithing” tool xLite http://bit.ly/3a4uz7 and find SQLite fast and capable of handling very large volumes of data.

      I don’t look on SQLite as a “miniature database”, more as highly efficient memory-based SQL-speaking table engine with inbuilt on-dish caching facility http://bit.ly/dbysQ, a bit like a SQL PyTables I guess.

      Tom