Dublin Bus and PALO ETL – the connection!

Dublin buses, as is the norm with most road-based public transport systems in our increasingly car-choked cities, tend to operate on the basis of “no sign of a bus for ages, then two or three arrive at the same time”. Palo MOLAP ETL options appear to be following the same pattern; we’ve been waiting for ETL support for ages and now we see three of them heading down the road towards us. There’s Palo’s own offering, then came Stratebi‘s Kettle Plugin and now Talend Version 2.3.0RC2 is offering a Palo output component.

Mind you, the Talend offering is very basic and I’ve not managed to get the Sratebi plugin to work, leaving Palo’s ETL Server as the front runner at the moment (drill-through capability is a winner in my book).

I’ve also been busy re-factoring my VBA SQLite and Amazon S3 code with the intention of publishing them as an Excel based micro-ETL platform. While cleaning up the Amazon AWS modules I’ve been playing with SimpleDB, I’m impressed, Excel combined with SimpleDB rocks!

I’ve also wrapped the open source XySSL SHA1 HMAC C code in a VBA friendly DLL, as searching for a VBA hmac sha1 hash implementation (essential for Amazon AWS access) has proved fruitless.

Hope to release the lot the end of next month.


Thanks to Javier and Jorge from Stratebi I’ve managed to get the new Kettle Palo plugin to work. It seems that the TEST facility in the Kettle database connection dialogue throws an exception for Palo connections but the connections work fine in the actual Palo input/output steps. Did a quick test and it looks very easy to use and fits in well with the Kettle “way of doing things”.


5 responses to “Dublin Bus and PALO ETL – the connection!

  1. Hi goban,

    Hope we did help :). Thanks for trying the plugin, please contact us if you discover any new things.


  2. Hi goban,

    as the developer from the Talend Palo Plugin i’d like to know what do you thing is missing?
    To give you an idea with which person you have to do, i am the guy who has developed the Pentaho JScript_mod.
    I’am sorry that you dont receive any feedback about your question inside our forum. Maybe time changes and you are allways wellcome to post any questions.


  3. Hi Martin,

    When I’m working with Talend the thing I miss most from Kettle is Javascript support, so thanks for Jscript_mod, maybe you could add something similar to Talend 😉

    As for your Palo Plugin, it’s OK in that it handles the value (aka measure) loading side of things. But, the real “pain-point” for MOLAP ETL is the management of dimensions, particularly the hierarchical levels; this is essential in any ETL tool that attempts to deal with MOLAP cubes.

    Also, as Talend is both a DI and ETL tool, the ability to read both dimensions and measures would be very useful.

    As for the forums,they (along with the Talend product itself) have improved greatly. I’m using Talend (Java) for a current project and am greatly enjoying the experience. Keep up the good work.


  4. hi goban,

    i tried now cubeware importer, the kettle-plugin and talend os to create dimension structures. cubeware (freeware) allows only 50.000 rows maximum per import, talend is just useful for cell-filling and kettle is the only on left, which works properly. i agree absolutely to your requirements of ETL-tools!!!!

    i got a small problem. the data fields which i transfer are formatted in integer and string. when i try to create a dimension with kettle, the dimension elements are formatted in text values (only when i enter string fields, integer fields are then formatted in numeric elements, as i want) – and palo does not calculate with text elements….
    maybe you got a solution??

    nice blog

  5. Markus,

    Cubeware is a no-no not just because of its row limitations but also due to its free licences being valid only until next December, after that who knows?

    As for Kettle problem, I’ve not done any “real work” with the palo-plugin, just taken it for a “spin-around-the-block”. But I’ve found the folks at http://www.stratebi.com/ very helpful, if you contact them (or the excellent Pentaho PDI forum) they should be able to help you.

    If you’re doing a lot of PALO work you should take a look at Jedox’s new ETL-Server product ( http://blog.gobansaor.com/2008/05/26/palo-olap-and-sparse-dimensions/ ), it’s very good.