Tag Archives: PDI 3.0

Open Source Metrics and Benchmarks

Marc Russel’s blog links to a Manapps ELT benchmark report comparing the performance of several leading ETL tools both proprietary (DataStage and Informatica) and OS (Talend and PDI (aka Kettle)).  As would be expected each tool has their own strengths and weaknesses, but one thing stands out, the venerable Kettle ETL aka PDI 3.0 is now a serious contender for handling very large datasets.  Oops, that’s what I get for wishing for a result and (mis-)reading the report early in the morning with a cold and bad sore throat, sadly PDI is still very much slower that its OS cousin Talend. In fact, Talend continues to play on the strength that comes from a code generated sloution, i.e. raw speed.  As a pure ETL play, Talend is well capable of playing on the same pitch as the “big kids”. 

Interestingly, the report is also “open source” as it’s released under a Creative Commons License, so I can link to it here.


There’s now a new version of the report available (www.manapps.com, Topic Benchmark), it seems the original was just a work-in-progress and was not meant for public release.  The main difference appears to be a significant improvement in Informatica’s ‘score’, but I’m not sure as I was really only interested in comparing the two OSS products, Talend and Pentaho PDI, in that ‘battle’ Pentaho still comes out ‘slower’.

 The original Marc Russel blog entry and a subsequent one reporting the new updated report appear to have both been removed.  

Also, I was informed of the ‘updated’ report via this email from manapps, which assures vendors that they are happy to rerun any tests and provide any information re the running of such tests … 

Dear Sir,

You referred on your web site to the report called “Benchmark ETL” by Manapps, from November 2008. This draft report was not intended to be publicly released since just a working document.
We would like you (i) publish Asap the modified version (or its related link) that supersedes the former one (on our web site (www.manapps.com, Topic Benchmark), (ii) state that Manapps had no intend to release the former report and accordingly takes no responsibility on its content, (iii) state that Manapps holds all necessary elements at the disposal of all vendors so that they can rerun some tests if wished that will then be published.

Philippe THOMAS

Time: Thursday March 5, 2009 at 5:10 pm


Another analysis of OSS in the wild this time from Chris Keene, WaveMaker CEO, on OSS as a marketing tool. Bottom line, 1% conversion rate, 700 paying customers in 9 months …   

WaveMaker OSS as a marketing tool

WaveMaker OSS as a marketing tool


Why not join me on Twitter at gobansaor?