Last week I tried out both the latest Palo 2.5 release and its sister product, ETL-Server. Although I’ve not done any proper benchmarks, 2.5 does appear to be faster than the previous release and the Excel add-in also behaves better when co-habiting with other add-ins and macros (the previous release’s use of, and response to, application level events meant it often caused the entire Excel session to grind to a halt when other macros were processing data).
There is however, a major memory leak problem when using cubes with sparse dimensions (such as the Biker database Invoice dimension), at one stage my Excel session had a working set size of 750M!
The failure of MOLAP cubes to effectively handle sparse datasets has always been something that ROLAP star schema advocates (myself included) have pointed to as a major short-coming. Over time, products such as Essbase have managed to get around this limitation (but often only with careful up front cube design by skilled staff). Palo hasn’t quite made it to that level yet. But then, maybe it shouldn’t, perhaps simplicity of setup and use should be Palo’s goal, not the ability to handle a telco’s ‘grain-at the-level-of-call’ fact table.
Two of the new features in Palo 2.5 that I’d been looking forward to, Zero Suppression and Drill-through are both relevant to the handling of cube sparsity.
- Drill-through; to reduce the sparseness of a dimension, by moving the ‘grain’ up a level in the consolidation hierarchy, e.g. have a base level of Invoice Type instead of Invoice Number or indeed, dropping the dimension entirely.
- Zero Suppress; to filter out excessive elements, e.g. only show Invoice Numbers that were raised this month rather than showing every invoice number ever raised depending on a non-zero value to indicate that it belongs to this month.
It was my testing of Zero Suppression using the Biker database that appears to have caused Excel to go on a memory binge. It doesn’t worry me that much as the right design can reduce the need for zero suppression e.g. in a customer dimension, ‘hide’ customers who trade with you infrequently under a “Others” consolidation. Also, the use of drill-through would eliminate the need for many sparse dimensions, but, alas, the drill-through functionality is severely nobbled in Palo 2.5.
Firstly, the functionality is only available in Excel if you purchase the €8,000 Palo Supervision Server which is all fine and dandy if you have a need for all the good things that Supervision Server offers. Having said that, I did managed to bypass this requirement by calling the ETL-Server’s SOAP API directly from Excel with the help of Simon Fell’s superb PocketSoap library, so all is not lost on that front. But …
… the other reason I’m not overly impressed with the feature, is its implementation; particularly for something Jedox is asking you to pay €8000 for! Here’s what you get.
Within the ETL-Server’s cube ‘export’ you can specify “<info>” fields such as invoice number, that will not form part of the final cube coordinate list. These coordinates and the info fields along with ‘value’ field are then written to a Derby (aka JavaDB) database table, where the schema is the ETL project name and the table name matches that of the cube. So you end up with, in essence, a ‘fact table’ at a finer ‘grain’ than the cube and the role of degenerate dimensions being provided by the “info” fields. The resulting table is not indexed, so large datasets will be a problem and for it to work, a cube can only be ‘owned’ by a single ETL project.
The other thing to note is that drill-thru only works for ‘N’ coordinates i.e. no consolidation elements. This is unlike Excel’s Pivot table which allows you to double-click at any cell, to reveal the underlying dataset.
On the plus side, the new ETL-Server is actually very good and well put together. Better off not thinking of it as a generic ETL tool but as a specialist Palo loader, like Oracle’s SQL*Loader. And like SQL*Loader’s role on many Oracle projects, for a large percentage of Palo projects ETL-Server will be the only ETL tool needed.