The latest version of the technology stack behind PowerPivot (“Denali” CTP3) is now available for download. This expands the technology significantly, with the VertiPaq engine taking on an equal role within SQL Server AS alongside the traditional OLAP engine (and it’s clear the intention is for this new “tabular” BI approach to eventually replace its venerable OLAP cousin).
Of course, tabular BI (aka reporting) is not new, it was and still remains the primary reporting framework for most. Pure OLAP has always been a minority sport (admittedly a well funded and very successful one). Relational datastores hold most of the world’s business data, so naturally those seeking to report and analyse this data tend to opt for relational tools. Most such tools are SQL driven, whether hand-written or generated, the 40 year old query language is still master of its own house.
Microsoft’s MDX language attempted to create an alternative to SQL for BI reporting. From a purely technical point of view it has succeeded, MDX can model every conceivable reporting requirement but its ability to replace SQL was always going to be hampered by:
- The need to build an intermediate non-relational model to operate on i.e. OLAP hypercubes. This is not to say that OLAP models are not a useful way of modelling a business problem, for certain requirements, top-down budgeting, for example, they’re ideal; but for the other 90% of reporting requirements, a relational model works as well if not better and has the advantage of reducing the “impedance mismatch” between the operational and reporting models.
- MDX is a very difficult language to master, at least compared to SQL. It surpasses SQL in it ability to frame a complex query, but if expecting “civilian datasmiths” to learn SQL was perhaps a realisable dream, expecting end-user MDX mastery was always a pipe-dream. MDX is, and will remain, the preserve of professional programmers (or auto-generated by the query tools built by such professionals).
Ah, but could not the same be said of relational based reporting efforts; star-schemas are logically different from OLTP models, SQL also remains largely the preserve of professionals? Yes, the necessity to create a separate model, adds a level of friction to the process, but the reasoning behind creating star-schemas is to simplify the complex SQL that would otherwise be required to access complex OLTP schemas (and most schema are complex these days as bespoke data models have been replaced with models that support highly generic packaged solutions).
Pure star-schemas (i.e central unified fact table surrounded by a single layer of conformed dimensions) also made it easier for RDBMs to add suitable indexing structures and optimisers to enable fast query responses. In-memory columnar stores such as VertiPaq negate some of the performance reasoning for employing pure stars (the primary simplification reason remains) enabling a return to more “natural” snow-flaked schemas and dis-jointed fact tables (e.g. order-header & order-line rather than a denormalised order-line-fact).
What about SQL then?Although the Q is SQL stands for Query, SQL failed to capture the heart and minds of those who undoubtedly could have benefited from its query power due to the lack of easy to declare powerful query expressions. This may seem a strange thing to say, as anybody who has fully mastered SQL and understands OLTP design principles will tell you that nearly any query is possible using pure SQL. And is SQL not easy to learn? Yes it is, and for say ETL tasks the level of SQL mastery required is not the great, but once you stray into the world of cross-table reporting, time windows, complex aggregations, most civilians will abandon the effort and revert back to using sub-optimal spreadsheet techniques, often patching together solutions that are both costly and fragile.
This is where I think DAX will shine. The language has the potential to pick up where SQL faltered and MDX overshot, a query/reporting language that will empower both professionals and end-users. Easier to learn that SQL for low-hanging fruit, but with the power of MDX (but within easier reach) for those complex, but high-return, queries.
But will civilians invest the time to master DAX or will it become yet another SQL or MDX, largely confined to IT? I think it has a fighting chance mainly due to its close relationship with Excel. SQL always felt like a subject best learned in a university lecture hall, while DAX 101 could easily become a night-class at your local adult-education centre this autumn.
I haven’t had a chance to fully investigate the new features of PowerPivot in Excel yet, but one new feature that’s going to help PowerPivot become the default way data analysts build pivot tables (and hence drive the adoption of DAX) is the ability to show details (aka drill-thru). Another, is the ability to sort columns using another column (e..g. show Month Name but sort by Month Number). Both of these may seem trivial compared to the other deep-technical changes such as the ability to natively host PowerPivot models in SQL Server (i.e. no need for SharePoint). But these two features where major missings for those who depend on “normal” pivot tables for dashboarding and discovery purposes.
Other new features that caught my attention where:
- New relational functions: CROSSJOIN, GENERATE (Join), GENERATEALL ( Left Outer Join) and SUMMARIZE (Group by).
- Ability to generate hierarchies, and functions to navigate such hierarchies.
- Ability to handle multiple relationships between tables (via functions that can define a new relationship at calculation time). This will allow for “roles” such as using billing date for one measure and order date for another.
- Removal of restriction on fact table date dimension foreign keys being of type Date. Will make direct import of many data warehouse star schemas much easier (most traditional DW star schemas use integer surrogate foreign keys).
- New “programming” helpers such as SWITCH and HASONEVALUE, will make construction and interpretation of complex measures easier .
So what do you think,? DAX, new dawn or old wine in new bottle?