DAX the new SQL?

The latest version of the technology stack behind PowerPivot (“Denali” CTP3) is now available for download. This expands the technology significantly, with the VertiPaq engine taking on an equal role within SQL Server AS alongside the traditional OLAP engine (and it’s clear the intention is for this new “tabular” BI approach to eventually replace its venerable OLAP cousin).

Of course, tabular BI (aka reporting) is not new, it was and still remains the primary reporting framework for most. Pure OLAP has always been a minority sport (admittedly a well funded and very successful one). Relational datastores hold most of the world’s business data, so naturally those seeking  to report and analyse this data tend to opt for relational tools. Most such tools are SQL driven, whether  hand-written or generated, the 40 year old query language is still master of its own house.

Microsoft’s MDX language attempted to create an alternative to SQL for BI reporting. From a purely technical point of view it has succeeded, MDX can model every conceivable reporting requirement but its ability to replace SQL was always going to be hampered by:

  • The need to build an intermediate non-relational model to operate on i.e. OLAP hypercubes. This is not to say that OLAP models are not a useful way of modelling a business problem, for certain requirements, top-down budgeting, for example, they’re ideal; but for the other 90% of reporting requirements, a relational model works as well if not better and has the advantage of reducing the “impedance mismatch” between the operational and reporting models.
  • MDX is a very difficult language to master, at least compared to SQL. It surpasses SQL in it ability to frame a complex query, but if expecting “civilian datasmiths” to learn SQL was perhaps a realisable dream, expecting end-user MDX mastery was always a pipe-dream. MDX is, and will remain, the preserve of professional programmers (or auto-generated by the query tools built by such professionals).

Ah, but could not the same be said of relational based reporting efforts; star-schemas  are logically different from OLTP models, SQL also remains largely the preserve of professionals? Yes, the necessity to create a separate model, adds a level of friction to the process, but the reasoning behind creating star-schemas is to simplify the complex SQL that would otherwise be required to access complex OLTP schemas (and most schema are complex these days as bespoke data models have been replaced with models that support highly generic packaged solutions).

Pure star-schemas (i.e central unified fact table surrounded by a single layer of conformed dimensions) also made it easier for RDBMs to add suitable indexing structures and optimisers to enable fast query responses. In-memory columnar stores such as VertiPaq negate some of the performance reasoning for employing pure stars (the primary simplification reason remains) enabling a return to more “natural” snow-flaked schemas and dis-jointed fact tables (e.g. order-header & order-line rather than a denormalised order-line-fact).

What about SQL then?Although the Q is SQL stands for Query, SQL failed to capture the heart and minds of those who undoubtedly could have benefited from its query power due to the lack of easy to declare powerful query expressions. This may seem a strange thing to say, as anybody who has fully mastered SQL and understands OLTP design principles will tell you that nearly any query is possible using pure SQL. And is SQL not easy to learn?  Yes it is, and for say ETL tasks the level of SQL mastery required is not the great, but once you stray into the world of cross-table reporting, time windows, complex aggregations, most civilians will abandon the effort and revert back to using sub-optimal spreadsheet techniques, often patching  together solutions that are both costly and fragile.

This is where I think DAX will shine. The language has the potential to pick up where SQL faltered and MDX overshot, a query/reporting language that will empower both professionals and end-users. Easier to learn that SQL for low-hanging fruit, but with the power of MDX (but within easier reach) for those complex, but high-return, queries.

But will civilians invest the time to master DAX or will it become yet another SQL or MDX, largely confined to IT? I think it has a fighting chance mainly due to its close relationship with Excel. SQL always felt like a subject best learned in a university lecture hall, while DAX 101 could easily become a night-class at your local adult-education centre this autumn.

I haven’t had a chance to fully investigate the new features of PowerPivot in Excel yet, but one new feature that’s going to help PowerPivot become the default way data analysts build pivot tables (and hence drive the adoption of DAX) is the ability to show details (aka drill-thru). Another, is the ability to sort columns using another column (e..g. show Month Name but sort by Month Number). Both of these may seem trivial compared to the other deep-technical changes such as the ability to natively host PowerPivot models in SQL Server (i.e. no need for SharePoint). But these two features where major missings for those who depend on “normal” pivot tables for dashboarding and discovery purposes.

Other new features that caught my attention where:

  • New relational functions: CROSSJOIN, GENERATE (Join), GENERATEALL ( Left Outer Join) and SUMMARIZE (Group by).
  • Ability to generate hierarchies, and functions to navigate such hierarchies.
  • Ability to handle multiple relationships between tables (via functions that can define a new relationship at calculation time). This will allow for “roles” such as using billing date for one measure and order date for another.
  • Removal of restriction on fact table date dimension foreign keys being of type Date. Will make direct import of many data warehouse star schemas much easier (most traditional DW star schemas use integer surrogate foreign keys).
  • New “programming” helpers such as SWITCH and HASONEVALUE, will make construction and interpretation of complex measures easier .

So what do you think,? DAX, new dawn or old wine in new bottle?

Advertisements

7 responses to “DAX the new SQL?

  1. Still a steep learning curve, even for Excel experts. The biggest stumbling block currently is the job market. Even though Microsoft touts PowerPivot as BI for the (Excel) masses, there are 0 jobs listed on indeed.com for Excel-centric jobs with PowerPivot expertise. Employers don’t get it, so why should those people invest their valuable time learning DAX when there is no financial incentive to do so? For BI experts, it’s SQL and MDX and now DAX.
    Most are probably still busy mastering MDX. Microsoft’s technical direction is great, but they are poor salesmen, and it is obvious that DAX and who will using it must be sold to the business community.

  2. Yes David, you’re right. Success is not guaranteed, and it is a steep learning curve, and not just for Excel jocks but also for IT folks.

    There’s no “easy” way to BI mastery, but as you rightly point out, people will only invest in something if there’s a good chance of a return on that investment. If you’re IT and specialise in MS’s BI stack then the need to start learning DAX is obvious, the same cannot be said for those who’s primary focus is Excel.

    That’s partial MS’s “fault” for wrapping the PowerPivot product completely in the Enterprise SharePoint and (now with Denali) Enterprise SQL Server flags. If you took a cursory look at the information out there on PowerPivot (both MS and external) your average Joe Excel would pass right by, thinking “this is not for the likes of me”. But then again the “fault” also lies with the the nature of Excel usage in the marketplace. Most Excel users can’t (or won’t) use a basic pivot table never mind a PowerPivot powered one. Excel is a broad church and trying to market to it or even to part so it, is very difficult.

    It’s the same problem faced by anybody trying to market to the SME (SMB) sector. But MS has proved to be very successful selling to SMEs, not by big budget centrally managed marketing, but by using channel partners. It is those partners (both official and one-man-and-his-dog local IT shops) that MS needs to get this message out to. It is in this SME sector, supported and advised by their trusted (hopefully) local suppliers that PowerPivot has the potential to be a game changer. For these local suppliers, Excel based PowerPivot offers the potential to offer a new service, call it BI for the masses, cheap but powerful (initial capital entry cost, a fast modern PC plus a single Excel 2010 licence) but with the potential to grow as the company or the company’s appetite for BI, grows.

    Exciting times.

  3. Pingback: PowerPivot Show Details | Gobán Saor

  4. Exciting times indeed!

    David, it is incumbant for the entrenched high-end Excel jocks to communicate to the biz what PP is all about.

    Come on, how hard is it?
    Step 1: Get boatload of SAP
    Step 2: Load into PP
    Step 3: Go nuts

  5. Pingback: DAX the new noSQL? | Gobán Saor

  6. Pingback: Automating PowerPivot Refresh operation from VBA | Gobán Saor

  7. Pingback: Building an Excel 2013 Percentile dashboard without PowerPivot or a PivotTable | Gobán Saor