DAX the new noSQL?

In my previous post Dax the new SQL I argued that perhaps DAX would replace SQL as the database query language, in particular for those I term “civilian datasmiths”. But perhaps I should have hailed DAX as a new contender in the world of noSQL.

When I started in this business in the late 1970s, the database world I encountered was a noSQL one. Database structures were in the main laid out by DBAs using either hierarchical (e.g. MUMPS) or network topologies (the various CODASYL offerings). Programmers would then navigate these predetermined access paths using a cacophony of APIs and query languages. It worked quite well, and if fact many developers found the move to a SQL world difficult, as they regarded the pure relational model to be sub-optimal when developing applications (and many still do!).

But there were two main problems with such databases:

  • Although the pre-defined access paths worked well for development, they were often next to useless for downstream reporting, which usually meant that data had to be dumped to files for “reshaping” and de-normalisation, with resulting flattened data often being consumed by early end-user reporting tools such as Datatrieve. (Sound familiar?)
  • Likewise, although the original data design tended to be ideal for the original application design, downstream changes in requirements were not easy to incorporate. DBAs were busy, expensive and usually very grumpy, which meant that many enhancements, as with their reporting cousins, had to make do with using external files.

The SQL era was a huge improvement. Nevertheless, many developers found the impedance mismatch between this relational SQL world and their own  “networked”  object-oriented world to be a continuing problem, leading to the constant search for the perfect ORM and eventually the re-emergence of noSQL database engines.

PowerPivot’s VertiPaq engine is one of these new noSQL databases. It resembles those pre-SQL databases in that its access and relationships paths are largely determined not by its DAX query language but by whoever builds the database in the 1st place. The big difference is of course, PowerPivot’s primary (and only) use-case is reporting while most  databases have in the past targeted system datastorage as their primary purpose in life, with reporting as a by-the-way. The other difference is that the creators of PowerPivot “databases” are less likely to be DBAs and more likely to be end-users or at least those in IT, like data/business analysts, who are close to the business i.e. more common on the ground, less expensive and hopefully less grumpy. Indeed many civilian DAX practitioners will have the luxury of being able to populate their reporting models with suitable datasets that have already been constructed by existing data warehousing systems.

It’s this separation of filtering- aggregation-transformation from topology that I think will help endear DAX to a less technical audience that has been the case with SQL. MDX also had this concept, but the underlying “more natural” tabular nature (business users live by lists!) that DAX builds on, will I think,be easier for many to engage with.

I used “I think” at lot in the previous paragraph as it’s very difficult for somebody like myself (a professional programmer and veteran of many a database mind-share war) to know how non-technical folks will actually view something like DAX. Although my career has been one largely based on business-user interaction, I’m still a techie at heart.

So what do you civilian datasmiths think? DAX, yep I’ll learn it, or nah, that’s what those geeks down in the basement are for!

Advertisements

2 responses to “DAX the new noSQL?

  1. I’m a business analyst. Since PowerPivot came out I’ve been able to analyse data in a better,more productive way.

    DAX is not too hard to learn but the main learning block for existing Excel power users will be having to think in terms of multiple contexts and having to think more about data modelling.

    Additionally, the new power means we can do far more. Some new types analysis I’ve been able to do are: Simulations & Scenario Planning, using very large data sets.

    Despite PowerPivot enabling ‘Self Service’ BI, we still need our DBD.. As I’m sure you know data in databases is not really designed for human consumption. The requirement for ETL is something I’m not keen on getting involved in any time soon.

    Lee

    • Agreed, as I’ve said before …

      “Self service BI doesn’t mean do-it-all-by-yourself BI, no more than a self-service sausage and bacon hotel breakfast means you first catch a pig.”
      (see this post http://bit.ly/hbuczD).

      There’s all sorts of “civilian datasmiths” out there, some like yourself are lucky to have the support of an IT department, others, alas, are less lucky, and must get their heads around the ETL process (or find somebody who can) if they’re to benefit from the power that tools like PowerPivot now provide.

      Even where there’s IT support (or decent data export APIs provided by a vendor, for example SAP) in many cases this only covers the “E” part of ETL. The “T” part is usually where the added-value begins and as such I would recommend that datasmiths arm themselves, if they can, with the tools and knowledge necessary. It’s mainly that type of datasmith that I had in mind when I developed microETL and now HAMMER.

      Tom