In my previous post Dax the new SQL I argued that perhaps DAX would replace SQL as the database query language, in particular for those I term “civilian datasmiths”. But perhaps I should have hailed DAX as a new contender in the world of noSQL.
When I started in this business in the late 1970s, the database world I encountered was a noSQL one. Database structures were in the main laid out by DBAs using either hierarchical (e.g. MUMPS) or network topologies (the various CODASYL offerings). Programmers would then navigate these predetermined access paths using a cacophony of APIs and query languages. It worked quite well, and if fact many developers found the move to a SQL world difficult, as they regarded the pure relational model to be sub-optimal when developing applications (and many still do!).
But there were two main problems with such databases:
- Although the pre-defined access paths worked well for development, they were often next to useless for downstream reporting, which usually meant that data had to be dumped to files for “reshaping” and de-normalisation, with resulting flattened data often being consumed by early end-user reporting tools such as Datatrieve. (Sound familiar?)
- Likewise, although the original data design tended to be ideal for the original application design, downstream changes in requirements were not easy to incorporate. DBAs were busy, expensive and usually very grumpy, which meant that many enhancements, as with their reporting cousins, had to make do with using external files.
The SQL era was a huge improvement. Nevertheless, many developers found the impedance mismatch between this relational SQL world and their own “networked” object-oriented world to be a continuing problem, leading to the constant search for the perfect ORM and eventually the re-emergence of noSQL database engines.
PowerPivot’s VertiPaq engine is one of these new noSQL databases. It resembles those pre-SQL databases in that its access and relationships paths are largely determined not by its DAX query language but by whoever builds the database in the 1st place. The big difference is of course, PowerPivot’s primary (and only) use-case is reporting while most databases have in the past targeted system datastorage as their primary purpose in life, with reporting as a by-the-way. The other difference is that the creators of PowerPivot “databases” are less likely to be DBAs and more likely to be end-users or at least those in IT, like data/business analysts, who are close to the business i.e. more common on the ground, less expensive and hopefully less grumpy. Indeed many civilian DAX practitioners will have the luxury of being able to populate their reporting models with suitable datasets that have already been constructed by existing data warehousing systems.
It’s this separation of filtering- aggregation-transformation from topology that I think will help endear DAX to a less technical audience that has been the case with SQL. MDX also had this concept, but the underlying “more natural” tabular nature (business users live by lists!) that DAX builds on, will I think,be easier for many to engage with.
I used “I think” at lot in the previous paragraph as it’s very difficult for somebody like myself (a professional programmer and veteran of many a database mind-share war) to know how non-technical folks will actually view something like DAX. Although my career has been one largely based on business-user interaction, I’m still a techie at heart.
So what do you civilian datasmiths think? DAX, yep I’ll learn it, or nah, that’s what those geeks down in the basement are for!