Cross Join Three Tables via DAX Query to seed a Date Dimension.

This is a variation on an old SQL trick to seed a Date (aka Time) dimension. You’ll need such a table in many, if not all, PowerPivot models as the proper functioning of Time Intelligence functions require a separate table containing a date column populated with a no-gaps sequence of dates covering the potential time span of the model.

If you’re sourcing your data from an existing data warehouse, most likely you’ll already have access to a date dimension. But if not, you’ll need to generate one. You could use the auto-fill facility in Excel to do so, but it can be awkward to use when generating a many-year table (e.g. pension & mortgage models might need current year +/-  20 or even 90 years).

To seed the dataset you’ll need three one-column tables; one for DAYs with values of 1 to 31, another for MONTHs with values 1 to 12 and finally YEARs listing the years required e.g. 1995 to 2060 or 2011 to 2013. Next, load these tables into a PowerPivot model, as linked-tables or whatever, and apply the following DAX Query against the model.

Evaluate(
ADDCOLUMNS(
SUMMARIZE(
ADDCOLUMNS(
CROSSJOIN(Years,Months,Days),
“FullDate”,DATE([Years],[Months],[Days])),[FullDate]
),
“Year”,YEAR([FullDate]),
“MonthNo”,MONTH([FullDate]),
“Month”,FORMAT([FullDate],”MMM”)
)
)
ORDER BY [FullDate]

You can apply the above DAX query using this trick (Note: this will not work for Excel 2013, but no worries, as  in Excel 2103, DAX Queries are a supported data-table source, see here).

Or, you could  install the excellent open source DAX Studio Excel Add-in (again doesn’t work in Excel 2013 but will eventually, I would think).

So how does this work?

Let’s start with the most inner function call:

CROSSJOIN(Years,Months,Days)

This simply cross-joins the three tables. The cross-join creates a table consisting of every single combination of days, months & years.

Next:

ADDCOLUMNS(
CROSSJOIN(Years,Months,Days),
“FullDate”,DATE([Years],[Months],[Days])
)

ADDCOLUMNS is the equivalent to the PowerPivot add-in’s “Calcuated Columns” in a DAX Query. This will add a new column call “FullDate” to the table using the DATE function to form a valid date.

Now you might think that many cross-join combinations would generate an invalid date (e.g. 30th March) but DATE is a forgiving function and will convert the likes of 30th of Match to the 2nd of April (or 1st in leap years). In fact you could use a 1 to 366 DAYs table and a single value (i.e. 1 ) MONTHs table! The result will include some duplicate dates but the next layer in the onion will take care of that:

SUMMARIZE(
ADDCOLUMNS(CROSSJOIN(Years,Months,Days),”FullDate”,DATE([Years],[Months],[Days])),
[FullDate]
)

SUMMARISE is like a SQL Group By, and will remove any duplicate dates in the [FullDate] column.

You could stop now as the minimum requirement for a date dimension has been produced, but in reality you’ll need a few descriptive/selective/sorting attributes to be assigned to each date.

You could do this via an Excel Table or via Calculated Columns when you import the table into a PowerPivot model. Or, as here, add them via a DAX Query:

ADDCOLUMNS(
SUMMARIZE(ADDCOLUMNS(CROSSJOIN(Years,Months,Days),”FullDate”,DATE([Years],[Months],[Days])),[FullDate]),
“Year”,YEAR([FullDate]),
“MonthNo”,MONTH([FullDate]),
“Month”,FORMAT([FullDate],”MMM”)
)

Again using ADDCOLUMNS to add calculated columns: a “Year” column (1995,1996 etc.) for selecting, a MonthNo column (1,2,3 …) for sorting and a “Month” column (Jan,Feb) for descriptive/selection purposes.

Finally, add a SORT BY [FullDate] to return the table in sorted order.

The table can then be saved as a CSV or an Excel worbook or loaded back into the source database. All, or a filtered range, can then be loaded into future models as required.

Advertisements