While investigating the nature of SQLite‘s query optimisation strategy I came across this 2004 presentation (link dead! try internet archive) by D. Richard Hipp which provides a excellent overview of SQLite’s technical architecture. The reason I’m looking at the optimiser is to figure out if SQLite can handle a star-schema query.
First impressions were not promising as SQLite doesn’t support merge-joins or bitmap-joins, only the classic loop-join is implemented e.g.
Select * from t1,t2;
…is evaluated as;
for each row in t1:
for each row in t2:
output one row of result
Also, as only one index per table is used in a query this appears to rule out the classic one-index-per-dimension scenario normally applied to fact tables. All is not lost however, by using the INTERSECT command it’s possible to simulate a bit mapped query e.g.
Select sum(sales) from fact_table ft
where ft.rowid in (
select rowid from fact_table ft where ft.customer_dim_id = "35"
select rowid from fact_table ft where ft.product_dim_id = "56787"
select rowid from fact_table ft where ft.time_dim_id = "20070823")
Not as slick as an Oracle 10G star transformation but ‘good enough’ for the task at hand. What, I hear you ask, is that task and why would anybody use SQLite to implement a data warehouse star schema? For the last few months I’ve been rabbiting on about micro-ETL, well, I’m about to embark on my micro-BI phase (you’ve been warned) and for a micro-BI environment a micro-datawarehouse is essential. More anon…