Skip to content

Lakehouse Roadmap

The Lakehouse is built in phases, each adding a layer of capability. Phases 0-4 are complete; Phases 5-8 are in progress or planned.

Phase Summary

Phase Title Status Effort
0 IaC Scaffolding Done S
1 Containerization Done M
2 Control Plane Deployment Done M
3 Seed Content Done L
3-fix CI & Build Fixes Done S
3.5 Branding & Cleanup Done S
4 RBAC Done M
5 ML Export Validation Deferred S
6 Data Catalog & Registry Todo L
7 Metrics Layer Todo L
8 Data Quality & Observability Todo XL

Effort key: S = less than 1 day, M = 1-2 days, L = 3-5 days, XL = 1-2 weeks

Dependency Diagram

Phase 0  IaC Scaffolding
  |
Phase 1  Containerization
  |
Phase 2  Control Plane Deployment
  |
Phase 3  Seed Content --> Phase 3-fix  CI & Build Fixes
  |
Phase 3.5  Branding & Cleanup
  |
Phase 4  RBAC
  |
  +-------------------------------+
  |                               |
Phase 5  ML Export Validation   Phase 6  Data Catalog
  (deferred)                      |
                                  +-------------------+
                                  |                   |
                               Phase 7             Phase 8
                               Metrics Layer       Data Quality

Completed Phases

Phase 0 -- IaC Scaffolding

Control Plane secret policy and Metabase secrets runbook.

Phase 1 -- Containerization

Dockerfile.metabase pinned to Metabase v0.59.7 with DuckDB driver, duckdb-init.sql, Docker Compose service.

Phase 2 -- Control Plane Deployment

cpln workload YAML, GitHub Actions CI/CD, domain binding (analytics.{env}.smackz.co).

Phase 3 -- Seed Content

Reproducible seed_metabase.py CLI with upsert semantics. Created 5 dashboards, 5 ML saved questions, 2 permission groups, 30 SQL files.

Phase 3.5 -- Branding & Cleanup

SMACKZ branding (logo, favicon, colors, font), removed sample database and default dashboards, pinned Platform Health as homepage.

Phase 4 -- RBAC

Expanded from 2 to 4 permission groups (admin, analyst, developer, readonly).

Upcoming Phases

Phase 6 -- Data Catalog & Registry

A _metadata DuckDB schema documenting every table, column, and relationship. See Data Catalog.

Phase 7 -- Metrics Layer

Canonical metric definitions as DuckDB views, eliminating duplicated SQL. See Metrics Layer.

Phase 8 -- Data Quality & Observability

Automated quality checks with catalog-driven rules and a Metabase dashboard. See Data Quality.

Backlog

Item Notes
Exploratory Analysis & Transformation UI Interactive exploration layer
Metabase Pro evaluation Unlocks custom branding, SSO, embedded dashboards, row-level sandboxing
Restaurant-owner facing dashboards Per-restaurant data sandboxing
Incremental re-backfill support Incremental re-ingestion of lakehouse data

Timeline (Estimated)

2026-Q2
  Apr  [====]     Phase 5 -- ML Export Validation
  May  [========] Phase 6 -- Data Catalog & Registry
  Jun  [========] Phase 7 -- Metrics Layer

2026-Q3
  Jul  [==========] Phase 8 -- Data Quality & Observability
  Aug  [====]       Backlog items

Key Files

  • smackz-lakehouse/docs/LAKEHOUSE-ROADMAP.md -- Full roadmap with task breakdowns
  • smackz-lakehouse/docs/Lakehouse-Data-Catalog-FRD.md -- Phase 6 FRD
  • smackz-lakehouse/docs/Lakehouse-Metrics-Layer-FRD.md -- Phase 7 FRD
  • smackz-lakehouse/docs/Lakehouse-Data-Quality-FRD.md -- Phase 8 FRD