Lakehouse Roadmap
The Lakehouse is built in phases, each adding a layer of capability. Phases 0-4 are complete; Phases 5-8 are in progress or planned.
Phase Summary
| Phase | Title | Status | Effort |
|---|---|---|---|
| 0 | IaC Scaffolding | Done | S |
| 1 | Containerization | Done | M |
| 2 | Control Plane Deployment | Done | M |
| 3 | Seed Content | Done | L |
| 3-fix | CI & Build Fixes | Done | S |
| 3.5 | Branding & Cleanup | Done | S |
| 4 | RBAC | Done | M |
| 5 | ML Export Validation | Deferred | S |
| 6 | Data Catalog & Registry | Todo | L |
| 7 | Metrics Layer | Todo | L |
| 8 | Data Quality & Observability | Todo | XL |
Effort key: S = less than 1 day, M = 1-2 days, L = 3-5 days, XL = 1-2 weeks
Dependency Diagram
Phase 0 IaC Scaffolding
|
Phase 1 Containerization
|
Phase 2 Control Plane Deployment
|
Phase 3 Seed Content --> Phase 3-fix CI & Build Fixes
|
Phase 3.5 Branding & Cleanup
|
Phase 4 RBAC
|
+-------------------------------+
| |
Phase 5 ML Export Validation Phase 6 Data Catalog
(deferred) |
+-------------------+
| |
Phase 7 Phase 8
Metrics Layer Data Quality
Completed Phases
Phase 0 -- IaC Scaffolding
Control Plane secret policy and Metabase secrets runbook.
Phase 1 -- Containerization
Dockerfile.metabase pinned to Metabase v0.59.7 with DuckDB driver, duckdb-init.sql, Docker Compose service.
Phase 2 -- Control Plane Deployment
cpln workload YAML, GitHub Actions CI/CD, domain binding (analytics.{env}.smackz.co).
Phase 3 -- Seed Content
Reproducible seed_metabase.py CLI with upsert semantics. Created 5 dashboards, 5 ML saved questions, 2 permission groups, 30 SQL files.
Phase 3.5 -- Branding & Cleanup
SMACKZ branding (logo, favicon, colors, font), removed sample database and default dashboards, pinned Platform Health as homepage.
Phase 4 -- RBAC
Expanded from 2 to 4 permission groups (admin, analyst, developer, readonly).
Upcoming Phases
Phase 6 -- Data Catalog & Registry
A _metadata DuckDB schema documenting every table, column, and relationship. See Data Catalog.
Phase 7 -- Metrics Layer
Canonical metric definitions as DuckDB views, eliminating duplicated SQL. See Metrics Layer.
Phase 8 -- Data Quality & Observability
Automated quality checks with catalog-driven rules and a Metabase dashboard. See Data Quality.
Backlog
| Item | Notes |
|---|---|
| Exploratory Analysis & Transformation UI | Interactive exploration layer |
| Metabase Pro evaluation | Unlocks custom branding, SSO, embedded dashboards, row-level sandboxing |
| Restaurant-owner facing dashboards | Per-restaurant data sandboxing |
| Incremental re-backfill support | Incremental re-ingestion of lakehouse data |
Timeline (Estimated)
2026-Q2
Apr [====] Phase 5 -- ML Export Validation
May [========] Phase 6 -- Data Catalog & Registry
Jun [========] Phase 7 -- Metrics Layer
2026-Q3
Jul [==========] Phase 8 -- Data Quality & Observability
Aug [====] Backlog items
Key Files
smackz-lakehouse/docs/LAKEHOUSE-ROADMAP.md-- Full roadmap with task breakdownssmackz-lakehouse/docs/Lakehouse-Data-Catalog-FRD.md-- Phase 6 FRDsmackz-lakehouse/docs/Lakehouse-Metrics-Layer-FRD.md-- Phase 7 FRDsmackz-lakehouse/docs/Lakehouse-Data-Quality-FRD.md-- Phase 8 FRD