Briefing

Everything you need before we start

Welcome

MHP Data Engineer Masterclass — one dataset, three pipelines, one decision.

Trainers & participants

Roadmap

Learning objectives

Note

After completing this workshop, you will be able to:

  • Configure all three development environments (Databricks, Snowflake, dbt) and verify connectivity
  • Navigate the repository structure and locate notebooks, SQL scripts, and dbt models
  • Sketch an initial medallion architecture for the YellowLine NYC use case
  • Articulate the three evaluation constraints (Cost, Performance, Compliance) that frame every tool decision

Agenda

Time Module Duration Story beat
09:00 Story: Use Case & Characters 10 min Marcus hires MHP — what would you design?
09:15 1: DE Fundamentals 35 min Elena whiteboards medallion; Priya lists KPIs
10:00 2: Databricks Pipeline 75 min Bob prototypes on Databricks with Sofia
11:30 3: Snowflake Pipeline 75 min Marcus: “We need SQL, not notebooks”
13:30 4: dbt Pipeline 75 min Board asks for lineage and tests
15:00 5: Production Patterns 45 min What runs every night without you?
15:45 6: AI Features 45 min Can AI help analysts explore faster?
16:30 7: Comparison & Wrap-up 30 min Priya’s dashboard + you choose the stack
17:00 End of main day

Optional Phase 2 (deliver 8 before 9)

Module Duration Prerequisites
8: Streaming 90 min Modules 2–3 required · 4 recommended
9: Machine Learning 90 min Modules 2–3 required · 4 recommended

Environment check

Before the workshop starts, verify your setup:

Databricks

Snowflake

dbt (Local)

TipTrouble?

See the detailed setup guides:

Architecture

flowchart LR
    ADLS2[("Azure ADLS2\nNYC Taxi")]

    ADLS2 --> DB["Databricks\nPySpark + Delta"]
    ADLS2 --> SF["Snowflake\nSQL + Snowpark"]
    ADLS2 --> DBT["dbt on Snowflake\nSQL + tests"]

    DB --> GOLD["12 Gold KPI tables\n(identical schema)"]
    SF --> GOLD
    DBT --> GOLD

    GOLD --> PBI["Power BI\n(Priya)"]

    style ADLS2 fill:#0057b8,color:#fff
    style PBI fill:#107c10,color:#fff
    style GOLD fill:#d97706,color:#fff

Quick start

  1. Verify prerequisites
  2. Begin with the Story: Use Case & Characters to meet YellowLine NYC.

MHP Data Engineer Masterclass 2026 · v2.8 · Content updated May 2026