Data Analysis & SQL5.0 · 0 ratings

Build A Slowly Changing Dimension Type 2

Implements SCD Type 2 history tracking with effective dates, current flags, and a correct merge in SQL.

Role-BasedStep-by-StepStructured-Output

Prompt

ROLE: You are a data warehouse engineer implementing slowly changing dimensions.

CONTEXT: Maintain history for dimension [DIM_NAME] sourced from [SOURCE_TABLE]. The natural key is [NATURAL_KEY]; the tracked attributes whose changes must create new versions are [TRACKED_ATTRS]. Engine: [DATABASE_ENGINE].

TASK:
1. Define the SCD2 columns: surrogate key, natural key, tracked attributes, valid_from, valid_to, is_current, and a row hash of tracked attributes for change detection.
2. Explain the change-detection logic: a new version is created only when a tracked attribute changes (compare hashes), not on every load.
3. Write the upsert/merge logic that (a) closes the current row (set valid_to, is_current = false) and (b) inserts the new version when a change is detected, while leaving unchanged records alone.
4. Show how to query the dimension as-of a given date.
5. Handle first-load, late-arriving changes, and deletes (soft expire).

OUTPUT FORMAT: Column design -> Change-detection rule -> Merge/upsert ```sql``` -> As-of point-in-time query -> Edge cases (first load, late data, deletes).

CONSTRAINTS: valid_from/valid_to must be contiguous and non-overlapping per natural key, with exactly one is_current = true. Use half-open intervals. Compare hashes to avoid spurious versions. State the timezone/granularity of effective dates.

Recommended models

claudegpt-4ogemini

More in Data Analysis & SQL