Microsoft Certified · Fabric Analytics Engineer Associate

Exam DP‑600
Study Guide

Implementing Analytics Solutions Using Microsoft Fabric — Lakehouse, Warehouse, Semantic Models, DAX, T‑SQL and PySpark.

A consolidated, scenario‑first reference organized around the four official exam domains. Every topic is paired with the decisions, trade‑offs, and distractors you will actually encounter on the exam.

Domain weighting

Plan · 10–15% Prepare & Serve · 40–45% Semantic Models · 20–25% Explore · 20–25%

Exam codeDP‑600

Duration100 min / 120 seat

Questions40 – 60

Passing score700 / 1000

Fabric Analytics Engineer Associate 01 / 16

DP‑600 · Study GuideTable of Contents

00Contents

What's inside.

Eight chapters covering the exam blueprint, deep topic notes for all four domains, comparison tables, the traps the exam likes to set, and a twenty‑item pre‑exam checklist.

01 Exam Domains & Weightage Blueprint, audience profile, and what the exam tests. p. 03

02 Domain 1 — Plan, Implement & Manage Workspace architecture, lifecycle, governance, security. p. 04

03 Domain 2 — Prepare & Serve Data Lakehouse, Warehouse, ingestion, transformation, medallion. p. 06

04 Domain 3 — Semantic Models Star schema, DirectLake, DAX, incremental refresh, XMLA. p. 09

05 Domain 4 — Explore & Analyze SQL endpoints, notebooks, profiling, Copilot, perf queries. p. 11

06 Service Comparison Tables Stores, storage modes, languages, tools. p. 13

07 Common Pitfalls & Distractors Twelve plausible‑but‑wrong answers the exam loves. p. 14

08 Final Checklist — 20 Must‑Knows Night‑before review list. p. 15

DP‑60002 / 16

DP‑600 · 01Exam Domains

01Official Exam Domains & Weightage

The blueprint.

The DP‑600 exam validates your ability to design, build, and deploy enterprise‑scale analytics solutions in Microsoft Fabric — spanning Lakehouse, Warehouse, semantic models, DAX, T‑SQL, and PySpark. It sits between data engineering and BI: you are the person who turns raw data into reusable, trusted analytics assets.

Audience profile

Analytics engineer bridging the data engineering and BI worlds — transforms raw data into reusable, trusted analytics assets.
Responsibilities: designing and building scalable data models, cleansing + transforming data, enabling performant analytic experiences.
Partners with solution architects, data engineers, DBAs, analysts, and administrators across the analytics lifecycle.
Skilled at SQL, DAX, and PySpark, plus semantic modeling and Power BI report design.

Weighting

Domain	Weight	Focus
01 Plan, Implement & Manage a Solution	10–15%	Workspaces, lifecycle, security
02 Prepare & Serve Data	40–45%	Lakehouse, Warehouse, ingest
03 Implement & Manage Semantic Models	20–25%	Star schema, DirectLake, DAX
04 Explore & Analyze Data	20–25%	SQL endpoints, profiling

Exam alert

Prepare & Serve Data is by far the largest domain — roughly four in ten questions. Spend the most prep time on Lakehouse vs. Warehouse choice, Delta table maintenance, Dataflow Gen2, notebooks, and medallion architecture. Breadth is the challenge: the exam spans data‑engineer skills (Spark, Lakehouse) and BI skills (DAX, semantic models).

Format at a glance

Platforms

Fabric · OneLake · Lakehouse · Warehouse · Power BI

Question types

Multiple choice · Drag‑and‑drop · Case studies

Languages

SQL · DAX · PySpark · Power Query (M)

01 · Domains03 / 16

DP‑600 · 02Domain 1 · Plan & Manage

02Domain 1 · 10 – 15%

Plan, implement & manage the solution.

2.1 Plan a data analytics environment

Capacity requirements. Size Fabric capacity (F‑SKU) by concurrent workloads; capacity is shared across workspaces. Watch CU seconds, smoothing, and throttling.
Domains & workspace strategy. Group workspaces by business area using domains; domain admins govern discoverability and defaults independently of tenant admins.
Data gateway. On‑premises data gateway for private sources; VNet data gateway for Azure private endpoints.
Recommend Fabric items. Lakehouse for unified files+tables, Warehouse for T‑SQL, Semantic model for BI, Dataflow/Pipeline for ingest.

2.2 Implement & manage an analytics environment

Security & governance. Workspace roles (Admin, Member, Contributor, Viewer); item‑level share; row‑, column‑, object‑, and file‑level security; Purview sensitivity labels.
Workspace settings. Git integration, default lakehouse, Spark environment, managed private endpoints, spark auto‑scale.
Lifecycle management. Git integration (Azure DevOps / GitHub), deployment pipelines with deployment rules for dev → test → prod.
Version control for data models. Database projects + source‑controlled semantic model TMDL; PBIP project format for Power BI.

Exam alert

Know the division of labor between Git integration (source control, diffing, PRs) and deployment pipelines (stage promotion with rules). They are complementary — Git tracks change, pipelines promote it.

2.3 Security & access model

Requirement	Feature	Note
Isolate tenants by row	Row‑Level Security (RLS)	Security predicate; applies in Warehouse, Lakehouse SQL endpoint, and semantic models.
Hide sensitive columns	Column / Object‑Level Security	`DENY SELECT` on specific columns.
Mask PII at presentation	Dynamic Data Masking	Not encryption — DBAs still see real data.
Restrict folders in OneLake	OneLake file/folder ACLs	Enforced across every engine reading the path.
Protect downstream exports	Sensitivity labels (Purview)	Labels flow to PBIX, Excel, shared reports.
Signal trust	Endorsements	Promoted (author) or Certified (admin) — not a security control.

02 · Domain 104 / 16

DP‑600 · 02Lifecycle & deployment

02 · cardLifecycle card · Git + deployment pipelines

From notebook to production.

Step 01

Develop

Author lakehouses, warehouses, semantic models in a dev workspace.

Step 02

Source control

Workspace bound to Git branch; commit notebooks, TMDL, PBIP.

Step 03

Promote

Deployment pipeline pushes items dev → test → prod with rules.

Step 04

Operate

Monitor Hub, Capacity Metrics app, refresh health, usage metrics.

Fabric items an analytics engineer touches

Item	Primary purpose	Owner‑persona
Lakehouse	Files + Delta tables over OneLake.	Data engineer + analytics eng.
Warehouse	T‑SQL DW with multi‑table transactions.	Analytics engineer
SQL endpoint	Read‑only T‑SQL over a Lakehouse.	Analyst / analytics eng.
Semantic model	Star‑schema model for Power BI.	Analytics engineer
Notebook	PySpark / SparkSQL transformations.	Analytics eng. / DE
Dataflow Gen2	Low‑code Power Query ingest + shape.	Analyst
Data pipeline	Orchestration with copy, notebook, dataflow.	Analytics eng. / DE
Power BI report	Visual explore & distribute.	Analyst

Mental model

Analytics engineer's job = turn Lakehouse/Warehouse tables into a reusable semantic model with clean names, clear relationships, and performant DAX. Everything on this page supports that goal.

02 · Domain 105 / 16

DP‑600 · 03Domain 2 · Prepare & Serve

03Domain 2 · 40 – 45% (largest domain)

Prepare & serve data.

3.1 Lakehouse vs. Warehouse — the central decision

Aspect	Lakehouse	Warehouse
Storage	Delta Parquet (files + tables)	Delta (managed T‑SQL)
Primary language	Spark SQL / PySpark	T‑SQL
Multi‑table txn	no	yes
Unstructured data	yes (Files area)	no
DML via SQL endpoint	read‑only	full
Shortcuts in	yes	Via OneLake
Best fit	Medallion ELT, ML, raw & curated data	Star‑schema DW, serve semantic model

3.2 Medallion architecture

Bronze. Raw landing — append‑only copies of source data. Minimal transforms; schema close to source.
Silver. Cleansed, conformed, deduplicated, type‑corrected. SCDs applied. Still normalized.
Gold. Business‑ready, aggregated, star‑schema. Source for semantic models.
Implementation. Layers can be separate lakehouses, separate schemas in one lakehouse, or distributed across Lakehouse (bronze/silver) + Warehouse (gold).

3.3 Create objects in a Lakehouse or Warehouse

Ingest. Pipeline Copy activity, Dataflow Gen2, notebook spark.read, Shortcut, Mirroring.
Shortcuts. Virtualize external data — ADLS, S3, GCS, Dataverse, other OneLake items. Zero copy; changes at source visible immediately.
Mirroring. Near‑real‑time read‑only replica of Azure SQL DB, Cosmos DB, Snowflake into OneLake as Delta.
External tables. In Warehouse / Lakehouse to reference Parquet/CSV without load.
Implement dynamic rules and partitioning. Partition Delta tables by a low‑cardinality column (e.g. EventDate). Never partition by high‑cardinality keys — it creates tiny files.

If →you need multi‑table transactions for gold layer → land gold in a Warehouse.

If →you need to analyze ADLS files without copying → Shortcut into Lakehouse.

If →you need a live read of Azure SQL with minimal ETL → Mirroring.

If →a team wants low‑code ETL with Power Query → Dataflow Gen2.

03 · Domain 206 / 16

DP‑600 · 03Ingest & transform

03 · cont.Ingest, transform, copy, merge

3.4 Copy & transform data

Copy activity. Move data between 100+ connectors with schema mapping, staging, fault tolerance. Best for scheduled batch.
Dataflow Gen2. Low‑code Power Query (M) — best for analysts and simple shape‑and‑enrich.
Notebook (PySpark / Spark SQL). Code‑first, best for large joins, complex logic, reuse via functions.
T‑SQL inside Warehouse. Set‑based INSERT … SELECT, MERGE, stored procedures.
Full, incremental, or streaming. Use a watermark column (LastModified) for incremental; use CDC / Change Tracking when available.

3.5 Handle SCDs, duplicates, late data

SCD Type 1. Overwrite current value. MERGE … WHEN MATCHED UPDATE.
SCD Type 2. Keep history with EffectiveFrom, EffectiveTo, IsCurrent. Close prior row; insert new.
Deduplicate. ROW_NUMBER() OVER (PARTITION BY natural_key ORDER BY ts DESC), keep rn = 1.
Missing values. COALESCE defaults, or quarantine nulls.
Late facts. Insert placeholder dim row; update when dim arrives.

Dedup + incremental — quick forms

Deduplicate on natural key

SELECT * FROM (
  SELECT *, ROW_NUMBER() OVER
    (PARTITION BY CustomerId
     ORDER BY UpdatedAt DESC) rn
  FROM stg.Customer
) t WHERE rn = 1;

Incremental watermark load

-- pipeline expression
@{activity('LookupWM').output.firstRow.LastLoad}

SELECT * FROM src.Orders
WHERE ModifiedDate > @prevWatermark;

Pro tip

When the exam asks "which tool for this transformation?" read for persona clues: analyst + UI → Dataflow Gen2 (M); large joins / ML → Notebook (PySpark); multi‑table transaction → Warehouse (T‑SQL).

03 · Domain 207 / 16

DP‑600 · 03Delta maintenance & optimization

03 · cont.Delta maintenance & performance

3.6 Optimize Lakehouse / Warehouse performance

V‑Order. Write‑time Delta Parquet optimization for Fabric engines. On by default. Keep it on for tables read by Power BI DirectLake or SQL endpoint.
OPTIMIZE. Compacts small files (common from streaming writes) into larger ones → less metadata, fewer IOs.
Z‑ORDER. Co‑locates rows within files on chosen columns → better data skipping on highly‑filtered queries.
VACUUM. Removes obsolete files past retention (default 7 days). Reduces storage.
Partitioning. Low‑cardinality only (EventDate). High‑cardinality partitioning creates the small‑file problem.
Warehouse. Statistics (auto), result‑set caching, star‑schema design, avoid SELECT *, minimize data movement across distributions.

Decide: which maintenance op?

Symptom	Remedy	Why
Many tiny files from streaming	`OPTIMIZE`	Compacts into larger Parquet files.
Slow filter on one column	`Z‑ORDER BY col`	Co‑locates matching rows.
Storage bill growing	`VACUUM`	Removes files past retention.
Slow read in Power BI DirectLake	Keep V‑Order on	Optimized layout for Vertipaq.
Query does full scan on a date filter	Partition on `date_col`	Partition pruning.

3.7 Denormalize for gold + serve

Conformed dimensions. Share Date / Customer / Product across fact tables.
Surrogate keys. Integer keys — fact joins stay fast.
Pre‑aggregate carefully. If many reports roll up by month, materialize a monthly summary; keep the grain aligned with the finest question.
Naming. Business‑friendly column names; hide surrogate keys in the semantic model.

Exam alert

OPTIMIZE ≠ VACUUM. OPTIMIZE compacts small files; VACUUM removes obsolete files after retention. The exam tests you on picking the right one for the stated symptom.

03 · Domain 208 / 16

DP‑600 · 04Domain 3 · Semantic Models

04Domain 3 · 20 – 25%

Design & manage semantic models.

4.1 Design & build semantic models

Star schema. One fact per grain, conformed dims. Bi‑directional relationships only when strictly required.
Relationships. One‑to‑many, single direction. Inactive relationships for role‑playing dims activated via USERELATIONSHIP.
Role‑playing dimensions. One Date dim referenced by OrderDate, ShipDate via multiple (inactive) relationships.
Calculation groups. Reusable time‑intelligence patterns (YTD, PY, MoM) applied across measures.
Hierarchies & display folders. Group measures and columns for consumer navigation.
Large semantic models. Enable large format + XMLA endpoint read/write. Partitions for incremental refresh.

4.2 Storage modes — Import vs. DirectQuery vs. DirectLake

Mode	How it works	Best for	Trade‑off
Import	Data loaded into VertiPaq (in‑memory).	Fast reports; complex DAX.	Needs refresh; memory‑bound.
DirectQuery	Every visual issues a query to source.	Live data; very large models.	Slower; pushes load to source.
DirectLake	Reads Delta directly from OneLake into VertiPaq on demand.	Fabric‑native; no refresh; near‑Import speed.	Fabric Lakehouse/Warehouse only; falls back to DirectQuery if rules violated.
Composite	Mix: Import fact + DQ dim, dual tables.	Hybrid scenarios.	Relationship limits; complexity.

4.3 DirectLake — know the rules

Source must be a Fabric Lakehouse or Warehouse Delta table (V‑Order recommended).
No calculated columns, no calculated tables, no composite models on the DirectLake model.
If a query exceeds resource guardrails or uses unsupported features, the engine falls back to DirectQuery over the SQL endpoint.
Schema refresh picks up column changes; framing refreshes the view of Delta for the model.

Exam alert

When the scenario mentions huge volumes + always‑current + Fabric source, the right answer is usually DirectLake. But if calculated columns or non‑Fabric sources are required, it must be Import or DirectQuery.

04 · Domain 309 / 16

DP‑600 · 04DAX, refresh, deploy

04 · cont.DAX optimization & model operations

4.4 DAX essentials the exam tests

Measures vs. calculated columns. Prefer measures. Calculated columns materialize at refresh and inflate model size; not allowed on DirectLake.
CALCULATE. The heart of DAX — modifies filter context. Pair with FILTER, ALL, KEEPFILTERS, REMOVEFILTERS.
Time intelligence. DATEADD, SAMEPERIODLASTYEAR, TOTALYTD, DATESBETWEEN. Require a marked Date table with contiguous dates.
Iterators. SUMX, AVERAGEX — evaluate expression per row; avoid on huge tables when a plain SUM would do.
Variables. VAR + RETURN — evaluate once, reuse, improve readability and performance.
Context transition. CALCULATE converts row context to filter context; know this for iterator questions.

4.5 Partitioning, incremental refresh, XMLA

Incremental refresh. Define RangeStart / RangeEnd parameters; set policy for storing N years + refreshing M days. Power BI auto‑partitions.
Detect changes. Optional column to skip unchanged partitions.
XMLA endpoint. Read/write on Premium/Fabric. Connect SSMS, Tabular Editor, ALM Toolkit for enterprise scripting.
Deploy via pipelines. Promote semantic models dev → test → prod; use deployment rules for data source bindings.

4.6 Implement & manage security on models

RLS. Define roles with DAX filter expressions; test with "View as role". Static (hardcoded) or dynamic (USERPRINCIPALNAME()).
OLS. Hide entire columns/tables from specific roles (Tabular Editor).
App audiences. Different role bindings per audience in a Power BI app.

Dynamic RLS pattern

[Region] =
LOOKUPVALUE(
  UserSecurity[Region],
  UserSecurity[Email],
  USERPRINCIPALNAME()
)

Time‑intel measure with VAR

Sales PY :=
VAR _prev =
  CALCULATE(
   [Sales],
   SAMEPERIODLASTYEAR('Date'[Date])
  )
RETURN _prev

Pro tip

If a DAX measure is slow, first check whether you are iterating a large fact table. Replace SUMX(Fact, Fact[x]*Fact[y]) with a pre‑calculated column only if the model isn't DirectLake — otherwise push the math upstream into the Lakehouse/Warehouse.

04 · Domain 310 / 16

DP‑600 · 05Domain 4 · Explore & Analyze

05Domain 4 · 20 – 25%

Explore & analyze data.

5.1 Exploratory tools in Fabric

SQL endpoint. Read‑only T‑SQL over a Lakehouse for visual query, ad‑hoc SELECTs, saving views. Joins across Lakehouse + Warehouse via cross‑database queries within the same workspace.
Notebooks. %%sql, %%pyspark, display() for rich previews, describe for profiling.
Power BI Datamart / Dataflow. Low‑code explore with Power Query.
Copilot for Data. Natural‑language queries over Lakehouse/Warehouse — good for quick exploration but validate results.

5.2 Descriptive / data profiling

Metric	T‑SQL / SparkSQL	Use
Row count	`SELECT COUNT(*)`	Sanity check after load.
Cardinality	`COUNT(DISTINCT col)`	Dedupe targets, key candidates.
Null ratio	`AVG(CASE WHEN col IS NULL THEN 1.0 ELSE 0 END)`	Missing‑data audit.
Min / max / avg	`MIN` / `MAX` / `AVG`	Range sanity + outlier hunt.
Top frequencies	`GROUP BY col ORDER BY count DESC`	Skew detection.

5.3 Query performance — warehouse & SQL endpoint

Query insights. queryinsights.exec_requests_history, long‑running and frequent queries.
Dynamic management views. sys.dm_exec_* for sessions, requests, waits.
Execution plan. Look for data movement, Cartesian products, missing stats.
Result‑set cache. Identical queries return instantly when cache is hot.

If →an analyst wants ad‑hoc T‑SQL over Lakehouse Delta tables → SQL endpoint, visual query.

If →profiling a new dataset for nulls, skew, cardinality → notebook with display() / T‑SQL profiling queries.

If →a repeated slow query → check query insights + plan; enable result‑set caching, update statistics.

05 · Domain 411 / 16

DP‑600 · 05Query patterns

05 · cont.Practical query patterns

5.4 Query patterns to recognize on the exam

Running total (windowed)

SUM(Amount) OVER (
  PARTITION BY CustomerId
  ORDER BY OrderDate
  ROWS UNBOUNDED PRECEDING
) AS RunTotal

Top N per group

WITH r AS (
SELECT *, ROW_NUMBER() OVER
(PARTITION BY CategoryId
ORDER BY Sales DESC) rn
FROM Products
)
SELECT * FROM r WHERE rn <= 5;

MERGE for upsert (Warehouse)

MERGE dim.Customer AS t
USING stg.Customer AS s
ON t.CustomerKey = s.CustomerKey
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT (...)
VALUES (...);

Cross‑database query

SELECT f.*, d.ProductName
FROM [SalesWH].dbo.FactSales f
JOIN [RetailLH].dbo.DimProduct d
ON f.ProductKey = d.ProductKey;

5.5 PySpark patterns (notebook)

spark.read.format("delta").load(path) / spark.table("lh.schema.tbl")
.filter(), .groupBy().agg(), .withColumn(), .write.format("delta").mode("append").saveAsTable(...)
Broadcast small dimensions: F.broadcast(dim_df).
Prefer F functions over UDFs; avoid collect() on large sets.

Pro tip

Windowed dedup, top‑N, running totals, and MERGE upsert are the four SQL patterns most likely to appear in drag‑and‑drop questions. Know what each one looks like at a glance.

05 · Domain 412 / 16

DP‑600 · 06Service comparison tables

06Service comparison tables

Choose the right tool.

Fabric data stores compared

Feature	Lakehouse	Warehouse	KQL Database	SQL DB in Fabric
Primary workload	Files + Delta tables	T‑SQL DW	Real‑time / telemetry	OLTP
Language	Spark SQL / PySpark	T‑SQL	KQL	T‑SQL
SQL writes	read‑only endpoint	full	Append	full
Multi‑table txn	no	yes	no	yes
Unstructured files	yes	no	no	no
Best for	Medallion, ML, raw+curated	Star schema, gold serving	Logs, IoT, events	App backend

Storage modes for semantic models

Mode	Data location	Refresh model	Choose when…
Import	In‑memory Vertipaq	Scheduled	Fastest reports, complex DAX, size fits memory.
DirectQuery	Stays in source	On query	Real‑time, very large, can't import.
DirectLake	Delta in OneLake, loaded on demand	Framing	Fabric‑native, always current, near‑Import speed.
Composite	Mixed (Dual tables)	Mixed	Hybrid: Import facts + DQ dims.

Transformation language — at a glance

Language	Strength	Typical item
Power Query (M)	Low‑code UI shaping.	Dataflow Gen2
PySpark / Spark SQL	Distributed compute, ML.	Notebook
T‑SQL	Set ops, transactions, views.	Warehouse, SQL endpoint (read)
DAX	Analytical measures in model.	Semantic model

06 · Comparison13 / 16

DP‑600 · 07Common pitfalls

07Common pitfalls & distractor answers

Plausible, but wrong.

The exam uses plausible‑sounding options to test depth of understanding. Twelve of the most common traps, with corrections.

01

Lakehouse SQL endpoint supports full DML.

WrongThe Lakehouse SQL endpoint is read‑only. For full T‑SQL DML use a Warehouse or write via Spark.

02

DirectLake needs a scheduled refresh.

WrongDirectLake reads Delta on demand. Only framing / schema refresh is needed — no data copy.

03

Calculated columns are fine in a DirectLake model.

WrongCalculated columns/tables are not supported in DirectLake. Push derivations upstream into Lakehouse/Warehouse.

04

Shortcuts copy data into OneLake.

WrongShortcuts virtualize data — no copy, no egress. Source changes are visible immediately.

05

Mirroring is bidirectional.

WrongMirroring is read‑only in OneLake. Writes go to the source system (Azure SQL, Cosmos DB, Snowflake).

06

Dynamic Data Masking encrypts data.

WrongDDM masks at presentation only. Privileged users still see real data.

07

Git integration replaces deployment pipelines.

WrongGit tracks change; deployment pipelines promote items across stages. Use both.

08

Partitioning always speeds up a Lakehouse table.

WrongHigh‑cardinality partitioning creates tiny files and hurts performance. Partition on low‑cardinality filter columns only.

09

OPTIMIZE and VACUUM do the same thing.

WrongOPTIMIZE compacts small files; VACUUM removes obsolete files past retention. Different operations.

10

Bi‑directional relationships are the default best practice.

WrongUse single‑direction one‑to‑many by default. Bi‑directional introduces ambiguity and performance cost — only when justified.

11

Calculated columns are better than measures for performance.

WrongMeasures compute at query time without inflating model size. Calculated columns bloat memory — prefer measures when possible.

12

Endorsement is the same as a sensitivity label.

WrongEndorsement (Promoted / Certified) signals trust. Sensitivity labels enforce protection. Different purposes.

07 · Pitfalls14 / 16

DP‑600 · 08Final checklist

08Night‑before review · 20 must‑know items

Twenty things you must know.

Review the night before. If any item feels unfamiliar, revisit that topic in the guide.

01You can choose between Lakehouse and Warehouse and justify on workload, DML, and transactions.

02You understand the medallion (bronze / silver / gold) pattern and layer responsibilities.

03You can set up Git integration and a deployment pipeline, and know they are complementary.

04You can apply workspace, item, row, column, object, file access controls.

05You can implement dynamic RLS with USERPRINCIPALNAME() and a security table.

06You understand Shortcuts (zero‑copy) vs. Mirroring (read‑only replica) vs. Copy.

07You can ingest via Copy activity, Dataflow Gen2, Notebook, Shortcut, Mirroring.

08You can transform with Power Query (M), PySpark, T‑SQL, and pick by persona.

09You can implement SCD Type 1 and Type 2, dedup, late‑arriving facts.

10You can maintain Delta tables with V‑Order, OPTIMIZE, Z‑ORDER, VACUUM — and pick the right one.

11You understand Import vs. DirectQuery vs. DirectLake vs. Composite and when to choose each.

12You know DirectLake limits: no calculated columns/tables; fallback to DirectQuery.

13You can design a star schema with conformed dims, surrogate keys, proper relationships.

14You can write core DAX: CALCULATE, FILTER, time‑intel, iterators, VAR.

15You can configure incremental refresh with RangeStart/RangeEnd and detect‑changes.

16You can connect via the XMLA endpoint using SSMS / Tabular Editor / ALM Toolkit.

17You can profile data with T‑SQL or Spark (counts, nulls, cardinality, distributions).

18You can investigate slow queries via query insights, DMVs, execution plans.

19You recognize common SQL patterns: window functions, top‑N per group, MERGE upsert.

20You can design end‑to‑end: ingest → medallion → semantic model → report on Fabric.

08 · Checklist15 / 16

DP‑600 · ColophonClosing

— Good luck

Focus on scenario‑based reasoning: understand not just what each Fabric item does, but when and why you would pick Lakehouse over Warehouse, DirectLake over Import, or a measure over a calculated column.

Source

Microsoft Learn — Study Guide for Exam DP‑600, Implementing Analytics Solutions Using Microsoft Fabric.

End of guide · DP‑600

DP‑60016 / 16

Exam DP‑600Study Guide

What's inside.

The blueprint.

Audience profile

Weighting

Format at a glance

Plan, implement & manage the solution.

2.1 Plan a data analytics environment

2.2 Implement & manage an analytics environment

2.3 Security & access model

From notebook to production.

Fabric items an analytics engineer touches

Prepare & serve data.

3.1 Lakehouse vs. Warehouse — the central decision

3.2 Medallion architecture

3.3 Create objects in a Lakehouse or Warehouse

3.4 Copy & transform data

3.5 Handle SCDs, duplicates, late data

Dedup + incremental — quick forms

Deduplicate on natural key

Incremental watermark load

3.6 Optimize Lakehouse / Warehouse performance

Decide: which maintenance op?

3.7 Denormalize for gold + serve

Design & manage semantic models.

4.1 Design & build semantic models

4.2 Storage modes — Import vs. DirectQuery vs. DirectLake

4.3 DirectLake — know the rules

4.4 DAX essentials the exam tests

4.5 Partitioning, incremental refresh, XMLA

4.6 Implement & manage security on models

Dynamic RLS pattern

Time‑intel measure with VAR

Explore & analyze data.

5.1 Exploratory tools in Fabric

5.2 Descriptive / data profiling

5.3 Query performance — warehouse & SQL endpoint

5.4 Query patterns to recognize on the exam

Running total (windowed)

Top N per group

MERGE for upsert (Warehouse)

Cross‑database query

5.5 PySpark patterns (notebook)

Choose the right tool.

Fabric data stores compared

Storage modes for semantic models

Transformation language — at a glance

Plausible, but wrong.

Twenty things you must know.

Exam DP‑600
Study Guide