A consolidated, scenario‑first reference organized around the three official exam domains. Every topic is paired with the decisions and distractors you will actually encounter on the exam.
Seven chapters covering the exam blueprint, deep topic notes, comparison tables, the traps this exam likes to set, and a twenty‑item pre‑exam checklist.
The DP‑800 exam validates your ability to design, develop, secure, optimize, and deploy AI‑enabled database solutions across SQL Server, Azure SQL, and SQL databases in Microsoft Fabric.
| Domain | Weight | Focus |
|---|---|---|
| 01 Design & Develop Database Solutions | 35 – 40% | Schema, T‑SQL, AI tooling |
| 02 Secure, Optimize & Deploy | 35 – 40% | Security, perf, CI/CD, Azure |
| 03 Implement AI Capabilities | 25 – 30% | Embeddings, search, RAG |
Most questions cover GA features, but commonly used Preview features — T‑SQL vector functions, AI_* functions — may appear. Expect heavy case‑study and scenario‑based questions.
SQL Server · Azure SQL · SQL DB in Fabric
Multiple choice · Drag‑and‑drop · Case studies
100 min · 40–60 items · 700 / 1000 to pass
NVARCHAR(MAX) vs. fixed‑length — and the storage / performance impact.MEMORY_OPTIMIZED) for OLTP‑heavy workloads; temporal for point‑in‑time audit; ledger for tamper‑evidence; graph for relationship data; external for data outside SQL.PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, DEFAULT.SEQUENCE objects generate ordered numeric values independently of any table — prefer over IDENTITY when you need values before INSERT or across tables.TRY/CATCH, transaction management.AFTER, INSTEAD OF); DDL triggers monitor schema changes. Use judiciously — they add hidden complexity.Know the difference between inline TVFs (single SELECT, no BEGIN/END) and multi‑statement TVFs. The exam often presents performance scenarios where replacing a multi‑statement TVF with an inline TVF is the correct fix.
A code‑heavy sub‑domain. Expect to read and write queries using modern T‑SQL features.
ROW_NUMBER, RANK, DENSE_RANK, NTILE for ranking; LAG/LEAD for row comparison; SUM/AVG OVER for running totals.JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG, OPENJSON, JSON_VALUE, JSON_CONTAINS.EXISTS / NOT EXISTS. Know when they outperform JOINs.TRY…CATCH, THROW, RAISERROR, XACT_ABORT, ERROR_NUMBER(), ERROR_MESSAGE(), ERROR_LINE().REGEXP_LIKE, REGEXP_REPLACE, REGEXP_SUBSTR, REGEXP_INSTR, REGEXP_COUNT, REGEXP_MATCHES, REGEXP_SPLIT_TO_TABLE.EDIT_DISTANCE (Levenshtein), EDIT_DISTANCE_SIMILARITY (%), JARO_WINKLER_DISTANCE. Dedup, data cleansing, approximate matching.NODE / EDGE tables with the MATCH operator for pattern matching across relationships.The new regex and fuzzy‑matching functions are heavily tested. Practice combining REGEXP_LIKE in WHERE clauses for data validation, and EDIT_DISTANCE for approximate‑match scenarios in customer data.
A new, distinctive sub‑domain covering Copilot integration and the Model Context Protocol.
.github/copilot‑instructions.md for project‑specific standards.tbl_ prefix, snake_case columns) → create a copilot‑instructions.md file with explicit rules.| Object | Choose when… | Watch out for |
|---|---|---|
| Temporal table | You need point‑in‑time queries and a history of changes. | History is mutable by admins — not tamper‑evident. |
| Ledger table | You need cryptographic tamper‑evidence for compliance or regulatory audit. | Heavier write path; schema changes more restricted. |
| In‑memory (OLTP) | Write‑heavy OLTP where latching / locking is the bottleneck. | Durability trade‑offs; memory footprint. |
| Graph (node/edge) | Queries traverse relationships — social graph, fraud detection. | Avoid for purely relational workloads. |
| External table | Query data that lives outside SQL without importing it. | Latency and per‑query data movement. |
| Clustered columnstore | Pure analytical / reporting workload on large tables. | Not ideal for frequent point updates. |
| Nonclustered columnstore | Mix of OLTP writes with occasional analytics (HTAP). | Maintenance overhead on the rowstore. |
| Need | Function | Note |
|---|---|---|
| Scalar from JSON | JSON_VALUE(doc,'$.path') | Scalars only — not objects. |
| Object / array from JSON | JSON_QUERY, OPENJSON | OPENJSON shreds to rows. |
| Build JSON | JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG | Aggregates rows → array. |
| Regex test | REGEXP_LIKE(col,'pattern') | Use in WHERE for validation. |
| Approximate string match | EDIT_DISTANCE, JARO_WINKLER_DISTANCE | Dedup, fuzzy joins. |
| Call external REST | sp_invoke_external_rest_endpoint | Core of RAG pipelines. |
| Similarity over vectors | VECTOR_DISTANCE, VECTOR_SEARCH | ENN without index, ANN with. |
Exam questions rarely ask "what does function X do?" They ask "given this scenario, which feature do you reach for?" — read every question as a decision tree.
Security is foundational: encryption, masking, row‑ and object‑level access, auditing, endpoint security.
EncryptByKey/DecryptByKey. Keys managed in the database — less secure than AE but more flexible server‑side.UNMASK see real data.SESSION_CONTEXT or SYSTEM_USER.GRANT, DENY, REVOKE. Principle of least privilege.sys.dm_exec_query_stats, sys.dm_exec_requests, sys.dm_os_wait_stats, sys.dm_db_index_usage_stats.sys.dm_exec_requests + sys.dm_tran_locks to find blockers. Deadlock graphs via Extended Events or system_health. Resolve with shorter transactions, consistent access order, SNAPSHOT.| Level | Dirty read | Non‑repeatable | Phantom | Blocking |
|---|---|---|---|---|
READ UNCOMMITTED | yes | yes | yes | Minimal |
READ COMMITTED (default) | no | yes | yes | Low |
REPEATABLE READ | no | no | yes | Moderate |
SERIALIZABLE | no | no | no | High |
SNAPSHOT | no | no | no | None · row vers. |
SNAPSHOT isolation eliminates blocking by using row versioning in tempdb — but increases tempdb usage. The exam loves to test when SNAPSHOT is the right choice versus SERIALIZABLE.
.sqlproj) for cross‑platform builds.MERGE for lookup tables, stored alongside schema..sql files.CODEOWNERS for governance.| Mechanism | Use case | Granularity |
|---|---|---|
| Change Data Capture (CDC) | Full change history for ETL / audit. | Row‑level, all columns |
| Change Tracking | Lightweight sync — "what changed?" | Row‑level, flag only |
| Change Event Streaming (CES) | Real‑time streaming to downstream systems. | Event‑based |
| Azure Functions + SQL trigger | Serverless reaction to row changes. | Row‑level, code‑driven |
| Azure Logic Apps | Low‑code workflow on data changes. | Row‑level, connector |
sp_invoke_external_rest_endpoint or native AI functions). Store vectors in VECTOR columns.The exam tests your ability to choose the right embedding maintenance strategy. Triggers are simplest but add latency to writes. CDC + Azure Functions is asynchronous and scalable. Choose based on latency tolerance and write throughput.
CONTAINS, FREETEXT.VECTOR storage; configure dimensionality; supports vector indexes for fast approximate search.VECTOR_DISTANCE, VECTOR_NORMALIZE, VECTORPROPERTY, VECTOR_SEARCH.| Search type | Matches on | Best for | Limitation |
|---|---|---|---|
| Full‑Text | Keywords / linguistic forms | Exact term / phrase lookup | No semantic understanding |
| Vector | Meaning / similarity | Concept search, synonym handling | Needs embeddings; approximate |
| Hybrid (RRF) | Keywords + meaning | Production search apps | More complex setup |
A pattern that grounds language‑model responses in your data. Know the end‑to‑end flow — the exam tests each step individually and as a complete pipeline.
sp_invoke_external_rest_endpoint calls the model.OPENJSON + JSON_VALUE parse the response.sp_invoke_external_rest_endpoint. The T‑SQL stored procedure for calling external REST APIs (including Azure OpenAI). Sends prompts, receives completions.FOR JSON PATH/AUTO, JSON_OBJECT, JSON_ARRAY) to include as context in the prompt.sp_invoke_external_rest_endpoint.OPENJSON and JSON_VALUE pull out the generated text.| Feature | SQL Server · on‑prem | Azure SQL Database | SQL DB in Fabric |
|---|---|---|---|
| Deployment | Self‑managed VM / bare metal | Fully managed PaaS | Fabric workspace |
| Scaling | Manual — add CPU/RAM | DTU or vCore auto‑scale | Fabric capacity units |
| AI integration | External REST calls | Native + REST endpoints | Deep Fabric AI/ML |
| Copilot | GitHub Copilot in IDE | GitHub Copilot + portal | Copilot in Fabric |
| Graph tables | yes | yes | yes |
| Ledger tables | 2022+ | yes | yes |
| Vector support | preview | preview / GA | preview / GA |
| CI/CD | SQL Database Projects | SQL Database Projects | Fabric Git integration |
| Method | Key location | Server can read? | Use case |
|---|---|---|---|
| Always Encrypted | Client only | no | Protect from DBAs & cloud admins |
| AE + Enclaves | Secure enclave | In enclave only | AE + range queries, LIKE |
| Column‑level (T‑SQL) | Database | yes | Flexible server‑side processing |
| TDE | Server / HSM | Yes (data at rest) | Full‑database at‑rest encryption |
| Dynamic Data Masking | N/A — no encryption | yes | Presentation‑layer obfuscation |
| Aspect | Data API Builder | Custom API · App Service / Functions |
|---|---|---|
| Code required | Zero — config‑driven | Full application code |
| Protocols | REST + GraphQL | Any |
| Setup time | Minutes | Hours to days |
| Customization | Limited (config options) | Unlimited |
| Auth | Entra ID, API keys | Any auth provider |
| Best for | Standard CRUD over DB objects | Complex business logic |
The exam uses plausible‑sounding options to test depth of understanding. Below are the twelve most common traps — read each distractor, then the correction.
UNMASK or db_owner see real data. For true protection use Always Encrypted.INSERT and works across multiple tables. IDENTITY is table‑bound.tempdb — I/O and space cost. Prevents blocking but not write‑write conflicts.VECTOR_DISTANCE works without an index (ENN). Indexes enable ANN for performance, but are not mandatory.JSON_VALUE returns scalars only. Use OPENJSON or JSON_QUERY to extract objects or arrays.Review the night before. If any item feels unfamiliar, revisit that topic in the guide before going to bed.
JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG, OPENJSON, JSON_VALUE, JSON_CONTAINS.ROW_NUMBER, RANK, LAG/LEAD, running SUM.REGEXP_LIKE, REGEXP_REPLACE, …) and can write basic patterns.EDIT_DISTANCE and JARO_WINKLER_DISTANCE.MATCH query.VECTOR_DISTANCE / VECTOR_SEARCH.Microsoft Learn — Study Guide for Exam DP‑800.