Microsoft Certified · SQL AI Developer Associate

Exam DP‑800
Study Guide

Developing AI‑Enabled Database Solutions across SQL Server, Azure SQL, and SQL DB in Fabric.

A consolidated, scenario‑first reference organized around the three official exam domains. Every topic is paired with the decisions and distractors you will actually encounter on the exam.

Domain weighting

Design & Develop · 35–40% Secure, Optimize, Deploy · 35–40% AI Capabilities · 25–30%

Exam codeDP‑800

Duration100 min / 120 seat

Questions40 – 60

Passing score700 / 1000

01 / 15

DP‑800 · Study GuideTable of Contents

00Contents

What's inside.

Seven chapters covering the exam blueprint, deep topic notes, comparison tables, the traps this exam likes to set, and a twenty‑item pre‑exam checklist.

01 Exam Domains & Weightage Blueprint, audience profile, and what the exam actually tests. p. 03

02 Domain 1 — Design & Develop Tables, programmability, advanced T‑SQL, AI‑assisted tooling. p. 04

03 Domain 2 — Secure, Optimize, Deploy Security, performance, CI/CD, Azure integration. p. 07

04 Domain 3 — AI Capabilities Embeddings, intelligent search, RAG in T‑SQL. p. 10

05 Service Comparison Tables SQL platforms, encryption methods, DAB vs. custom APIs. p. 12

06 Common Pitfalls & Distractors Twelve plausible‑but‑wrong answers the exam loves. p. 13

07 Final Checklist — 20 Must‑Knows Night‑before review list. p. 14

DP‑80002 / 15

DP‑800 · 01Exam Domains

01Official Exam Domains & Weightage

The blueprint.

The DP‑800 exam validates your ability to design, develop, secure, optimize, and deploy AI‑enabled database solutions across SQL Server, Azure SQL, and SQL databases in Microsoft Fabric.

Audience profile

Subject matter expertise in designing AI‑enabled database solutions across Microsoft SQL platforms.
Experience writing T‑SQL and developing databases on SQL Server, Azure SQL, and Fabric SQL DB.
Familiarity with CI/CD in GitHub and AI‑assisted tools — GitHub Copilot, Copilot in Fabric.
Understanding of AI concepts: embeddings, vectors, models, and retrieval‑augmented generation (RAG).

Weighting

Domain	Weight	Focus
01 Design & Develop Database Solutions	35 – 40%	Schema, T‑SQL, AI tooling
02 Secure, Optimize & Deploy	35 – 40%	Security, perf, CI/CD, Azure
03 Implement AI Capabilities	25 – 30%	Embeddings, search, RAG

Exam alert

Most questions cover GA features, but commonly used Preview features — T‑SQL vector functions, AI_* functions — may appear. Expect heavy case‑study and scenario‑based questions.

Format at a glance

Platforms

SQL Server · Azure SQL · SQL DB in Fabric

Question types

Multiple choice · Drag‑and‑drop · Case studies

Duration & scoring

100 min · 40–60 items · 700 / 1000 to pass

01 · Domains03 / 15

DP‑800 · 02Domain 1 · Design & Develop

02Domain 1 · 35 – 40%

Design & develop database solutions.

2.1 Design & implement database objects

Core concepts

Table design. Appropriate data types and column sizes. When NVARCHAR(MAX) vs. fixed‑length — and the storage / performance impact.
Specialized tables. In‑memory (MEMORY_OPTIMIZED) for OLTP‑heavy workloads; temporal for point‑in‑time audit; ledger for tamper‑evidence; graph for relationship data; external for data outside SQL.
JSON columns & indexes. Store semi‑structured data natively. Index computed columns on JSON paths for performant queries.
Columnstore. Clustered replaces the heap for analytics; nonclustered adds analytics to OLTP tables (HTAP).
Partitioning. Partition large tables by range (e.g. date) via a partition function + scheme mapped to filegroups.
Constraints. PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, DEFAULT.
Sequences. SEQUENCE objects generate ordered numeric values independently of any table — prefer over IDENTITY when you need values before INSERT or across tables.

If →the business needs an immutable audit trail that regulators can verify → use a Ledger table. Temporal tables track history but are mutable by admins; ledger tables provide cryptographic tamper‑evidence.

If →the workload mixes heavy OLTP writes with occasional analytics → add a nonclustered columnstore index for real‑time operational analytics (HTAP).

If →JSON documents are queried by a known set of paths → create computed columns on those paths and index them, rather than scanning JSON at query time.

2.2 Implement programmability objects

Views. Simplify joins, enforce column‑level security, provide backward compatibility. Indexed views materialize results.
Scalar UDFs. Return a single value. Inline scalar UDFs (SQL Server 2019+) can be inlined by the optimizer; otherwise row‑by‑row execution hurts.
Table‑valued functions. Inline TVFs are preferred (optimizer expands them like views). Multi‑statement TVFs have fixed cardinality estimates — avoid for large data.
Stored procedures. Encapsulate logic server‑side; parameterized execution, TRY/CATCH, transaction management.
Triggers. DML triggers (AFTER, INSTEAD OF); DDL triggers monitor schema changes. Use judiciously — they add hidden complexity.

Exam alert

Know the difference between inline TVFs (single SELECT, no BEGIN/END) and multi‑statement TVFs. The exam often presents performance scenarios where replacing a multi‑statement TVF with an inline TVF is the correct fix.

02 · Domain 104 / 15

DP‑800 · 02Domain 1 · Design & Develop

02 · cont.Advanced T‑SQL · AI‑assisted tools

2.3 Write advanced T‑SQL

A code‑heavy sub‑domain. Expect to read and write queries using modern T‑SQL features.

Language features

CTEs. Recursive CTEs for hierarchical data (org charts, BOMs). Anchor + recursive member pattern.
Window functions. ROW_NUMBER, RANK, DENSE_RANK, NTILE for ranking; LAG/LEAD for row comparison; SUM/AVG OVER for running totals.
JSON. JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG, OPENJSON, JSON_VALUE, JSON_CONTAINS.
Correlated subqueries. EXISTS / NOT EXISTS. Know when they outperform JOINs.
Error handling. TRY…CATCH, THROW, RAISERROR, XACT_ABORT, ERROR_NUMBER(), ERROR_MESSAGE(), ERROR_LINE().

Text & graph

Regular expressions. REGEXP_LIKE, REGEXP_REPLACE, REGEXP_SUBSTR, REGEXP_INSTR, REGEXP_COUNT, REGEXP_MATCHES, REGEXP_SPLIT_TO_TABLE.
Fuzzy matching. EDIT_DISTANCE (Levenshtein), EDIT_DISTANCE_SIMILARITY (%), JARO_WINKLER_DISTANCE. Dedup, data cleansing, approximate matching.
Graph queries. NODE / EDGE tables with the MATCH operator for pattern matching across relationships.

Pro tip

The new regex and fuzzy‑matching functions are heavily tested. Practice combining REGEXP_LIKE in WHERE clauses for data validation, and EDIT_DISTANCE for approximate‑match scenarios in customer data.

2.4 Design SQL solutions with AI‑assisted tools

A new, distinctive sub‑domain covering Copilot integration and the Model Context Protocol.

Core concepts

Security impact. Copilot suggestions may expose sensitive schema or data patterns. Configure policies to limit data exposure.
GitHub Copilot for SQL. Enable in VS Code / Azure Data Studio. Chat‑based prompts to generate T‑SQL. Respects .github/copilot‑instructions.md for project‑specific standards.
Copilot in Fabric. AI assistance inside the Fabric SQL editor. Works with Fabric SQL databases.
MCP — Model Context Protocol. A standard for connecting AI models to data tools. Configure MCP tool options in Copilot chat; connect to MCP server endpoints for SQL Server and Fabric lakehouse.
Instruction files. Customize naming conventions, prohibited patterns, preferred T‑SQL constructs.

If →the team wants Copilot to follow company naming conventions (e.g. tbl_ prefix, snake_case columns) → create a copilot‑instructions.md file with explicit rules.

If →a developer needs AI‑assisted queries over Fabric lakehouse data → configure an MCP server endpoint pointing to the lakehouse.

02 · Domain 105 / 15

DP‑800 · 02Domain 1 · Reference card

02 · cardTable & object decision guide

When to reach for which object.

Object	Choose when…	Watch out for
Temporal table	You need point‑in‑time queries and a history of changes.	History is mutable by admins — not tamper‑evident.
Ledger table	You need cryptographic tamper‑evidence for compliance or regulatory audit.	Heavier write path; schema changes more restricted.
In‑memory (OLTP)	Write‑heavy OLTP where latching / locking is the bottleneck.	Durability trade‑offs; memory footprint.
Graph (node/edge)	Queries traverse relationships — social graph, fraud detection.	Avoid for purely relational workloads.
External table	Query data that lives outside SQL without importing it.	Latency and per‑query data movement.
Clustered columnstore	Pure analytical / reporting workload on large tables.	Not ideal for frequent point updates.
Nonclustered columnstore	Mix of OLTP writes with occasional analytics (HTAP).	Maintenance overhead on the rowstore.

T‑SQL function cheat‑lines

Need	Function	Note
Scalar from JSON	`JSON_VALUE(doc,'$.path')`	Scalars only — not objects.
Object / array from JSON	`JSON_QUERY`, `OPENJSON`	OPENJSON shreds to rows.
Build JSON	`JSON_OBJECT`, `JSON_ARRAY`, `JSON_ARRAYAGG`	Aggregates rows → array.
Regex test	`REGEXP_LIKE(col,'pattern')`	Use in WHERE for validation.
Approximate string match	`EDIT_DISTANCE`, `JARO_WINKLER_DISTANCE`	Dedup, fuzzy joins.
Call external REST	`sp_invoke_external_rest_endpoint`	Core of RAG pipelines.
Similarity over vectors	`VECTOR_DISTANCE`, `VECTOR_SEARCH`	ENN without index, ANN with.

Mental model

Exam questions rarely ask "what does function X do?" They ask "given this scenario, which feature do you reach for?" — read every question as a decision tree.

02 · Reference06 / 15

DP‑800 · 03Domain 2 · Secure, Optimize, Deploy

03Domain 2 · 35 – 40%

Secure, optimize, & deploy.

3.1 Data security & compliance

Security is foundational: encryption, masking, row‑ and object‑level access, auditing, endpoint security.

Always Encrypted. Column‑level encryption where keys never leave the client. Deterministic allows equality comparisons; randomized does not. Enclaves extend to range comparisons and pattern matching.
Column‑level encryption (T‑SQL). EncryptByKey/DecryptByKey. Keys managed in the database — less secure than AE but more flexible server‑side.
Dynamic Data Masking. Presentation‑layer masking only. Does not encrypt. Mask types: default, email, partial, random. Users with UNMASK see real data.
Row‑Level Security. Filter predicates (read) and block predicates (write) via security policies bound to inline TVFs. Identify the user with SESSION_CONTEXT or SYSTEM_USER.
Object‑level permissions. GRANT, DENY, REVOKE. Principle of least privilege.
Passwordless auth. Microsoft Entra ID, Managed Identity for service‑to‑service. Eliminates connection‑string secrets.
Auditing. SQL Server Audit (server + database); Azure SQL Auditing writes to Storage, Log Analytics, or Event Hubs.
Endpoint security. Secure model, GraphQL, REST, and MCP endpoints with Managed Identity + auth policies.

If →sensitive data (SSN, credit card) must be protected even from DBAs → use Always Encrypted. DDM is not sufficient — DBAs can still query the real data.

If →multi‑tenant apps need row isolation per tenant → implement Row‑Level Security with a filter predicate keyed on tenant ID.

If →applications need to connect to Azure SQL without storing credentials → use Managed Identity.

3.2 Optimize database performance

Execution plans. Read actual vs. estimated. Look for key lookups, table scans, hash joins on large tables, implicit conversions.
DMVs. sys.dm_exec_query_stats, sys.dm_exec_requests, sys.dm_os_wait_stats, sys.dm_db_index_usage_stats.
Query Store. Plans + runtime stats over time; plan forcing, regression detection, A/B comparison. SQL Server 2016+ and Azure SQL.
Query Performance Insight. Azure‑portal view over Query Store — top resource consumers.
Blocking & deadlocks. sys.dm_exec_requests + sys.dm_tran_locks to find blockers. Deadlock graphs via Extended Events or system_health. Resolve with shorter transactions, consistent access order, SNAPSHOT.

03 · Domain 207 / 15

DP‑800 · 03Isolation · CI/CD

03 · cont.Isolation · CI/CD · SQL Database Projects

Transaction isolation levels

Level	Dirty read	Non‑repeatable	Phantom	Blocking
`READ UNCOMMITTED`	yes	yes	yes	Minimal
`READ COMMITTED` (default)	no	yes	yes	Low
`REPEATABLE READ`	no	no	yes	Moderate
`SERIALIZABLE`	no	no	no	High
`SNAPSHOT`	no	no	no	None · row vers.

Exam alert

SNAPSHOT isolation eliminates blocking by using row versioning in tempdb — but increases tempdb usage. The exam loves to test when SNAPSHOT is the right choice versus SERIALIZABLE.

3.3 CI/CD with SQL Database Projects

SQL Database Projects. Declarative, source‑controlled representation of your schema. SDK‑style (.sqlproj) for cross‑platform builds.
Testing. Unit tests (tSQLt) for stored procedures and functions; integration tests against a real instance.
Reference / static data. Post‑deployment scripts or MERGE for lookup tables, stored alongside schema.
Schema drift detection. Compare project model to live database; catch out‑of‑band changes in production.
Branching & PRs. Feature branches, pull requests with code review, conflict resolution for .sql files.
Secrets. Never in source. Use Azure Key Vault, GitHub Secrets, or environment variables.
Pipelines. Build → validate → deploy. Approval gates, branching policies, authentication tables, CODEOWNERS for governance.

If →unauthorized schema changes are appearing in production → enable schema drift detection in the pipeline and enforce all changes through SQL Database Projects + pull requests.

If →deployments need multi‑environment approval (dev → staging → prod) → configure approval gates and branching policies.

03 · Domain 208 / 15

DP‑800 · 03Azure integration

03 · cont.Integrate SQL with Azure Services

3.4 Integrate SQL solutions with Azure services

Data API Builder (DAB). Generates REST and GraphQL endpoints from database objects — zero application code. JSON configuration. Caching, pagination, filtering, field‑level authorization.
Expose database objects. DAB can expose tables, views, and stored procedures as GraphQL types or REST endpoints. Configure relationships for nested queries.
DAB deployment. Azure Static Web Apps backend, container, or App Service. Managed Identity for database auth.
Azure Monitor. Application Insights for request telemetry; Log Analytics with KQL for centralized logs. Alerts on query performance or error rates.

Change event handling — which mechanism?

Mechanism	Use case	Granularity
Change Data Capture (CDC)	Full change history for ETL / audit.	Row‑level, all columns
Change Tracking	Lightweight sync — "what changed?"	Row‑level, flag only
Change Event Streaming (CES)	Real‑time streaming to downstream systems.	Event‑based
Azure Functions + SQL trigger	Serverless reaction to row changes.	Row‑level, code‑driven
Azure Logic Apps	Low‑code workflow on data changes.	Row‑level, connector

If →a frontend needs a quick REST/GraphQL API over existing tables with no backend code → use Data API Builder.

If →downstream systems need real‑time change events → use CES or Azure Functions with the SQL trigger binding.

If →you need full before/after images of every change for audit → CDC. If you only need to know which rows changed → Change Tracking.

03 · Domain 209 / 15

DP‑800 · 04Domain 3 · AI Capabilities

04Domain 3 · 25 – 30%

Implement AI capabilities.

4.1 Models & embeddings

External models. AI models hosted outside SQL (Azure OpenAI, etc.). Evaluate on modality (text, image, multimodal), language support, size, latency, cost, structured‑output capability.
Create & manage external models. Register endpoints in SQL; manage credentials, versioning, failover.
Embedding maintenance. Embeddings go stale as source data changes. Options: table triggers (sync), Change Tracking, Azure Functions with SQL trigger, Logic Apps, CDC, CES, Microsoft Foundry.
Column selection. Identify columns with semantic meaning. Concatenate meaningful text. Exclude IDs, timestamps, numeric codes.
Chunking. Split large text before embedding — by sentence, paragraph, or fixed token count. Overlap for context continuity.
Generating embeddings. Call the external endpoint from T‑SQL (sp_invoke_external_rest_endpoint or native AI functions). Store vectors in VECTOR columns.

Pro tip

The exam tests your ability to choose the right embedding maintenance strategy. Triggers are simplest but add latency to writes. CDC + Azure Functions is asynchronous and scalable. Choose based on latency tolerance and write throughput.

4.2 Intelligent search

Full‑text search. Keyword‑based using inverted indexes — exact and linguistic matches (stemming, thesaurus). CONTAINS, FREETEXT.
Vector (semantic) search. Similarity over embeddings using cosine, dot product, or Euclidean distance. Returns semantically similar results with no keyword overlap.
Hybrid search. Combines full‑text + vector using Reciprocal Rank Fusion (RRF) to merge ranked lists.
Vector data type. Native VECTOR storage; configure dimensionality; supports vector indexes for fast approximate search.
Vector functions. VECTOR_DISTANCE, VECTOR_NORMALIZE, VECTORPROPERTY, VECTOR_SEARCH.
ANN vs. ENN. Approximate — fast, uses vector index. Exact — scans all rows, precise but slow. ANN for production; ENN for small datasets or validation.
Vector index types. IVF, HNSW — evaluate by dataset size, dimensionality, recall, build/query trade‑offs.

Search type	Matches on	Best for	Limitation
Full‑Text	Keywords / linguistic forms	Exact term / phrase lookup	No semantic understanding
Vector	Meaning / similarity	Concept search, synonym handling	Needs embeddings; approximate
Hybrid (RRF)	Keywords + meaning	Production search apps	More complex setup

04 · Domain 310 / 15

DP‑800 · 04Retrieval‑Augmented Generation

04 · cont.Retrieval‑Augmented Generation

4.3 RAG in T‑SQL.

A pattern that grounds language‑model responses in your data. Know the end‑to‑end flow — the exam tests each step individually and as a complete pipeline.

Step 01

User query

Natural‑language question arrives from the app.

Step 02

Retrieve

Embed the query, vector‑search for context rows.

Step 03

Augment

System prompt + context JSON + user question.

Step 04

Invoke

sp_invoke_external_rest_endpoint calls the model.

Step 05

Extract

OPENJSON + JSON_VALUE parse the response.

Core concepts

Use cases. Customer‑support bots grounded in product docs, internal knowledge search, report summarization, SQL natural‑language interfaces.
sp_invoke_external_rest_endpoint. The T‑SQL stored procedure for calling external REST APIs (including Azure OpenAI). Sends prompts, receives completions.
Structured data → JSON. Convert query results to JSON (FOR JSON PATH/AUTO, JSON_OBJECT, JSON_ARRAY) to include as context in the prompt.
Send results to the LLM. System message + retrieved context + user question. Call the model via sp_invoke_external_rest_endpoint.
Extract responses. Parse the JSON response — OPENJSON and JSON_VALUE pull out the generated text.

If →users search by product name and expect exact matches → Full‑Text Search.

If →users describe what they want in natural language ("comfortable running shoes for flat feet") → Vector Search over description embeddings.

If →you need both keyword precision and semantic recall → Hybrid Search with RRF.

If →the dataset has 1M+ rows and needs sub‑second search → ANN with a vector index. ENN would be too slow.

04 · RAG11 / 15

DP‑800 · 05Service comparison tables

05Service comparison tables

Choose the right platform.

SQL platform comparison

Feature	SQL Server · on‑prem	Azure SQL Database	SQL DB in Fabric
Deployment	Self‑managed VM / bare metal	Fully managed PaaS	Fabric workspace
Scaling	Manual — add CPU/RAM	DTU or vCore auto‑scale	Fabric capacity units
AI integration	External REST calls	Native + REST endpoints	Deep Fabric AI/ML
Copilot	GitHub Copilot in IDE	GitHub Copilot + portal	Copilot in Fabric
Graph tables	yes	yes	yes
Ledger tables	2022+	yes	yes
Vector support	preview	preview / GA	preview / GA
CI/CD	SQL Database Projects	SQL Database Projects	Fabric Git integration

Encryption methods

Method	Key location	Server can read?	Use case
Always Encrypted	Client only	no	Protect from DBAs & cloud admins
AE + Enclaves	Secure enclave	In enclave only	AE + range queries, LIKE
Column‑level (T‑SQL)	Database	yes	Flexible server‑side processing
TDE	Server / HSM	Yes (data at rest)	Full‑database at‑rest encryption
Dynamic Data Masking	N/A — no encryption	yes	Presentation‑layer obfuscation

Data API Builder vs. custom API

Aspect	Data API Builder	Custom API · App Service / Functions
Code required	Zero — config‑driven	Full application code
Protocols	REST + GraphQL	Any
Setup time	Minutes	Hours to days
Customization	Limited (config options)	Unlimited
Auth	Entra ID, API keys	Any auth provider
Best for	Standard CRUD over DB objects	Complex business logic

05 · Comparison12 / 15

DP‑800 · 06Common pitfalls

06Common pitfalls & distractor answers

Plausible, but wrong.

The exam uses plausible‑sounding options to test depth of understanding. Below are the twelve most common traps — read each distractor, then the correction.

01

Dynamic Data Masking protects data from admins.

WrongDDM is presentation‑layer only. Users with UNMASK or db_owner see real data. For true protection use Always Encrypted.

02

Temporal tables provide tamper‑proof audit.

WrongAdmins can modify history tables. For tamper‑evidence use Ledger tables with cryptographic hashes.

03

Multi‑statement TVFs perform like inline TVFs.

WrongMulti‑statement TVFs have fixed cardinality estimates and prevent optimizer push‑down. Inline TVFs are expanded like views.

04

IDENTITY is always better than SEQUENCE.

WrongSEQUENCE generates values before INSERT and works across multiple tables. IDENTITY is table‑bound.

05

ENN is preferred for production.

WrongENN scans all vectors — too slow for large datasets. Use ANN with a vector index for production.

06

CDC and Change Tracking are interchangeable.

WrongCDC captures full before/after images; Change Tracking only tells you which rows changed. CDC is heavier but richer.

07

SNAPSHOT isolation has no cost.

WrongSNAPSHOT uses row versioning in tempdb — I/O and space cost. Prevents blocking but not write‑write conflicts.

08

Always Encrypted supports all query operations.

WrongStandard AE supports equality only (deterministic) or no comparisons (randomized). Use Enclaves for range / LIKE.

09

Vector indexes are required for vector search.

WrongVECTOR_DISTANCE works without an index (ENN). Indexes enable ANN for performance, but are not mandatory.

10

Triggers are the best way to maintain embeddings.

WrongTriggers are synchronous and slow writes. For high‑throughput tables use async methods — CDC + Functions, CES.

11

RRF replaces the need for both search types.

WrongReciprocal Rank Fusion merges results from full‑text AND vector search. You still need both implementations; RRF combines them.

12

JSON_VALUE can return entire objects.

WrongJSON_VALUE returns scalars only. Use OPENJSON or JSON_QUERY to extract objects or arrays.

06 · Pitfalls13 / 15

DP‑800 · 07Final checklist

07Night‑before review · 20 must‑know items

Twenty things you must know.

Review the night before. If any item feels unfamiliar, revisit that topic in the guide before going to bed.

01You can explain when to use temporal vs. ledger vs. graph vs. in‑memory tables.

02You can write a recursive CTE for hierarchical data.

03You know all JSON functions: JSON_OBJECT, JSON_ARRAY, JSON_ARRAYAGG, OPENJSON, JSON_VALUE, JSON_CONTAINS.

04You can use window functions — ROW_NUMBER, RANK, LAG/LEAD, running SUM.

05You understand the new regex functions (REGEXP_LIKE, REGEXP_REPLACE, …) and can write basic patterns.

06You know the difference between EDIT_DISTANCE and JARO_WINKLER_DISTANCE.

07You can write a graph MATCH query.

08You can configure GitHub Copilot instruction files and MCP server endpoints.

09You understand Always Encrypted (standard vs. enclaves) and when to use each.

10You can implement Dynamic Data Masking and explain its limitations.

11You can implement Row‑Level Security with filter and block predicates.

12You can read a query execution plan and identify key lookups, scans, implicit conversions.

13You can explain all five transaction isolation levels and their trade‑offs.

14You can diagnose blocking and deadlocks using DMVs.

15You understand SQL Database Projects, schema drift, and CI/CD pipeline controls.

16You can configure Data API Builder for REST and GraphQL endpoints.

17You know the differences between CDC, Change Tracking, CES, and SQL trigger bindings.

18You can design an embedding pipeline — columns, chunking, generation, maintenance.

19You understand ANN vs. ENN, vector indexes, and VECTOR_DISTANCE / VECTOR_SEARCH.

20You can implement end‑to‑end RAG — embed query → vector search → build prompt → call LLM → parse response.

07 · Checklist14 / 15

DP‑800 · ColophonClosing

— Good luck

Focus on scenario‑based reasoning: understand not just what each technology does, but when and why you would choose it over the alternatives.

Source

Microsoft Learn — Study Guide for Exam DP‑800.

End of guide · DP‑800

DP‑80015 / 15

Exam DP‑800Study Guide

What's inside.

The blueprint.

Audience profile

Weighting

Format at a glance

Design & develop database solutions.

2.1 Design & implement database objects

Core concepts

2.2 Implement programmability objects

2.3 Write advanced T‑SQL

Language features

Text & graph

2.4 Design SQL solutions with AI‑assisted tools

Core concepts

When to reach for which object.

T‑SQL function cheat‑lines

Secure, optimize, & deploy.

3.1 Data security & compliance

3.2 Optimize database performance

Transaction isolation levels

3.3 CI/CD with SQL Database Projects

3.4 Integrate SQL solutions with Azure services

Change event handling — which mechanism?

Implement AI capabilities.

4.1 Models & embeddings

4.2 Intelligent search

4.3 RAG in T‑SQL.

Core concepts

Choose the right platform.

SQL platform comparison

Encryption methods

Data API Builder vs. custom API

Plausible, but wrong.

Twenty things you must know.

Exam DP‑800
Study Guide