Salesforce Data Architect Study Guide (Winter '26)
Your complete guide to passing the Salesforce Data Architect exam — data modelling, MDM patterns, large data volume strategy, Shield, and ETL design.
Written and reviewed by Krishna Mohan — ADM-201, PD1, PD2, App Builder & Consultant certified. Updated for Winter '26. Methodology · Contact
Exam Sections & Weightings
What Each Section Tests
Data Modelling & Management
Designing scalable data models: object relationships (lookup, master-detail, many-to-many), field types, record types, and custom metadata. Evaluating when to use custom objects vs platform features. External objects and Salesforce Connect for federated data. Schema design for reporting performance.
Master Data Management
MDM patterns: consolidation, coexistence, and centralisation. Matching and merging duplicate records — Duplicate Management, matching rules, merge fields. Salesforce as system of record vs system of engagement. Golden record strategy and data stewardship workflows.
Large Data Volume Strategy
LDV best practices: skinny tables, custom indexes, division-based partitioning. Data skew: ownership skew, lookup skew, and their impact on record locking. Archiving strategies: big objects, external archiving via Heroku or third-party. Bulk API 2.0 for high-volume loads. SOQL optimisation for large datasets.
Data Governance & Compliance
Data governance frameworks, data stewardship roles, metadata management. Salesforce Shield: Platform Encryption (deterministic vs probabilistic), Event Monitoring, Field Audit Trail. GDPR and data residency considerations: field-level encryption, data masking, data retention policies.
Data Migration & ETL
ETL tool selection: Salesforce Data Loader, Informatica, Jitterbit, MuleSoft. Migration strategy: data profiling, cleansing, transformation, and validation. Bulk API vs REST API for migration volume. Rollback strategy and data validation post-migration. External IDs for upsert and relationship mapping.
10-Week Study Plan
Scenario Strategy Tips
- 1.LDV mitigation hierarchy: When performance is the problem, first check if a custom index can help. If not, consider skinny tables. If the model itself causes skew, redesign the relationship or use big objects for archiving.
- 2.MDM pattern selection: Consolidation = merge all data into Salesforce as master. Coexistence = Salesforce is one of several authoritative systems. Centralisation = Salesforce is the single hub but doesn't own all data. Match the pattern to the business scenario described.
- 3.Encryption trade-offs: Deterministic encryption allows filtering/searching. Probabilistic encryption is more secure but you cannot filter/search on the field. GDPR questions often require deterministic encryption so data can be found and deleted.
- 4.External IDs for migration: Always create External ID fields on objects before a data migration. They enable upsert (insert + update in one operation) and allow relationship mapping across systems without knowing Salesforce record IDs.
Mock Exam Benchmark
Aim for 75%+ on practice exams before scheduling. Data Architect questions are scenario-heavy — most describe a business situation with performance, compliance, or data quality challenges and ask for the optimal design. If you can justify your answer (not just identify it), you are ready.
Top 10 Concepts to Review
- Object relationship types and when to use lookup vs master-detail
- Three MDM patterns: consolidation, coexistence, centralisation — and when to use each
- Duplicate Management: matching rules, duplicate rules, merge workflow
- LDV: custom indexes, skinny tables, selective SOQL, EXPLAIN PLAN
- Ownership skew and lookup skew: causes, symptoms, and mitigation
- Big objects: use cases, limitations, Async SOQL for queries
- Salesforce Shield: Platform Encryption, Event Monitoring, Field Audit Trail
- Deterministic vs probabilistic encryption and searchability implications
- Bulk API 2.0: job types, serial vs parallel, ingest lifecycle
- External IDs: creating them, using upsert, relationship mapping in migration
Frequently Asked Questions
- What is the Salesforce Data Architect certification?
- The Salesforce Data Architect certification validates expertise in designing scalable data models, managing large data volumes, implementing MDM strategies, and enforcing data governance on the Salesforce platform. The exam has 60 questions, a 110-minute time limit, ~68% passing score, and a $400 fee. It is part of the Application Architect credential path.
- What is large data volume (LDV) and why is it important?
- Large data volume refers to Salesforce orgs with millions of records that can cause query performance issues, record locking contention, and report timeouts. LDV best practices include creating custom indexes on frequently queried fields, using skinny tables (custom Salesforce indexes that pre-join columns), avoiding cross-object formula fields on large objects, and using ownership/lookup skew-aware design. The Data Architect exam heavily tests LDV trade-offs.
- What is data skew in Salesforce?
- Data skew occurs when a disproportionate number of records share a common field value, causing performance problems. Ownership skew: many records owned by one user (e.g., integration user) — causes slow sharing rule calculations. Lookup skew: many child records on one parent record (e.g., 50,000 Contacts on one Account) — causes record locking during DML. The exam tests both types and the mitigation strategies.
- What is Salesforce Shield and when is it required?
- Salesforce Shield is a set of security tools for regulated industries: Platform Encryption (encrypt data at rest, field-level), Event Monitoring (audit user actions via log files), and Field Audit Trail (retain field history for up to 10 years). Shield is required when compliance mandates (HIPAA, GDPR, PCI-DSS) require encryption of specific sensitive fields or long-term audit logs beyond standard field history retention (18 months).
- How long should I study for the Data Architect exam?
- Plan for 10–12 weeks with 10–15 hours per week. Hands-on experience with SOQL query optimisation, Bulk API loading, and data model design is essential. Candidates without LDV experience should spend extra time on that section (20% of exam) as it tests trade-offs that are hard to understand without project experience.
What Comes After This Certification?
After this certification, consider: Application Architect, System Architect, or Technical Architect (CTA).
Exam Section Difficulty Heatmap
Which sections are a gimme vs which ones trap confident candidates. Use this to prioritise your final-week revision.
| Exam Section | Difficulty | Study Tip |
|---|---|---|
| Data Modeling | Hard | Normalization, LDV, and data model trade-offs — scenario questions on design. |
| Master Data Management | Moderate | MDM concepts and when to use Data Cloud vs CRM — know the positioning. |
| Data Governance | Trap ⚠ | Data quality and stewardship — governance vs security is often confused. |
| Data Architecture | Hard | Architecture decisions and documentation — integration with other domains. |
| Data Integration | Moderate | ETL and replication patterns — know the tools and limits. |
Difficulty based on analysis of common candidate errors across each exam section.
Ready to Practice?
Test yourself with free Data Architect practice questions covering all 5 exam sections.
Start Free Practice Questions