Tech Due Diligence ARCH-2: Database Design and Data Management
What This Control Requires
The assessor evaluates the database architecture, schema design, data modelling practices, query performance, migration management, and the overall data management strategy including data lifecycle and storage efficiency.
In Plain Language
If you want to see the true quality of an engineering team's work, look at their database. Schema design, indexing, migration discipline, and query performance reveal more about engineering maturity than almost any other area. Poorly designed databases create performance bottlenecks, make feature development painfully slow, and become extremely costly to refactor once they hold significant data.
Assessors dig into schema normalisation (avoiding redundant data that creates inconsistency), index design (appropriate indexes for actual query patterns without over-indexing), migration management (version-controlled, reversible schema changes), and query performance (no N+1 queries, slow queries, or full table scans on large tables). They also look at whether the schema can handle 10x data volume and whether backup and recovery strategies are solid.
Beyond schema design, data management practices matter just as much. Are there retention policies, or does data accumulate indefinitely? Is there an archival strategy for old data? Can the system handle GDPR requirements like data deletion and export? For multi-tenant products, how is customer data isolated?
How to Implement
Design database schemas with clear domain modelling. Use consistent naming conventions, appropriate data types, and foreign key constraints for referential integrity. Document the data model with entity-relationship diagrams that show relationships, cardinality, and key constraints.
Set up version-controlled migration management using a proper tool (Flyway, Liquibase, Alembic, TypeORM migrations, Prisma Migrate). Every schema change should be defined as a migration script, version-controlled alongside application code, tested before hitting production, reversible where possible, and applied automatically during deployment.
Optimise query performance systematically. Set up query logging to catch slow queries in production, run EXPLAIN analysis on complex queries, build indexes based on actual query patterns rather than guesswork, paginate large result sets, and use connection pooling to manage database connections efficiently.
Monitor the key database health metrics: query latency percentiles (p50, p95, p99), connection pool utilisation, disk I/O and storage growth, replication lag if using replicas, and index usage statistics. Set up alerts on degradation trends before they become user-facing problems.
Put data lifecycle management in place. Define retention policies for different data categories, implement soft deletion where the business needs it (with eventual hard deletion for compliance), archive old data to cheaper storage, and ensure GDPR compliance capabilities including data export and right to deletion.
For multi-tenant SaaS applications, pick a tenancy model and document it clearly: shared database with tenant ID columns (row-level isolation), schema-per-tenant, or database-per-tenant. Document the trade-offs and any migration plans as the customer base grows.
Plan for data scalability. Evaluate whether the current database handles projected volumes. Consider read replicas for read-heavy workloads, partitioning or sharding for large tables, caching layers (Redis, Memcached) for hot data, and whether the current database technology will hold up at scale.
Evidence Your Auditor Will Request
- Entity-relationship diagrams for the data model
- Database migration scripts (version-controlled)
- Query performance monitoring dashboard or reports
- Data retention and lifecycle management policies
- Database scalability assessment relative to growth projections
Common Mistakes
- No migration management; schema changes applied manually or ad-hoc
- Slow queries in production with no systematic monitoring or optimisation
- Missing indexes causing full table scans on large, growing tables
- No data retention policy; data accumulated indefinitely, growing costs and risk
- Multi-tenancy design allows data leakage between customers
Related Controls Across Frameworks
Frequently Asked Questions
Is NoSQL acceptable or should everything be in a relational database?
What is a healthy migration count?
Track Tech Due Diligence compliance in one place
AuditFront helps you manage every Tech Due Diligence control, collect evidence, and stay audit-ready.
Start Free Assessment