CQ-9 Tech Due Diligence

Tech Due Diligence CQ-9: Error Handling and Logging Practices

What This Control Requires

The assessor evaluates the robustness of error handling across the application, the quality and comprehensiveness of logging practices, and the ability to diagnose production issues through available observability data.

In Plain Language

When something breaks in production at 2am, your error handling and logging determine whether you diagnose it in minutes or spend hours guessing. These practices directly affect reliability, debuggability, and how efficiently your team operates under pressure.

Poor error handling is a major risk signal. Assessors look for patterns like silent exception swallowing (catching errors without logging or acting on them), overly broad catch blocks that mask specific failures, inconsistent API error formats, and missing handling for edge cases or external service failures. Any of these leads to unexpected crashes, data corruption, and frustrated users.

Logging quality matters just as much. Can your logs tell you what went wrong, with enough context to reproduce the issue? Are log levels used correctly (ERROR, WARN, INFO, DEBUG)? Is sensitive data kept out of logs? Are logs structured as JSON so you can actually search and filter them effectively?

How to Implement

Define error handling conventions for the entire codebase. Agree on how errors are caught and propagated, what context gets attached (request ID, user context, operation name), how errors are presented to API consumers (consistent format with error codes, messages, and correlation IDs), and how you handle different categories - client errors versus server errors versus transient failures.

Adopt structured logging across the application. Every log entry should be JSON-formatted with consistent fields: timestamp, log level, service name, request/correlation ID, anonymised user context, operation name, and relevant metadata. Use a proper logging library (Winston, Pino, Logback, slog) rather than writing to stdout.

Standardise log level usage. ERROR for failures requiring attention. WARN for unexpected but handled conditions. INFO for significant business events like user registration or payment processing. DEBUG for detailed diagnostics, disabled in production. Excessive DEBUG logging creates noise and drives up storage costs.

Implement correlation IDs that propagate through every service in a request chain. Being able to trace a single user request across multiple services and log entries is essential for diagnosing issues in distributed systems.

Make sure sensitive data never appears in logs. Sanitise passwords, auth tokens, personal data, financial details, and health data. Use an allowlist of loggable fields rather than dumping entire objects.

Centralise log collection with an aggregation platform - ELK Stack, Datadog, Grafana Loki, or CloudWatch Logs. Set retention policies that balance diagnostic usefulness with storage costs and data protection requirements.

Include error handling and logging in your code review checklist. Reject code that swallows exceptions silently or logs sensitive information.

Evidence Your Auditor Will Request

Error handling conventions documentation
Structured logging configuration and examples
Log aggregation platform and dashboard screenshots
Evidence of log sanitisation for sensitive data
Error response format documentation for APIs

Common Mistakes

Silent exception swallowing: errors caught but neither logged nor handled
Inconsistent error response formats across different API endpoints
Sensitive data appearing in logs (passwords, tokens, PII)
Unstructured log messages that are difficult to search and analyse
No correlation IDs; impossible to trace a request through the system

Related Controls Across Frameworks

Framework	Control ID	Relationship
ISO 27001	A.8.15	Related
SOC 2	CC7.2	Related

Frequently Asked Questions

How verbose should production logging be?

Run at INFO level by default, with the ability to flip specific services to DEBUG temporarily when diagnosing an issue. INFO should capture significant business events and error conditions. Do not log every function call or database query at INFO level - it creates noise and costs you money.

Should we use a centralized logging platform?

For anything beyond a single server, yes - it is a baseline expectation in due diligence. Centralised log aggregation lets you search across services, correlate events, set up alerting on error patterns, and do historical analysis. Without it, diagnosing production issues becomes guesswork.

Track Tech Due Diligence compliance in one place

AuditFront helps you manage every Tech Due Diligence control, collect evidence, and stay audit-ready.

Start Free Assessment