Building Audit Trails for Compliance
This guide is part of the Feature Flag Architecture & Lifecycle Management series. Every production flag mutation — a targeting rule edit, a percentage adjustment, an emergency rollback — must land in a durable, tamper-evident record that an auditor can query, export, and trust. This guide shows you how to structure those records, wire the pipeline, enforce RBAC, set retention, and produce signed evidence exports for SOC2 and GDPR reviews.
Problem Framing: What Audit Means for Flag Systems
What audit logs are not: observability metrics, request traces, or latency dashboards. Those measure system health. Audit logs capture governance actions — who changed what, from what prior state, to what new state, and why.
Feature flags are a dynamic configuration surface. A production targeting rule change can affect millions of users in seconds, with no binary deploy to trigger an approval workflow. That makes the audit trail the only reliable record of authorization. Without it, you cannot answer the auditor’s first question: “Who approved this change?”
This guide covers the mutation-side audit trail — the record of flag configuration changes. It does not cover evaluation telemetry (which variant each request saw), sampling strategies for high-volume logs, or SDK performance tuning.
Prerequisites
- flag taxonomy (at minimum:
viewer,editor,release-manager)
Core Concept & Architecture
The Audit Event Schema
Every mutation produces one audit event. The schema must carry enough context to reconstruct the before-and-after state and answer “who authorized this?”
{
"event_id": "01HZ3K7Q4J9BVFM2WXRDC5N6P8",
"schema_version": "1.0",
"timestamp_utc": "2026-06-20T14:32:07.221Z",
"actor": {
"id": "alice@eng.example.com",
"role": "release-manager",
"auth_method": "sso"
},
"flag_key": "billing.invoicing.pdf-v2",
"environment": "production",
"action": "update",
"before": { "defaultVariant": "off", "targeting": [] },
"after": { "defaultVariant": "off", "targeting": [{ "if": [{"==": [{"var":"plan"},"enterprise"]},"on","off"] }] },
"reason": "Enable PDF v2 for enterprise tier — INC-5042 sign-off",
"approver": "bob@eng.example.com",
"prev_hash": "a3f8c2...",
"hash": "9d1e47..."
}
The prev_hash + hash fields create a cryptographic chain: any retroactive edit breaks every subsequent record. The approver field satisfies the SOC2 CC6.1 requirement for authorization evidence. The reason field ties the change to a ticket or incident, satisfying change-management requirements without forcing a code deploy.
RBAC Enforcement
Targeting rule changes in production must require an elevated role. Enforce this at the API layer — the control plane rejects the write before it generates an audit event if the actor lacks the required role.
ROLE_PERMISSIONS = {
"viewer": frozenset({"read"}),
"editor": frozenset({"read", "update_nonprod"}),
"release-manager": frozenset({"read", "update_nonprod", "update_prod"}),
"break-glass": frozenset({"read", "update_nonprod", "update_prod", "force_override"}),
}
def check_permission(actor_role: str, required: str) -> None:
if required not in ROLE_PERMISSIONS.get(actor_role, frozenset()):
raise PermissionError(
f"Role '{actor_role}' cannot perform '{required}'. "
"Escalate to release-manager or open a break-glass request."
)
Tie break-glass access to a separate approval workflow. Every break-glass action must carry a mandatory reason string or the mutation is rejected.
Step-by-Step Implementation
Step 1 — Capture mutation events at the control plane
Intercept every flag write at the mutation API boundary. The control plane’s event hook fires before the write commits, so you can enrich the event with actor metadata from the authenticated request context.
// mutation-hook.go — intercept flag writes and emit audit events
func (h *AuditHook) OnFlagWrite(ctx context.Context, e MutationEvent) error {
actor := authz.ActorFromContext(ctx)
prev, _ := h.store.Get(ctx, e.FlagKey)
event := AuditEvent{
EventID: ulid.New(),
TimestampUTC: time.Now().UTC(),
Actor: actor,
FlagKey: e.FlagKey,
Environment: e.Environment,
Action: e.Action,
Before: prev,
After: e.NewState,
Reason: e.Reason,
Approver: e.ApproverID,
}
event.Hash = h.chainHash(event, h.lastHash)
h.lastHash = event.Hash
return h.sink.Publish(ctx, event)
}
Pitfall: emitting the event after the write commits risks losing it if the publish fails. Use a transactional outbox pattern: write the event to a pending_audit table in the same transaction as the flag state, then have a relay process deliver it to the log sink.
Step 2 — Write to an append-only log with idempotent producers
The log sink must be append-only. Kafka with enable.idempotence=true and acks=all guarantees exactly-once delivery and prevents duplicate records during network retries.
# kafka-audit-producer.yaml
bootstrap.servers: audit-broker.internal:9092
acks: all
enable.idempotence: true
retries: 10
max.in.flight.requests.per.connection: 1
compression.type: lz4
# Retain indefinitely on the compliance topic — TTL managed by storage tier, not Kafka
log.retention.ms: -1
For object-store destinations (S3, GCS), write each event as an immutable object with a content-addressable key (sha256_of_content.json). Enable versioning and object-lock (WORM) on the bucket so no actor — including the service account — can overwrite or delete records before the retention window expires.
Pitfall: Kafka log compaction on a compliance topic will silently discard older records for the same key. Disable compaction on the audit topic: cleanup.policy=delete.
Step 3 — Hash-chain records for tamper evidence
Compute each record’s hash over its serialized content plus the previous record’s hash. Store both in the event.
import hashlib, json
def chain_hash(event: dict, prev_hash: str) -> str:
# Canonical serialization: sort keys, no whitespace
payload = json.dumps(event, sort_keys=True, separators=(',', ':'))
chain_input = f"{prev_hash}:{payload}"
return hashlib.sha256(chain_input.encode()).hexdigest()
To verify the chain, replay every record in order and recompute. A single mismatch pinpoints the tampered record. Run this verification on a schedule (daily) and before exporting evidence for an audit.
Pitfall: storing only the hash without the prior hash reference makes the chain unverifiable in isolation. Always store both prev_hash and hash in the record body, not just in a sidecar index.
Step 4 — Set retention and cold-storage tiering
SOC2 typically requires 12 months of accessible logs and evidence retention for the audit period. GDPR narrows this for PII-adjacent data — masking PII in the evaluation context upstream means audit events carry hashed identifiers rather than raw fields, so they are not themselves personal data and fall under the longer retention window.
# terraform — S3 lifecycle for compliance audit bucket
resource "aws_s3_bucket_lifecycle_configuration" "audit_retention" {
bucket = "compliance-flag-audit-logs"
rule {
id = "hot-to-glacier"
status = "Enabled"
transition {
days = 90
storage_class = "GLACIER_IR" # instant retrieval for audit window queries
}
expiration {
days = 2555 # 7 years for financial / SOC2 long-tail
}
}
}
resource "aws_s3_bucket_object_lock_configuration" "worm" {
bucket = aws_s3_bucket.audit_logs.id
rule {
default_retention {
mode = "COMPLIANCE"
days = 2555
}
}
}
Step 5 — Export a signed, time-bounded evidence package
When an auditor requests evidence for a specific control period, produce a signed export rather than granting direct log access.
#!/usr/bin/env bash
# export-evidence.sh — produce a tamper-evident export for a date range
set -euo pipefail
START_DATE="${1:?Usage: $0 YYYY-MM-DD YYYY-MM-DD}"
END_DATE="${2:?}"
OUT="flag-audit-evidence_${START_DATE}_${END_DATE}.jsonl"
# Pull events from the immutable store
curl -sf "https://audit-api.internal/v1/export" \
-H "Authorization: Bearer ${AUDIT_TOKEN}" \
--data-urlencode "start=${START_DATE}" \
--data-urlencode "end=${END_DATE}" \
-o "${OUT}"
# Sign and verify
openssl dgst -sha256 -sign "${AUDIT_SIGNING_KEY}" -out "${OUT}.sig" "${OUT}"
openssl dgst -sha256 -verify "${AUDIT_VERIFY_KEY}" -signature "${OUT}.sig" "${OUT}" \
&& echo "Evidence export verified: ${OUT}"
echo "SHA-256: $(sha256sum "${OUT}" | awk '{print $1}')"
Deliver ${OUT} and ${OUT}.sig together. The auditor verifies the signature with the public key registered in your security controls documentation.
Verification & Testing
Verify chain integrity before any audit submission:
# chain-verify.py — replay the log and recompute every hash
python3 - <<'EOF'
import hashlib, json, sys
prev_hash = "0" * 64 # genesis sentinel
with open("flag-audit-evidence_2026-01-01_2026-06-20.jsonl") as f:
for i, line in enumerate(f, 1):
rec = json.loads(line)
stored_hash = rec.pop("hash")
prev_ref = rec.pop("prev_hash")
if prev_ref != prev_hash:
sys.exit(f"Chain break at record {i}: prev_hash mismatch")
payload = json.dumps(rec, sort_keys=True, separators=(',', ':'))
computed = hashlib.sha256(f"{prev_hash}:{payload}".encode()).hexdigest()
if computed != stored_hash:
sys.exit(f"Tamper detected at record {i}: hash mismatch")
prev_hash = stored_hash
print(f"Chain verified: {i} records, no tampering detected.")
EOF
Troubleshooting & FAQ
Why does the chain-verification script report a break at record 1?
The genesis sentinel (prev_hash = "0" * 64) in the script must match the sentinel used when the first event was written. If the service initialized with a different sentinel (empty string, a UUID), every record fails. Check the prev_hash field in your very first audit event and update the script’s initial value to match.
How do I handle PII in audit events without breaking the chain?
Hash user identifiers before they enter the event schema — the actor.id can be an email for internal actors (emails are business data, not personal data under most interpretations), but targetingKey values from evaluation context must be hashed. See GDPR compliance for feature flags for the hashing approach. The hash function must be stable: the same input always produces the same output so you can correlate across records without reversing to PII.
What is the difference between an audit event and an evaluation event?
An audit event records a mutation to flag configuration: who changed the rule, what the rule was before, what it is now. An evaluation event records that a flag was resolved for a specific request: which variant was returned, what context was used. Audit events are low-volume, high-retention, compliance-grade. Evaluation events are high-volume, short-retention, observability-grade. Keep them in separate pipelines and separate retention policies.
Which SOC2 controls does this satisfy?
The schema and pipeline described here directly address CC6.1 (logical access), CC6.6 (authentication and authorization), and CC8.1 (change management). Map each field: actor + role → CC6.1/CC6.6; before/after + reason + approver → CC8.1. See SOC2 evidence collection for flag changes for the full control mapping and the automated pull workflow.
Performance & Scale
Audit events are low-volume relative to evaluation events — a busy platform might see a few hundred flag mutations per day, not per second. The bottleneck is almost never throughput; it is latency in the transactional outbox relay. Keep the relay SLA under 5 seconds so audit records are queryable before the change has propagated to every replica. For kill switch scenarios, the audit event must land before the incident retrospective starts — wire the relay as a first-class service, not an afterthought batch job.