Backend Evaluation & Server-Side SDKs
Server-side evaluation keeps targeting logic, segmentation rules, and sensitive context inside your trust boundary, resolving flags deterministically before a byte reaches the client. This overview owns the backend half of a controlled rollout system: how a server-side SDK bootstraps, how it assembles an evaluation context, how rules are compiled and cached for sub-millisecond resolution, and how the whole path stays safe when the control plane degrades.
Architecture Overview & Client-Server Boundaries
Strict client-server boundaries keep sensitive targeting logic out of untrusted environments. Backend evaluation guarantees deterministic resolution while minimizing frontend payload — the browser never sees the rule tree, only the resolved variant. Stateless evaluation calls a remote endpoint per request and trades latency for zero local state; stateful (local) evaluation downloads the rule set once and resolves in-process for sub-millisecond reads. Most production services choose local evaluation and treat the network only as a sync channel.
OpenFeature standardizes the provider abstraction so application code calls one vendor-neutral API while the provider behind it can be flagd, a SaaS vendor, or a custom backend. A declarative flag definition is the contract between the control plane and the engine:
# flagd-format definition — keys use namespace.service.feature
flags:
checkout.payments.express-pay:
state: ENABLED
variants: { "on": true, "off": false }
defaultVariant: "off"
targeting:
if:
- { "==": [ { var: "tenantTier" }, "enterprise" ] }
- "on"
- "off"
Lifecycle & Governance
A structured lifecycle enforces governance across development, staging, and production. Creation requires schema validation; rollout phases mandate gradual exposure thresholds; retirement is driven by usage telemetry rather than guesswork. Promotion between environments should be automated and diffable — see multi-environment flag promotion pipelines for drift detection and GitOps gates, and designing a scalable flag taxonomy for the metadata schema that makes ownership and expiry enforceable. Mandatory deprecation windows prevent the configuration sprawl that quietly inflates evaluation cost.
Ecosystem Integration & DevOps Alignment
Treating flag configuration as code enables version control, peer review, and automated validation. Declarative definitions get linted in CI; dry-run evaluations verify targeting against synthetic contexts before a deploy reaches production. Webhook orchestration propagates config changes to caches and warms them ahead of a rollout, and observability correlation ties each flag decision to a trace so you can answer “which variant did this request see?” during an incident.
# CI gate: validate definitions, then dry-run against flagd before deploy
npx ajv-cli validate -s ./flags/schema.json -d './flags/*.json'
curl -sf -X POST http://localhost:8013/schema.v1.Service/ResolveBoolean \
-H 'Content-Type: application/json' \
-d '{"flagKey":"checkout.payments.express-pay","context":{"targetingKey":"ci-123","tenantTier":"enterprise"}}' \
| jq -e '.reason=="TARGETING_MATCH" or .reason=="DEFAULT"' || { echo "flag validation failed"; exit 1; }
Progressive Delivery & Experimentation
Server-side resolution is where progressive delivery actually executes: deterministic bucketing assigns each targetingKey to a stable variant so a user does not flip cohorts between replicas, and traffic is shifted by percentage rather than by redeploy. Experiment integrity depends on guardrail metrics that can auto-halt a rollout when error rate or latency regresses, and on attribution that records the resolved variant alongside the outcome event without adding blocking I/O to the request path.
Operational Safety
Every evaluation call needs a strict timeout and a safe default — an unhandled null variant is how flag infrastructure takes down a request path it was supposed to protect. A circuit breaker around the provider isolates a failing control plane, and a kill switch bypasses complex targeting to force a known-good state instantly, without a code deploy. Choosing the right sync transport matters here too: polling versus streaming determines how fast a kill switch actually propagates to every node.
func evaluateWithResilience(ctx context.Context, c openfeature.IClient, key string) bool {
evalCtx, cancel := context.WithTimeout(ctx, 50*time.Millisecond)
defer cancel()
val, err := c.BooleanValue(evalCtx, key, false, openfeature.EvaluationContext{})
if err != nil { breaker.RecordFailure(); return false } // safe default
breaker.RecordSuccess()
return val
}
Compliance & Audit
Immutable audit trails capture every flag mutation, the actor, the evaluator context, and a resolution timestamp — the raw material for an incident post-mortem and a security review alike. Role-based access control restricts who can change production targeting, and retention policies align evaluation telemetry with regulatory boundaries. The mechanics of structuring those logs, satisfying SOC2 evidence requests, and masking personal data in the evaluation context are covered in the compliance and context-enrichment guides.
Key Concepts at a Glance
- Server-side SDK integration — initialization lifecycle, dependency injection, and resilience for high-throughput services.
- Context enrichment — assembling sanitized user, tenant, and request attributes for precise targeting.
- Distributed caching — cache topologies and consistency that keep evaluation latency flat across nodes.
- Rule-engine performance — AST compilation and latency budgets for sub-5ms resolution.
- Flag synchronization transport — the polling-versus-streaming decision and its propagation guarantees.
Troubleshooting & FAQ
Should I evaluate flags server-side or in the browser?
Evaluate server-side whenever targeting depends on sensitive attributes (entitlements, plan tier, internal segments) or when rule logic must stay private. Resolve in-process and send only the variant to the client. Use client-side evaluation for purely presentational toggles where exposing the rule has no cost.
Why does a flag return its default variant in production but not in staging?
The provider usually is not connected: a missing SDK key, a blocked egress route to the control plane, or an initialization race where the first request runs before the rule set finished loading. Confirm with a readiness probe on the SDK and check the evaluation reason — DEFAULT with errorCode PROVIDER_NOT_READY points straight at bootstrap ordering.
How do I keep flag evaluation off the critical latency path?
Use local (in-process) evaluation against a pre-compiled rule set, cap every call with a short timeout, and return a safe default on error. Never make a synchronous network call per evaluation on the request path; let a background stream or poll keep the local rule set fresh.