Backend Evaluation & Server-Side SDKs

Server-side evaluation keeps targeting logic, segmentation rules, and sensitive context inside your trust boundary, resolving flags deterministically before a byte reaches the client. This overview owns the backend half of a controlled rollout system: how a server-side SDK bootstraps, how it assembles an evaluation context, how rules are compiled and cached for sub-millisecond resolution, and how the whole path stays safe when the control plane degrades.

Server-side flag evaluation data flow A request enters a service that hosts an OpenFeature SDK and provider; the provider evaluates against a cached, pre-compiled rule set synced from the control plane, and emits telemetry. Request user + context Service (in-process) OpenFeature SDK + provider Rule engine compiled AST Local cache TTL + fallback Control plane flag config stream / poll Telemetry traces + audit
Backend evaluation resolves flags in-process against a pre-compiled rule set kept warm by a local cache, syncing config from the control plane and emitting traces and audit events.

Architecture Overview & Client-Server Boundaries

Strict client-server boundaries keep sensitive targeting logic out of untrusted environments. Backend evaluation guarantees deterministic resolution while minimizing frontend payload — the browser never sees the rule tree, only the resolved variant. Stateless evaluation calls a remote endpoint per request and trades latency for zero local state; stateful (local) evaluation downloads the rule set once and resolves in-process for sub-millisecond reads. Most production services choose local evaluation and treat the network only as a sync channel.

OpenFeature standardizes the provider abstraction so application code calls one vendor-neutral API while the provider behind it can be flagd, a SaaS vendor, or a custom backend. A declarative flag definition is the contract between the control plane and the engine:

# flagd-format definition — keys use namespace.service.feature
flags:
  checkout.payments.express-pay:
    state: ENABLED
    variants: { "on": true, "off": false }
    defaultVariant: "off"
    targeting:
      if:
        - { "==": [ { var: "tenantTier" }, "enterprise" ] }
        - "on"
        - "off"

Lifecycle & Governance

A structured lifecycle enforces governance across development, staging, and production. Creation requires schema validation; rollout phases mandate gradual exposure thresholds; retirement is driven by usage telemetry rather than guesswork. Promotion between environments should be automated and diffable — see multi-environment flag promotion pipelines for drift detection and GitOps gates, and designing a scalable flag taxonomy for the metadata schema that makes ownership and expiry enforceable. Mandatory deprecation windows prevent the configuration sprawl that quietly inflates evaluation cost.

Ecosystem Integration & DevOps Alignment

Treating flag configuration as code enables version control, peer review, and automated validation. Declarative definitions get linted in CI; dry-run evaluations verify targeting against synthetic contexts before a deploy reaches production. Webhook orchestration propagates config changes to caches and warms them ahead of a rollout, and observability correlation ties each flag decision to a trace so you can answer “which variant did this request see?” during an incident.

# CI gate: validate definitions, then dry-run against flagd before deploy
npx ajv-cli validate -s ./flags/schema.json -d './flags/*.json'
curl -sf -X POST http://localhost:8013/schema.v1.Service/ResolveBoolean \
  -H 'Content-Type: application/json' \
  -d '{"flagKey":"checkout.payments.express-pay","context":{"targetingKey":"ci-123","tenantTier":"enterprise"}}' \
  | jq -e '.reason=="TARGETING_MATCH" or .reason=="DEFAULT"' || { echo "flag validation failed"; exit 1; }

Progressive Delivery & Experimentation

Server-side resolution is where progressive delivery actually executes: deterministic bucketing assigns each targetingKey to a stable variant so a user does not flip cohorts between replicas, and traffic is shifted by percentage rather than by redeploy. Experiment integrity depends on guardrail metrics that can auto-halt a rollout when error rate or latency regresses, and on attribution that records the resolved variant alongside the outcome event without adding blocking I/O to the request path.

Operational Safety

Every evaluation call needs a strict timeout and a safe default — an unhandled null variant is how flag infrastructure takes down a request path it was supposed to protect. A circuit breaker around the provider isolates a failing control plane, and a kill switch bypasses complex targeting to force a known-good state instantly, without a code deploy. Choosing the right sync transport matters here too: polling versus streaming determines how fast a kill switch actually propagates to every node.

func evaluateWithResilience(ctx context.Context, c openfeature.IClient, key string) bool {
    evalCtx, cancel := context.WithTimeout(ctx, 50*time.Millisecond)
    defer cancel()
    val, err := c.BooleanValue(evalCtx, key, false, openfeature.EvaluationContext{})
    if err != nil { breaker.RecordFailure(); return false } // safe default
    breaker.RecordSuccess()
    return val
}

Compliance & Audit

Immutable audit trails capture every flag mutation, the actor, the evaluator context, and a resolution timestamp — the raw material for an incident post-mortem and a security review alike. Role-based access control restricts who can change production targeting, and retention policies align evaluation telemetry with regulatory boundaries. The mechanics of structuring those logs, satisfying SOC2 evidence requests, and masking personal data in the evaluation context are covered in the compliance and context-enrichment guides.

Key Concepts at a Glance

Troubleshooting & FAQ

Should I evaluate flags server-side or in the browser?

Evaluate server-side whenever targeting depends on sensitive attributes (entitlements, plan tier, internal segments) or when rule logic must stay private. Resolve in-process and send only the variant to the client. Use client-side evaluation for purely presentational toggles where exposing the rule has no cost.

Why does a flag return its default variant in production but not in staging?

The provider usually is not connected: a missing SDK key, a blocked egress route to the control plane, or an initialization race where the first request runs before the rule set finished loading. Confirm with a readiness probe on the SDK and check the evaluation reasonDEFAULT with errorCode PROVIDER_NOT_READY points straight at bootstrap ordering.

How do I keep flag evaluation off the critical latency path?

Use local (in-process) evaluation against a pre-compiled rule set, cap every call with a short timeout, and return a safe default on error. Never make a synchronous network call per evaluation on the request path; let a background stream or poll keep the local rule set fresh.