Implementing Progressive Delivery Workflows

This guide is part of the Feature Flag Architecture & Lifecycle Management series. Progressive delivery decouples a code deployment from a user-facing release — the binary lands in production but only a fraction of traffic sees it, and that fraction grows only when measured signals stay healthy. Feature flags are the control knobs: they define who sees what variant, when exposure shifts, and what triggers an instant rollback without a redeploy.

This guide covers canary releases, ring deployments, percentage ramps, automated traffic shifting, and rollback triggers. It does not cover the mechanics of how flags are stored, versioned, or named — see designing a scalable flag taxonomy for that — nor the specifics of stale-flag retirement after a rollout completes, which is covered in managing flag deprecation and cleanup.

Progressive delivery ramp and canary diagram Traffic progresses through canary, ring-one, ring-two, and full rollout phases; a guardrail breach at any phase triggers an automatic rollback. 1% Canary 10% Ring 1 25% Ring 2 50% Ring 3 100% Full auto-rollback on breach guardrail
Each ring widens traffic exposure; a guardrail breach at any phase triggers an automatic rollback to the previous ring.

Prerequisites

Core Concept & Architecture

Progressive delivery sits at the intersection of deployment automation and flag evaluation. The deployment pipeline brings new code to every host simultaneously; the flag controls what percentage of requests activate it. This separates two previously coupled events — “code reaches production” and “users see the change” — and makes the second one a continuous, measurable process rather than a binary switch.

Three patterns dominate production use:

Pattern Traffic control Best for
Canary 1–5% real users Catch low-rate defects before broad exposure
Ring deployment Internal → early adopters → general Risk segmentation by user population
Percentage ramp Linear or exponential ramp on a schedule Smooth rollouts with SLO gates between steps

All three use the same underlying mechanism: a flag that maps a targetingKey to a variant based on a bucketing rule. The key requirement is stickiness — a given user must see the same variant on every request as the percentage climbs, across all replicas. The how-to for that is covered in percentage-based rollout with sticky bucketing.

Step-by-Step Implementation

Step 1 — Define the flag with a percentage rollout rule

Start with 1% and make the ramp schedule explicit in flag metadata so operators know what phase they’re reading.

# flagd-format definition — namespace.service.feature key convention
flags:
  checkout.payments.express-pay:
    state: ENABLED
    variants: { "on": true, "off": false }
    defaultVariant: "off"
    targeting:
      fractionalEvaluation:
        - { "var": "targetingKey" }
        - ["on", 1]     # 1% get "on"
        - ["off", 99]   # 99% get "off"

Pitfall: using a non-deterministic bucketing source (random UUID per request, server timestamp) breaks stickiness — a user can flip between variants on consecutive requests. Always derive the bucket from a stable identity attribute like userId or sessionId.

Step 2 — Gate progression on guardrail metrics

Automate the ramp: measure, compare to thresholds, and advance only when metrics are healthy. Never advance manually on a schedule without a metric check.

import { OpenFeature } from '@openfeature/server-sdk';

async function advanceRollout(flagKey: string, currentPct: number): Promise<void> {
  const errorRate = await metrics.query(`rate_5xx{flag="${flagKey}"}`);
  const p95Latency = await metrics.query(`p95_ms{flag="${flagKey}"}`);

  if (errorRate > 0.01 || p95Latency > 200) {
    await rollbackFlag(flagKey, currentPct);
    alerting.fire(`rollback.${flagKey}`, { errorRate, p95Latency });
    return;
  }

  const nextPct = Math.min(currentPct * 2, 100); // exponential: 1→2→4→8→16→32→64→100
  await flagAPI.setRolloutPercentage(flagKey, nextPct);
  console.log(`checkout.payments.express-pay advanced to ${nextPct}%`);
}

Pitfall: measuring error rate globally rather than per-variant conflates control-group errors with variant errors. Attribute metrics to the resolved variant (available from the evaluation context) so the signal is clean.

Step 3 — Configure ring deployments for user population segmentation

Ring deployments restrict early exposure to low-risk populations (internal employees, beta opt-ins) before exposing the general user base.

flags:
  checkout.payments.express-pay:
    state: ENABLED
    variants: { "on": true, "off": false }
    defaultVariant: "off"
    targeting:
      if:
        - { "in": [ { "var": "ring" }, ["internal", "beta"] ] }
        - "on"
        - { "fractionalEvaluation":
            - { "var": "targetingKey" }
            - ["on", 5]
            - ["off", 95] }

This rule resolves on for internal/beta users unconditionally, and gives 5% of the general population the new variant. Promote to the next ring by updating the fractionalEvaluation percentages after the internal ring shows no regressions.

Pitfall: defining rings as separate flags rather than targeting rules in one flag creates attribution confusion — you can’t compare variant outcomes across rings when the flag key differs.

Step 4 — Wire automated rollback triggers

The kill switch for a rollout should fire without human intervention when a guardrail breaches. Wire a webhook from your alerting system to a flag-update endpoint.

# FastAPI webhook: automated rollback on SLO breach
from fastapi import FastAPI, Request
import httpx, os

app = FastAPI()

@app.post("/webhook/slo-alert")
async def rollback_on_breach(request: Request):
    payload = await request.json()
    flag_key = payload.get("labels", {}).get("flag_key", "")
    metric = payload.get("metric", "")
    value = payload.get("value", 0.0)

    if flag_key == "checkout.payments.express-pay" and metric == "error_rate" and value > 0.01:
        async with httpx.AsyncClient() as http:
            await http.patch(
                f"https://flags.internal/v1/flags/{flag_key}/rollout",
                json={"percentage": 0},
                headers={"Authorization": f"Bearer {os.environ['FLAG_API_TOKEN']}"},
            )
        return {"status": "rolled_back", "flag": flag_key}
    return {"status": "ignored"}

Pitfall: a rollback that sets the percentage to 0 is not the same as forcing the safe variant. If the SDK default in code differs from defaultVariant, behaviour is ambiguous. Force the explicit variant — see the emergency kill-switch runbook for the production-safe approach.

Verification & Testing

After each ring advance, confirm both the exposure percentage and the metric signal:

# Confirm fraction of requests seeing "on" matches the target percentage
flagctl get checkout.payments.express-pay --env prod -o json | jq '.targeting'

# Spot-check variant resolution across replicas
for host in $(cat replicas.txt); do
  curl -s "$host/debug/flags/checkout.payments.express-pay" | jq -r '.variant'
done | sort | uniq -c
# expect ~1 out of every 100 lines to show "on" at 1% rollout

For ring deployments, also verify that internal users resolve on regardless of percentage:

curl -s -X POST http://flagd.internal:8013/schema.v1.Service/ResolveBoolean \
  -H 'Content-Type: application/json' \
  -d '{"flagKey":"checkout.payments.express-pay","context":{"targetingKey":"emp-42","ring":"internal"}}' \
  | jq '.value'   # must be true

Troubleshooting & FAQ

Why do users flip between variants as the percentage increases?

The bucketing hash is not stable. Either the targetingKey changes between requests (re-generated session ID, anonymous → logged-in transition) or the provider is not using a deterministic hash. Confirm the key is stable for the session lifetime and that the provider uses a repeatable hashing function. See percentage-based rollout with sticky bucketing for the full fix.

My rollback webhook fired, but some replicas still serve the new variant — why?

The flag change has not propagated to all replicas yet. Propagation speed depends on your sync transport: streaming delivers within a second, polling within one interval. Check the sync connection state on lagging replicas and verify the transport is healthy. For transport selection guidance see polling vs streaming flag synchronization.

How do I attribute a metric regression to a specific variant without changing the data model?

Resolve the variant in-process and attach it as a structured attribute on the trace or log event at the point of the flag call. Most OpenFeature hooks let you tap into the afterEvaluation hook to add the resolved variant to the active span without adding a separate network call.

Should I use one flag per ring or one flag with ring-based targeting?

One flag. Multiple flags for the same feature make cross-ring attribution difficult: you can’t compare variant metrics when the flag key differs, and the audit trail is fragmented. Keep all ring rules in a single flag’s targeting configuration and promote by updating that configuration.

Performance & Scale Considerations

Percentage evaluation adds negligible overhead — it is a deterministic hash and a modulo comparison. The cost that scales with ring complexity is the evaluation context assembly: building the ring attribute requires a lookup or claim parse, which should happen once per request boundary and be cached on the request context, not re-fetched per flag call.

At high replica counts, the exponential ramp schedule (1% → 2% → 4% → 8% …) naturally limits blast radius: a defect at 1% affects 1 in 100 users, giving early warning before any significant population is exposed. For experimentation and A/B testing guardrails, the same ramp structure applies — the difference is that experiment variants hold at a fixed percentage long enough to accumulate statistical significance rather than ramping to 100%.