Precompiling Targeting Rules into an AST

Q: Compilation succeeds at init but evaluations still show high latency — what is wrong?

The evaluation function is probably still calling the parse function inline. Verify with a compiled-rules counter at init; if the count is correct, profile the walkAST function for regex predicates or expensive context attribute lookups.

Q: How do I handle a rule syntax the compiler does not support?

Log the unsupported operator, increment a compile-error counter, and fall back to interpreted evaluation for that flag. A single bad rule should never prevent the provider from initialising.

Q: Can I run the compiler in a worker thread to avoid blocking the event loop?

Yes. Move parsing and compilation into a worker thread and post the typed-node AST back to the main thread via postMessage. Closures cannot be transferred across thread boundaries, so use the typed-node representation rather than the closure approach.

This how-to is part of Optimizing Rule Engine Performance. It solves a specific, measurable problem: every time a flag is evaluated, the SDK parses the JSON targeting rule from scratch, builds an operator tree, and then discards it. For a service handling 5,000 evaluations per second that overhead — typically 1–4 ms per call — accumulates into the dominant share of your p99 evaluation latency. The fix is to parse once, compile to an in-memory AST, and reuse that tree for every subsequent evaluation of the same flag.

Parse and compile run once at init; every evaluation call does only a map lookup and an AST walk — no JSON is read on the hot path.

Prerequisites

OpenFeature server SDK ≥ 1.x installed (@openfeature/server-sdk, openfeature Python, or Go equivalent)
Flag targeting rules available as JsonLogic or flagd-format JSON at provider init
A p99 latency baseline showing per-evaluation parse cost (use nanosecond spans or process.hrtime)
Flag definitions using namespace.service.feature key schema
A flag-update event or callback from your provider that fires when a rule changes

Step 1 — Parse rules once at provider initialisation

The provider’s initialize hook runs once before any evaluation is served. This is where all rule parsing belongs. Load every flag definition, parse the targeting expression, and store the result.

import { OpenFeature, Provider, EvaluationContext, ResolutionDetails } from '@openfeature/server-sdk';

interface ASTNode {
  op: 'AND' | 'OR' | 'EQ' | 'IN' | 'GT' | 'NOT';
  left?: ASTNode;
  right?: ASTNode;
  attr?: string;
  value?: unknown;
  values?: unknown[];
}

type CompiledMap = Map<string, ASTNode>; // key: "flagKey@version"

function parseJsonLogic(expr: Record<string, unknown>): ASTNode {
  const [[op, args]] = Object.entries(expr) as [[string, unknown[]]];
  switch (op) {
    case 'and':  return { op: 'AND', left: parseJsonLogic(args[0] as any), right: parseJsonLogic(args[1] as any) };
    case 'or':   return { op: 'OR',  left: parseJsonLogic(args[0] as any), right: parseJsonLogic(args[1] as any) };
    case '==':   return { op: 'EQ',  attr: (args[0] as any).var, value: args[1] };
    case 'in':   return { op: 'IN',  attr: (args[0] as any).var, values: args[1] as unknown[] };
    case '>':    return { op: 'GT',  attr: (args[0] as any).var, value: args[1] };
    default: throw new Error(`Unsupported JsonLogic op: ${op}`);
  }
}

class CompiledRuleProvider implements Provider {
  readonly rulesVersion = 'flagd-1.0';
  private compiled: CompiledMap = new Map();

  async initialize(context?: EvaluationContext): Promise<void> {
    const defs = await fetchFlagDefinitions(); // your provider's fetch
    for (const def of defs) {
      const cacheKey = `${def.key}@${def.version}`;
      this.compiled.set(cacheKey, parseJsonLogic(def.targeting));
    }
    // Compiled map is ready before the first evaluation is served
  }
}

Pitfall: if your SDK calls initialize lazily on the first evaluation rather than eagerly at startup, that first call will still pay the parse cost. Force eager initialisation with await OpenFeature.setProviderAndWait(provider) and block the HTTP server from accepting traffic until the promise resolves.

Step 2 — Compile to an AST or closure tree

Parsing builds a raw data structure; compilation turns it into a form optimised for repeated evaluation. The simplest approach is a tree of typed nodes (as above). A more aggressive option is to compile each rule into a closure that closes over its constants — this eliminates the switch dispatch on every walk.

from __future__ import annotations
from typing import Callable, Any
import json

# A compiled rule is just a callable: (context) -> bool
CompiledRule = Callable[[dict[str, Any]], bool]

def compile_rule(expr: dict) -> CompiledRule:
    """Recursively compile a JsonLogic expression into a closure."""
    if "and" in expr:
        left = compile_rule(expr["and"][0])
        right = compile_rule(expr["and"][1])
        # Closure captures left/right; no dict lookup at eval time
        return lambda ctx: left(ctx) and right(ctx)

    if "==" in expr:
        attr: str = expr["=="][0]["var"]
        expected: Any = expr["=="][1]
        return lambda ctx, a=attr, e=expected: ctx.get(a) == e

    if "in" in expr:
        attr = expr["in"][0]["var"]
        allowed = frozenset(expr["in"][1])      # frozenset: O(1) membership test
        return lambda ctx, a=attr, s=allowed: ctx.get(a) in s

    raise ValueError(f"Unsupported op in: {list(expr.keys())}")

# At init, build the compiled map
_rules: dict[str, CompiledRule] = {}

def load_flag_defs(defs: list[dict]) -> None:
    global _rules
    new_rules = {}
    for d in defs:
        cache_key = f"{d['key']}@{d['version']}"
        new_rules[cache_key] = compile_rule(json.loads(d["targeting"]))
    _rules = new_rules   # atomic swap — in-flight evals finish against old map

The closure-based approach removes one layer of dispatch per predicate. For a rule with five predicates evaluated 10,000 times per second, that is a measurable reduction in CPU instructions. For the sub-5ms latency target, this is the step that moves parse cost entirely off the request path.

Step 3 — Cache compiled rules keyed by version

The version key is critical: it lets you hold compiled rules for multiple versions simultaneously during a transition, and it ensures you never serve a stale compilation after a flag update.

package flagengine

import (
    "fmt"
    "sync"
    "sync/atomic"
    "unsafe"
)

type ASTNode struct {
    Op     string
    Left   *ASTNode
    Right  *ASTNode
    Attr   string
    Value  interface{}
    Values []interface{}
}

// compiledStore is swapped atomically — never mutated in place
type compiledStore map[string]*ASTNode

var store atomic.Pointer[compiledStore]

func LoadRules(defs []FlagDef) {
    newStore := make(compiledStore, len(defs))
    for _, d := range defs {
        key := fmt.Sprintf("%s@%d", d.Key, d.Version)
        newStore[key] = compileAST(d.Targeting) // pure function, no side-effects
    }
    store.Store(&newStore)
}

func LookupAST(flagKey string, version int) (*ASTNode, bool) {
    s := store.Load()
    if s == nil {
        return nil, false
    }
    node, ok := (*s)[fmt.Sprintf("%s@%d", flagKey, version)]
    return node, ok
}

Pitfall: storing flagKey alone (without the version) means a rule update and an in-flight evaluation for the old version can collide. Include the version number in the cache key and keep the old compiled tree until all in-flight evaluations for that version complete.

Step 4 — Re-compile only on flag change; benchmark the result

Subscribe to flag update events from the control plane. When one arrives, compile only the changed flag and swap the store. Confirm the improvement with a benchmark.

// Re-compile a single flag on change — O(1), not O(n flags)
func OnFlagUpdate(updated FlagDef) {
    // Load current store, shallow-copy, update the one changed entry
    current := store.Load()
    newStore := make(compiledStore, len(*current)+1)
    for k, v := range *current {
        newStore[k] = v
    }
    key := fmt.Sprintf("%s@%d", updated.Key, updated.Version)
    newStore[key] = compileAST(updated.Targeting)
    store.Store(&newStore)
}

// Benchmark: parsed-each-call vs compiled-reused
// Run: go test -bench=. -benchtime=10s
func BenchmarkParsedEachCall(b *testing.B) {
    raw := `{"and":[{"==":[{"var":"tenantTier"},"enterprise"]},{"in":[{"var":"region"},["us-east-1","eu-west-1"]]}]}`
    ctx := map[string]interface{}{"tenantTier": "enterprise", "region": "us-east-1"}
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        var expr map[string]interface{}
        json.Unmarshal([]byte(raw), &expr)   // parse on every iteration
        evaluateJsonLogicDirect(expr, ctx)
    }
}

func BenchmarkCompiledReused(b *testing.B) {
    ctx := map[string]interface{}{"tenantTier": "enterprise", "region": "us-east-1"}
    ast, _ := LookupAST("api.search.semantic-rerank", 3) // pre-compiled
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        walkAST(ast, ctx)   // map lookup + tree walk only
    }
}

Expected outcome: BenchmarkCompiledReused should be 15–50× faster than BenchmarkParsedEachCall. If the ratio is less than 10×, confirm JSON parsing is not being hoisted out of the benchmark loop by the compiler.

Verification Step

After deploying, confirm the parse cost has moved off the request path by checking the flag_eval_duration_seconds histogram in Prometheus:

# p99 should drop from 3–10 ms to < 1 ms after precompilation
curl -sg 'http://prometheus:9090/api/v1/query' \
  --data-urlencode \
  'query=histogram_quantile(0.99, rate(flag_eval_duration_seconds_bucket{job="app"}[5m]))' \
  | jq -r '.data.result[] | "\(.metric.flag_key): \(.value[1]) s"'

Also verify that a flag update triggers recompilation. Flip a flag in staging, then query the compiled map (via a debug endpoint or log) to confirm the new version is present before the next evaluation for that flag.

For distributed caching set-ups where evaluation results are cached across replicas, note that precompilation and result-caching are complementary: precompilation removes parse overhead on a cache miss; result-caching eliminates the AST walk on a cache hit.

Gotchas & Edge Cases

Lazy provider init: some OpenFeature providers defer initialize until the first call. If your service handles its first real request before init completes, it will pay the parse cost for that call and may return the default variant. Always use setProviderAndWait and fail the readiness probe until init is confirmed.
Unsupported operators: targeting rules written directly in the control plane UI may use operators your custom compiler does not handle. Wrap compileAST in a try/catch that falls back to interpreted evaluation (with a metric increment) so a malformed rule fails open rather than crashing the init path.
Memory footprint at scale: compiled ASTs for 10,000 flags with moderate complexity occupy tens of megabytes — usually acceptable. If footprint is a constraint, compile only the flags active in the current environment and defer the rest to on-demand compilation with a short local TTL.

Troubleshooting & FAQ

Compilation succeeds at init but evaluations still show high latency — what is wrong?

The most common cause is that the evaluation function is not looking up the compiled AST — it is still calling the parse function inline. Add a counter flag_rules_compiled_total that increments in load_flag_defs and confirm it reaches the expected value before the first request is served. If the counter is correct but latency is still high, profile the walkAST function itself: the bottleneck may now be a regex predicate or an expensive context attribute lookup inside the walk.

How do I handle a rule syntax the compiler does not support?

Log the unsupported operator, increment a flag_compile_error_total counter, and fall back to evaluated interpretation for that flag. Never let a compile error prevent the provider from initialising — a single bad rule should not take down flag evaluation for every other flag.

Can I run the compiler in a worker thread to avoid blocking the event loop?

Yes, for Node.js and similar event-loop runtimes. Move parseJsonLogic into a Worker thread and post the compiled result back via postMessage. The main thread receives the compiled tree and stores it atomically. The key constraint is that the serialised form you post must be transferable — plain objects and arrays are fine; closures are not, so use the typed-node AST approach (Step 2 TypeScript example) rather than the closure approach for cross-thread compilation.