Precompiling Targeting Rules into an AST
This how-to is part of Optimizing Rule Engine Performance. It solves a specific, measurable problem: every time a flag is evaluated, the SDK parses the JSON targeting rule from scratch, builds an operator tree, and then discards it. For a service handling 5,000 evaluations per second that overhead — typically 1–4 ms per call — accumulates into the dominant share of your p99 evaluation latency. The fix is to parse once, compile to an in-memory AST, and reuse that tree for every subsequent evaluation of the same flag.
Prerequisites
@openfeature/server-sdk,openfeaturePython, or Go equivalent)process.hrtime)namespace.service.featurekey schema
Step 1 — Parse rules once at provider initialisation
The provider’s initialize hook runs once before any evaluation is served. This is where all rule parsing belongs. Load every flag definition, parse the targeting expression, and store the result.
import { OpenFeature, Provider, EvaluationContext, ResolutionDetails } from '@openfeature/server-sdk';
interface ASTNode {
op: 'AND' | 'OR' | 'EQ' | 'IN' | 'GT' | 'NOT';
left?: ASTNode;
right?: ASTNode;
attr?: string;
value?: unknown;
values?: unknown[];
}
type CompiledMap = Map<string, ASTNode>; // key: "flagKey@version"
function parseJsonLogic(expr: Record<string, unknown>): ASTNode {
const [[op, args]] = Object.entries(expr) as [[string, unknown[]]];
switch (op) {
case 'and': return { op: 'AND', left: parseJsonLogic(args[0] as any), right: parseJsonLogic(args[1] as any) };
case 'or': return { op: 'OR', left: parseJsonLogic(args[0] as any), right: parseJsonLogic(args[1] as any) };
case '==': return { op: 'EQ', attr: (args[0] as any).var, value: args[1] };
case 'in': return { op: 'IN', attr: (args[0] as any).var, values: args[1] as unknown[] };
case '>': return { op: 'GT', attr: (args[0] as any).var, value: args[1] };
default: throw new Error(`Unsupported JsonLogic op: ${op}`);
}
}
class CompiledRuleProvider implements Provider {
readonly rulesVersion = 'flagd-1.0';
private compiled: CompiledMap = new Map();
async initialize(context?: EvaluationContext): Promise<void> {
const defs = await fetchFlagDefinitions(); // your provider's fetch
for (const def of defs) {
const cacheKey = `${def.key}@${def.version}`;
this.compiled.set(cacheKey, parseJsonLogic(def.targeting));
}
// Compiled map is ready before the first evaluation is served
}
}
Pitfall: if your SDK calls initialize lazily on the first evaluation rather than eagerly at startup, that first call will still pay the parse cost. Force eager initialisation with await OpenFeature.setProviderAndWait(provider) and block the HTTP server from accepting traffic until the promise resolves.
Step 2 — Compile to an AST or closure tree
Parsing builds a raw data structure; compilation turns it into a form optimised for repeated evaluation. The simplest approach is a tree of typed nodes (as above). A more aggressive option is to compile each rule into a closure that closes over its constants — this eliminates the switch dispatch on every walk.
from __future__ import annotations
from typing import Callable, Any
import json
# A compiled rule is just a callable: (context) -> bool
CompiledRule = Callable[[dict[str, Any]], bool]
def compile_rule(expr: dict) -> CompiledRule:
"""Recursively compile a JsonLogic expression into a closure."""
if "and" in expr:
left = compile_rule(expr["and"][0])
right = compile_rule(expr["and"][1])
# Closure captures left/right; no dict lookup at eval time
return lambda ctx: left(ctx) and right(ctx)
if "==" in expr:
attr: str = expr["=="][0]["var"]
expected: Any = expr["=="][1]
return lambda ctx, a=attr, e=expected: ctx.get(a) == e
if "in" in expr:
attr = expr["in"][0]["var"]
allowed = frozenset(expr["in"][1]) # frozenset: O(1) membership test
return lambda ctx, a=attr, s=allowed: ctx.get(a) in s
raise ValueError(f"Unsupported op in: {list(expr.keys())}")
# At init, build the compiled map
_rules: dict[str, CompiledRule] = {}
def load_flag_defs(defs: list[dict]) -> None:
global _rules
new_rules = {}
for d in defs:
cache_key = f"{d['key']}@{d['version']}"
new_rules[cache_key] = compile_rule(json.loads(d["targeting"]))
_rules = new_rules # atomic swap — in-flight evals finish against old map
The closure-based approach removes one layer of dispatch per predicate. For a rule with five predicates evaluated 10,000 times per second, that is a measurable reduction in CPU instructions. For the sub-5ms latency target, this is the step that moves parse cost entirely off the request path.
Step 3 — Cache compiled rules keyed by version
The version key is critical: it lets you hold compiled rules for multiple versions simultaneously during a transition, and it ensures you never serve a stale compilation after a flag update.
package flagengine
import (
"fmt"
"sync"
"sync/atomic"
"unsafe"
)
type ASTNode struct {
Op string
Left *ASTNode
Right *ASTNode
Attr string
Value interface{}
Values []interface{}
}
// compiledStore is swapped atomically — never mutated in place
type compiledStore map[string]*ASTNode
var store atomic.Pointer[compiledStore]
func LoadRules(defs []FlagDef) {
newStore := make(compiledStore, len(defs))
for _, d := range defs {
key := fmt.Sprintf("%s@%d", d.Key, d.Version)
newStore[key] = compileAST(d.Targeting) // pure function, no side-effects
}
store.Store(&newStore)
}
func LookupAST(flagKey string, version int) (*ASTNode, bool) {
s := store.Load()
if s == nil {
return nil, false
}
node, ok := (*s)[fmt.Sprintf("%s@%d", flagKey, version)]
return node, ok
}
Pitfall: storing flagKey alone (without the version) means a rule update and an in-flight evaluation for the old version can collide. Include the version number in the cache key and keep the old compiled tree until all in-flight evaluations for that version complete.
Step 4 — Re-compile only on flag change; benchmark the result
Subscribe to flag update events from the control plane. When one arrives, compile only the changed flag and swap the store. Confirm the improvement with a benchmark.
// Re-compile a single flag on change — O(1), not O(n flags)
func OnFlagUpdate(updated FlagDef) {
// Load current store, shallow-copy, update the one changed entry
current := store.Load()
newStore := make(compiledStore, len(*current)+1)
for k, v := range *current {
newStore[k] = v
}
key := fmt.Sprintf("%s@%d", updated.Key, updated.Version)
newStore[key] = compileAST(updated.Targeting)
store.Store(&newStore)
}
// Benchmark: parsed-each-call vs compiled-reused
// Run: go test -bench=. -benchtime=10s
func BenchmarkParsedEachCall(b *testing.B) {
raw := `{"and":[{"==":[{"var":"tenantTier"},"enterprise"]},{"in":[{"var":"region"},["us-east-1","eu-west-1"]]}]}`
ctx := map[string]interface{}{"tenantTier": "enterprise", "region": "us-east-1"}
b.ResetTimer()
for i := 0; i < b.N; i++ {
var expr map[string]interface{}
json.Unmarshal([]byte(raw), &expr) // parse on every iteration
evaluateJsonLogicDirect(expr, ctx)
}
}
func BenchmarkCompiledReused(b *testing.B) {
ctx := map[string]interface{}{"tenantTier": "enterprise", "region": "us-east-1"}
ast, _ := LookupAST("api.search.semantic-rerank", 3) // pre-compiled
b.ResetTimer()
for i := 0; i < b.N; i++ {
walkAST(ast, ctx) // map lookup + tree walk only
}
}
Expected outcome: BenchmarkCompiledReused should be 15–50× faster than BenchmarkParsedEachCall. If the ratio is less than 10×, confirm JSON parsing is not being hoisted out of the benchmark loop by the compiler.
Verification Step
After deploying, confirm the parse cost has moved off the request path by checking the flag_eval_duration_seconds histogram in Prometheus:
# p99 should drop from 3–10 ms to < 1 ms after precompilation
curl -sg 'http://prometheus:9090/api/v1/query' \
--data-urlencode \
'query=histogram_quantile(0.99, rate(flag_eval_duration_seconds_bucket{job="app"}[5m]))' \
| jq -r '.data.result[] | "\(.metric.flag_key): \(.value[1]) s"'
Also verify that a flag update triggers recompilation. Flip a flag in staging, then query the compiled map (via a debug endpoint or log) to confirm the new version is present before the next evaluation for that flag.
For distributed caching set-ups where evaluation results are cached across replicas, note that precompilation and result-caching are complementary: precompilation removes parse overhead on a cache miss; result-caching eliminates the AST walk on a cache hit.
Gotchas & Edge Cases
- Lazy provider init: some OpenFeature providers defer
initializeuntil the first call. If your service handles its first real request before init completes, it will pay the parse cost for that call and may return the default variant. Always usesetProviderAndWaitand fail the readiness probe until init is confirmed. - Unsupported operators: targeting rules written directly in the control plane UI may use operators your custom compiler does not handle. Wrap
compileASTin a try/catch that falls back to interpreted evaluation (with a metric increment) so a malformed rule fails open rather than crashing the init path. - Memory footprint at scale: compiled ASTs for 10,000 flags with moderate complexity occupy tens of megabytes — usually acceptable. If footprint is a constraint, compile only the flags active in the current environment and defer the rest to on-demand compilation with a short local TTL.
Troubleshooting & FAQ
Compilation succeeds at init but evaluations still show high latency — what is wrong?
The most common cause is that the evaluation function is not looking up the compiled AST — it is still calling the parse function inline. Add a counter flag_rules_compiled_total that increments in load_flag_defs and confirm it reaches the expected value before the first request is served. If the counter is correct but latency is still high, profile the walkAST function itself: the bottleneck may now be a regex predicate or an expensive context attribute lookup inside the walk.
How do I handle a rule syntax the compiler does not support?
Log the unsupported operator, increment a flag_compile_error_total counter, and fall back to evaluated interpretation for that flag. Never let a compile error prevent the provider from initialising — a single bad rule should not take down flag evaluation for every other flag.
Can I run the compiler in a worker thread to avoid blocking the event loop?
Yes, for Node.js and similar event-loop runtimes. Move parseJsonLogic into a Worker thread and post the compiled result back via postMessage. The main thread receives the compiled tree and stores it atomically. The key constraint is that the serialised form you post must be transferable — plain objects and arrays are fine; closures are not, so use the typed-node AST approach (Step 2 TypeScript example) rather than the closure approach for cross-thread compilation.