PEP Proxy

Wilma 2.0

PEP Proxy — from Node.js to C

Overview

What Wilma Does

Wilma is a Policy Enforcement Point (PEP) Proxy — a reverse proxy that sits between every HTTP client and every protected backend service. It validates authentication tokens and enforces authorization policies on every single request. If Wilma adds 5ms of latency, every API call in the entire platform pays that tax.

Client request Extract token Validate JWT Check cache Keyrock (cache miss) AuthZForce (if Level 3) Proxy to backend Return response

Three security levels:

Source: github.com/ging/fiware-pep-proxy (v8.3.0, MIT license)

Current Implementation

Node.js + Express — The Bottlenecks

Technology Stack

ComponentLibraryProblem
HTTP serverExpress 4.x~50μs per middleware layer, 6+ middlewares per request
JWT verificationjsonwebtokenSynchronous crypto on the event loop — blocks all other requests
Token cachenode-cacheJS hash map, subject to GC pressure
HTTP proxy clientgotNo connection pooling, full body buffering, new TCP connection per request
XACML handlingxml2js / xml2jsonFull DOM parse for XML — heavy allocation
Body handlingExpress middlewareBuffer.concat() copies all chunks before processing
Loggingmorgan + debugString formatting on every request
Clusteringcluster.fork()N full Node.js processes = N × 100 MB RAM

The Hot Path — What Happens Per Request

// Step 1: Express middleware chain (~0.5-2ms)
app.use(bodyParser)     // Buffer.concat() all body chunks
app.use(cors)           // CORS header check
app.use(morgan)         // Log formatting

// Step 2: Token extraction (~0.01ms)
token = req.headers['authorization'].split(' ')[1]

// Step 3: JWT verification (~0.1-1ms) ⚠️ BLOCKS EVENT LOOP
jwt.verify(token, secret)  // synchronous RSA/HMAC crypto

// Step 4: Cache check (~0.01ms)
nodeCache.get(token)

// Step 5: Keyrock call on cache miss (~5-50ms) ⚠️ EXTERNAL I/O
got('http://keyrock:3000/user?access_token=...')

// Step 6: Forward to backend (~1-500ms) ⚠️ NEW TCP CONNECTION
got(PROXY_URL + req.url, { method, headers, body, retry: 0 })

Performance Numbers

MetricValueNotes
RAM (single instance)~100 MBNode.js V8 heap baseline
RAM (8-core cluster)~800 MB8 separate Node.js processes
Throughput (cache hit)~5,000–15,000 req/sLimited by Express + got proxy overhead
Throughput (cache miss)~200–1,000 req/sLimited by Keyrock HTTP round-trip
Throughput (Level 3)~100–500 req/sTwo sequential HTTP round-trips + XML
p99 latency (cache hit)~5–20msGC pauses + synchronous JWT crypto
Startup time~1–2sNode.js + module loading

The Three Killer Bottlenecks

1. Synchronous JWT on Event Loop

jsonwebtoken.verify() performs RSA or HMAC cryptography synchronously. During the ~0.1–1ms crypto operation, the event loop is frozen. Every other connection — reading, writing, proxying — stalls. Under load, this creates cascading latency spikes.

2. No Connection Pooling

The got HTTP client creates a new TCP connection for every proxied request. TCP handshake (~0.5ms local, ~50ms remote) + no keep-alive means the proxy overhead alone can exceed the backend's response time.

3. Full Body Buffering

Every request body is fully buffered in memory (Buffer.concat()) before processing begins. For large NGSI-LD payloads (batch entity creation), this means copying megabytes of data before a single byte is validated or forwarded. No streaming.

Wilma 2.0

C + fw-libs — The Rewrite

Architecture

// The entire PEP proxy in C
KhServer server;
khInit(&server, 8080, 0);

// All routes go through the same handler
khRegister(&server, KhGet,     "/**", pepHandler, true);
khRegister(&server, KhPost,    "/**", pepHandler, true);
khRegister(&server, KhPatch,   "/**", pepHandler, true);
khRegister(&server, KhDelete,  "/**", pepHandler, true);

// pepHandler hot path:
// 1. Extract token from header (zero-copy pointer into read buffer)
// 2. fwHash lookup in token cache (O(1), no GC)
// 3. If miss: validate JWT via OpenSSL (threaded, non-blocking)
// 4. If miss: HTTP to Keyrock via persistent connection pool
// 5. Forward to backend via persistent connection pool
// 6. Stream response back (zero-copy splice where possible)

Component-by-Component Replacement

Wilma ComponentNode.jsC + fw-libsImpact
HTTP server Express (event loop, JS middleware) fwHttp (epoll, zero-copy parse) 10–50× faster request parsing
JWT verification jsonwebtoken (sync crypto) OpenSSL HMAC/RSA (threaded) Non-blocking, 2–5× faster crypto
Token cache node-cache (JS object) fwHash (flat table, no GC) ~10× faster, zero GC pressure
Proxy to backend got (no pool, full buffer) fwHttp client + connection pool Eliminates TCP handshake per request
Proxy to Keyrock got (new conn per call) Persistent connection to Keyrock ~10× faster cache-miss path
Body handling Buffer.concat() Zero-copy (fwHttp read buffer) No copy, no allocation
JSON parsing JSON.parse() fwJson (in-place, zero-alloc) 5–10× faster
Memory allocation V8 heap + GC (~100 MB) fwAlloc bump allocator (~2–5 MB) 20–50× less RAM, zero GC
Logging morgan + debug fwTrace (structured, near-zero cost) ~100× less logging overhead
Clustering N × Node.js processes SO_REUSEPORT (fwHttp built-in) N instances at ~2 MB each, not N × 100 MB

Performance Projection

The PEP proxy hot path (cache hit) becomes:

PhaseNode.js (Wilma)C (Wilma 2.0)
Parse HTTP request ~0.5–2ms (Express + middlewares) ~3–5μs (fwHttp zero-copy)
Extract token ~0.01ms ~0.01μs (pointer into buffer)
Cache lookup ~0.01ms ~0.5μs (fwHash)
JWT verify (if needed) ~0.1–1ms (blocks event loop) ~0.05–0.5ms (threaded OpenSSL)
Enrich headers ~0.05ms (JSON.stringify roles) ~1μs (pre-cached)
Forward to backend ~0.5–1ms (new TCP conn) ~0.02–0.05ms (pooled conn)
Return response ~0.1–0.5ms (full buffer) ~0.01–0.05ms (stream/splice)
Total (cache hit) ~1.5–5ms ~0.01–0.05ms (+ backend time)

Summary

MetricWilma (Node.js)Wilma 2.0 (C)Improvement
Throughput (cache hit) ~5,000–15,000 req/s ~100,000–200,000 req/s ~10–40×
Throughput (cache miss) ~200–1,000 req/s ~2,000–5,000 req/s ~5× (Keyrock-limited)
p99 latency (cache hit) ~5–20ms ~0.05–0.2ms ~50–100×
RAM per instance ~100 MB ~2–5 MB ~20–50× less
RAM (8-core cluster) ~800 MB ~5–10 MB ~80–160× less
Startup time ~1–2s <10ms ~100×
Proxy overhead added ~1.5–5ms ~0.01–0.05ms ~50–100× less

The cache-miss throughput is still limited by Keyrock's response time, but with persistent connection pooling the overhead drops from ~50ms (new TLS handshake) to ~5ms (reused connection). A Keyrock 2.0 in C (see Keyrock analysis) would reduce this further.

Development

Effort Estimate with Claude Max

Wilma is the easiest FIWARE GE to rewrite. It's a thin proxy with well-defined behavior. The fw-libs provide ~80% of the infrastructure.

ComponentWorkEstimate
HTTP proxy core fwHttp server + client with connection pooling, header forwarding, body streaming 1 week
JWT validation OpenSSL HMAC-SHA256/RS256 verification, token parsing with fwJson 3–4 days
Token/decision cache fwHash with TTL expiry, thread-safe access 2–3 days
Keyrock integration HTTP client for token validation, user info enrichment 3–4 days
Authorization PDPs Keyrock basic, AuthZForce XACML, OPA HTTP client 1 week
NGSI-LD payload analysis Level 3 body inspection (entity IDs, attributes, types) with fwJson 3–4 days
Configuration & startup Config file parsing, environment variable overrides, graceful shutdown 2–3 days
Testing & hardening Unit tests, integration tests against Keyrock, load testing 1 week
Total 3–5 weeks

Verdict: The Quick Win

Wilma is the ideal first rewrite target. It's small (~2,500 lines of JavaScript), the behavior is well-defined (validate token, proxy request), and the performance impact is massive — because the PEP proxy sits in front of every protected service, making it faster benefits the entire platform.

With 3–5 weeks of effort, you get a PEP proxy that adds 50μs of overhead instead of 5ms, uses 2 MB instead of 100 MB, and handles 100K+ req/s per core. The latency reduction alone justifies the rewrite — every API call in the platform becomes 1–5ms faster.

Bonus: building Wilma 2.0 serves as a test bed for fwHttp's proxy capabilities (connection pooling, streaming, header manipulation) that will also be needed for other GE rewrites.