PEP Proxy — from Node.js to C
Wilma is a Policy Enforcement Point (PEP) Proxy — a reverse proxy that sits between every HTTP client and every protected backend service. It validates authentication tokens and enforces authorization policies on every single request. If Wilma adds 5ms of latency, every API call in the entire platform pays that tax.
Three security levels:
Source: github.com/ging/fiware-pep-proxy (v8.3.0, MIT license)
| Component | Library | Problem |
|---|---|---|
| HTTP server | Express 4.x | ~50μs per middleware layer, 6+ middlewares per request |
| JWT verification | jsonwebtoken | Synchronous crypto on the event loop — blocks all other requests |
| Token cache | node-cache | JS hash map, subject to GC pressure |
| HTTP proxy client | got | No connection pooling, full body buffering, new TCP connection per request |
| XACML handling | xml2js / xml2json | Full DOM parse for XML — heavy allocation |
| Body handling | Express middleware | Buffer.concat() copies all chunks before processing |
| Logging | morgan + debug | String formatting on every request |
| Clustering | cluster.fork() | N full Node.js processes = N × 100 MB RAM |
// Step 1: Express middleware chain (~0.5-2ms)
app.use(bodyParser) // Buffer.concat() all body chunks
app.use(cors) // CORS header check
app.use(morgan) // Log formatting
// Step 2: Token extraction (~0.01ms)
token = req.headers['authorization'].split(' ')[1]
// Step 3: JWT verification (~0.1-1ms) ⚠️ BLOCKS EVENT LOOP
jwt.verify(token, secret) // synchronous RSA/HMAC crypto
// Step 4: Cache check (~0.01ms)
nodeCache.get(token)
// Step 5: Keyrock call on cache miss (~5-50ms) ⚠️ EXTERNAL I/O
got('http://keyrock:3000/user?access_token=...')
// Step 6: Forward to backend (~1-500ms) ⚠️ NEW TCP CONNECTION
got(PROXY_URL + req.url, { method, headers, body, retry: 0 })
| Metric | Value | Notes |
|---|---|---|
| RAM (single instance) | ~100 MB | Node.js V8 heap baseline |
| RAM (8-core cluster) | ~800 MB | 8 separate Node.js processes |
| Throughput (cache hit) | ~5,000–15,000 req/s | Limited by Express + got proxy overhead |
| Throughput (cache miss) | ~200–1,000 req/s | Limited by Keyrock HTTP round-trip |
| Throughput (Level 3) | ~100–500 req/s | Two sequential HTTP round-trips + XML |
| p99 latency (cache hit) | ~5–20ms | GC pauses + synchronous JWT crypto |
| Startup time | ~1–2s | Node.js + module loading |
jsonwebtoken.verify() performs RSA or HMAC cryptography synchronously. During the ~0.1–1ms crypto operation, the event loop is frozen. Every other connection — reading, writing, proxying — stalls. Under load, this creates cascading latency spikes.
The got HTTP client creates a new TCP connection for every proxied request. TCP handshake (~0.5ms local, ~50ms remote) + no keep-alive means the proxy overhead alone can exceed the backend's response time.
Every request body is fully buffered in memory (Buffer.concat()) before processing begins. For large NGSI-LD payloads (batch entity creation), this means copying megabytes of data before a single byte is validated or forwarded. No streaming.
// The entire PEP proxy in C
KhServer server;
khInit(&server, 8080, 0);
// All routes go through the same handler
khRegister(&server, KhGet, "/**", pepHandler, true);
khRegister(&server, KhPost, "/**", pepHandler, true);
khRegister(&server, KhPatch, "/**", pepHandler, true);
khRegister(&server, KhDelete, "/**", pepHandler, true);
// pepHandler hot path:
// 1. Extract token from header (zero-copy pointer into read buffer)
// 2. fwHash lookup in token cache (O(1), no GC)
// 3. If miss: validate JWT via OpenSSL (threaded, non-blocking)
// 4. If miss: HTTP to Keyrock via persistent connection pool
// 5. Forward to backend via persistent connection pool
// 6. Stream response back (zero-copy splice where possible)
| Wilma Component | Node.js | C + fw-libs | Impact |
|---|---|---|---|
| HTTP server | Express (event loop, JS middleware) | fwHttp (epoll, zero-copy parse) | 10–50× faster request parsing |
| JWT verification | jsonwebtoken (sync crypto) |
OpenSSL HMAC/RSA (threaded) | Non-blocking, 2–5× faster crypto |
| Token cache | node-cache (JS object) |
fwHash (flat table, no GC) | ~10× faster, zero GC pressure |
| Proxy to backend | got (no pool, full buffer) |
fwHttp client + connection pool | Eliminates TCP handshake per request |
| Proxy to Keyrock | got (new conn per call) |
Persistent connection to Keyrock | ~10× faster cache-miss path |
| Body handling | Buffer.concat() |
Zero-copy (fwHttp read buffer) | No copy, no allocation |
| JSON parsing | JSON.parse() |
fwJson (in-place, zero-alloc) | 5–10× faster |
| Memory allocation | V8 heap + GC (~100 MB) | fwAlloc bump allocator (~2–5 MB) | 20–50× less RAM, zero GC |
| Logging | morgan + debug |
fwTrace (structured, near-zero cost) | ~100× less logging overhead |
| Clustering | N × Node.js processes | SO_REUSEPORT (fwHttp built-in) | N instances at ~2 MB each, not N × 100 MB |
The PEP proxy hot path (cache hit) becomes:
| Phase | Node.js (Wilma) | C (Wilma 2.0) |
|---|---|---|
| Parse HTTP request | ~0.5–2ms (Express + middlewares) | ~3–5μs (fwHttp zero-copy) |
| Extract token | ~0.01ms | ~0.01μs (pointer into buffer) |
| Cache lookup | ~0.01ms | ~0.5μs (fwHash) |
| JWT verify (if needed) | ~0.1–1ms (blocks event loop) | ~0.05–0.5ms (threaded OpenSSL) |
| Enrich headers | ~0.05ms (JSON.stringify roles) | ~1μs (pre-cached) |
| Forward to backend | ~0.5–1ms (new TCP conn) | ~0.02–0.05ms (pooled conn) |
| Return response | ~0.1–0.5ms (full buffer) | ~0.01–0.05ms (stream/splice) |
| Total (cache hit) | ~1.5–5ms | ~0.01–0.05ms (+ backend time) |
| Metric | Wilma (Node.js) | Wilma 2.0 (C) | Improvement |
|---|---|---|---|
| Throughput (cache hit) | ~5,000–15,000 req/s | ~100,000–200,000 req/s | ~10–40× |
| Throughput (cache miss) | ~200–1,000 req/s | ~2,000–5,000 req/s | ~5× (Keyrock-limited) |
| p99 latency (cache hit) | ~5–20ms | ~0.05–0.2ms | ~50–100× |
| RAM per instance | ~100 MB | ~2–5 MB | ~20–50× less |
| RAM (8-core cluster) | ~800 MB | ~5–10 MB | ~80–160× less |
| Startup time | ~1–2s | <10ms | ~100× |
| Proxy overhead added | ~1.5–5ms | ~0.01–0.05ms | ~50–100× less |
The cache-miss throughput is still limited by Keyrock's response time, but with persistent connection pooling the overhead drops from ~50ms (new TLS handshake) to ~5ms (reused connection). A Keyrock 2.0 in C (see Keyrock analysis) would reduce this further.
Wilma is the easiest FIWARE GE to rewrite. It's a thin proxy with well-defined behavior. The fw-libs provide ~80% of the infrastructure.
| Component | Work | Estimate |
|---|---|---|
| HTTP proxy core | fwHttp server + client with connection pooling, header forwarding, body streaming | 1 week |
| JWT validation | OpenSSL HMAC-SHA256/RS256 verification, token parsing with fwJson | 3–4 days |
| Token/decision cache | fwHash with TTL expiry, thread-safe access | 2–3 days |
| Keyrock integration | HTTP client for token validation, user info enrichment | 3–4 days |
| Authorization PDPs | Keyrock basic, AuthZForce XACML, OPA HTTP client | 1 week |
| NGSI-LD payload analysis | Level 3 body inspection (entity IDs, attributes, types) with fwJson | 3–4 days |
| Configuration & startup | Config file parsing, environment variable overrides, graceful shutdown | 2–3 days |
| Testing & hardening | Unit tests, integration tests against Keyrock, load testing | 1 week |
| Total | 3–5 weeks |
Wilma is the ideal first rewrite target. It's small (~2,500 lines of JavaScript), the behavior is well-defined (validate token, proxy request), and the performance impact is massive — because the PEP proxy sits in front of every protected service, making it faster benefits the entire platform.
With 3–5 weeks of effort, you get a PEP proxy that adds 50μs of overhead instead of 5ms, uses 2 MB instead of 100 MB, and handles 100K+ req/s per core. The latency reduction alone justifies the rewrite — every API call in the platform becomes 1–5ms faster.
Bonus: building Wilma 2.0 serves as a test bed for fwHttp's proxy capabilities (connection pooling, streaming, header manipulation) that will also be needed for other GE rewrites.