For most of Avtrz’s life, an avatar request that missed the cache went to a pool of workers running Sharp. The worker pulled the original photo, resized and cropped it, and handed it back. It worked. It was also a fleet of machines we had to run, scale, and patch so that a request could wait in a queue.
This spring we moved the transform to the edge. The crop now happens in the same edge function that serves the response, next to the user, with no worker pool behind it. It cut about 200ms off every cache miss. It also cost us three production bugs, which are the interesting part.
Why move it at all
A worker pool is latency you pay for twice: the network hop to the pool, and the time the request spends queued behind other requests when traffic spikes. Warm-up jobs spike traffic by design, so the queue was worst exactly when customers cared most.
Edge runtimes can now decode, resize, and re-encode an image inline. So the transform moved to where the request already was:
“Moving work closer to the user is easy. Noticing the assumptions that moved with it is the hard part.”
The three bugs
One: memory limits, not CPU limits
The worker pool had room to decode a large source image into memory. The edge runtime is tighter. A handful of unusually large originals decoded fine in staging and then hit the memory ceiling under real traffic. The fix was to cap the source dimensions before decoding and reject anything implausible early.
Two: the cache key lost a field
The worker keyed its output cache on size and output format. The first edge version keyed only on size, so a request for webp could be served a cached png. It looked fine until a client sent an Accept header we had not seen in testing. We added format back to the key and the Vary header.
Three: cold starts on rare regions
Popular regions stayed warm and fast. A handful of low-traffic regions cold-started the transform code on the first request and looked slower than the old worker pool there. We now keep the function warm with a low-rate health ping per region. Boring, and it worked.
Where it landed
The win was real and so were the bugs. Every one of them came from an assumption that was true in the worker pool and silently false at the edge: more memory, a forgiving cache key, always-warm code. Moving work closer to the user is easy. The job is finding the assumptions that did not make the trip.