Avtrz
BlogEngineering
Engineering9 min read

How we cache 50M profile photos without going broke

S3 was the easy answer. It was also the wrong one. Notes on the boring storage decisions behind a cheap-feeling API.

Maya Okonkwo··9 min read
S3 → R2
$0.07/GB egress saved
$0.07/GB egress saved

Avtrz serves one thing: a profile photo for a business contact. At rest that is roughly 50 million images, and the number that matters is not how many we store; it is how many times each one leaves our network. Storage is cheap. Egress is not.

When we started, we put every photo in S3 and a CDN in front of it. It worked on day one. It also quietly set us up for a bill that scaled with our success, which is the worst kind of bill.

Why S3 was the easy answer

S3 is durable, boring, and everywhere. You can have a bucket in four minutes and never think about it again. For storing the photos, it is still the right call.

The problem is the path out. Every cache miss on the CDN reads from the origin, and every origin read on S3 is billed egress. Our miss rate was low (under 4%), but 4% of a large number is still a large number, and the per-gigabyte rate does not care how clever your architecture is.

The egress math

We did the arithmetic the way you should have done it before signing up: take the average object size, multiply by monthly origin pulls, multiply by the egress rate. Our average avatar is small (a 64px crop is a few kilobytes), but the long tail of larger sizes pulls the average up, and warm-up jobs hammer the origin in bursts.

The bug was not in the code. It was in the pricing page we did not read closely enough.

The fix was not a smarter cache. It was moving the origin to storage that does not bill for egress at all. We moved the hot set to R2, kept S3 as the cold archive, and pointed the CDN at R2.

The move itself

Copy, then cut over

We did not migrate in place. We ran a background copy of the hot set into R2, left S3 serving live traffic, and only flipped the origin once the copy finished and checksums matched. A cutover you can roll back is a cutover you can do on a Tuesday.

Get the cache headers right

Most of the savings come from the CDN never asking the origin in the first place. Profile photos change rarely, so we serve them with a long max-age and a stale-while-revalidate window, and we key the cache on the normalized request, not the raw query string.

HTTPcopy / paste
Cache-Control: public, max-age=86400, stale-while-revalidate=604800
Vary: Accept
ETag: "a1f3c9-64"
Long max-age, stale-while-revalidate, and a 24-hour public cache.

Normalize before you cache

?size=64 and ?size=64&v=1 are the same image. If you cache on the raw URL, you pay for both. We collapse the request to a canonical form (sorted params, dropped tracking keys) before it ever touches the cache layer.

What it bought us

$0.07
/GB egress saved
98.9%
edge hit rate
0
code changes for users

None of this was clever. The clever version (a custom cache tier, a bespoke eviction policy) is what we almost built first. The boring version shipped in a week and is still the reason the API feels cheap.

If you are storing a lot of small files and serving them a lot, read the egress line on the pricing page before you read anything else. That line is your architecture.

Written by
Maya Okonkwo
Works on storage and the edge at Avtrz. Believes most infrastructure problems are really cost problems wearing a hat.
More posts
Twice a month, max

Get the next customer story in your inbox.

Engineering posts, customer stories, and the occasional changelog. Unsubscribe in one click.

How we cache 50M profile photos without going broke · Avtrz Blog