HTTP content negotiation on AWS CloudFront Part 2

My earlier post on HTTP content negotiation in AWS CloudFront covered support for negotiating the response encoding using the request’s Accept-Encoding header. This post builds on that by adding support for negotiating Content-Type using the request’s Accept header.

As Ilya Grigorik laid out several years ago, content negotiation tricky to get right, and yet particularly important when serving images. This is even more relevant today, as recently the browser ecosystem’s support for WebP has taken a big step forward -- Edge just shipped support and work on Firefox support has resumed after a long hiatus. Furthermore, there continue to be interesting new image formats on the horizon such as AVIF and HEIF.

There are several reasons why Content-Type negotiation is more difficult than Content-Encoding. First, the media types being negotiated are hierarchical, supporting a type and subtype model, e.g. image/png. In addition, wildcards are supported, e.g. image/*. Finally, unlike Accept-Encoding when browsers explicitly send all of the encodings that they support, browsers tend not to do this with the Accept header. For example, Firefox sends Accept: */* when requesting images. This gives the HTTP server no indication of what the browser actually supports -- by the RFC the server would be allowed to return any content at all. As a result, servers are typically either conservative, returning only formats which are highly likely to be supported like image/jpeg, or fall back to heuristics like User-Agent sniffing to detect specific browser builds which support a server-preferred content type.

What is an HTTP server implementer to do? Apache’s mod_negotiation has a fairly sophisticated set of heuristics for supporting content negotiation which covers some of this including working around usage of overly-permissive wildcards.

With AWS CloudFront, we can implement something similar to drive this process on Lambda@Edge. The code below is being used to serve this article, and will cause the image of a dog to be returned as WebP if your browser supports it.

How does this work?

Similar to my previous post, this uses the zero-dependency, MIT-licensed http-content-negotiation-js library to run the content negotiation process. This library implements all of the requisite media range parsing and semantics, as well as some of the heuristics from mod_negotiation. For example, it treats matches against a subtype wildcard as having an implicit q-value of 0.02 if none of the media ranges in the request have an explicit q-value specified.

First, the SERVER_IMAGE_TYPES list of ValueTuple objects is created to represent our content type preferences for images. Note that we indicate that we have a slight preference for image/jpeg (q-value 0.99) over image/webp (q-value 0.98). This handles user-agents who do not send an explicit indication of support for one of the image types (e.g. Firefox sends Accept: */*). In this case, we’ll prefer the more conservative image/jpeg. For user-agents that do send specific image types (e.g. Chrome sends Accept: image/webp,image/apng,image/*,*/*;q=0.8), we’ll honor their request for a more specific type.

Next, the request handler looks for request URLs ending in .jpg and interprets this as a request for an image, performing type negotiation. We then rewrite the URL for the upstream request with a new file extension based on the negotiated content type.

Finally, it’s worth noting that while type and encoding negotiation are not mutually exclusive, it is generally not worthwhile spending CPU cycles to encode and decode images. Because of this, we only bother performing encoding negotiation if we’re not serving an image.