Skip to main content

What a holy grail image format would mean for the browser’s lookahead pre-parser?

By Jason Grigsby

Published on August 27th, 2012

Topics

As long as I’m posting puzzles, riddle me this: what happens to the browser’s lookahead pre-parser if we find the holy grail of responsive images—an image format that works for all resolutions?

In nearly every discussion about responsive images, the conversation eventually turns to the idea that what we really need is a new image format that would contain all of the resolutions of a given image.

For example, JPEG 2000 has the ability to “display images at different resolutions and sizes from the same image file”.

A solution like this would be ideal for responsive images. One image file that can be used regardless of the size the image is in the page. The browser only downloads what it needs and nothing more.

As Ian Hickson recently wrote on the WhatWG mailing list regarding future srcset syntax complexities that might come with hypothetical 4x displays:

Or maybe by then we’ll have figured out an image format that magically does the right thing for any particular render width, and we can just do:
<img src="jupiter-holiday-home.smf">
…and not have to worry about any of this.

It seems that no matter where you’d like to see responsive images go—srcset, picture, whatever—that everyone agrees we’d all be happier with a new, magical image format.

If this mythical image format existed and was adopted by browsers, it would mean the end of image breakpoints for most responsive images. You can see this in Ian Hickson’s hypothetical code above.

Unless the image required editing at various viewport widths (see art direction use case), the document author could simply point to a single file and know that the browser would download the data necessary to display the image for anything from a small mobile screen to a large, high-density display.

This is where the thought experiment gets interesting for me. As noted before, the big challenge with responsive images is that the browser’s lookahead pre-parser wants to start downloading images before the layout is finished. This is the central conflict for responsive images which I described as:

How do we reconcile a pre-parser that wants to know what size image to download ahead of time with an image technique that wants to respond to its environment once the page layout has been calculated?

Image breakpoints are designed to give the lookahead pre-parser the information it needs to pick the right image before the browser knows the size of the element in the layout. This is how we get around the problem.

But we just saw that image breakpoints would go away if we had a new, magical image format. Without the image breakpoints and without knowing the size of the image in the page, how would the browser know when to stop downloading the image?

Unless I’m missing something, it wouldn’t. The browser would start downloading the image file and would only stop once the layout had been determined. In the meantime, it may download a lot more data for a given image than is necessary.

Let’s take a look at Flickr as an example. Flickr would be a huge beneficiary of a new image format. Right now, Flickr maintains a bunch of different sizes of images in addition to the original source image.

Sample of the different sizes that a Flickr image can be downloaded at.

Instead of having to resize every image, Flickr could use a single image format no matter where the image was used throughout the site.

Lets also assume that in this future world, Flickr was responsive design using flexible images so that the size of an image is dependent on the width of the viewport. What happens to a typical Flickr layout like the one below?

Screenshot of Flickr home screen for a logged in user. Shows lots of smaller images.

Because it would be a responsive design, the size of the thumbnails on the page will vary based on the screen size. But no matter how big the thumbnails got, they would never equal the full size of the original image which means that the browser would need to stop downloading the magic image file at some point or it would download extra data.

What would the lookahead pre-parser know at the moment it started downloading the images? All it would know is the size of the viewport and the display density.

Based on those two data points, the pre-parser might try to determine when to stop downloading the image based on the maximum size the display could support. That works ok if the display is small, but on a large, high density display, the maximum image could be gigantic.

And because our magical file format would contain multiple resolutions, the browser would likely keep downloading image data until the layout was calculated making it very likely excess image data would be downloaded and the download of other assets would be delayed.

Of course, this is all hypothetical. We don’t have a magical image format on the table. So for now we can avoid resolving what I see as a central conflict between the need for the pre-parser to start speculatively downloading images before it knows the layout of the page and the fact that the size of responsive images isn’t known until the page layout is determined.

But in the long run—if we find our holy grail—this conflict is likely to resurface which makes me wonder about our current efforts.

I whole-heartedly agree with Steve Souders that “speculative downloading is one of the most important performance improvements from browsers”, and until a new image format materializes, it seems we should do everything we can to accomodate the pre-parser.

And at the same time, I can’t help but wonder, if we all want this magical image format and if in some ways it seems inevitable, then are we jumping through hoops to save browser behavior that won’t work in the long run regardless?

Comments

Andy Davies said:

Perhaps we don’t need a new image format but browsers that are better at enlarging images…

There’s a technique called “Super Resolution from a Single Image”, there’s quite a few paper around about it, here’s an example of one:

http://www.wisdom.weizmann.ac.il/~vision/SingleImageSR.html

(You have to wait quite a while for the images to load once you’ve clicked the blue button)

I have no idea what processor and memory load this places on a device but I’d love to find out more.

Bryan Rieger said:

I’m not entirely sure the pre-parser should determine the required image download size. As you mention “…all [the pre-parser] would know is the size of the viewport and the display density…” – at which point it could fetch a reference file that would later be rendered as required in the context of the actual page layout.

The idea of ‘reference files’ (or stubs – 1-2Kb files) has been around for while – we used them with QuickTime (reference movies), Shockwave and Flash (stubs) in order to pre-process the layout context required BEFORE downloading and rendering the required resource. Often these ‘movies/reference files’ contained logic in the form of simple switch or if/then statements – or in the case of Flash SWFs they would contain more advanced logic to dynamically draw the image at the correct size and aspect ratio for the layout while lazy-loading any additional resources. This worked not only for photographs, but also for line drawing and other vector-based artwork. In theory, you could do something similar using canvas but it wouldn’t have the performance you’d get from a native implementation.

The ‘Super Resolution from a Single Image’ stuff is super exciting and I’ve seen quite a few demonstrations of similar projects. These tend to work great for photographs, not so well for line-art/vectors/dataviz rendered as bitmaps.

One last thing to consider is that a lot of the ideas required for a new image format solution to work already exist, and unfortunately many of them are likely covered by patents. So the problem of ‘how’ is only half the problem, with ‘will we get sued’ being the other.

Noah Adams said:

A file format won’t save us from needing breakpoints.

The problem isn’t only about finding a means to push enough pixels to a retina display, but rather finding something that works for all of our resolutions and layouts, as you allude to in your post about the art direction problem.

Today, there’s effectively one place we can make these sorts of decisions: as part of a breakpoint in CSS. the drawback here is that we don’t actually specify any resources in the original markup, and thus eschew any possibility of a download starting before the CSS has been downloaded and parsed.

Moving forward, it seems very likely that we’re going to get some form of the W3C’s proposed picture element, which moves these resource breakpoints into the markup for the element.

With today’s new W3C draft for the responsive picture element, we actually see two levels of breakpoints, one using media queries to choose a layout appropriate asset, and a second for choosing what resolution of the asset to get. Something like JPEG 2000 (or even a simple progressive JPEG) could work well to allow you to provide a single file for the second set of breakpoints, but you’d still need to specify what would constitute a sufficient number of pixels or amount of data to download as part of the breakpoint definition.

Put more simply, to be pre-parser friendly and download as little as necessary you have to know what, or how much of what, is enough, ahead of time, holy grail or not.

gern said:

how would embedding different sizes of an image into one file reduce bytes downloaded?

gern said:

I’m actually thinking the best way to handle this is streamlined device detection. Once we know what device is used, we can download all the assets for that device and nothing else.

J said:

It seems to me that the lookahead pre-parser is merely a browser optimization of the current HTML spec. If HTML changes, shouldn’t it be up to the browser vendors to optimize for the new spec?