The Web OS is already here… it’s just not what you thought it would be. Web technologies are currently powering content and interactions across multiple devices effectively turning the most popular native applications into Web browsers. The end result is a widely distributed and used Web-based operating system. Just not the one you imagined. — Luke Wroblewski
Some people do a lot of research before they travel. They read guidebooks. They like to know what they are going to do before they get there.
Others want to experience the new place and let the serendipity happen. In their minds, planning ahead would take all of the fun out of the trip.
Either approach works fine. But anyone who has travelled in a group knows what happens when the person who plans ahead is forced to contend with the care free attitude of someone who wants to wait until the last minute to decide what to do.
And that is essentially the conflict we have between the browser’s lookahead pre-parser and responsive images.
What does the lookahead pre-parser do?
In recent years, browser makers have put an emphasis on making pages load as quickly as possible. The lookahead pre-parser is part of their efforts.
The Internet Explorer team describes the lookahead pre-parser as follows:
To reduce the delay inherent in downloading script, stylesheets, images, and other resources referenced in an HTML page, Internet Explorer needs to request the download of those resources as early as possible in the loading of the page.… Internet Explorer runs a second instance of a parser whose job is to hunt for resources to download while the main parser is paused. This mode is called the lookahead pre-parser because it looks ahead of the main parser for resources referenced in later markup.
The lookahead pre-parser is much simpler than the full parser used to determine how the page will be rendered. Because the scripts and css have yet to be processed, the lookahead pre-parser is taking some guesses about what assets will be downloaded:
The download requests triggered by the lookahead are called “speculative” because it is possible (not likely, but possible) that the script run by the main parser will change the meaning of the subsequent markup (for instance, it might adjust the BASE against which relative URLs are combined) and result in the speculative request being wasted.
How lookahead pre-parsers work isn’t that important. What does matter is that the browser wants to start downloading assets before full page layout has been determined. To be successful, the pre-parser needs to know what the page is likely to do ahead of time.
Responsive images travel without a guidebook
If the lookahead pre-parser is the tourist with a detailed itinerary of places to visit, responsive images are the go-with-flow tourist waiting to see what things look like before choosing what to do.
In a responsive web design, the layout and images are all fluid. The size of any given image cannot be determined until the page layout is calculated by the rendering engine.
All of the solutions suffer from this conflict
No matter if you favor <picture>, @srcset, or some other solution, the fundamental conflict between the lookahead pre-parser and responsive images persists. Let me demonstrate with some of the more popular options.
@srcset min-width only
One of the biggest points of confusion for the srcset proposal was understanding what the width and height attributes in the new syntax were supposed to represent.
<img src="firstname.lastname@example.org" alt="" srcset="email@example.com 600w 200h 1x, firstname.lastname@example.org 600w 200h 2x, face-icon.png 200w 200h">
Originally, I thought the 600w 200h in the example syntax represented the image size. But instead, they list the minimum viewport resolution that should be used for a particular image. Think CSS min-width being used in a media query.
Once I understood what the width was meant to represent, I began to see the shortcomings of this approach. Whatever values where listed in srcset would need to match the breakpoints specified in the design’s media queries.
Jeremy Keith pointed out that by only supporting “min-width”, srcset will have difficulty matching breakpoints defined using max-width in CSS. He writes:
One of the advantages of media queries is that, because they support both min- and max- width, they can be used in either use-case: “Mobile First” or “Desktop First”.
Because the srcset syntax will support either min- or max- width (but not both), it will therefore favour one case at the expense of the either.
Both use-cases are valid. Personally, I happen to use the “Mobile First” approach, but that doesn’t mean that other developers shouldn’t be able to take a “Desktop First” approach if they want. By the same logic, I don’t much like the idea of srcset forcing me to take a “Desktop First” approach.
The inability of srcset to match breakpoints defined in media queries is just the beginning of the challenges.
@srcset px only
You can use em and % freely in your stylesheets/CSS. The values from srcset is used to fetch the right resource during early prefetch, checked against the width and height of the viewport (and only that viewport).
Having ems or % would make no sense whatsoever there, because you don’t know what they mean…If you make a solution that will support em/% in a meaningful way, you would have to wait for layout in order to know what size that means. So you will have slower-loading images, and ignore the “we want pictures fast” requirement.
Sound familiar? It’s the conflict between the lookahead pre-parser and responsive images again.
Without support for ems, it will be very difficult to match srcset attributes to responsive design breakpoints that use ems.
@srcset and <picture> maintenance nightmare
As the challenges of matching media queries to @srcset values became clearer to me, it also became apparent what a mess updating this code would be when a redesign occurred. This is a problem for both @srcset and <picture> as D. Pritchard pointed out on the WhatWG list:
I dread the day when I have to dig through, possibly hundreds of pages, with multiple images per, to update the breakpoints and resolutions. Surely there’s a better way to manage breakpoints on a global level rather than burying the specifics within the elements or attributes themselves.
Unfortunately, no one has suggested a better way yet likely because most of the global rules for a site exist in CSS which is parsed much later by the browser.
A progressive image format
Surely all these problems point out that we shouldn’t be messing around with breakpoints in HTML anyways. What we really need is a new progressive image format (or perhaps an old format used in a new way).
Use a progressive image format and HTTP range requests. Ideally the image metadata at the start of the file would include some hints about how many bytes to download to get an exact image size. The browser can then download the smallest size equal to or greater than the dimensions it needs based on the layout’s width as specified in CSS.
Unfortunately, this description also demonstrates why this approach will also battle the lookahead pre-parser. How will the browser know when to stop downloading the image? It will know based on the size of the image in the layout.
And so on and so on
I could list other proposed solutions. Each has their own merits and problems, but they all suffer from the same conflict due to the fact that the proper size of a responsive image isn’t known until the layout is complete which is too late for the lookahead pre-parser.
No intrinsic cut off points for images
What if we’re complicating this too much. Perhaps we shouldn’t try to replicate the breakpoints in HTML. Instead, we should simply supply different sizes of the image and let the browser decide on the best version.
There are two problems with this idea.
- How does the browser know when to download each image? At the time of the lookahead pre-parser, it doesn’t know what size the image will be. So it will need you to tell it when to use each size image.
- If you’re not tying the selection of images to the breakpoints in your design, how do you decide when you should switch images? Should you create three versions of each image? Five? Ten? Should you switch them at 480 because that is the iPhone width in landscape and then lament your decision if rumors of a taller iPhone screen come to pass?
The problem is there is nothing intrinsic to the image that would guide you in deciding where you should switch from one size of the image to another. Whatever we select will be entirely arbitrary unless we base it on our design breakpoints.
What matters more: lookahead pre-parser or responsive images?
Since coming to the realization that the real conflict is between the lookahead pre-parser and responsive images, I’ve been wondering which we should prioritize.
The lookahead pre-parser has been essential to providing a better experience for users given the way that web pages have been traditionally coded. Doing anything that prevents the pre-parser from working seems like a step backward.
At the same time, while it may seem responsive images is an author issue, the biggest impact is felt by users. Downloading images that are too large for the size that they are displayed at makes pages load more slowly. It is possible that the performance gains from the pre-parser could be lost again due to unnecessarily large image downloads.
For existing web content, the lookahead pre-parser is undoubtably the fastest way to render the page. But if web development moves towards responsive images as standard practice, then delaying the download of images until the proper size of the image in the layout can be determined may actually be faster than using the lookahead pre-parser. The difference in size between a retina image for iPad and an image used on a low resolution mobile phone is significant.
It seems like this tradeoff ought to be measurable in some way so we can quantify what the impact would be. I’m not skilled enough to construct that test, but hopefully others can help evaluate it so we can make an informed decision.
Two years later and I finally understand the problem
It has been two years since I first started looking at images in responsive designs. It seemed simple. The <img> tag had one src and we needed multiple sources.
Until the WhatWG got fully engaged in this question, I thought I understood the problem. Now I realize it was much bigger and more difficult than I originally thought.
We have an existential problem here. A chicken and egg conundrum.
How do we reconcile a pre-parser that wants to know what size image to download ahead of time with an image technique that wants to respond to its environment once the page layout has been calculated?
I don’t know what the answer is, but I’m very curious to see what we decide.
Over the last few weeks many more web developers and designers have become engaged in the conversation surrounding responsive images. On the whole, this is great news because the more people we have telling browser makers that this is a legitimate issue, the more likely it is to get addressed quickly.
However, some of the conversations about responsive images end up going in circles because people are talking past each other. I believe this is happening because we don’t have a common framework to look at the problem.
I believe there are two separate, but related issues that need to be solved regarding the use of the img element in responsive designs. They are:
1. How do we enable authors so that they can display different images under different conditions based on art direction?
To understand this issue, it helps to look at a specific use case. Take for example the following photo of President Obama speaking at a Chrysler plant.1
When the image is displayed at larger sizes, it makes sense for the image to show the automobile factory in the background. The background helps explain where the event took place and adds to the image. But look what happens when we scale the image down to fit a smaller screen.
At that size, you can barely recognize Obama. You can’t make out his face. Instead of simply resizing the image, it may make sense to crop the image to get rid of some of the background and focus in on him. The end result is an image that works better at the smaller size:
This is what I refer to as enabling art direction. Authors need to be able to provide different sources for images at different sizes not based on resolution or based on network speed, but based on the judgment of the designer for what is the best image at a particular breakpoint.
As an aside, showing photographs at different sizes to illustrate a point is more difficult when you’re dealing with flexible images in a responsive design. If those examples don’t make sense on your phone, I’m afraid you may have to look at it on a wider screen to see what I’m talking about! :-)
2. Enabling authors to provide different resolutions of images based on a variety of conditions.
When people talk about how to handle images for retina displays, they are talking about this second issue. The same is true of making decisions about the size of images based on bandwidth. All are dealing with how to deliver different resolutions based on conditions (pixel density, network speed, etc.).
Apple’s proposed addition to CSS4, image-set, is designed to solve this issue in CSS. Authors define images at different densities and the browser picks the best one based on some criteria that could be as simple as pixel density or as complex as a combination of observed network speed, pixel density, and user preference. What the criteria is remains to be defined.
Where the <picture> element fits
The proposed <picture> element attempts to solve issue #1. It focuses on how to give authors the ability to specify different images, but doesn’t do anything about pixel density or bandwidth.
I’ve seen a lot of feedback on the <picture> element that says we should have a new type of image format to replace JPEG, PNG and GIF that would be resolution independent. That would be awesome. And it would solve issue #2, but it wouldn’t help with the art direction outlined in issue #1.
When we discuss various solutions, it behooves us to figure out which issue we’re trying to solve. We can also debate whether or not the two issues I outlined are legitimate or if there are other issues that aren’t addressed by them.
But in order to have a fruitful discussion about how to solve these issues, we need to be clear about which issues we’re talking about or we’ll end up wasting more time. What I often see in comment threads and on Twitter is two people debating different solutions for responsive images, both looking at different issues, and neither realizing that the other isn’t looking at the same problem.
My hope is that by defining these issues, we can stop spinning our wheels and have more successful discussions.
Over the last few months, I’ve been proud to represent Portland as the WebVisions conference branched out into other cities. It’s been awesome to see “the little conference that could” find new communities eager to welcome it.
Despite that fact, I’ll freely admit that I have a special affection for the Portland version of WebVisions. It was the first conference I attend and the first one I spoke at. There are talks that I saw at WebVisions that still influence my work to this day.
This is the 12th year of the conference. The organizers bring in fantastic speakers and do so at a price that is much less expensive than what it costs to attend most other conferences. At my previous job, I would often take our entire team to WebVisions.
This year is no different. The conference starts on May 16th. Lyza and I will be giving a workshop showing desktop web developers how to convert their skills to mobile. I’m also giving a talk called Casting Off Our Desktop Shackles. We’d love to see you at both!
If you’re in the Portland area, I encourage you to attend WebVisions. I love the community around the event. And I like playing host to all of the smart people who travel to our city to attend or speak at WebVisions. If you do attend, please say hello.