User Interface Design :: Lessons :: Output
Today's hall of fame or shame candidate is the Domino's Pizza build-your-own-pizza process. You can try it yourself by going to the Domino's website and clicking Order to start an order (you'll have to fill in an address to get to the part we care about, the pizza-building UI).
On your own think aboult the following aspects of usability:
- user control & freedom
This lesson continues our look into the mechanics of implementing user interfaces, by considering output in more detail.
There are basically three ways to represent the output of a graphical user interface.
Objects is the same as the view tree we discussed previously. Parts of the display are represented by view objects arranged in a spatial hierarchy, with automatic redraw propagating down the hierarchy. There have been many names for this idea over the years; the GUI community hasn't managed to settle on a single preferred term.
Strokes draws output by making procedure calls to high-level drawing primitives, like drawLine, drawRectangle, drawArc, and drawText.
Pixels regards the screen as an array of pixels and deals with the pixels directly.
All three output approaches appear in virtually every modern GUI application. The object approach always appears at the very top level, for windows, and often for graphical objects within the windows as well. At some point, we reach the leaves of the view hierarchy, and the leaf views draw themselves with stroke calls. A graphics package then converts those strokes into pixels displayed on the screen. For performance reasons, an object may short-circuit the stroke package and draw pixels on the screen directly. On Windows, for example, video players do this using the DirectX interface to have direct control over a particular screen rectangle.
What approach do each of the following representations use? HTML (object); Postscript laser printer (stroke input, pixel output); plotter (stroke input and output); PDF (stroke); LCD panel (pixel).
Since virtually every GUI uses all three approaches, the design question becomes: at which points in your application do you want to step down into a lower-level kind of output? Here's an example. Suppose you want to build a view that displays a graph of nodes and edges.
One way to do it would represent each node and each edge in the graph by a component (as in the tree on the right). Each node in turn might have two components, a circle and a text label. Eventually, you'll get down to the primitive objects available in your GUI toolkit. Most GUI toolkits provide a text label; most don't provide a primitive circle. (One notable exception is SVG, which has component equivalents for all the common drawing primitives.) This would be a pure object approach, at least from your application's point of view – stroke output and pixel output would still happen, but inside primitive objects that you took from the library.
Alternatively, the top-level window might have no subcomponents. Instead, it would draw the entire graph by a sequence of stroke calls: drawCircle for the node outlines, drawText for the labels, drawLine for the edges. This would be a pure stroke.
Finally, your graph view might bypass stroke drawing and set pixels in the window directly. The text labels might be assembled by copying character images to the screen. This pure pixel approach is rarely used nowadays, because it's the most work for the programmer, but it used to be the only way to program graphics.
Hybrid approaches for the graph view are certainly possible, in which some parts of the output use one approach, and others use another approach. The graph view might use objects for nodes, but draw the edges itself as strokes. It might draw all the lines itself, but use label objects for the text.
There are a number of things to think about when choosing an output model:
- Layout: Objects remember where they were put, and draw themselves there. They also support automatic layout. With strokes or pixls, you have to figure out (at drawing time) where each piece goes, and put it there.
- Input: Objects participate in event dispatch and propagation, and the system automatically does hit-testing (determining whether the mouse is over the component when an event occurs) for objects, but not for strokes. If a graph node is an object, then it can receive its own click and drag events. If you stroked the node instead, then you have to write code to determine which node was clicked or dragged.
- Redraw: An automatic redraw algorithm means that components redraw themselves automatically when they have to. Furthermore, the redraw algorithm is efficient: it only redraws components whose extents intersect the damaged region. The stroke or pixel model would have to do this test by hand. In practice, most stroked objects don't bother, simply redrawing everything whenever some part of the view needs to be redrawn.
- Drawing order: It's easy for a parent to draw before (underneath) or after (on top of) all of its children. But it's not easy to interleave parent drawing with child drawing. So if you're using a hybrid model, with some parts of your view represented as components and others as strokes, then the components and strokes generally fall in two separate layers, and you can't have any complicated layering relationships between strokes and components.
- Heavyweight objects: Objects may be big -- even an object with no fields costs about 20 bytes in Java. As we've seen, the view tree is overloaded not just with drawing functions but also with event dispatch, automatic redraw, and automatic layout, so the properties and state used by those processes further bulks up the class. Views derived from large amounts of data – say, a 100,000- node graph – generally can't use an object for every individual data item. The "flyweight" pattern can help, by storing redundant information in the object's context (i.e., its parent) rather than in each component, but few toolkits support flyweight objects.
- Device dependence: The stroke approach is largely device independent. In fact, it's useful not just for displaying to screens, but also to printers, which have dramatically different resolution. The pixel approach, on the other hand, is extremely device dependent. A directly-mapped pixel image won't look the same on a screen with a different resolution.
As we said earlier, almost every GUI program uses all three approaches. At the highest level, a typical program presents itself in a window, which is an object. At the lowest level, the window appears on the screen as a rectangle of pixels. So a series of steps has to occur that translates that window object (and all its descendents in the view tree) into pixels.
The step from objects down to strokes is usually called drawing. We'll look at that first.
The step from strokes down to pixels is called rasterization (or scan conversion). The specific algorithms that rasterize various shapes are beyond the scope of this course but we'll talk about some of the effects of rasterization, and what you need to know as a UI programmer to control those effects.
Here's how drawing works in the object approach. Drawing is a top-down process: starting from the root of the view tree, each object draws itself, then draws each of its children recursively. The process is optimized by passing a clipping region to each object, indicating the area of the screen that needs to be drawn. Children that do not intersect the clipping region are simply skipped, not drawn. In the example above, nodes B and C would not need to be drawn. When an object partially intersects the clipping region, it must be drawn – but any strokes or pixels it draws when the clipping region is in effect will be masked against the clip region, so that only pixels falling inside the region actually make it onto the screen.
For the root, the clipping region might be the entire screen. As drawing descends the tree, however, the clipping region is intersected with each object's bounding box. So the clipping region for an object deep in the tree is the intersection of the bounding boxes of its ancestors.
For high performance, the clipping region is normally rectangular, using bounding boxes rather than the graphical object's actual shape. But it doesn't have to be that way. A clipping region can be an arbitrary shape on the screen. This can be very useful for visual effects: e.g., setting a string of text as your clipping region, and then painting an image through it like a stencil. Postscript was the first stroke model to allow this kind of nonrectangular clip region. Now many graphics toolkits support nonrectangular clip regions. For example, on Microsoft Windows and X Windows, you can create nonrectangular windows, which clip their children into a nonrectangular region.
Here's an example of the redraw algorithm running on the graph window (starting with the clipping region shown earlier).
- First the clip region is intersected with the whole window's bounding box, and the window is told to draw itself within that intersection. The window draws its titlebar and its gray background. The window background effectively erases the previous contents of the window. 2.The window's clip region is now intersected with its first child's bounding box (Node A), and Node A is told to draw itself within that. In this particular example (where nodes are represented by circle and label components), Node A doesn't do any of its own drawing; all the drawing will be handled by its children.
- Now Node A's circle child is told to draw itself. In this case, the circle has the same bounding box as Node A itself, so it receives the same clip region that Node A did. It draws a white circle.
- Now Node A's label child is told to draw itself, again using the same clip region because it has the same bounding box. It draws text on top of the circle just drawn.
- Popping back up the tree, the next child of the window, Edge A-B, is told to draw itself, using the clip region that intersects its own bounding box with the window's clip region. Only part of the edge falls in this clip region, so the edge only draws part of itself.
- The next child of the window, Node B, doesn't intersect the window's clip region at all, so it isn't told to draw itself.
- The algorithm continues through the rest of the tree, either drawing children or skipping them depending on whether they intersect the clip region. (Would Edge A-C be drawn? Would Node C be drawn?)
Note that the initial clip region passed to the redraw algorithm will be different every time the algorithm is invoked. Clip regions generally come from damage rectangles, which will be explained in a moment.
When the bounding boxes of two objects overlap, like the circle and label components in the previous example, the redraw algorithm induces an ordering on the objects that makes them appear layered, one on top of the other. For this reason, 2D graphical user interfaces are sometimes called 2.5D. They aren't fully 3D, in which objects have x, y, and z coordinates; instead the z dimension is merely an ordering, called z order.
Z order is a side-effect of the order that the objects are drawn when the redraw algorithm passes over the tree. Since drawing happens top-down, parents are generally drawn underneath children (although parents get control back after their children finish drawing, so a parent can draw some more on top of all its children if it wants). Older siblings (with lower indexes in their parent's array of children) are generally drawn underneath younger ones. Java Swing is a curious exception to this – its redraw algorithm draws the highest-index child first, so the youngest sibling ends up on the bottom of the z order.
Z order can be affected by rearranging the tree, e.g. moving children to a different index position within their parent, or promoting them up the tree if necessary. This is often important for operations like drag-and-drop, since we generally want the object being dragged to appear on top of other objects.
When a graphical object needs to change its appearance, it doesn't repaint itself directly. It can't, because the drawing process has to occur top-down through the view tree: the object's ancestors and older siblings need to have a chance to paint themselves underneath it. (So, in Java, even though a graphical object can call its own paint() method directly, you generally shouldn't do it!)
Instead, the object asks the graphics system to repaint it at some time in the future. This request includes a damaged region, which is the part of the screen that needs to be repainted. Often, this is just the entire bounding box of the object; but complex objects might figure out which part of the screen corresponds to the part of the model that changed, so that only that part is damaged.
The repaint request is then queued for later. Multiple pending repaint requests from different objects are consolidated into a single damaged region, which is often represented just as a rectangle – the bounding box of all the damaged regions requested by individual objects. That means that undamaged screen area is being considered damaged, but there's a tradeoff between the complexity of the damaged region representation and the cost of repainting.
Eventually – usually after the system has handled all the input events (mouse and keyboard) waiting on the queue -- the repaint request is finally satisfied, by setting the clipping region to the damaged region and redrawing the view tree from the root.
There's an unfortunate side-effect of the automatic damage/redraw algorithm. If we draw a view tree directly to the screen, then moving an object can make the screen appear to flash – objects flickering while they move, and nearby objects flickering as well.
When an object moves, it needs to be erased from its original position and drawn in its new position. The erasure is done by redrawing all the objects in the view hierarchy that intersect this damaged region; typically the drawing of the window background is what does the actual erasure. If the drawing is done directly on the screen, this means that all the objects in the damaged region temporarily disappear, before being redrawn. Depending on how screen refreshes are timed with respect to the drawing, and how long it takes to draw a complicated object or multiple layers of the hierarchy, these partial redraws may be briefly visible on the monitor, causing a perceptible flicker.
Double-buffering solves this flickering problem. An identical copy of the screen contents is kept in a memory buffer. (In practice, this may be only the part of the screen belonging to some subtree of the view hierarchy that cares about double-buffering.) This memory buffer is used as the drawing surface for the automatic damage/redraw algorithm. After drawing is complete, the damaged region is just copied to screen as a block of pixels. Double-buffering reduces flickering for two reasons: first, because the pixel copy is generally faster than redrawing the view hierarchy, so there's less chance that a screen refresh will catch it half-done; and second, because unmoving objects that happen to be caught, as innocent victims, in the damaged region are never erased from the screen, only from the memory buffer.
It's a waste for every individual view to double-buffer itself. If any of your ancestors is doublebuffered, then you'll derive the benefit of it. So double-buffering is usually applied to top-level windows.
Why is it called double-buffering? Because it used to be implemented by two interchangeable buffers in video memory. While one buffer was showing, you'd draw the next frame of animation into the other buffer. Then you'd just tell the video hardware to switch which buffer it was showing, a very fast operation that required no copying and was done during the CRT's vertical refresh interval so it produced no flicker at all.
In our description of the redraw algorithm, we said a graphical object "draws itself," meaning that it produces strokes to show itself on the screen. How that is actually done depends on the GUI toolkit you're using.
In Java Swing (and many other desktop GUI toolkits, like Win32 and Cocoa), every object has a drawing method. In Swing, this method is paint(). The redraw algorithm operates by recursively calling paint() down the view hierarchy. Objects can override the paint() method to change how they draw themselves. In fact, Swing breaks the paint() method down into several overridable template methods, like paintComponent() and paintChildren(), to make it easier to affect different parts of the redraw process.
The DOM in web browsers has no drawing method available to override – the redraw algorithm is hidden from the programmer, much like the event loop is hidden by many toolkits. Instead, you make a sequence of stroke calls into the object, and the object records this sequence of calls. Subsequently, whenever the object needs to redraw itself, it just plays back the recorded sequence of stroke calls. This approach is sometimes called retained graphics.
A key difference between these approaches is when stroke calls can be made. With the drawing method approach, drawing should only be done while the drawing method is active. Drawing done at a different time (like during an event handler) will not interact correctly with the redraw algorithm; it won't respect z order, and it will be ephemeral, overwritten and destroyed the next time the redraw algorithm touches that object. With the retained graphics approach, however, the stroke calls can be recorded at any time, and the toolkit automatically handles playing them back at the right point in the redraw. The retained graphics approach tends to be less error prone for a programmer; drawing at the wrong time is a common mistake for beginning Swing programmers.
A potential downside of the retained graphics approach is performance. The recorded strokes must be stored in memory. Although this recording is not as heavyweight as a view tree (since it doesn't have to handle input or layout, or even necessarily be represented as objects), you probably wouldn't want to do it with millions of stroke calls. So if you had an enormous view (like a map) being displayed inside a scrolling pane (so that only a small part of it was visible on screen), you wouldn't want to stroke the entire map. The drawing method approach gives more control over this; since you have access to the clip region in the drawing method, you can choose not to render strokes that would be clipped. To do the equivalent thing with retained graphics would put more burden on the programmer to determine the visible rectangle and rerecord the stroke calls every time this rectangle changed.
Now let's look at the drawing capabilities provided by the stroke model.
Every toolkit's stroke model has some notion of a drawing surface. The screen is only one possible place where drawing might go. Another common drawing surface is a memory buffer, which is an array of pixels just like the screen. Unlike the screen, however, a memory buffer can have arbitrary dimensions. The ability to draw to a memory buffer is essential for doublebuffering. Another target is a printer driver, which forwards the drawing instructions on to a printer. Although most printers have a pixel model internally (when the ink actually hits the paper), the driver often uses a stroke model to communicate with the printer, for compact transmission. Postscript, for example, is a stroke model.
Most stroke models also include some kind of a graphics context, an object that bundles up drawing parameters like color, line properties (width, end cap, join style), fill properties (pattern), and font.
The stroke model may also provide a current coordinate system, which can be translated, scaled, and rotated around the drawing surface. We've already discussed the clipping region, which acts like a stencil for the drawing. Finally, a stroke model must provide a set of drawing primitives, function calls that actually produce graphical output.
Many systems combine all these responsibilities into a single object. Java's Graphics object is a good example of this approach. In other toolkits, the drawing surface and graphics context are independent objects that are passed along with drawing calls.
When states like graphics context, coordinate system, and clipping region are embedded in the drawing surface, the surface must provide some way to save and restore the context. A key reason for this is so that parent views can pass the drawing surface down to a child's draw method without fear that the child will change the graphics context. In Java, for example, the context can be saved by Graphics.create(), which makes a copy of the Graphics object. Notice that this only duplicates the graphics context; it doesn't duplicate the drawing surface, which is still the same.
It's beyond the scope of this lesson to talk about algorithms for converting a stroke into pixels. But you should be aware of some important techniques for making strokes look good.
One of these techniques is antialiasing, which is a way to make an edge look smoother. Instead of making a binary decision between whether to color a pixel black or white, antialiasing uses a shade of gray whose value varies depending on how much of the pixel is covered by the edge. In practice, the edge is between two arbitrary colors, not just black and white, so antialiasing chooses a point on the gradient between those two colors. The overall effect is a fuzzier but smoother edge.
Subpixel rendering takes this a step further. Every pixel on an LCD screen consists of three discrete pixels side-by-side: red, green, and blue. So we can get a horizontal resolution which is three times the nominal pixel resolution of the screen, simply by choosing the colors of the pixels along the edge so that the appropriate subpixels are light or dark. It only works on LCD screens, not CRTs, because CRT pixels are often arranged in triangles, and because CRTs are analog, so the blue in a single "pixel" usually consists of a bunch of blue phosphor dots interspersed with green and red phosphor dots. You also have to be careful to smooth out the edge to avoid color fringing effects on perfectly vertical edges. And it works best for highcontrast edges, like this edge between black and white. Subpixel rendering is ideal for text rendering, since text is usually small, high-contrast, and benefits the most from a boost in horizontal resolution. Windows XP was the first version of Windows to use ClearType, an implementation of subpixel rendering for Windows fonts.
Finally, let's talk in more detail about what a pixel image looks like.
Put simply, it's a rectangular array of pixels – but pixels themselves are not always so simple. A pixel itself has a depth, so this model is really three dimensional. Depth is often expressed in bits per pixel (bpp). The simplest kind of pixel model has 1 bit per pixel; this is suitable for representing black and white images. It's also used for bitmasks, where the single-bit pixels are interpreted as boolean values (pixel present or pixel missing). Bitmasks are useful for clipping – you can think of a bitmask as a stencil.
Another kind of pixel representation uses each pixel value as an index into a palette, which is just a list of colors. In the 4-bpp model, for example, each of the 16 possible pixel values represents a different color. This kind of representation, often called Indexed Color, was useful when memory was scarce; you still see it in the GIF file format, but otherwise it isn't used much today.
The most common pixel representation is often called "true color" or "direct color"; in this model, each pixel represents a color directly. The color value is usually split up into multiple components: red, green, and blue. (Color components are also called channels or bands; the red channel of an image, for example, is a rectangular array of the red values of its pixels.)
A pixel model can be arranged in memory (or a file) in various ways: packed tightly together to save memory, or spread out loosely for faster access; with color components interleaved or separated; and scanned from the top (so that the top-left pixel appears first) or the bottom (the bottom-left pixel appearing first).
Many pixel models have a fourth channel in addition to red, green, and blue: the pixel's alpha value, which represents its degree of transparency. We'll talk more about alpha in a future lesson.
The primary operation in the pixel model is copying a block of pixels from one place to another – often called bitblt (pronounced "bit blit"). This is used for drawing pictures and icons on the screen, for example. It's also used for double-buffering – after the offscreen buffer is updated, its contents are transferred to the screen by a bitblt.
Bitblt is also used for screen-to-screen transfers. To do fast scrolling, for example, you can bitblt the part of the window that doesn't change upwards or downwards, to save the cost of redrawing it. (For example, look at Swing's JViewport.BLIT_SCROLL_MODE.)
It's also used for sophisticated drawing effects. You can use bitblt to combine two images together, or to combine an image with a mask, in order to clip it or composite them together.
Bitblt isn't always just a simple array copy operation that replaces destination pixels with source pixels. There are various different rules for combining the destination pixels with the source pixels. These rules are called compositing (alpha compositing, when the images have an alpha channel), and we'll talk about them in a later lesson.
Here are a few common image file formats. It's important to understand when to use each format. For user interface graphics, like icons, JPG generally should not be used, because it's lossy compression – it doesn't reproduce the original image exactly. When every pixel matters, as it does in an icon, you don't want lossy compression. JPG also can't represent transparent pixels, so a JPG image always appears rectangular in your interface.
For different reasons, GIF is increasingly unsuitable for interface graphics. GIF isn't lossy – you get the same image back from the GIF file that you put into it – but its color space is very limited. GIF images use 8-bit color, which means that there can be at most 256 different colors in the image. That's fine for some low-color icons, but not for graphics with gradients or blurs. GIF has limited support for transparency – pixels can either be opaque (alpha 1) or transparent (alpha 0), but not translucent (alpha between 0 and 1). So you can't have fuzzy edges in a GIF file, that blend smoothly into the background. GIF files can also represent simple animations.
PNG is the best current format for interface graphics. It supports a variety of color depths, and can have a full alpha channel for transparency and translucency. If you want to take a screenshot, PNG is the best format to store it.
A final word about debugging the output of a graphical user interface, which can sometimes be tricky. A common problem is that you try to draw something, but it never appears on the screen. Here are some possible reasons why.
- Wrong place: what's the origin of the coordinate system? What's the scale? Where is the component located in its parent?
- Wrong size: if a component has zero width and zero height, it will be completely invisible no matter what it tries to draw– everything will be clipped. Zero width and zero height are the defaults for all components in Swing – you have to use automatic layout or manual setting to make it a more reasonable size. Check whether the component (and its ancestors) have nonzero sizes.
- Wrong color: is the drawing using the same color as the background? Is it using 100% alpha, so that it's completely transparent?
- Wrong z-order: is something else drawing on top?