Basic X Concepts

Federico Mena Quintero

    federico@gimp.org
  

This article presents the basic concepts that programmers need to know about the architecture of the X Window System. It presents an overview of the X client and server concepts, asynchronous operation, drawables, visuals and colormaps, graphics contexts, and the basics of drawing and event-driven programming from the viewpoint of GTK+.


Table of Contents
Introduction
Client/Server Architecture
Drawables
Visuals and Colormaps
Graphics Contexts
Common Drawing Operations
References

Introduction

The X Window System[1] is a big and complex hairball. It is not designed to be a specific graphical user interface (GUI), but a generic system for building graphical user interfaces. The original X authors did not have a particular user interface model in mind, so they wanted to create the framework necessary to experiment with different types of user interfaces[2].

Despite its age, X has held up surprisingly well, being able to accomodate itself to many user interface paradigms. These include multiple tiled windows, overlapping windows, and custom turn-key “single application” systems.

In the desktop computer arena, the prevailing user interface model is that of multiple applications running simultaneously in separate, overlapping windows. This is the model that most people are accustomed to, and it is the model that the GNOME libraries are designed to be used for.

The GNOME libraries provide a large number of high-level abstractions and wrappers that make programming the X Window System easy. Still, programmers will be much more productive if they know some basic concepts about the architecture of X. The X Window System is notable for its asynchronous, network-transparent client/server model. It also has many important concepts related to how windows are represented and how drawing operations are performed on them.

This article explains the basic concepts that GNOME programmers should have in mind when programming the X Window System. It is not a replacement for the Xlib manual; you should definitely get your hands on the Xlib programming and reference manuals for complete information.


Client/Server Architecture

X has a client/server architecture. The X server typically runs on the user's console and takes care of managing the video display, the keyboard, and the mouse. X clients include application programs and window managers. Clients issue requests to the server; these include commands to create and destroy windows and draw graphical objects. This client/server architecture has several important characteristics, which are outlined below.

Asynchronous Operation. Requests from the clients to the X server are not synchronous. A client may queue a number of requests, keep on executing, and flush the requests to the server at a later point. Since the server runs on a separate process, it may get requests from different clients at unpredictable times. This means that sometimes clients must explicitly synchronize with each other if they need operations to occur in a certain order. Most of the time, however, applications can simply ignore asynchronous issues, since they only care that the X server perform their requests.

Network Transparency. X is network transparent; this means that a client does not care whether the X server is running on the same machine as itself. Clients running on different machines may connect to an X server. Applications do not need to do anything special to support this; however, they must sometimes be careful if they wish to share information via the X server, as in any networked environment. For instance, transferring local filenames usually does not make sense in a networked environment.

Client-side and Server-side Resources. Programmers with no previous experience in X may not realize that some things in X applications are server-side resources, while other things are client-side resources. This means that applications can generally share server-side resources, because they are identified uniquely within the server; however, they may not be able to share client-side resources without doing some extra work. Also, sometimes resources must be transferred from the client to the server or vice-versa, and this may lead to important performance considerations, especially if the client and the server are separated by a slow network link.


Drawables

Drawables are server-side resources which you can paint on[3]. X has two types of drawables, windows and pixmaps. Pixmaps are off-screen entities which you cannot see, and they are just hunks of raw pixel data. Windows can be visible on the screen if they are mapped. Windows can also be unmapped, which means they exist as data structures in the X server, but are not shown on the screen. Windows can be hidden and shown, or more properly, unmapped and mapped, at any time.

Windows can be nested in a hierarchical tree structure. All windows except the root window have a single parent, but all windows may have any number of children. Pixmaps cannot be nested and they don't have parents.

Windows have x/y/width/height properties that define their position within their parent. Pixmaps only have width and height properties.

Windows and pixmaps have many other properties. However, a very important property is the visual class of the drawable, which will be discussed in the next section.

Windows and pixmaps are both server-side resources. Both are identified by simple numerical IDs which are unique within the server. Different clients can share drawables by passing their IDs to each other; an application could hand a pixmap ID to another application to let it draw on the pixmap[4].


Visuals and Colormaps

Visuals and colormaps together define how image data can be represented by the X server. Roughly, a visual defines the memory representation that a piece of hardware uses to store the contents of an image. A colormap defines a look-up table that is used to translate raw pixel information into RGB colors that are finally sent to the display.


Visual Classes

X supports different kinds of visuals to suit the different kinds of available hardware. There are three basic kinds of visuals, each divided into two classes:

  • Grayscale visuals. These are used for displays that use a single channel of color information. Black and white or grayscale monitors, including amber and green monitors, may use this type of visual. Grayscale visuals can be either static gray or grayscale.

    Static gray visuals are those in which you cannot change the gray intensities of the hardware. Plain monochrome (B/W) displays or fixed 4-gray displays may be of the static gray kind.

    Grayscale visuals are those in which you can change the gray intensities used by the hardware. Exotic 12-bit grayscale displays that let you change the gray intensities, as the ones used for medical visualization, may be of the grayscale visual type.

  • Indexed visuals. These use the “paint-by-number” concept: each pixel value is an integer that indexes a table of colors. So 0 may represent black, 1 may represent pink, 2 may represent blue, and so on. These visuals can be either static color or pseudo color.

    Static color visuals are those in which you cannot change the actual colors that the indices correspond to. Old PC CGA cards with four fixed colors in graphics mode could be considered of the static color type.

    Pseudo color visuals are those in which you can change the actual colors that the indices correspond to. Each index maps to a red/green/blue, or RGB triplet that defines the color that will be displayed on the screen. You can change these RGB triplets for each index. Pseudo color visuals are very common in low-end graphics cards, for example, 256-color SVGA cards that let you change the individual colors in the palette and are of the pseudo color visual type.

  • Color visuals. These are the “big fat ones”. They usually provide the highest quality you can get from the hardware, and they also consume the most resources in terms of speed and memory. Color visuals store explicit RGB values for every pixel, instead of storing a single value like indexed visuals. Color visuals can be either true color or direct color.

    True color visuals use the exact RGB values you specified for a pixel as the color that gets displayed on the screen for that pixel. Most “true color” SVGA cards are of this kind.

    The values in a direct color visual go through an indirection step before being sent to the display. Each of the R/G/B values you specify is an index in separate red, green, and blue tables. This means that an RGB triplet gets translated into an R'G'B' triplet, that is, the three tables together define an f(r, g, b)↦(r', g', b') function. For most purposes, your tables will be filled by the identity function and you will get linearly increasing intensity values for each of the RGB channels. Things can become quite interesting, however, when you modify the tables to have a nonlinear mapping. If you fill them using an exponential function, you can do color correction on hardware, for example. Most high-end hardware (Sun, HP, SGI) supports direct color visuals.

To remind yourself of what the different visual classes mean, think in terms of static gray, static color, and true color having read-only intensity mappings; and grayscale, pseudo color, and direct color having read/write mappings.

The X server does not deal with RGB triplets directly because not all hardware thinks in terms of RGB triplets. X acts close to the metal in this respect, for both performance and historical reasons. However, the GNOME libraries make it easy for applications to think in terms of RGB images; the libraries will convert these to whatever representation the X server needs.

In addition to the visual class, each X visual has a bit depth. This is the number of significant bits that are used to encode the value of every pixel. Most 256-color PC video cards operate on an 8-bit pseudocolor visual. Better video cards operate on 24-bit true color visuals, with eight bits of information per channel. Some Amiga video cards operate on 12-bit pseudocolor visuals, which leads to a palette of 4096 indexed colors. Some exotic hardware uses 8-bit truecolor visuals, using 3/3/2 bits for the RGB channels, respectively.

The best way to know about the visual types your hardware supports is to run the xdpyinfo program. You will get a load of interesting information. If you can get hold of a high-end video card and X server, run xdpyinfo on it so that you can see all the exotic visuals it supports.


Colormaps

Colormaps suck. FIXME.


Relationship to Drawables

As we mentioned in the preceding section, each drawable has a visual class associated to it. This defines the low-level representation of image data for that drawable. It is important to note that if you want to copy image data between two drawables, they must have visuals with the same depth. In general you should guarantee that two drawables have the same visual when copying image data between them; if you try to copy data from an 8-bit pseudo color pixmap into a 24-bit true color window, you will get a BadMatch error from the X server[5].


Graphics Contexts

X supports many different parameters for drawing operations. For example, when drawing lines you could configure the foreground color for drawing, the line style (solid or dashed), the stipple pattern, the line width, join and cap styles for polylines, and other parameters. It would be inconvenient to have to pass all of the possible parameters to the line drawing function every time. Graphics contexts solve this problem.

A graphics context, or GC for short, is a record structure that describes a set of drawing parameters like foreground color and line style. When you invoke a drawing function, you pass in a GC so that X will know what options to use for that drawing operation. GCs can be reused for multiple drawing operations; this means that you can configure a GC just once with the desired drawing parameters and use it for many drawing operations.

For example, the prototype for gdk_draw_line() looks like this:

void gdk_draw_line(GdkDrawable *drawable, GdkGC *gc, gint x1, gint y1, gint x2, gint y2);

Here, the gc argument specifies the GC to be used when drawing the line. GC arguments supported by lines include foreground color, line width, cap style, dash pattern, stipple pattern, and other miscellaneous parameters like a clipping region[6].

When you create a GC, you specify a drawable whose visual class will also be used for the GC. You can then use this GC to paint on drawables that have the same depth as the original one; if you need to paint on a drawable with a different depth, you will need to create a new GC suited for it.


Common Drawing Operations

This section describes several common drawing operations that applications may need to perform. You should choose the drawing model that best fits your application.


Data Transfer and Drawables

X provides several drawing primitives such as lines, rectangles, polygons, ellipses, and text. The requests used to draw these primitives are small and can be transferred very quickly over network links. Also, the X server can often use hardware functions to draw these primitives, making them very fast.

In other situations applications may need to draw images such as photographs or icons. These must be converted to a representation the X server understands and the result has to be transferred over the wire from the client to the server. Images can be big, so this process could be slow over network links. In the case where the X server and the client are running on the same machine, applications can often use the MIT shared memory extension so that images can be ‘transferred’ using shared memory, for increased performance. However, application writers need to take in mind the fact that applications that transfer big images over the wire could be slow when run across network links.

Similarly, sometimes applications will need to fetch image data from the server to the client. This could be used to take screenshots or otherwise retrieve the pixel contents of a drawable. This is the same situation as in the previous paragraph but in reverse — image data has to be transferred from the server to the client, instead of the other way around. Applications that need to do this often and for large images could also be slow over network links.


Drawing Primitives to Drawables

This is the most common drawing model, and applications can use it to draw most of their displays. This involves creating windows and drawing to them using the standard X primitives. Often it is a good idea to create a temporary off-screen pixmap, then draw whatever needs to be redrawn to it, and then copy the contents of that pixmap to the final on-screen window in a single operation. Since the final contents of the display are copied in a single operation, from the pixmap to the window, no flicker will appear.

If only pixmaps, windows, and the normal drawing primitives are used, then it means everything will be done server-side and can generally be considered to be fast. Server-side drawing primitives can often be done in hardware by the X server. Thus applications that use this method of drawing will most likely run quickly even over a network link, since only small X requests have to be transferred over the wire.


Drawing to an RGB Buffer

Some applications like image manipulation programs and games may need to deal with images such as RGB buffers directly. Since X does not deal with these, client-side RGB buffers have to be converted to whatever data format is defined by the visual class of the destination drawable. For good visual results, this could involve color reduction, remapping, and dithering.

The GdkRGB library, which is part of GDK, provides an easy way to render RGB buffers to drawables. GdkRGB performs color reduction, mapping, and dithering, and transfers the final image data to the X server. It will automatically use shared memory to do this if the client and the X server are running on the same machine, or it will use the default slower method if they are running on different machines.

Whether this is fast or slow depends on several factors. Consider a plotting program that must plot millions of data points, such as for a very detailed graph or for a 3D point cloud. If one used the normal gdk_draw_point() to draw every single point, this could lead to a very large number of X requests. Remember that these still have to be transferred over the wire. Say that the size of the combined requests is A. Now consider a program that drew the point cloud to a client-side RGB image, of size B. If B is less than A, then it may actually be faster to create an RGB image and transfer it over the wire than issue an extremely big number of point-drawing requests.

References

Xlib Programming Manual.

Notes

[1]

Correct ways of calling it include “X Window System” and “X”, not “X Windows”.

[2]

Many thanks to Jim Gettys, one of the original authors of the X Window System, for this insight.

[3]

Strictly speaking, drawing is done using the appropriate X protocol requests, just as everything else in X. Protocol requests are nicely wrapped with the Xlib API, and GNOME in turn wraps Xlib using the the GDK library for added portability and convenience.

[4]

To do this, however, an application that created the pixmap may need to explicitly synchronize with the server so that all X requests have been flushed to the server and executed before the application hands the pixmap ID to the other client.

[5]

Moreover, due to the asynchronous nature of X, this error could be reported at a later stage in the program — say your program does such an illegal operation, then keeps executing, then flushes its X request queue; then the error would appear to be reported after you called the drawing function that caused the error. You can pass the GNOME programs with the

[6]

The prototype for XDrawLine() looks very similar to the one for gdk_draw_line(). The only difference is the addition of a display parameter that tells X which display to draw on. At this time GDK does not support multiple simultaneous displays on a single program, so its drawing functions do not take in a display parameter.