Bug Blog: Affine Transformations

I am implementing a mouse based graph editor as part of an assignment. I am of course taking about mathematical graphs with nodes and edges, not statistical charts.

An important feature of this application is the ability to pan the canvas by shift dragging the mouse. This seems quite easy at first, since good graphics libraries can automatically map between logical coordinates (the locations of objects on the canvas), and physical coordinates (the locations of the objects on the screen).

This is done using an affine transformation, a concept from linear algebra. An affine transformation combines translation, rotation, reflection, scaling, and shear into a single operation. The affine transformation in a graphics library usually maps logical coordinates into physical coordinates.

The trick is to arrange for an affine transformation to be applied by the graphics library. As the user shift drags the mouse, the translation component of the transformation is updated to reflect the motion.

Simple.

The bug

The question though is by how much should the translation be adjusted. The obvious answer is to subtract the previous mouse position from the current position to obtain the vector \([x - x_0, y - y_0]\).

However, I also want to add zooming functionality to this application. If the user zooms out, the physical coordinates will get smaller relative to the logical coordinates, and vice versa when the user zooms in. This means that when zoomed in, the canvas will pan too slow, and when zoomed out, the canvas will pan too fast.

Recall that the affine transformation maps logical coordinates into physical coordinates. Clearly the answer to my problem is to apply the inverse transformation (mapping physical coordinates to logical coordinates) to \([x - x_0, y - y_0]\).

This succeeds in adjusting the scale, but the canvas jumps around, seemingly at random, as soon as I start dragging!

Why?

At this point I remembered something from linear algebra class: Affine transformations aren't linear. What does that mean? Well, a linear transformation is one where \(T(a + b) = T(a) + T(b)\) and \(nT(a) = T(na)\), where \(a\) and \(b\) are vectors, and \(n\) is a scale factor. Rotations, reflections, scaling, and shearing are linear, which is quite easy to prove.

Translation, however, is not linear. \(T(a - b) \neq T(a) - T(b)\). This is also clear if you visualize it right. Imagine finding the vector from point a to point b, then adding some translation vector (which is essentially what the inverse affine transformation is doing). The resulting vector is very definitely not what you intended. Now imagine adding the same translation vector to \(a\) and \(b\), then finding the vector between them. Much better.

I guess the lesson here is that it pays to pay attention in maths class. It really can matter whether an operation is linear or not.