Graphs in thought

Any system of entities and relationships is a type of a graph: friendships among people, course prerequisites at a university, or stations in a subway system. Graphs show up constantly in life. We rarely think about them consciously as graphs, but we've very good at thinking about them implicitly, just as we are for software structure.

To get from St. Paul's Cathedral to Regent's Park in London, you can take the Central tube line to Oxford Circus and change to the Bakerloo line. If service interruptions prevent the Central line from stopping at Oxford Circus, the Piccadilly line can be used as a detour at the cost of a second change.

The two sentences above describe a small section of the graph composed of London tube stations (nodes) and subway lines (edges):

The modern London tube map prioritizes the transit network's abstract structure over the physical layout of the city, just as the above diagram does. Relative distances on the map don't necessarily correspond to physical distance between stations, curves on the map don't necessarily correspond to curves in the physical tracks, and so on.

Scans of London tube maps before (1908) and after (1933) Beck's redesign are shown below. The orginal maps were confusing because complications of the physical world were shown: physically-close stations were drawn close together, the lines were shown to curve as they do in reality, etc. Beck's redesign shows logical structure rather than physical structure, allowing us to plot a course more easily. We don't care whether the Central line's tracks curve or not, and we rarely care about the walking distance between stations; we care about how they connect.

A first-time visitor will follow the graph literally, tracing different connections. They might even look at a physical map of the city, trying to reconcile the tube and street maps. (Don't do it; it will confuse you!)

With time, we internalize abstract graphs like the Tube's, just as we internalize programming language syntax trees. A long-time programmer generates syntax with few errors; a long-time Londoner knows how to get from one station to another, even with reroutes cropping up.

Both types of understanding are important in software, transit networks, and elsewhere. We have to write functions and to walk down streets. But when considering a software system as a whole, or when moving across large distances in London, fine-grained details are distractions. We need the abstract maps to mitigate our finite minds.

To understand software systems, programmers commonly think at four different levels of abstraction. First, the code in all of its complexity: function bodies, statements, expressions, and so on. Second, the call graph, which discards the functions' internals. Third, the module interaction graph, which discards individual functions entirely. Fourth, the package dependency graph, which discards all internal structure of the packages, and is shared by all users of a language.

The next several sections will dig into each of those in detail. We'll find similar graph structures in each, but we'll also find nuanced differences that stop us from making tempting, but naive, generalizations.

This is one section of The Programmer's Compendium's article on Software Structure, which contains more details and context.