The utility of graph analysis

Most properties of software systems are structured as graphs; those examined here are a small sample. When programming in a text editor, these graphs remain implicit. We see functions calling other functions, but we don't see the bigger picture unless we intentionally step back.

Thinking explicitly about graphs becomes more valuable as systems grow. That doesn't necesarily require automated analysis as done for many of the graph diagrams in this article; it can mean drawing out the relationships as you understand them, or even constructing the graph in your head.

Intentional analysis of the graph can lead to many insights, like:

  • Why do so many classes reference User? Are they relying on several discrete ideas erroneously buried within User, and should we extract those ideas into new modules? (That is, do the nodes represent clear concepts?)
  • Why do the arrows in this call graph go like initialize -> load -> analyze? Analyzing isn't part of loading, and loading isn't part of initializing, yet that's how it's implemented. Can we introduce a new function that calls these three in sequence, expressing that linear process better? (That is, do the edges represent relationships as we understand them, or are they the accidental results of historical changes?)
  • Why do modules referencing Transaction also reference User in so many cases? Is there a subset of User and Transaction's behaviors that can be extracted as, say, OrderHistory? (That is, do other modules connect to sets of related modules in ways that indicate missing abstractions?)

Graphs are omnipresent in software. Sometimes, thinking about them is undeniably important (module interaction graphs in medium to large systems). Sometimes, they may be overshadowed by external factors (social dynamics complicating package dependency ecosystems). Because they show up frequently, sometimes in critical ways, learning to think about them pays for itself over and over again. It shows us the structure in our systems, allowing us to understand them at multiple levels of meaning simultaneously.

This is one section of The Programmer's Compendium's article on Software Structure, which contains more details and context.