Module interaction graphs

Call graphs are important, but they become unwieldy beyond a few dozen functions or so. When tracking the call graph in our heads, it becomes unwieldy even faster. To analyze large-scale design properties, we need to abstract further than the function level. Instead of asking "which functions calls which other functions?", we can ask "which modules call functions in which other modules?" This gives us a module interaction graph.

As before, we'll ignore the specific internal details of the modules. For the function call graph, we didn't note which line of code the call came from. For the module interaction graph, we won't note which specific function in module A called which function in module B.

Our example will be rubygems.org, the website of Ruby's package repository. It's 3,365 lines of Rails code, which is big enough to show real-world complications but still small enough to be tractable as an example. There are 389 method definitions, and many more method call points than that, making the call graph too large for direct consideration.

Rubygems.org's is shown below, where an edge from A to B indicates a call from some function in module A to some function in B. There's a lot of complexity here, so here's a suggested goal: scan through the diagram, looking at the number of edges coming into and out of a given node. Ask yourself: which modules are most heavily connected to other modules?

(Now, some notes on this graph for the curious. These can be skipped safely. The diagram was generated by using Ruby's set_trace_func to observe method calls during a full run of the test suite. rubygems.org has good test coverage, so this should present a good approximation of the true runtime interactions in production. A minority of tests failed for environmental reasons, but that will have only a small effect on this graph. The results are limited to function call relationships between modules; inheritance isn't considered. About ten nodes not connected to the main graph were removed. One node, Redirector, was removed because it referenced every controller and caused layout trouble. None of the removed nodes had significant design implications. Revision ec03f00 of the RubyGems.org repository was used, limited to the app directory, which contains 3,365 lines of Ruby.)

The most prominent features of this graph are three ubermodules with disproportionate incoming edges: User, Version, and Rubygem. User classes in particular are often highly connected to other modules; most web applications have one like this. This makes intuitive sense: most functionality concerns users in some way, making User an easy dumping ground for application code. (Sometimes, users are represented by both a User and a Profile, in which case the latter usually becomes the dumping ground.)

The other two ubermodules, Rubygem and Version, represent a Ruby package and a version of a package, respectively. Rubygems.org allows users to publish versions of Ruby gems, so it's no surprise that User, Rubygem, and Version are referenced frequently. Most web applications will have a handfull of heavily-connected, domain-specific classes like this. The number varies, but few applications have none.

This is one section of The Programmer's Compendium's article on Software Structure, which contains more details and context.