Improving the module graph

Ubermodules like User are a nearly universal disease of the graph, manifesting as a vague fear about our changes. It's hard to see the problem in any more detail while looking at a text editor; we just can't see enough of the system at one time.

Stepping back to the module graph clarifies these phenomena. How complex is the call graph within a module? How heavily connected is the module interaction graph? Are there nodes in these graphs that are disproportionately large, or disproportionately connected, or both?

Uberclasses indicate lop-sided design graphs. They have too much internal complexity (250 methods in our imagined rubygems.org of the future). They're referenced by too many other modules (90 incoming edges). They change too frequently (#1 in terms of commits modifying them). They contain too much text (#1 in terms of lines of code). All of these properties are highly correlated, with multiplicative effects on our confusion.

How do we fix this? Returning to an earlier example: suppose that we extract OrderHistory from User in a standard ecommerce site. Now, any module accessing order history loses its module interaction edge to User. When order history code changes, we don't have to ask which of the 90 modules referencing User might be affected. Only a few modules will care about order history, so our change analysis is limited to those.

With small modules like OrderHistory, we can easily make behavioral guarantees. We might decide that OrderHistory is read-only; it won't write to the database. Or, we might decide that its methods always raise exceptions on errors; they never return nil. Now we don't need to guess about those properties when calling OrderHistory from the outside, even though we may choose other constraints for other modules. We can be more confident and less defensive when calling methods, both internally within OrderHistory and from the outside. Repeat that a few times and the system starts to feel more maintainable.

The solution, in short, is simple on its face: break the large modules down into smaller pieces, and make the methods in those small modules similar to each other. In practice, this is subtle and difficult. However, it's important to avoid worrying over which module to extract first. In advanced ubermodule cases, extracting almost any module can be beneficial, even if it's not the ideal extraction. The first step is to introduce some kind of module boundary. We can always move the methods around later.

This is one section of The Programmer's Compendium's article on Software Structure, which contains more details and context.