Skip to content

Module Decomposition

Finding the right boundaries by applying domain-driven decomposition to modular monoliths

The Foundational Principle: Information Hiding

In 1972, David Parnas published "On the Criteria to Be Used in Decomposing Systems into Modules" — a paper that remains the most important single idea in software decomposition. His argument: a module should hide a design decision behind a stable interface. Everything internal to the module — its data structures, its algorithms, its storage choices — is invisible to the outside world. Consumers interact only with what the module deliberately exposes.

Sam Newman, building on Parnas, captures the practical implication: "Don't expose anything unless someone needs it. Everything you hide, you can always expose later." This inverts the common instinct to make things public "because someone might need it." A module boundary is fundamentally an information-hiding boundary. Internal domain models, database schemas, and implementation strategies remain invisible to consumers. The public API and adapter pattern we use in NestJS is a direct application of this principle.

Information hiding is why module decomposition matters at all. Without it, modules are just folders — organizational convenience with no architectural force. With it, each module becomes a unit that can change internally without rippling across the system.

"How Many Modules?" Is the Wrong Question

Newman spent a decade evolving away from size-based thinking about service boundaries. The same lesson applies to modules: the number or size of modules is not the goal. The goal is boundaries that enable independent change.

Teams that fixate on module count end up in one of two failure modes. Too many small modules — thin CRUD wrappers around individual entities with weak cohesion and high coupling between them. Or too few large modules — monolithic blobs where everything touches everything. Both are symptoms of decomposing by the wrong criteria.

The right criteria, drawn from Newman's framework and Eric Evans' domain-driven design, are: model modules around business domains, enforce information hiding at boundaries, and evaluate boundaries by how well they contain change.

Bounded Contexts: The Primary Decomposition Tool

Eric Evans introduced bounded contexts in Domain-Driven Design, recognizing that "total unification of the domain model for a large system will not be feasible or cost-effective." A bounded context defines a boundary within which a domain model is internally consistent, terms have unambiguous meanings, and a shared language holds.

Newman makes bounded contexts his primary decomposition tool. The mapping rule applies directly to modules: one bounded context equals one or more modules, but one module should not span multiple bounded contexts.

Ubiquitous Language as Boundary Detection

The most reliable signal for where one context ends and another begins is language. When the same word means different things to different parts of the business, you've found a boundary.

Vaughn Vernon provides a clear example: "product" in a catalog context means descriptions and features for presentation. In an inventory context it means stock counts, shelf locations, and physical dimensions. In a shopping cart context it means pricing and selected quantities. These aren't the same concept — they share a name but have different attributes, different behaviors, and different reasons to change.

In our own systems, consider "connection." In the connections module, a connection is an OAuth credential with token refresh logic and provider-specific configuration. In the execution module, a connection is just a flag — is the required integration available? The connections module owns the rich model; the execution module works with a simplified representation obtained through the public API.

These linguistic divergences are not bugs to resolve but signals to respect. They reveal where one context ends and another begins. When you catch yourself qualifying a term — "the execution connection, not the configuration connection" — you're looking at two bounded contexts.

Context Mapping Between Modules

Modules don't exist in isolation. When bounded contexts interact, the DDD strategic patterns describe the relationship:

Customer-Supplier. One module produces a capability that another module consumes. The producing module's public API is the contract. This maps directly to our public service pattern — the organization module exposes OrganizationPublicService, and consuming modules depend on that interface.

Anti-Corruption Layer. A consuming module translates the upstream model into its own terms, protecting its domain from external concepts leaking in. This is our adapter pattern — the task module defines OrganizationProvider (what it needs) and the adapter translates from OrganizationPublicService (what the organization module provides). The task module's domain never imports organization concepts directly.

Shared Kernel. Two modules agree on a small, stable set of shared types — common identifiers, value objects, enums. Newman warns that shared kernels are dangerous because updates can break all consumers simultaneously. Keep it minimal: if the shared kernel keeps growing, you're papering over a boundary problem.

Separate Ways. Not all modules need to interact. Accepting data duplication between two modules is sometimes better than coupling them. If module A needs a user's display name and module B also needs it, both can store their own copy rather than one depending on the other — provided the consistency requirements allow it.

Aggregates: The Unit of Consistency

An aggregate is a cluster of domain objects treated as a single unit for data changes, with one entity — the aggregate root — as the sole external access point. Vaughn Vernon's four rules of aggregate design directly inform module boundaries:

  1. Model true invariants within consistency boundaries. One transaction modifies at most one aggregate. If two things must be consistent after every operation, they belong in the same aggregate.

  2. Design small aggregates. Large aggregates that lock many records become bottlenecks. Prefer the smallest cluster that protects genuine business invariants.

  3. Reference other aggregates by identity only. Don't hold direct object references to entities in other aggregates. Store the ID and look it up through the public API when needed.

  4. Use eventual consistency outside the boundary. Cross-aggregate operations don't need to be transactionally consistent. Events, sagas, or simple read-then-act patterns handle cross-aggregate coordination.

Newman connects aggregates to module design directly: "The Aggregate is a self-contained state machine that focuses on a single domain concept, with the Bounded Context representing a collection of associated aggregates, again with an explicit interface to the wider world."

How Aggregates Guide Module Boundaries

Each module typically owns one primary aggregate and possibly several supporting ones. The aggregate's consistency boundary often aligns naturally with the module boundary:

typescript
// The execution module owns this aggregate
class Execution {
  readonly id: string;
  readonly taskId: string;           // Reference by identity — task is a different aggregate
  readonly organizationId: string;   // Reference by identity — organization is a different aggregate
  readonly status: ExecutionStatus;
  readonly steps: ExecutionStep[];   // Part of this aggregate — must be consistent with execution
}

The Execution and its ExecutionSteps must be transactionally consistent — changing a step's status must update the execution's overall status atomically. But the execution doesn't need to be instantly consistent with the task definition it was created from, or the organization it belongs to. Those are separate aggregates, separate modules.

The decision test: if a business change requires modifying aggregates across multiple modules simultaneously, the boundaries are likely wrong. Either the aggregates belong in the same module, or the consistency requirement is softer than it appears and eventual consistency would suffice.

Discovering Boundaries in Practice

Getting boundaries right from the start is hard. Even experienced architects struggle with it. The advantage of working within a modular monolith is that the cost of getting boundaries wrong is dramatically lower than in a distributed system — refactoring module boundaries uses the same tools as any other code change. No API versioning, no deployment orchestration, no data migration between databases.

This means you can discover boundaries incrementally. Start coarser-grained and split as you learn.

Event Storming

Alberto Brandolini's Event Storming is Newman's recommended technique for collaborative domain discovery. The workshop places technical and nontechnical stakeholders together, building a timeline of domain events: "Task Created," "Execution Started," "Connection Authorized." Commands trigger events. Aggregates receive commands and produce events. Policies react to events.

The outputs inform module boundaries through three mechanisms:

  • Linguistic boundaries emerge when different groups use different words for the same concept. Resist resolving the duplicate wording — different language hints at different bounded contexts.
  • Pivotal events — major moments marking phase transitions — often mark boundaries between contexts. "Order Placed" is a pivotal event that separates the ordering context from the fulfillment context.
  • Aggregate clustering reveals groups of related commands and events that change together, forming natural module candidates.

Domain Storytelling

Stefan Hofer and Henning Schwentner's technique captures how work actually happens by having domain experts tell concrete scenarios: who does what, with what, in what order. A key heuristic: unidirectional flows in domain stories usually indicate separate bounded contexts. When activity flows in one direction (from ordering to fulfillment to shipping), each phase is a candidate for its own context.

Seam Identification

For existing codebases, Newman adapts Michael Feathers' concept of a "seam" — a portion of code that can be treated in isolation. The approach: identify bounded contexts, map existing code to those contexts, and look for where code "doesn't really fit anywhere." That leftover code often identifies bounded contexts you missed.

Newman emphasizes that namespace and package structures provide initial hints but are often organized by technical layer. Reorganizing into domain-aligned modules is itself valuable — through the process, you discover what actually belongs together.

Evaluating Boundaries: The Coupling Taxonomy

Newman defines four types of coupling, ordered from most to least acceptable. This taxonomy is the primary tool for evaluating whether a proposed module boundary is sound:

Domain coupling — one module calls another because it genuinely needs that module's capability. The task module needs to verify an organization exists, so it calls the organization module. This is unavoidable but should be minimized: share only what the consumer actually needs, not the entire internal model. Our adapter pattern enforces this — the consuming module defines a narrow interface for only what it requires.

Pass-through coupling — a module passes data through purely because a downstream module needs it, with no use for the data itself. If the execution module passes organization details through to the tools module untouched, that's pass-through coupling. It creates fragile chains where changes to the downstream module's needs force changes in the intermediary. Fix it by having the downstream module call the source directly.

Common coupling — modules share a resource, typically a database table. Changes to the shared resource require all dependent modules to change in coordination. This is the most insidious form because it's invisible in the code — no imports to grep for, no function calls to trace. Each module must own its data exclusively.

Content coupling — one module reaches into another's internals to read or manipulate its state. Importing another module's domain entities, querying another module's database tables, or depending on another module's internal data structures. This should be avoided entirely. It's the antithesis of information hiding.

The evaluation rule: if two proposed modules would have common or content coupling between them, they should remain together. Domain coupling is acceptable. Pass-through coupling is a warning sign to restructure the dependency chain.

Cognitive Load as a Boundary Heuristic

Matthew Skelton and Manuel Pais' Team Topologies introduces a complementary lens: modules should be sized so that a developer (or team) can cognitively manage them. Teams have finite capacity across three dimensions: intrinsic load (the domain complexity itself), extraneous load (tooling and infrastructure overhead), and germane load (productive learning about the domain).

A developer working on the connections module shouldn't need to understand execution internals. A developer working on task definitions shouldn't need to hold the file storage strategy in their head. If understanding a module requires understanding three other modules, the boundary is cutting through a cohesive concept — or the module has too many dependencies.

Conway's Law reinforces this: organizations produce designs that mirror their communication structures. Module boundaries and team responsibilities should reinforce each other. A module owned by one team should not require constant coordination with another team's module.

Antipatterns

Entity Modules

Michael Nygard's critique of "entity services" applies directly to modules. Entity modules wrap individual database tables with CRUD operations — a UsersModule, an OrdersModule, a ProductsModule, each a thin data-access layer.

The problem is the "activation set": most features require multiple entity modules simultaneously. Pricing a shopping cart activates cart, products, accounts, and pricing modules in concert. Business logic can't live in any single entity module because it spans several, so it migrates to an orchestrating "god module" that coordinates across all of them. You end up with anemic entity modules (data without behavior) and bloated orchestrator modules (behavior without data).

The fix: decompose around business capabilities where data and behavior are encapsulated together. A "pricing" module owns the rules, the data, and the API. An "order fulfillment" module owns the workflow end-to-end.

Shared Data

Newman's rule is unequivocal: modules must not share database tables. When two modules read from and write to the same tables, schema changes for one can break the other. There is no separation between what's shared and what's internal. Developers cannot know what they can safely change.

Each module owns its data. If another module needs that data, it asks through the public API. If two modules seem to need the same table, either one module is the true owner (and the other should call its API), or the table represents two different concepts that should be split — the same entity appearing differently in two bounded contexts.

Premature Decomposition

Martin Fowler observes that "almost all the successful [decomposition] stories have started with a monolith that got too big and was broken up," while "almost all the cases where I've heard of a system built [decomposed] from scratch, it has ended up in serious trouble."

The advantage of working within a modular monolith is that the pressure to get boundaries right on day one is much lower. You can start with coarser-grained modules — group related capabilities together even if you suspect they'll eventually separate. As you build features and see where changes actually cluster, the real boundaries become visible. Splitting a well-structured module is a routine refactor; merging poorly-structured modules is archaeological excavation.

Stefan Tilkov offers an important caveat: the discipline to maintain module boundaries within a monolith "rarely works" without enforcement. This is why we use dependency-cruiser rules to enforce boundaries automatically rather than relying on convention.

Circular Dependencies

When module A depends on module B and module B depends on module A, both modules become impossible to understand or test in isolation. Circular dependencies usually reveal that the boundary is in the wrong place — the two modules are actually one bounded context that was artificially split.

Fixes: merge the modules, extract the shared concept into a third module, or invert one of the dependencies using events (module A publishes an event that module B reacts to, rather than A calling B directly).

The God Module

When a module imports half the system to coordinate a single operation, the boundaries are likely wrong. This is the module-level equivalent of the "death star architecture" — a central nexus where everything connects, creating a bottleneck for change.

Examine what the god module is actually doing. Often it's taking on responsibilities that belong in the downstream modules. Sometimes it reveals that several of its dependencies should be merged because they always participate in the same operations. If a module genuinely needs to orchestrate many others, it should receive only the minimal data it needs (avoiding pass-through coupling) and delegate behavior to the owning modules.

The Decomposition Checklist

When evaluating whether module boundaries are right, ask:

  1. Does each module hide its implementation? Can you change the module's internal data structures, storage strategy, or algorithms without any other module noticing?

  2. Is coupling between modules intentional and minimal? Check Newman's taxonomy — is it domain coupling (acceptable), pass-through coupling (restructure), common coupling (fix immediately), or content coupling (never)?

  3. Do changes stay within module boundaries? If most features require touching multiple modules, the boundaries don't align with how the system actually changes.

  4. Can a developer understand this module without understanding others? If the cognitive load of working on one module requires holding three others in your head, the boundary is cutting through a cohesive concept.

  5. Does each module have a name that describes a business capability? "Task Execution" and "Connection Management" are good. "Shared Utilities" and "Common Services" are signs of missing boundaries.

  6. Are the aggregate consistency boundaries respected? Things that must be transactionally consistent live in the same module. Things that are eventually consistent can live in different modules.

Applied In

Further Reading

  • Building Microservices (2nd Edition) by Sam Newman — Chapter 2 on domain-driven decomposition
  • Domain-Driven Design by Eric Evans — Part IV on strategic design and bounded contexts
  • Implementing Domain-Driven Design by Vaughn Vernon — Aggregate design rules
  • Team Topologies by Matthew Skelton and Manuel Pais — Cognitive load and team-first boundaries
  • Bounded Contexts by Martin Fowler
  • Services by Lifecycle by Michael Nygard — The entity service critique