Service documentation

Decision

The Bounded Context Canvas is used for documenting every new service we create.

We use the BOUNDED_CONTEXT_CANVAS_TEMPLATE.md in the root of the monorepo and put the service canvas as README.md in the root of the service project. The template contains some dummy content and comments to make it easier to be used.

For existing services, especially the monolith Hermione and Filch applications, we do not drop other work just to create the documents. When touching specific parts of those projects, we aim for bundling domain logic into separate modules where each module should have its own documentation, e.g. an auth module.

Due to the way we modelled our data via MongoDB, some contexts currently can't be cleanly mapped to a specific service or module. When running into those situations we take the topic to the cross-team retro to align between teams on how to approach this concrete problem. We need to come up with solutions on how to model the domains properly so that the responsibilities and ownership of this domain becomes clear.

Problems

If at all, we only document within the corresponding README file what a backend service is responsible for. This also lead to situations where it's not clear which service should be the owner of a specific domain model. One example is the campaign model which is used in different ways across our whole application landscape. Our MongoDB database is used to model relationship between different tables in a way how relational databases are meant to be used. This way it's unclear which parts actually should belong together from a business perspective.

Also, we're missing any documentation of the dependencies a service has. This includes which data the service needs from other services, which events it subscribes to as well as which data is provided to other services.

Most of the time we recognize only after there is a negative impact on a customers campaign that we had issues.

Context

We had an Event Storming workshop in 2022 where we modelled which business domains exist at optilyz. It might not be fully up-to-date anymore but a good first starting point to see which domains exist.

We aim for moving into individual domain services which can be owned by one team. Currently, it's unclear which team is responsible for which part of the system as most of our code still lives within a monolith application. It's really hard to keep track of which data is used in which part of our application landscape.

Inside product, we monitor logged production errors to observe if a service is healthy. Most of the time, we don't have any concrete key metric defined to observe the health of an individual service.

All the code related documentation happens within the monorepo itself.

Options

The following approaches were considered

Bounded Context Canvas
Each team decides on their own on how to document their services

Reasoning

Since the goal is to further grow the company, we need one common way how to document our services. It helps us to identify dependencies between the different parts of our system and therefore also between the individual teams.

In addition, it needs to be clear which domain models are owned by which service.

Consequences

How do we implement this change?

When creating a new service each team as a first step starts with documenting the service by using the bounded context canvas.

When adapting an existing service each teams should counter check if the change request also requires an update of the service canvas before rolling out the change.

Who will implement the change?

Each team creates the context canvases for their domains / services. When data is needed that is not documented by another service, the canvas can be used to discuss how the message flow should look like but also which data is needed.

How do we teach this change?

The Enable team gave an introduction in the bounded context workshop on 4th of July 2024.

Parts we elaborated during the workshop are documented within recipient-importer, supplier-manager as well as automation (domain inside Hermione) and could be used as a guide for other use cases.

What could go wrong?

We might miss to document some dependencies due to missing domain knowledge.

Also, the correct separation of domains and data might need multiple iterations.

What do we do if something goes wrong?

Since it's about documentation, we can adapt the bounded context canvas at any point in time according to our best knowledge.

What is still unclear?

It's still unclear how to properly split and document the monolith applications (Filch and Hermione) as most of the logic lives still within those applications. Documenting everything we have right now is a time-consuming task where it's unclear how much value it would add.

Related ADRs

None