Domain Driven Design
created: 04 November 2022
revision: 1
- In case you wonder about an Event Sourcing issues, just ask your accountant :-) .
- compensating events instead of update
- no versioning ?! either use an updated event with sensible defaults, or switch to a new stream.
- Optimistic concurrency control on the streams.
- Represent domain (business) logic as pure functions over ADT (immutable algebraic types). it should not depend on serializtion (db), infrastructure (web services), async code, and should not throw exceptions (pure function).
Domain Driven Model made functional
- This whole section is a summary and citations from: (Wlaschin, 2018)
- Wlaschin site with a lot of good stuff
- 3 approaches to domain modelling:
- mainstream: define data structures and functions to act on them
- semistructuread and/or flexible structures such as maps to store key-value pairs (see Clojure)
- combining element together to make other elements, focus on composition rule (algebra)
chapter 1
- A business doesn’t just have data, it transforms it somehow. The value of the business is created in this process of transformation, so it is critically important to understand how these transformations work and how they relate to each other.
- Domain Events are the starting point for almost all of the business processes we want to model. It is a record of something that happened in the system
- scenario and use case are user centric, business process is business centric.
- commands things that make domain events happen (e.g. Do this for me). It is a request for some process to happen and is triggered by a person or another event
- domains from the problem space are mapped to bounded contexts in the solution space (the domain model). The relationship does not always need to be one-to-one.
- context maps to communicate and interaction between contexts
chapter 2
- In domain-driven design we let the domain drive the design, not a database schema.
- The concept of a “database” is certainly not part of the ubiquitous language. The users do not care about how data is persisted.
In DDD terminology this is called persistence ignorance. It is an important principle because it forces you to focus on modeling the domain accurately, without worrying about the representation of the data in a database. - Letting classes drive the design can be just as dangerous as letting a database drive the design-again, you’re not really listening to the requirements.
- We need to capture … phases in our domain model, not just for documentation but to make it clear that (for example) an unpriced order should not be sent to the shipping department.
The easiest way to do that is by creating new names for each phase:UnvalidatedOrder
,ValidatedOrder
, and so on. It does mean that the design becomes longer and more tedious to write out, but the advantage is that everything is cristal clear.
chapter 3
- Transferring Data Between Bounded Contexts: The data objects that are passed around may be superficially similar to the objects defined inside the bounded context (which we’ll call domain objects), but they are not the same; they are specifically designed to be serialized and shared as part of the intercontext infrastructure. We will call these objects Data Transfer Objects or DTOs.
- Trust Boundaries and Validation: The perimeter of a bounded context acts as a trust boundary. Anything inside the bounded context will be trusted and valid, while anything outside the bounded context will be untrusted and might be invalid. Therefore, we will add “gates” at the beginning and end of the workflow that act as intermediaries between the trusted domain and the untrusted outside world.
- At the input gate, we will always validate the input to make sure that it conforms to the constraints of the domain model. It often plays the role of anti-corrupption layer, acting as a translator between models.
- The job of the output gate is to ensure that private information doesn’t leak out of the bounded context, both to avoid accidental coupling between contexts and for security reasons.
- command -> public workflow -> domain event
- Avoid Domain Events within a Bounded Context… Instead, if we need a “listener” for an event, we just append it to the end of workflow.
- Keep I/O at the edge.. a function that reads or writes to a database or file system would be considered “impure”, so we would try to avoid these kinds of functions in our core domain.
chapter 4
- A value is just a member of a type, something that can be used as an input or an output. For example, 1 is a value of type int, “abc” is a value of type string, and so on.
- Functions can be values too. If we define a simple function such as let add1 x = x + 1, then add1 is a (function) value of type int->int.
- Values are immutable (which is why they are not called “variables”). And values do not have any behavior attached to them, they are just data.
- In contrast, an object is an encapsulation of a data structure and its associated behavior (methods).
- So in the world of functional programming (where objects don’t exist), you should use the term “value” rather than “variable” or “object.”
- Optional
for missing values - Result<E,T> for modelling errors (OR type for Success or failure).
- workflow is modelled with functions
chapter 5
- ! Listing all the effects explicitly is useful, but it does make the type signature ugly and complicated, so we would typically create a type alias for this to make it look nicer.
- In DDD terminology, objects with a persistent identity are called Entities and objects without a persistent identity are called Value Objects. Let’s start by discussing Value Objects first.
- In a business context, Entities are often a document of some kind: orders, quotes, invoices, customer profiles, product sheets, and so on. They have a life cycle and are transformed from one state to another by various business processes.
The distinction between “Value Object” and “Entity” is context-dependent. For example, consider the life cycle of a cell phone. During manufacturing, each phone is given a unique serial number - a unique identity - so in that context, the phone would be modeled as an Entity. When they’re being sold, however, the serial number isn’t relevant - all phones with the same specs are interchangeable - and they can be modeled as Value Objects. But once a particular phone is sold to a particular customer, identity becomes relevant again and it should be modeled as an Entity: the customer thinks of it as the same phone even after replacing the screen or battery. - Here’s an example of how an Entity can be updated in F#. First, we’ll start with an initial value:
let initialPerson = {PersonId=PersonId 42; Name="Joseph"}
To make a copy of the record while changing only some fields, F# uses the with keyword, like this:let updatedPerson = {initialPerson with Name="Joe"}
- Is Order an Entity or a Value Object? Obviously it’s an Entity - the details of the order may change over time, but it’s the same order.
On Aggregates
- A very common situation: we have a collection of Entities, each with their own ID and also some “top-level” Entity that contains them. In DDD terminology, a collection of Entities like this is called an aggregate, and the top-level Entity is called the aggregate root. In this case, the aggregate comprises both the Order and the collection of OrderLines, and the aggregate root is the Order itself.
- the Customer and the Order are distinct and independent aggregates. They each are responsible for their own internal consistency, and the only connection between them is via the identifiers of their root objects.
- This leads to another important aspect of aggregates: they are the basic unit of persistence. If you want to load or save objects from a database, you should load or save whole aggregates. Each database transaction should work with a single aggregate and not include multiple aggregates or cross aggregate boundaries.
- Just to be clear, an aggregate is not just any collection of Entities. For example, a list of Customers is a collection of Entities, but it’s not a DDD “aggregate,” because it doesn’t have a top-level Entity as a root and it isn’t trying to be a consistency boundary.
- Here’s a summary of the important role of aggregates in the domain model:
- An aggregate is a collection of domain objects that can be treated as a single unit, with the top-level Entity acting as the “root”.
- All of the changes to objects inside an aggregate must be applied via the top level to the root, and the aggregate acts as a consistency boundary to ensure that all of the data inside the aggregate is updated correctly at the same time.
- An aggregate is the atomic unit of persistence, database transactions, and data transfer.
chapter 6
- Integrity (or validity) in this context means that a piece of data follows the correct business rules (it is a business term).
- Consistency here means that different parts of the domain model agree about facts (it is a business term).
- !!! it’s important to recognize that consistency and atomicity of persistence are linked. There’s no point, for example, in ensuring that an order is internally consistent if the order is not going to be persisted atomically. If different parts of the order are persisted separately and then one part fails to be saved, then anyone loading the order later will be loading an order that is not internally consistent.
- In general, a useful guideline is “only update one aggregate per transaction”. If more than one aggregate is involved, we should use messages and eventual consistency, even though both aggregates are within the same bounded context. But sometimes—and especially if the workflow is considered by the business to be a single transaction—it might be worth including all affected entities in the transaction. A classic example is transferring money between two accounts, where one account increases and the other decreases.
- If the accounts are represented by an Account aggregate, then we would be updating two different aggregates in the same transaction. That’s not necessarily a problem, but it might be a clue that you can refactor to get deeper insights into the domain. In cases like this, for example, the transaction often has its own identifier, which implies that it’s a DDD Entity in its own right. In that case, why not model it as such?
type MoneyTransfer = {
Id: MoneyTransferId
ToAccount : AccountId
FromAccount : AccountId
Amount: Money
}
- After this change, the Account entities would still exist, but they would no longer be directly responsible for adding or removing money. Instead the current balance for an Account would now be calculated by iterating over the MoneyTransfer records that reference it. We’ve not only refactored the design, but we’ve also learned something about the domain.
- How do we ensure that the constraints are enforced? Answer: The same way we would in any programming language—make the constructor private and have a separate function that creates valid values and rejects invalid values, returning an error instead. In FP communities, this is sometimes called the smart constructor approach.
- Capturing Business Rules in the Type System
// before
type CustomerEmail = {
EmailAddress : EmailAddress
IsVerified : bool
}
// after
type CustomerEmail =
| Unverified of EmailAddress
| Verified of VerifiedEmailAddress // different from normal EmailAddress
// hence no need on unit test, as the compiler will compain if the wrong type is added.
chapter 7
- A much better way to model the domain is to create a new type for each state of the order. This allows us to eliminate implicit states and conditional fields (boolean).
- for example, convert a design with a flag into a design with two choices, one for each state “unverified” and “verified”.
- Hence “state machine” and transitions from one state to another.
- The dependencies for the top-level workflow function should not be exposed, because the caller doesn’t need to know about them. The signature should just show the inputs and outputs.
- But for each internal step in the workflow, the dependencies should be made explicit, just as we did in our original designs. This helps to document what each step actually needs. If the dependencies for a step change, then we can alter the function definition for that step, which in turn will force us to change the implementation.
- For long running workflows, each step could be smaller, independent workflow, triggred by an event.
- This is where the state machine model is a valuable framework for thinking about the system. Before each step, the order is loaded from storage, having been persisted as one of its states. The mini-workflow transitions the order from the original state to a new state, and at the end the new state is saved back to storage again.
chapter 8
- Functions are things. They can be passed as input, return as output, or passed as a control parameter. Because functions are things, we can put them in a list.
- Currying convert any multiparameter function into a series of one parametter functions.
let add x y = x + y // int -> int -> int
let adderGenerator x = fun y -> x + y // int -> (int -> int)
- In F#, we don’t need to do this explicitly—every function is a curried function! That is, any two-parameter function with signature ‘a -> ‘b -> ‘c can also be interpreted as a one-parameter function that takes an ‘a and returns a function (‘b -> ‘c), and similarly for functions with more parameters.
- Partial application: If every function is curried, that means you can take any multiparameter function and pass in just one argument, and you’ll get a new function back with that parameter baked in but all the other parameters still needed.
// sayGreeting: string -> string -> unit
let sayGreeting greeting name = printfn "%s %s" greeting name
// sayHello: string -> unit
let sayHello = sayGreeting "Hello"
// sayGoodbye: string -> unit
let sayGoodbye = sayGreeting "Goodbye"
sayHello "Alex" // output: "Hello Alex"
sayGoodbye "Alex" // output: "Goodbye Alex"
- total function: where every input has an associated output (no exceptions).
- One thechnique would be to restrict the input to eliminate illegal values (smart conctructor - validation is either in the constructor or in a specific input type)
- Another techique is to extend the output - use Option.
- Workflow composition: sequential and/or parallel
chapter 9: dependecies
- Function cmposition has 2 main problems:
- Some functions have extra parameters that aren’t part of the data pipeline, but are nedded for the implementation (i.e. dependencies).
- Functions with effects in their output (Result type), cannot be directly connected to functions that accept unwrapped plain data as input.
- Adapter Function, Function Transformer
- Composing functions with different shapes: monads (Reader Monad, Free Monad, etc.), which are not addresed in this book, or partial applications.
// pseudocode for partial applications workflow input: UnvalidatedOrder // before ValidateOrder(CheckA, CheckB, UnvalidatedOrder): ValidatedOrder PriceOrder(CheckC, ValidatedOrder): PricedOrder // after with partial function using currying ValidateOder2(ValidateOrder, CheckA, CheckB): ValidatedOrder PriceOrder2(PriceOrder, CheckC): PricedOrder // realcode: pass all dependecies in the composition root let placeOrder checkProductExists // dependency checkAddressExists // dependency getProductPrice // dependency createOrderAcknowledgmentLetter // dependency sendOrderAcknowledgment // dependency : PlaceOrderWorkflow = // function definition
chapter 10: effects
- Domain errors: expected as paart of the domain model, where the business will already have procedures in place to deal with with.
- Panics: leave the system in unknown state, or npe, divide by zero and similar.
- Infra: netoerk timeout, authenticatio error etc.
- Error handling in the code is ugly => switch functions,often called monadic finctions as well:
bind
orflatMap
. So instead of one-track pipeline, the final result is a two-track pipeline, with a “success” track and a “failure” track. Going from success to failure is possible, but not the other way around. - This could be combined with a
map
function to switch from one track to two track funtions, where the output from the first one is not aResult
,Option
or else value. - It also means error types must be compatible with one another (e.g. OR type of several errors).
- ?? The switch function can be used to handle also dead-end functions, like I/O effects
- !!! After all, it finally goes again to Monads and Applicatives.
chapter 11: serialization
- DTOs as a contract between bounded contexts
- Serializing your domain model
chapter 12: persistence
- Separate pure functions from I/O and use a root function to handel the results from the pure into I/O.
I/O -> Pure -> I/O
, where I/O are at the domain boundaries. - If there’s too much mixing of I/O and logic, the simple “sandwich” may become more of a “layer cake.” In that case, you might want to break the workflow into shorter mini-workflows, as discussed in Long-Running Workflows. This way each workflow can stay as a small, simple sandwich.
// command handler at the edge of the bounded context
let payInvoice
loadUnpaidInvoiceFromDatabase // dependency
markAsFullyPaidInDb // dependency
updateInvoiceInDb // dependency
payInvoiceCommand = // load from DB
let invoiceId = payInvoiceCommand.InvoiceId
let unpaidInvoice = loadUnpaidInvoiceFromDatabase invoiceId
// call into pure domain
let payment = payInvoiceCommand.Payment
let paymentResult = applyPayment unpaidInvoice payment
// handle result
match paymentResult with
| FullyPaid ->
markAsFullyPaidInDb(invoiceId)
postInvoicePaidEvent(invoiceId)
| PartiallyPaid updatedInvoice ->
updateInvoiceInDb updatedInvoice
- Command and Query separation, aaplied to FP:
- Functions that return data should not have side effects.
- Functions that have side effects (updating state) should not return data - that is, they should be unit-returning functions.
- !!! Bounded Contexts Must Own Their Data Storage
- No other system can directly access the data owned by the bounded context. Instead, the client should either use the public API of the bounded context or use some kind of copy of the data store.
Command-Query Responsibility Segregation
- It’s often tempting to try to reuse the same objects for reading and writing. For example, if we have a Customer record, we might save it to a database and load it from a database with side-effecting functions like these:
type SaveCustomer = Customer -> DbResult<Unit>
type LoadCustomer = CustomerId -> DbResult<Customer>
- However, it’s not really a good idea to reuse the same type for both reading and writing for a number of reasons.
- First, the data returned by the query is often different than what is needed when writing. For example, a query might return denormalized data or calculated values, but these wouldn’t be used when writing data. Also, when creating a new record, fields such as generated IDs or versions wouldn’t be used, yet would be returned in a query. Rather than trying to make one data type serve multiple purposes, it’s better to design each data type for one specific use.
- A second reason to avoid reuse is that the queries and commands tend to evolve independently and therefore shouldn’t be coupled. For example, you may find that over time you need three or four different queries on the same data, with only one update command. It gets awkward if the query type and the command type are forced to be the same.
- Finally, some queries may need to return multiple entities at once for performance reasons. For example, when you load an order, you may also want to load the customer data associated with that order, rather than making a second trip to the database to get the customer. Of course, when you are saving the order to the DB, you would use only the reference to the customer (the CustomerId) rather than the entire customer.
- Based on these observations, it’s clear that queries and commands are almost always different from a domain-modeling point of view, and therefore they should be modeled with different types. This separation of query types and command types leads naturally to a design where they are segregated into different modules so that they are truly decoupled and can evolve independently. One module would be responsible for queries (known as the read model) and the other for commands (the write model), hence command-query responsibility segregation or CQRS.
chapter 13: wrap up
- We should try to capture important constraints and business rules in the type system wherever possible. Our motto is “make illegal states unrepresentable”.
- We should also try to design our functions to be “pure” and “total”, so that every possible input has an explicit documented output (no exceptions) and all behavior is entirely predictable (no hidden dependencies).
- Building a complete workflow using only composition of smaller functions.
- Parameterizing functions whenever there’s a dependency, or even just a decision that we want to put off
- Using partial application to bake dependencies into a function, allowing the function to be composed more easily and to hide unneeded implementation details.
- Creating special functions that could transform other functions into various shapes. In particular we learned about bind—the “adapter block” that we used to convert error-returning functions into two-track functions that could easily be composed.
- Solving type-mismatch problems by “lifting” disparate types into a common type.
F# to Scala
// F#
// AND Type
type FruitSalad = {
Apple: AppleVariety
Banana: BananaVariety
Cherries: CherryVariety
}
// OR Type
type FruitSnack =
| Apple of AppleVariety
| Banana of BananaVariety
| Cherries of CherryVariety
type AppleVariety =
| GoldenDelicious
| GrannySmith
| Fuji
// Simple Type
type ProductCode = ProductCode of string
type SimpleName = SimpleName of string
// we cannot confuse different types by mistake
// SimpleName = ProductCode leads to compile Error
// AND Type
case class FruitSalad (
apple: AppleVariety,
banana: BananaVariety,
cherries: CherryVariety
)
// OR type Scala 2
sealed trait FruitSnack
case class Apple(variety: AppleVariety) extends FruitSnack
case class Banana(variety: BananaVariety) extends FruitSnack
case class Cherry(variety: CherryVariety) extends FruitSnack
// OR type Scala 3
// SUM of 3 types, where each one is Product
// (although a degenerate one with 1 argument only)
enum FruitSnack {
case Apple(variety: AppleVariety)
case Banana(variety: BananaVariety)
case Cherry(variety: CherryVariety)
}
enum AppleVariety {
case GoldenDelicious, GrannySmith, Fuji
}
enum BananaVariety with
case Cavendish, GrosMichel, Manzano
enum CherryVariety with
case Montmorency, Bing
// Simple Type
opaque type ProductCode = String
object ProductCode {
def apply (code: String): ProductCode = code
}
// example usage
val snack = FruitSnack.AppleVariety(Fuji)
val salad = FruitSalad(GoldenDelicious, Cavendish, Bing)
// Scott Wlaschin to Scala 3
enum CardType with case Visa, Mastercard
enum Currency with case EUR, USD
object OpaqueTypes with // 'with' can be replaced by ':'
opaque type CheckNumber = Int
object CheckNumber with def apply(n: Int): CheckNumber = n
opaque type CardNumber = String
object CardNumber: def apply(s: String): CardNumber = s
opaque type PaymentAmount = Float
object PaymentAmount: def apply(f: Float): PaymentAmount = f
import OpaqueTypes._
case class CreditCardInfo (cardType: CardType, cardNumber: CardNumber)
enum PaymentMethod with
case Cash
case Check(checkNumeber: CheckNumber)
case Card(cardNumber: CardNumber)
case class Payment(
amount: PaymentAmount,
currency: Currency,
method: PaymentMethod
)
val cash10EUR = Payment(PaymentAmount(10), Currency.EUR, PaymentMethod.Cash)
val check10USD = Payment(PaymentAmount(10), Currency.USD, PaymentMethod.Check(CheckNumber(123)))
// Simple Types implementation in Scala
// a) type aliases - NO
type CustomerId = Int // + companion object
type OrderId = Int // + companion object
val customerId = CustomerId(42)
val orderId = OrderId(42)
// compiler will treat both as Int
assert(orderId == customerId) // :-( this will compile
val customerId: CustomerId = OrderId(42) // :-( this will compile
val something: Int = CustomerId(42) // :-( this will compile
def display(id: CustomerId): Unit = println(s"customerId=$id")
display(orderId) // :-( this will compile
// b) value classes - AnyVal with one val parameter
class CustomerId(val id: Int) extends AnyVal
// assert will work again, but adding `derives Eql` will give error
display(orderId) // will gove an error
// c) case class which `dervies Eql` works fine too
case class CustomerId(val id: Int) dervies Eql
// d) opaque types as value and case classes have performance issues
// similar to type aliases, but exists at compile time.
// Not preserved at runtime, do not produce classes
// always a valid model - smart constructo approach.
// private constructio and creation function with validation
// hence if customerId < 0 then Either[Error, CustomerId] = ... , which is a Monad
// alernativelly, to get all validation errors use Applicative
Insigits from Conferences
1
- based on (Servienti, 2018)
- Be aware fo the distributed monolith :)
- Services (a.k.a. contexts) should be free to evolve without affecting anyone else in the system (ie. be autonomous).
- Example with “shopping card”:
- “Shopping card” does not exists. Every single context has its own view of the “shopping card” concept.
- So “sales” owns a subset of the “shoppping card”, while “Warehouse”, and “maerketing” owns other parts and not car about the rest.
- Still there should be an owner of the concept “shoppping card”, and not of all data in it. “Sales” in this case.
- The UX present a “Shopping card” view model. In practice this is a cache (a dictionary) with no owner.
- View Model composition and decomposition. No need of complex projections and read models.
- Append-only model
- Zero coupling is utopic. But we could aggregate things in the same boundry based on the coupling: things that change together should be in the same context. Follow the coupling.
- Define service boundaries. Behaviour defines how to aggregated data and define boundaries, not data itself.
- Do not bring in more thechnology to solve non-technical problems. In case of issues go back to the whiteboard.
- User mental model can badly influence servcie design. User/analyst/PM tend to think in terms of data representation. They see the “shopping card”, while developers see the composed view model.
- Do not give names prematurely.
- Use “anti-requirements” tehcnique for validation (a.k.a. ask stupid questions): “Will there be a case where last name of the customer will be used to applly a discount price?” If NO, then last name should not be in the same place as price!
2
- based on (Richardson, 2016)
- Aggregate:
- a unit of consistency
- cluster of objects that can be treated as a unit
- a graph with a root.
- reference an aggregate only by the identity (id, primary key) of the root. Not by an object reference.
- Do not use foreign keys (object references) in Domain Models - considered parasites
- Transactions:
- should only CRUD 1 aggregate. Processing 1 command by 1 aggregate.
- transaction scope = service
- maintain consistency between services with events driven architecture.
- create order in PENDING
- send event to service B
- receive event from service B
- order becomes SUCCESS or FAILURE
- a drawback could be that “rollback” logic might be needed to be implemented: compensating transactions. Imapct could be minimized with proper validations.
- How to atomically “update a db” and “send event”?
- use event sourcing. Hence bothe db update and event sending are one and the same thing. Events become 1st class citizens. Events are persisted, not state. And state is calculated from the events on demand.
- JPA will query the events and recreate the state. FP (scala) fold or reduce:
var currentState = fold(applyEvent, initState, events)
.
13 ways of looking at a Turlte !!!
another simplifed explanation
miscellaneous
- On GDPR: one idea is to split public and private data with separate ids, and referene the public one in the private domain object.
- Invaraints are business rules that should be consistent.
- On statistics, but yet useful :) “All models are wrong, but some are useful.” George Box
- Use DDD to create hypothesis, and theory of constrains to validate them and find bottlenecks
- Bounded context = Autonomy
- How to atomically “update a db” and “send event”?
- (me, not good solution) use @transactional and 1st update the DB, then send the event. If the http fails, it is easier to rollback the JPA code.
- Use:
- ADT to model structures representing concepts and relations inside the model, and make illegal states in the Domain model unrepresentable (compiler error)
- Categories (Option, Eiter, Future, …) to model effects of operations
- Function Composistion to describe business process (state transitions)
- use Function composition and Value Objects (ADT) + validations/smart constructors to construct Entities (ADT). Whenever an object is constructed it is already valid. see Validated applicative too.
- states could be case class with different info, instead of enums
- Partial function are indistinguishable from total funcions. So maybe do not use them! Solution 1: use Either[L,R], solution 2: use refined types (postive integer instead of just integer), which will lead to compile time error, not runtime one.
- Try to derive most of the functions (like map from flatmap)
References
- Richardson, C. (2016). Developing microservices with aggregates. https://www.youtube.com/watch?v=7kX3fs0pWwc
- Schwarz, P. (2020). Scala 3 by example, part 1. https://www.slideshare.net/pjschwarz/scala-3-by-example-algebraic-data-types-for-domain-driven-design-part-1
- Schwarz, P. (2020). Scala 3 by example, part 2. https://www.slideshare.net/pjschwarz/scala-3-by-example-algebraic-data-types-for-domain-driven-design-part-2
- Servienti, M. (2018). Talk Session: All Our Aggregates Are Wrong. https://www.youtube.com/watch?v=KkzvQSuYd5I
- Wlaschin, S. (2018). Domain Modeling Made Functional. Pragmatic Bookshelf.