2025-06-18 11:46

BA Techniques Explained: Business Domain Model and Logical Data Model

Business analysis

In this essay, I’ll share my perspective on two closely related models in business analysis — the domain model (aka business domain model) and the logical data model. The first one can eventually give rise to the second, and both are essential structural visual models that any self-respecting analyst would be wise to keep in their toolkit.

Let me clarify right away: I’m a strong advocate for distinguishing between the terms “model” and “diagram” in most contexts. A model is something that is a result of a BA’s work; it’s what we deliberately construct to provide insight or value. A diagram, on the other hand, is a tool we use to represent that model. You could call them differently, but here’s an example: take the popular UML modeling notation, where you have things like “Use Case Diagram,” “Activity Diagram,” etc. With the Use Case Diagram, its purpose is fairly clear — it’s hard to misapply it, even if some approaches distinguish between business and system use cases. But when it comes to drawing, say, an Activity or Class Diagram, the results vary wildly depending on the BA’s level of knowledge and imagination. Some overly enthusiastic analysts even start sketching out software class structures.

What’s truly helpful to understand is that diagrams like the UML Class Diagram are just hammers — it’s all about using them to hit the right nails. And in this context, the right nails are those two models I mentioned earlier. A UML Class Diagram isn’t the only suitable hammer — there are ERDs (in various notations), IDEF0 diagrams, and several lesser-known ones. But for our purposes, we’ll focus on these models through the lens of UML Class Diagrams, which I’m especially fond of. Let’s dive in.

Exposition. You've just kicked off a new project and are relentlessly questioning the client, trying to decipher their latest scheme for world domination. Gasping for clarity, you’re teasing out fragments of their vision and requirements, navigating through a jungle of unknown business terms, rules, policies, and more.

Let’s bring this to life. Imagine your client wants to create a website guide for the enchanting world of The Witcher by Andrzej Sapkowski, aiming to monetize it via premium subscriptions (we’ll skip modeling the monetization bit to keep things simple). Suppose you’ve understood the target audience and the general idea, and now you want to deeply immerse yourself in the world — not just to churn out content, but to really grasp what the client wants to include and how to advise them strategically.

This means, at some point, you'll need to study the domain (also known as the business domain). According to our beloved BABOK, a domain is a sphere of knowledge that defines a set of common requirements, terminology, and functionality for any program or initiative solving a problem.. Well… they tried. In my own words: it’s a field of knowledge directly tied to a Change (remember the BABOK’s Core Concepts), defined by its shared information, terminology, and common characteristics.

In our example, the domain is The Witcher universe. In your project, it might be insurance in China, healthcare in Zimbabwe, or food delivery — anything that represents the business side of your project (i.e., the client’s or users’ activities).

Immersion in the domain:

You’ve decided to dive into the domain — to understand the world Mr. Sapkowski crafted so you can meaningfully participate in ideation with the client. What do you need to do? Naturally, study all available sources, starting with the originals. Sure, you could ask your client to tell you a bedtime story, but that’s hardly professional. If I were the client, I’d politely (or not) ask why you can’t study the materials first before asking questions.

So, you dive into the books, then the games, and even suffer through the Netflix show in the evenings, groaning at how poorly they adapted the material. Your main challenge here is to document the knowledge you gain. You need to structure and organize it — filter out the fluff and tie everything together into a coherent picture. Trust me, you’ll end up with a mountain of information. If your approach is to just “absorb and recall” whatever sticks, it’s time we talked about your qualifications. You need some form of structured distillate from all the materials (we’ll call this a “summary” for simplicity).

Let’s abstract away from our example. Suppose you've used your full arsenal of information-gathering techniques: interviews, document reviews, web research, observation, etc. Now you have a final set of domain knowledge. What form is it in? A chaotic mess of notes? Good luck finding answers later. You'll have to re-analyze everything. It's clearly better to organize it for usability. Beyond just presenting it in structured text form, I’m a fan of two approaches in such situations:

1. Mind Maps

A simple yet incredibly powerful way to present information as a tree structure. As you study the material, you note down key points hierarchically — each level of branches representing a deeper level of detail. You continuously reconcile new info with previous notes, refining them as needed. It’s intuitive and, with some practice, becomes a go-to method for structured knowledge capture:

2. Domain Model

What is it? A visual model (diagram) that uses boxes and arrows to represent key concepts in the domain (its entities, terms, actors, etc.) and how they relate. It reveals the structure of the domain.

How to build it?

As you study your information set, identify the key nouns (concepts, objects, actors — we’ll call them domain entities) and place them as elements (classes, if we’re using UML Class Diagrams). Then pick out the verbs — actions, processes, events, or relationships — and connect the entities using them (we’ll call these relationships).

At its simplest, that’s all there is to it.

Let’s go back to our earlier Witcher example. Say you started by reading general articles online. Here’s a summary excerpt:

The series’ main character is Geralt of Rivia, a witcher — a monster slayer protecting humans. As a child, he was taken to Kaer Morhen, a keep where witchers undergo mutations enhancing their speed, stamina, and poison resistance. He is known by nicknames like the White Wolf (Gwynbleidd), the White-Haired Witcher, and the Butcher of Blaviken. His main job is slaying monsters for coin. His life changes when he becomes guardian to a girl named Ciri, who is hunted by both Northern Kingdoms and the Nilfgaardian Empire. Geralt seeks to protect her and navigates his complex relationship with the sorceress Yennefer.

This is already a condensed version, but let’s turn it into a model using the approach above. I won’t detail every step of the transformation, but I invite you to compare the text with how it maps onto the diagram.

You’ll notice I also added comments — some of the questions that might naturally arise when reading this for the first time without prior knowledge of the domain. And that leads us to the first and arguably most important benefit of this modeling approach: it helps a BA analyze information. Just reading the text, I doubt I would’ve immediately thought of those crucial follow-up questions. You’ll soon see even more advanced modeling aspects that support deeper analysis.

Аdvanced modeling:

1) More sophisticated relationships, beyond the basic arrow (which, by the way, is called an association in UML). UML provides three types of relationships that are particularly useful to analysts and have more specific meanings than the universally applicable association:

Generalization or inheritance. This is a relationship between two elements indicating that the descendant is a further specification of the parent — that is, the element it inherits from. In other words, the child is essentially the same as the parent, but with its own specifics. It is represented by a solid line with an empty triangle at the end. For example, suppose we have a class Student of Learning Center X. A Student of Learning Center X is a Student, but we’ve added a new qualifying parameter — the learning center. Between Student and Student of Learning Center X, we can draw a generalization relationship. And we can continue the chain: Student of Learning Center X → Student of BA Courses at Learning Center X. So, as we move deeper, the classes become more and more specific and less abstract. By the way, inheritance implies that the child class adopts all properties (attributes) of the parent class, but typically adds something of its own.

Aggregation and composition. Both these relationships represent the “whole-part” relationship. In simple terms, they describe one element being included in another. If aggregation or composition goes from A to B, this means that A is a constituent part of the whole B. What’s the difference between these two types of relationships? Aggregation is shown as a solid line with an unfilled diamond at the end; composition, with a filled diamond. But that’s not all. Aggregation is considered a weak relationship, while composition is strong. If the conceptual destruction of the parent object leads to the destruction — literal or figurative — of its children (this could be physical destruction, marking as removed, archiving — anything that signifies destruction in the domain), then it’s a strong relationship. If the child objects continue to exist, the relationship is weak.

Let’s look at some examples. What is aggregation? Example: Student – Group. Suppose a student in a learning center attends both a Business Analysis course and a Database course. Now ask: if the BA course group is disbanded, do all its students lose their meaning as entities — are they destroyed as objects? No. A person still remains a student of the learning center and might be in another group or none at all — just recorded in the system as a past or future student. Therefore, this is a weak relationship.

Now, what is composition? Composition is when the components of a system are inseparable from it and have no meaning outside of it. For instance: Course of Learning Center X → Learning Center X. If the learning center is dissolved, the concept of its courses ceases to exist too. All courses are conceptually destroyed as well. This is a strong relationship.

2) Multiplicities on relationships that clarify their nature. Multiplicity on a relationship shows how many objects, or class instances, may exist at each end of the relationship.

Let’s take the Student – Group example again, where there is aggregation between them. Let’s define multiplicity at both ends of that aggregation. What questions should we ask to determine this? For multiplicity near the Student class, we ask: how many Students can (insert the name of the relationship — say, “belong to”) one Group? The answer (let’s say) is 10 to 16.

Now for the multiplicity near the Group class, the question is similar but reversed: how many Groups can one Student belong to? We've reversed the relationship direction. The answer: zero (and yes, they’re still a Student — perhaps they enrolled in a program that doesn’t require group participation, which is exactly why we used aggregation, not composition), or an infinite number (with no known upper limit), which is denoted by an asterisk (*).

3) Grouping elements by thematic areas or spreading them across different diagrams to reduce clutter and improve readability. I suppose this is self-explanatory.

Let’s try to apply these extra features to the previous model. I’ll remove the earlier questions (assuming we’ve answered them or at least noted them — no need to clutter the diagram now) and add some new ones prompted by the added modeling aspects, as well as additional info obtained.

As you can see, the models:
a) became more readable (some might disagree though),
b) became more informative (assuming one can read specific relationships and multiplicities),
c) raised a number of additional questions.

A few practical tips:

Name associations according to the direction of the arrow (for associations only — the other types of relationships are generally self-evident and don’t usually need labels).
Use multiplicities on relationships (except generalizations) wherever they might add even a bit of value. This is additional information that, when properly analyzed, can provide tons of insight and serve as a source for useful business rules (e.g., “An order cannot be empty when submitted to the company”).

Now I can take these models and either:

bring them to a stakeholder meeting, presenting and explaining them to confirm my understanding of the domain or to get useful feedback, or
present them to the team to convey my understanding of the domain, clarifying any ambiguous points during the presentation.

This is the second major goal of models like this — and all visual models, really: to simplify communication with stakeholders. Not in the sense that I’ll send the diagram with a “figure it out yourself” message and expect instant mutual understanding, but in the sense that showing this visually will make discussions around the areas I’m interested in — or confirming my understanding — much more effective than if I were to just explain everything I’d read on my own.

And going forward, I can either discard these models once they’ve served their purpose (though that feels a bit wasteful), or I can continue to maintain them as I study more books, games, or the TV show, expanding and updating them as needed. Naturally, with deeper exploration, the models will grow many times over. The key, as with all such techniques, is to learn to spend just the right amount of time on these artifacts — so the value they generate justifies the effort. You can only find this balance through trial and experience. And of course, learning the tools to work fast and confidently is always helpful.

After countless sleepless nights and completed phases of business analysis, you move on to developing solution requirements. As many know, solution requirements come in two flavors: functional and non-functional. But what fewer people are aware of — and even fewer apply in practice — is the fact that functional requirements can lie in two dimensions (or, to put it another way, it’s helpful to view them from two perspectives): the system's behavior (functionality) and the data that this behavior operates on.

Remember that the IT analyst’s focus is on information systems? The term “information” isn’t there by chance — it’s the cornerstone of software solutions. These systems operate on information, on data. So, a rather obvious but important conclusion follows: when dealing with requirements, it's helpful both for yourself and others to see what kind of data or information your solution will handle.

Let’s set aside the debate over whether you should be dealing with this layer of requirements at all — it depends on your business analysis approach combined with the project methodology, your view on what constitutes complete requirements, the position of the planets in the sky etc. What truly matters is that it should be a conscious decision — not one forced by ignorance or lack of knowledge.

But let’s say you do decide to dive into this part. So, what can help you here?

A solid foundation is a model called the logical data model.

Since the analyst’s view ideally shouldn’t be clouded by implementation details, this model is called logical, meaning it’s not tied to physical implementation. The analyst doesn’t know in advance whether the information will be stored in a database (and if so, will it be a relational one or not), or just dumped into text files. Or maybe the data won’t be stored at all, but rather hard-coded directly into the system.

The good news is that a logical data model:
a) often evolves naturally from your domain model — many of the same entities and attributes might appear here too (which makes sense, since the solution automates the domain),
b) is built using the same diagrams (for instance, UML Class Diagrams are perfect here, which we’ll explore further),
c) uses the same modeling techniques we’ve already studied.

And what’s more — it’s usually very well received by the development team, who’ll shower you with gratitude for including such an artifact in your requirements.

The bad news? You need to approach this model far more carefully and rigorously than the domain model. If the domain model is your creative interpretation of the domain (and it’s entirely normal to have radically different but equally valid models for the same domain), then the logical data model shouldn't have that kind of variability. If it does, it likely signals a flaw in your requirements — one that will come back to bite you.

Let’s move on to our example. Imagine that, after several iterations involving pliers and blowtorches applied to the client, you’ve determined that the website should be a catalog of articles describing the Witcher universe, with authenticated users able to leave comments on them. Articles are organized by category. Any user (verified or anonymous) can view articles, but only admins can create, edit, and otherwise mess with them. A simple example, but let’s complicate it a bit: comments must go through pre-moderation by an admin after posting, with the comment author notified of the result.

This is, howerver, the kind of case where the contents of the data model will hardly overlap with the domain model. The Witcher universe, with all its inner workings, gets transformed into different concepts within the context of the software system. Into what exactly? Let’s revisit the system description and think about the kinds of information the solution will manage. There will be articles, and those articles will have categories. You could argue that category is just another atomic attribute of an article, like its title. But let’s add some context: the client wants admins to manage categories (CRUD), including the order they appear in the site menu. Now that atomic attribute becomes a more complex component and, therefore, a separate entity in the model. Articles can have comments. The system will have users — after all, you need to handle registration, authentication, and authorization, which means storing a certain amount of information per user. And if we assume comment feedback is delivered via an internal notification system, then notifications are also a part of the system.

Let’s translate all that text into a model and use some of our familiar modeling tricks. It might look something like this:

A few notes on the model:

1) Attributes now have data types, unlike in the domain model. In fact, the domain model often didn’t even have attributes in classes unless necessary for clarity. The data types here are logical — described in plain language from the analyst’s point of view (you can choose the terminology that works best for you and your team). Later, in the physical data model, these will be converted into whatever implementation-specific types your team needs — char, string, int64, and so on. But as we said earlier, let's stay above the implementation for now.

That said, detailing attributes and their data types is a useful and important part of data requirements. Just keep in mind that if you're also maintaining a data dictionary, repeating attribute details in the model might be redundant.

Two interesting types you can use in this model (or replace with others depending on your team’s preference):

Enumeration – a predefined, finite set of values. Unlike plain text, where the value can be anything, enumerations are fixed. For example, a Comment’s Status might always be one of: “Approved”, “Rejected”, or “Pending”. You’d list these specific values in your data dictionary or supplementary documentation.
Data types matching entity names – for example, the Author attribute of a Comment has a data type of User. What’s this about? Well, think about it: when we say a Comment has an Author, what do we mean? A text value? A login? You could show the relationship using just an atomic attribute (e.g., a login that links comments to users), but I personally prefer this approach: showing that the Author of a Comment is an instance of the User entity, with all its internal structure. Thus, the attribute has a composite type — namely, the entity User.

2). Some comments on relationships:

Comment–Article: This is a composition, because a Comment doesn’t make sense without its parent Article. If an Article is deleted/archived/etc., its comments are purged too. A Comment only exists under one article. An Article, however, can have no Comments or many.
Article–Category: A bit more nuanced and reflects a subjective take on requirements (which, in a real project, should be clarified with the client). This is an aggregation, because deleting a Category might just move the Article to an “uncategorized” group or prompt a Category change — not delete the Article. Thus, Articles can exist in or outside Categories. Unassigned ones might not appear to regular users until categorized.

3). Admin and Reader are modeled as separate entities to show that only a specific kind of User interacts with comments and notifications. Otherwise, this distinction wouldn’t be necessary — they’re just users with different role values.

So what can I do with this model? The goals are the same as for the domain model: to facilitate your own analysis and improve communication. More specifically:

Doing this kind of modeling will raise many helpful questions: Can an article exist without a category? If so, how should the system handle it? What content types are allowed in articles? What happens to comments when an article is deleted? What comment statuses are there? What information must users provide for entity X? And so on. Each new perspective on requirements reveals insights you wouldn’t catch otherwise. Here, we’re looking at requirements through the lens of data. Think about system behavior, security, implementation constraints, etc., and new knowledge (and new questions) will emerge.

It’s also a requirements artifact — something you hand over to those implementing the solution and later verifying it. Include it in your specs, upload to Confluence, reference it in your stories — these are additional requirements that increase the likelihood of delivering value through business analysis. It also helps with requirements traceability, reducing the risk of incomplete or poor-quality requirements. For instance, if you start with the data and build behavior around it, you’re less likely to forget something important — or build something unnecessary. Example: you decide articles need a Date and Name. Later, you design the CRUDL functionality for articles. If admins can edit articles, this model will remind you whether editing includes changing the name or just the content. It may even lead you to question: why do we need a publication date? If it never shows up in behavior, maybe drop it? Or maybe it should be displayed on the UI?

And let’s not forget how useful this model can be when coordinating with external stakeholders. Sure, it’s not beginner-friendly (and to the untrained eye, nearly incomprehensible), and you might have to guide their focus. Still — though it’s not an essential part of every project — there are times when discussing requirements over a beer, using this lens, reveals key nuances. In fact, I once had a project fail precisely because such a model wasn’t created and key details — like class multiplicities — weren’t clarified. So yeah, it matters.

These are just two powerful mini-techniques. Of course, we’ve left out many practical nuances about when and how to use each model — but it’s best to just try it out in the field.