Written By: Steve Zagieboylo, Senior Architect at Calavista
I recently started a new greenfield project, where the decision was to use a microservices-based architecture. The application was pretty well defined, including most of the data model, since there was a working prototype, so my biggest first concern was how to break it up appropriately into microservices. I’ve participated in this process before, including making some of the mistakes I’ll try to warn you of, so I developed a way to think about the question that helps with the process.
Why Microservices?
There is plenty of documentation out there telling us why a microservices-based architecture is preferable for a large project, so I’ll just hit the highlights.
- Clean, well-defined interfaces make for better code. Sure, you could keep the interfaces this clean in a monolith, maybe if you were the only developer. But you can’t when there are tens or hundreds of developers trying to sneak around them, intentionally or otherwise.
- Separately deployable services allow for quicker turnaround of features and bug fixes. Honestly, I’ve never gotten this far in an application, including this one we just developed. In the early stages, there are still interdependencies, where one service needs certain info from another, so somebody has to create the API, somebody has to make the code that calls it, and it probably the same person doing both. But I truly believe that in a mature version of this product, those sorts of changes will diminish, and most small features will remain internal to a single service.
- Horizontal scaling becomes more efficient. I have seen this already. Some services have so little load that two instances is plenty, whereas some services scale up to a dozen instances. (We don’t drop below two so that a single failure doesn’t shut us down.)
How to Approach the Process
There are no hard and fast rules for how to break up your application into microservices, but these are the things that I’ve found are the key considerations.
Data Model
I tend to be a data-first architect, so, of course, this is where I start. Build a basic Entity-Relationship Diagram (ERD) of your data. This will naturally create some groupings, but it’s the rare data model that isn’t completely connected by relationships. Draw a circle around the major groups. This task is generally obvious, so the rest of this section will focus on how to resolve trickier cases.
Consider whether they are really Joins, or just References
Frequently, you’ll find that you have relationships which are many-to-one, where there is really no reason to connect them at a level through which you can perform a join. For instance, say you have a table of CELL_TOWER, with location, power, date_installed, etc. There is also a table of CELL_PHONE, with a separate table that connects the two. (Even though we assume a phone is only connected to one tower, you put this in a separate table because it changes so fast, and it might be an in-memory table.) You also have a table of TOWER_MAINTENANCE_RECORD.
The TOWER_MAINTENANCE table clearly is in a different microservice from the one CELL_PHONE is in. But which of these two services is CELL_TOWER in?
The answer lies in considering how you’re going to use this data, and where the real join is. Which is more likely: That you’re going to be presenting a list of towers, perhaps filtered or sorted by something to do with its maintenance records? Or that you’re going to present a list of towers filtered or sorted by something to do with the cell phones connected to it? Clearly, the first is far more likely. You might want to get a snapshot of all the phones currently attached to a particular tower, but you don’t need a join to do that. This operation is probably a drill-down from some list of towers, so you already have the tower information in your hand, including its ID. You just need to go to whatever service has the table that connects phones and towers and query against that. This query does want a join to CELL_PHONE, because you’re about to present a list of phones, probably with several columns. The query doesn’t pull anything from CELL_TOWER, though; it just needs the particular tower’s ID to filter against.
Rule of Thumb: When you’re going to be presenting a list of things with filtering and sorting, that’s one important place where you need a real join. The table with the items themselves must join to the tables that have the filtering and sorting parameters, so they have to be in the same microservice.
UI Interaction
Ideally, any given screen of UI Interaction should interact with only one or two microservices. (Yes, I know we don’t have independent screens anymore, but you know what I mean.) For example, say you have one wrapper screen with everything about a PATIENT. Within it are tabs with different information about the patient: one with a list of VISITs and a drill-down on VISIT; another with a BILLING_AND_PAYMENTs; etc. Almost certainly, those two tabs are accessing two different microservices, and the wrapper with basic patient information (name, date of birth, …) is possibly a third. However, this does not violate my rule of thumb, because you can think of the different tabs as different “screens,” given the somewhat expanded definition of “screen.” In any case, as a rule of thumb, it’s pretty weak as it gets violated frequently. Its value is that it might tip the balance when you are unsure which service should hold a particular table, consider what the UI is interacting with.
The latest trend in microservices is to create Microservice UI, where the UI components that interact with a microservice live in that service and are only included as components in the larger framework. If, for instance, another column is added to one table in one service, then the UI to present and to edit that data is guaranteed to be in the same service. This targeted UI is included as a component, perhaps in several places in the application, but none of those places need to change. I admit that I’ve not used this approach in a real application, yet, but I’ve experimented with it. My experience was that I still needed to make changes in the wrapper to account for changes in the size and geography of the component. However, my experiment was not developed with a reactive UI, and most of the issues I saw would not have been problems if it were.
Rule of Thumb: Try to break up your microservices such that any given “screen” of the UI is interacting with no more than two services. When you do find yourself violating this rule, do so knowingly and have a good reason.
Rule of Thumb: Any UI component should interact with only a single service. Consider putting that UI code in the same repo as the back end code for that service.
Don’t Overdo It
The most common mistake I’ve seen (and done myself) is to go overboard in making lots of nanoservices. (A term I thought I had just made up, but a quick Google search shows that I’m not the first one.) Just because this one table of PATIENT_COMMENTS is only accessed on a screen dedicated CRUD of those comments, and there are no real joins to it because it is just a drill-down from a Patient page doesn’t mean that needs its own microservice. Creating a microservice adds some overhead in mental load for understanding the overall application, that somewhat offsets the mental load of isolating a section cleanly. The net result should be to decrease the mental load, not increase it.