> In practice microservices can be just as tough to wrangle as monoliths.
What's worse: Premature scalability.
I joined one project that failed because the developers spent so much time on scalability, without realizing that some basic optimization of their ORM would be enough for a single instance to scale to handle any predictable load.
Now I'm wrangling a product that has premature scalability. It was designed with a lot of loosely coupled services and high degrees of flexibility, but it's impossible to understand and maintain with a small team. A lot of "cleanup" often results in merging modules or cutting out abstraction.
The company I'm at is a well funded startup that doesn't receive a humongous traffic at all. Yet, the so-called engineers in the early days ended up deciding to split every little functionality into a microservice.
Now we have 20+ microservices that are setup together in a fucked up way. Today, every engineer out of our 150+ engineering team struggles with implementing and getting trivial stuff over the finish line.
Many tasks require making code changes in multiple codebases and there are way too many moving parts. The knowledge overhead required to setup and test shit locally is too high as well. And the documentation gets so obsolete so quickly and people spend an obscene amount of time reaching out to other teams and running in circles to get unblocked on things. Our productivity would literally 5x if we just had 3 or 4 services overall. Even 1 giant service with clear abstraction between teams would have worked well, actually.
Yet, for the flashiness and to keep sounding cool, the folks at our company still keep living with the pain. As an IC, I just fucking do my work 5 hours each day and just keep reminding myself to ignore whatever horrors I keep seeing. Seems to keep going well.
IMO: Either advocate to merge services, or find a new job.
As professionals, part of our job is to advocate for the right solution. Often there is a leader who is emotional about their chosen framework / pattern / architecture / library. It's merely a matter of speaking truth to the right power, and then getting enough consensus to push on the leader to see the error in their ways (or to get the other leaders to push the bad leader out.)
In your job, what you really need to do is point out to non-technical leaders that development of new features is too slow, given the current team size. Don't get too technical with them, but try to find high-level analogies. You can also work on your more direct leadership, but that requires feeling out their motivations.
Thank you for the note. I've actually tried many times to bring up the barriers that are major problems in our company. But everything seems to be subject to a lot of politics. No one seems to be willing to listen, and at one point it just started affecting my morale and peace of mind. My personal situation is currently so that I am unable to switch jobs though. I am willing to take a break and find something new later next year
I’ve started taking 5 layers out of a Rails app, back to MVC. It’s so much faster now I actually feel bad, and I’m not the one that built the app in the first place. The premise during its construction was that it would scale to millions of active users. It…is not doing that in the wild…
As an ex-fang engineer myself, I have never advocated for more services, usually have pushed for unifying repos and multiple build targets on the same codebase. I am forever chasing the zen of google3 the way I remember it.
If anything my sin has been forgetting how much engineering went into supporting the monorepo at Google and duo-repo at Facebook when advocating for it.
There seems to be more interest in building monorepo support now. Some tools, start ups, etc. I would bet Github is working on increasing support as well for large repos. So I think Google was ahead of the curve there.
The main ones were www which contained most of the PHP code and fbcode which contained most of the other backend services. There were actually separate repos for the mobile apps also.
Elixir + Phoenix is so great at this with contexts and eventually umbrella apps. So easy to make things into apps that receive messages and services with a structure. I’m amazed it isn’t more popular really given it’s great at everything from runtime analysis to RPC/message passing to things like sockets/channels/presence and Live View.
I've been picking up Elixir and the Pheonix framework and I'm impressed so far. Pheonix is a very productive framework. Under the hood Elixir is very lisp-like, but the syntax is more palatable for developers who are put off by the syntax of Lisp.
Why isn't it more popular? It's always an uphill battle to introduce a new programming language or framework if BIGNAME doesn't use it.
the elixir shop I was at, folks just repl'd into prod to do work. Batshit insanity to me. Is that the elixir way? Are you able to easily lock down all writes and all side effects and be purely read only? If so, they never embraced that.
It's an order of magnitude harder to debug when you don't have access to prod, but there's a reason to block that access. I think you need to put controls on that fairly early in your project's evolution.
Any good strategies to reduce the pain? My previous employer never solved this.
I always wanted to explore contextual logging - by which I mean, logging is terse by default, but in an error state, the stack is walked for contextual info to make the log entry richer; and also, ideally, previous debug log entries that are suppressed by default are instead written. I guess that implies buffering log entries and writing only a subset at the end of the happy path.
To illustrate what I mean: happy path log:
10:21:04 Authenticated
10:21:05 Scumbulated
Error condition log:
10:21:04 Authenticating id 49234 request DEADBEEF
IDP response OK for 49234 request DEADBEEF https://idp.dundermifflin.com
Session cookie OK request DEADBEEF
Authenticated
10:21:05 Scumbulating flange 7671529 user 49234 request DEADBEEF
NullFlangeError flange 7671529
at scum.py:265
Frame vars a=42, password=redacted, flags=0x05
I'm reacting to hard-to-repro bugs at $employer where we chucked logging statements at a dartboard, deployed, waited, didn't capture the issue, repeat several times. At a cadence of 5-10 deploys a week, this is below what I consider acceptable velocity. We often took days to fix major bugs, we'd run degraded for weeks at a time.
That sounds like something that will not scale because people make mistakes. Our DBA at $prev_gig was very skilled. And also truncated a production database. Mistakes happen. Write access to prod all willy nilly gives me the heebie jeebies
It depends on the situation, if something is broken in a live system and you can login and introspect the real thing this is awesome. Obviously there are trade offs and potentially you might break things further!
I'm used to introspecting live data with read only access. They used write access and were an accidental key stroke from deleting. Writing in prod should take permission escalation
In a nutshell, each django project is an 'app' and you can 'install' multiple apps together. They can come with their own database tables + migrations. But all live under the same gunicorn and on the same infra, within the same codebase. Many Django plugins are setup as an 'app'.
Django apps can be the modular part of a modular monolith, but it requires some discipline. Django apps do not have strong boundaries within a Django project.
Often there will be foreign keys crossing app boundaries which makes one wonder why there are multiple apps at all.
In fact some people opt for putting everything into a single app [0]. Others opt for no app [1].
Django apps are good for installing Django packages into Django projects. But there’s no firm mechanism that enforced any real separation. It’s just other Python modules in a different folder (that you can just import into your other app).
The rule would be something like, if you can’t pip install your Django app into a project, it’s probably too weak of a boundary (that might be a bit too extreme, but if it is, it’s not too far off).
I can't help but feel like the author has taken some fairly specific experiences with microservice architecture and drawn a set of conclusions that still results in microservices, but in a monorepo. There's nothing about microservices that suggests you have to go to the trouble of setting up K8s, service meshes, individual databases per service, RPC frameworks, and so on. It's all cargo culting and all this...infra... simply lines the pockets of your cloud provider of choice.
The end result in the context of a monolith reads more like domain driven design with a service-oriented approach and for most people working in a monolithic service, the amount of abstraction you have to layer in to make that make sense is liable to cause more trouble than it's worth. For a small, pizza-sized team it's probably going to be overkill where more time is spent managing the abstraction instead of shipping functionality that is easy to remove.
If you're going to pull in something like Bazel or even an epic Makefile, and the end result is that you are publishing multiple build artifacts as part of your deploy, it's not really a monolith any more, it's just a monorepo. Nothing wrong with that either; certainly a lot easier to work with compared to bouncing around multiple separate repos.
Fundamentally I think that you're just choosing if you want a wide codebase or a deep one. If somehow you end up with both at the same time then you end up with experiences similar to OP.
I think the assumption here is that "microservices" means each team is dealing with lots of services. Sometimes it's like that. But if you go by the "one service <=> one database" rule of thumb, there will probably be 1-3 services per team. And when you want to use other teams' stuff, you'll be thankful it's across an RPC. First basic reason is if you don't agree with that other team on what language to write in.
It'd really help to see a concrete example of a modular monolith compared to the microservice equivalent.
The thing microservices give is an enforced api boundry. OOP classes tried to do that with public/private but fail because something public for this module is private outside. I've written many classes thinking they were for my module only and then someone discovered and abused it elsewhere. Now their code is tightly coupled to mine in a place I didn't intend to be coupled.
i don't know the answer to this it is just a problem I'm fighting.
What you'd want is Architecture Unit Tests; you can define in code the metastructures and relationships, and then cause the build to fail if the relationship is violated.
The problem I've had with these tests is that the run time is abysmal, meaning they only really get run as part of CI and devs complain that they fail too late in the process
Different languages handle this in different ways, but the most common seems to be adding access controls to the class itself, rather than just its members.
For instance, Java lets you say "public class" for a class visible outside its package, and just "class" otherwise. And if you're using Java 11 modules (nobody is though :( ), you can choose which packages are exported to consumers of your module.
In a similar vein, Rust has a `pub` access control that can be applied to modules, types, functions, and so on. A `pub` symbol is accessible outside the current crate; non-pub symbols are only accessible within one crate.
Of course, lots of languages don't have anything like this. The biggest offender is probably C++, although once its own version of modules is widely supported, we'll be able to control access somewhat like Java modules and Rust crates, with "partitions" serving the role of a (flattened) internal package hierarchy. Right now, if you do shared libraries, you can tightly control what the linker exports as a global symbol, and therefore control what users of your shared library can depend on -- `-fvisibility=hidden` will be your best friend!
> A `pub` symbol is accessible outside the current crate;
This is not universally true; it's more that pub makes it accessible to the enclosing scope. Wrapping in an extra "mod" so that this works in one file:
mod foo {
mod bar {
pub fn baz() {
}
}
pub fn foo() {
bar::baz();
}
}
fn main() {
// this is okay, because foo can call baz
foo::foo();
// this is not okay, because bar is private, and so even though baz is marked pub, its parent module isn't
foo::bar::baz();
}
Modules are almost as old as compiler technology. A good module structure is a time proven way to deal with growing code bases. If you know your SOLID principles, they apply to most module systems at any granularity. It doesn't matter if they are C header files, functions, Java classes or packages, libraries, python modules, micro services, etc.
I like to think of this in terms of cohesiveness and coupling rather than the SOLID principles. Much easier to reason about and it boils down to the same kind of outcomes.
You don't want a lot of dependencies on other modules (tight coupling) and you don't want to have one module do too many things (lack of cohesiveness). And circular dependencies between modules are generally a bad idea (and sadly quite common in a lot of code bases).
You can trivially break dependency cycles by introducing new modules. This is both good and bad. As soon as you have two modules, you will soon find reasons to have three, four, etc. This seems to be true with any kind of module technology. Modules lead to more modules.
That's good when modules are cheap and easy. E.g. most compilers can deal with inlining and things like functions don't have a high cost. Small functions, classes, etc. are easy to test and easy to reason about. Being able to isolate modules from everything else is a nice property. If you stick to the SOLID principles, you get to have that.
But lots of modules is a problem with micro services because they are kind of expensive as a module relative to alternatives. Having a lot of them isn't necessarily a great idea. You get overhead in the form of build scripts, separate deployments, network traffic, etc. That means increased cost, performance issues, increased complexity, long build times, etc.
Add circular dependencies to the mix and you now get extra headaches resulting from that as well (which one do you deploy first?). Things like graphql (aka. doing database joins outside the database) are making this worse (coupling). And of course many companies confuse their org chart with their internal architecture and run into all sorts of issues when those no longer align. If you have 1 team per service, that's probably going to be an issue. It's called Conway's law. If you have more services than teams you are over engineering. If you struggle to have teams collaborate on a large code base, you definitely have modularization issues. Micro services aren't the solution.
But it's a huge difference. No RPC overhead. No lost / duplicate PRC messages. All logs can literally go to the same file (via e.g. simple syslog).
Local deployment is dead simple, and you can't forget to start any service. Prod deployment never needs to handle a mix of versions among deployed services.
Beside that, the build step is much simpler. Common libraries' versions can never diverge, because there's one copy per the whole binary (can be a disadvantage sometimes, too). You can attach a debugger and follow the entire chain, even if it crosses the boundaries of the modules.
With that, you can make self-contained modules as small is it makes logical sense. You can pretty cheaply move pieces of functionality from one module to another, if it makes better sense. It's trivially easy to factor out common parts into another self-contained module.
Still you have all the advantages of fast incremental / partial builds, contained dependencies, and some of the advantages of isolated / parallel testing. But most importantly, it preserves your sanity by limiting the scope of most changes to a single module.
There would be a mix of versions, managed via branches.
The part about debuggability sounded appealing at first, but if the multiple services you want to run are truly that hard to spin up locally, it won't be any easier as a monorepo. First thing you'll do is pass in 30 flags for the different databases to use. If these were RPCs, you could use some common prod or staging instance for things you don't want to bother running locally.
> There would be a mix of versions, managed via branches
"We build the image slated for deployment from the release branch which is cut from master daily / weekly at noon." Works for MPOW, and some previous places. It's a monolith, there are rules!
> but if the multiple services you want to run are truly that hard to spin up locally, it won't be any easier as a monorepo. First thing you'll do is pass in 30 flags for the different databases to use.
Agreed! Maye that would be a reason to split the thing finally into separate services. Not necessarily micro-services, just into parts that are self-contained enough.
But most code bases are not nearly as heavyweight. They can work pretty well as a monolith, run as a whole on a decent laptop, along with a database or two, or even three (typically Postgres, Redis, and Elastic). I know because I did it many times, and a ton of other people did. Worse yet, I ran the whole bunch of microservices, much like production, locally, again, as many a developer here did, too. At the scale this small, the complexity just slows you down.
> use some common prod or staging instance for things you don't want to bother running locally.
I've seen this in much bigger projects, and there it made complete sense. When you have to move literally a ton, it makes sense to use a forklift. But if the thing is a stack of papers that fits into a backpack, the forklift is an unnecessary bulk and expense. It could sill be a neatly organized stack of papers, not a shapeless wad.
I'd avoid having separate services share a DB. Besides the overhead, you get scary hidden dependencies that way. If this approach is considered not micro but rather an in-between, the article should mention it as an option.
Indeed, they should not share a DB (except maybe Redis as a common cache, with a single writer per item type).. But a single Postgres installation can run multiple schemas / users, so you only need to run one Postgres container locally.
A good analogy is lacking though. "Modular Monolith" sounds like a contradiction. It doesn't help the idea.
It inherits culture from OOP stuff, that abstraction was leaked to repositories, then it was leaked to packages, and it's being roughly patched together into meaningless buzzwords.
It's no surprise no one understands all of this. I see the react folks trying to come up with a chemical analogy (atoms, molecules and so on), and the functional guys borrowed from a pretty solid mathematical frame of mind.
What is the OOP point of view missing here? Maybe it was a doomed analogy from the beginning. Let's not go into biology though, that can't do any good.
Spare parts, connectors, moving parts versus passive mechanisms, subsystems. Hard separation and soft separation. It's all about that when doing component stuff. And it has been figured all out, we just keep messing how we frame it for no reason.
Oo I love a terminological discussion, well said! I would disagree on your first point though: aren’t most large machines modular monoliths? Say, cars, airplanes, and dams? I absolutely agree that this usage kinda erases the original intent of the word “monolith” in a software context, though. Or at least complicates it greatly…
Personally, I’m putting all my money on cognitive, and the terms that go along with it - say, social, agential, discursive, and conversational. Not to forget the deeper cuts from philosophy of mind (the precursor to CS!), such as dialectic (a highly-mutable data structure with holistic modification functions?), architectonic (code-generators built in to the very “top” of a system, breaking it down into a binary tree of generated-code-facilitated subsections?), and striated vs smooth systems (describes the level of obstruction/complication present in each?).
Ultimately my takeaway from this article is that absolutes (namely, microservices) rarely work in real world architecture contexts. When looked at in those general terms, I think we have little choice but to start treating software like minds to be “shaped” or “sculpted”, to use Minsky’s preferred terminology. After all,
I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted. I believe further that no useful purpose is served by concealing these beliefs.
- Alan Turing, 1950’s Computing Machinery and Intelligence
Isn't OOP supposed to be about bringing everyday human things as the paradigm for thinking? Stuff like doors, buttons, locks and keys, levers.
There are some good gems in the popular developer culture. "Circuit Breaker" is a de-facto name for an OOP pattern I enjoy. "Dependency Injection" not so much, could be called "Spare Parts Design".
Architectonic styles comes from Christopher Alexander stuff. It's a classic, patterns and GoF comes from that. We could use some of his later _Nature of Order_ stuff as well (the "Order" name I don't like that much though, but it's good content nonetheless).
I think this is the most cognitive (in what it does, not in what it says) we can get without being too geeky for beginners.
Minds to be "educated", maybe? Like we humans do. Small everyday stuff first, then more complicated human things later.
Anyway, about the "modular monolith": what about we just call it a "machine"? We already expect most machines to be modular, and the sum of its parts to do more than its individual components, and the idea that the components are designed and maintaned together.
> Isn't OOP supposed to be about bringing everyday human things as the paradigm for thinking?
Not really. That's just the metaphors that are used for teaching. Really just for introducing the concept, if the teacher is any good. Because it's not very helpful for what developers actually do most of the time.
If you want to believe Alan Kay, it's about "messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things", and almost everyone's been doing it wrong since the 70s. Oh, and he actually "thought of objects being like biological cells".
Personally, I'd say it's most importantly about encapsulating state with the code that operates on it.
I made my point in the first comment I made on this thread. Whatever it is, it is lacking something for human brains to grasp it.
My point is that people are doing it wrong because the names don't help. It was called _Smalltalk_, wasn't it? Like a casual friendly conversation. These more technical descriptions are behind the theory of it, not what is actually supposed to look like when applied.
I don't need to know that doors have hinges or the complex manufacturing of locks to call a door a door. When I open a lock, I don't need to call it a "mechanical system of pins and springs" or anything else.
Of course not. Circuit breakers come from electrical engineering. And façades comes from architecture. And mediators are from debate and diplomacy. And promises are things humans tell others regarding a future expectation.
Or were you stating the fact that it's not a documented "official" pattern by a popular book author?
The problem isn't that it is missing something but that it has extraneous parts. Inheritance as your default composition method couples together subtyping and interfaces. You basically can't build a modular monolith out of a large class hierarchy because the very act of being a large class hierarchy means you have more coupling between your classes than a modular monolith permits and already have a just-plain-monolith. From there standard code entropy will only grind it in harder.
You really need to eschew inheritance for a good modular monolith and base it around interfaces only. Every major bit of functionality in your system needs a cleanly specified set of services it depends on, where that specification is just an interface and not hard-coded to some specific type, and certainly not some sort of class where subclasses must not only fulfill the interface but conform to the Liskov Substitution Principle, which is much, much harsher than most people realize.
Then you have a true module, where if one day you need to yank out a particular service and make it a true microservice, you "just" take the interfaces it needs and either carry along the services locally or run them over the network. I scare-quote "just" because it's not infrequent to need to tweak the interface to permit it to work over the network (more things that can fail), but it's still a relatively mechanical and feasible process versus taking a whole bunch of super-hard-coded specific types, accessed randomly through whatever language scoping mechanism made sense at the time, and trying to convert that to run over the network.
This is just a summary, of course, because it's an HN post and I can hardly lay out a complete design philosophy to the n'th degree in a comment. But there definitely is a true distinction between a monolith with all sorts of hard wiring internally and a way you can design a modular monolith that is still a monolith, but the coupling between the modules has been slimmed down to a minimum and controlled through some consistent gate rather than just willy-nilly wired together.
Or, to put it another way, it's the difference between the module directly loading another module to directly resolve DNS, directly using the filesystem, directly wiring in a specific type for user authentication, and using a global instance of a logging system instance as initialized by some other module, you know, basically how most people write code all the time, and a module that accepts "a thing to resolve DNS addresses", "a filesystem interface", "a thing that authenticates users", and "a logger" where each of those things are interfaces, minimized down to just what the local module actually needs. One is a coupled nightmare no one want to touch. The other is easily swapped out to work with S3 to store files instead, or inject a different authentication system that ends up using a network service rather than directly hitting the DB, or swap in a hard-coded DNS resolver for testing purposes to avoid dependencies on the real network, etc.
It isn't even really anything that surprising. A lot of people will at least claim this is just "good programming". But you don't get the benefits if you don't actually do it.
Agreed with all of this. Composition over inheritance. The other consequence of turning one of these not-actually-modular monoliths into microservices is they bring the same bad behavior with them tightly coupled services where any failure in the chain cascades throughout. It takes a LOT of work and effort to build microservices that aren't a fragile mess. How many companies going full steam ahead on microservices don't have another team dedicated to trying to bring the system down just to ensure the resiliency (http://principlesofchaos.org/)? And you have to ask what you're actually doing this work for. Because everyone here knows you can scale the shit out of a monolith just by throwing some more hardware at it and hardware is cheap. I think this is because the real problem microservices are trying to solve is how to do software development with thousands of developers contributing to related code bases without productivity grinding to a halt.
Sure at the very high end of scalability being able to isolate computationally intensive portions of the code might make sense to spin off into its own service justifying decomposing the monolith a bit. But very rarely do scalability and performance requirement necessitate full blown microservice adoption. Craigslist is an example of a company with so few developers and such a restricted feature set that they likely would never have to move to microservices because the handful of developers they have can all grok the system as a whole. Regardless of how many users they had to scale up to support.
"True" modules rarely exist in real life. Maybe toys like Lego bricks. Even those have a lot of mismatch sometimes. Sometimes we want to build something that the toymakers never imagined.
Pipes and tubes require glue, engines require lubrification oil, bricks require cement. Lots of non-modular things holding stuff together.
But we want true modules to exist in programming, right? Why not remove all the glue and oil and messy cement? Make it _truly_ modular, everything fitting together perfectly.
I've done plenty on both sides of the argument. Had my fair share of trying to square the circle, only to end up chasing an infinite problem. Increasingly monstrous interface sets, contradictory names, things so abstract I needed a map to understand them. Sometimes using a drop of glue was better than all of it.
Requiring numerous large interfaces is a True Truth about your code. You didn't make anything go away by just giving up. You just forstalled any other future options. The complexity was there regardless.
I suspect it is likely you did not do the interface minimization, which I didn't have time to go into. You should not hand "a file system" to your code when it does not need it; if all it needs is to "read a file by file name" then the interface it receives should have only that.
If you have the occasional module that takes 25-50 inputs, I just pay the price for that particular module. If every module in your system is taking dozens upon dozens of inputs, you aren't building a "modular monolith", you're just building a monolith with extra random cuts in it, which is the worst of both worlds.
You can take an architecture diagram and drop an arbitrarily complicated line across any bit of it. The resulting code will also be arbitrarily complicated, but you can do it. (See also what microservice architectures tend to become anyhow.) But if you keep dropping lines across your architecture and they're all really, really complicated, the problem is your procedure for drawing those lines, not the utility of a separated architecture.
This is one of the reasons I'm not a big fan of rigidly following "clean" or "hexagonal" or any other particular architecture, because even as they have good ideas, rigidly following them inevitably involves drawing overcomplicated lines across the architecture. They're useful ways to get started but true design taste requires an inexpressible understanding of where to cut things that none of them can hand to you as a rigid recipe.
You were talking about _true modularity_. In my previous comment I was just exploring what "true" means in terms of that.
What you said it's like precision in metrology. You're just using more words to explain it. You never have _true_ absolute precision. The more precision you have, the better the parts fit, the higher the price. Sometimes glue is better, or a shim. Both in technical design _AND_ terminology.
However, the programming world for some reason hate shims, or glue code, or non-modular stuff, or the idea of less precision than the maximum possible. In the fever dreams of perfect design, they don't consider those things as a solution.
It's like absolute reuse is a MUST for software engineering, even worth bending what the word "monolith" means (it means a single piece of stone: mono lith). The contradiction is amazingly bizarre.
Glue. And shims. And a weird cosmological constant in an otherwise perfect equation. All imperfect things that are not modular, don't fit perfectly, they're not exact. We also can't do away with them.
To me it seems like the main advantages of microservices are
a) you can use different languages
b) you can run different parts of your system on different servers
I feel like you can solve both without giving up the niceties of a monolith just with a good RPC framework. A really good one would even give you the flexibility to run "microservices" as separate local threads for easy development.
Well, that's just the normal way to write software, no?
Aside from some websites and small scripts, all software is written like that.
You simply create a hierarchical directory structure where the directories correspond to modules and submodules and try to make sure that the code is well split and public interfaces are minimal.
Well you try that but in general someone in a different module discovers this thing you have over here is useful and starts using it and before you know it you have everything tightly coupled to everything else.
All non-toy programming languages support encapsulation, usually implemented with "private" or "public"/"export" keywords (well-designed languages make private the default), which means that unless the "thing" was marked as public/exported, in which case it's designed to be reused and stable and thus it's OK to depend on it, that will trigger a compiler or runtime error (in well-designed languages, a compiler error).
Obviously, in that case it's perfectly normal and acceptable to either export or make public the thing, if it is a good idea for it to be part of the module interface, or if that's not a good idea factor out the useful thing and make it a 3rd module that both the original and new modules depend one; this should come with some documentation about the interface if it's not obvious or fully specified by the types.
I mean, of course they are a good idea, what we need is more examples of actually doing them in practice. :-)
I.e. quoting from the post:
- monolithic databases need to be broken up
- Tables must be grouped by module and isolated from other modules
- Tables must then be migrated to separate schemas
- I am not aware of any tools that help detect such boundaries
Exactly.
For as much press as "modular monoliths" have gotten, breaking up a large codebase is cool/fine/whatever--breaking up a large domain model is imo the "killer app" of modular monoliths, and what we're missing (basically the Rails of modular monoliths).
I'm nearing greybeard status, so I have to chime in on the "get off my lawn" aspect.
There is no one general "good engineering". Everything is different. Labels suck because even if you called one thing "microservices", or even "monolith of microservices", I can show you 10 different ways that can end up. So "modular monolith" is just as useless a descriptor; it's too vague.
Outside of the HN echo chamber, good engineering practice has been happening for decades. Take open source for example. Many different projects exist with many different designs. The common thread is that if a project creates some valuable functionality, they tend to expose it both at the application layer and library layer. They know some external app will want to integrate with it, but also they know somebody might want to extend the core functionality.
I personally haven't seen that method used at corporations. If there are libraries, they're almost always completely independent from an application. And because of that, they then become shared across many applications. And then they suddenly discover the thing open source has been dealing with for decades: dependency.
If you aren't aware, there is an entire universe out there of people working solely on managing dependencies so that you, a developer or user, can "just" install software into your computer and have it magically work. It is fucking hard and complicated and necessary. If you've never done packaging for a distro or a language (and I mean 250+ hours of it), you won't understand how much work it is or how it will affect your own projects.
So yes, there are modular moniliths, and unmodular monoliths, and microservices, and libraries, and a whole lot of varied designs and use cases. Don't just learn about these by reading trendy blog posts on HN. Go find some open source code and examine it. Package some annoying ass complex software. Patch a bug and release an update. These are practical lessons you can take with you when you design for a corporation.
> If you aren't aware, there is an entire universe out there of people working solely on managing dependencies so that you, a developer or user, can "just" install software into your computer and have it magically work. It is fucking hard and complicated and necessary. If you've never done packaging for a distro or a language (and I mean 250+ hours of it), you won't understand how much work it is or how it will affect your own projects.
When new employees joined the engineering org at a former employer, they were required to spend six months on the sustaining team, where they could be assigned customer-escalated bugs in any part of the codebase. Under the clock to deliver a hotfix for production customers, they would be required to learn a new area of the complex codebase, work with specialists in that area, develop/test a fix and shepherd it through review/approval by senior engineers. Those who survived this process could apply to the subsystem team of their choice.
There is much for developers to learn from a period of apprenticeship in cross-platform software packaging. Start with .deb/.rpm, then image customization, A/B upgrades, stateless systems and work up to reproducible builds with Yocto, NixOS, Guix. In the next few years, SBOMs (software "bill of materials" aka package provenance) will become mandatory in some regional and vertical markets. This will either cause a reduction in dependencies, or increased attention to software supply chain relationships that bring regulatory costs.
I am already in the latter half of my fifties and find articles/discussions like these irksome and a sad reflection on the state of knowledge of the Programmers today.
Everything is merely cookie cutter recipes, patterns, cute jargons/acronyms, a general lack of understanding of computation models/paradigms, an inability to disambiguate actual concepts from language constructs, a lack of knowledge of fundamentals/important nuances all of which leads to simplistic cargo-culting.
There seems to be no emphasis on thinking through the problem and a solution but only an eagerness to reach for the latest faddish framework/library/pattern to put together something and "make it work". Reminds me of Tesla's observation on Edison;
“His [Thomas Edison] method was inefficient in the extreme, for an immense ground had to be covered to get anything at all unless blind chance intervened and, at first, I was almost a sorry witness of his doings, knowing that just a little theory and calculation would have saved him 90 per cent of the labor. But he had a veritable contempt for book learning and mathematical knowledge, trusting himself entirely to his inventor's instinct and practical American sense. In view of this, the truly prodigious amount of his actual accomplishments is little short of a miracle.”
The folks who maintain system-level package managers (apt, rpm, etc.) love to equivocate their stuff with language-level package/dependency managers.
But the folks who maintain language-level package/dependency managers essentially never reference system-level package manager stuff.
Which makes sense. They're categorically different things. System-level package maintainers are, often, stuck in an anachronistic model of software delivery.
What's worse: Premature scalability.
I joined one project that failed because the developers spent so much time on scalability, without realizing that some basic optimization of their ORM would be enough for a single instance to scale to handle any predictable load.
Now I'm wrangling a product that has premature scalability. It was designed with a lot of loosely coupled services and high degrees of flexibility, but it's impossible to understand and maintain with a small team. A lot of "cleanup" often results in merging modules or cutting out abstraction.