A single EC2 instance is an equally bad trade-off on the opposite side of the spectrum from over architected SQS, SNS, etc…
The ideal trade off is a single Kubernetes cluster with as much in the cluster as makes sense for the team and stage of the project. As you say, toss the app on a single node to start, but the control plane is tremendously valuable from on the onset of most projects.
A startup that outgrows an EC2 server will be making enough money to hire more people to scale the system properly than what was initially designed: trading away everything for development velocity.
Kubernetes is not the right tool for this startup. Kubernetes is what large, old-school non-tech companies use to orchestrate resources, because it’s easier to find someone that “knows k8s” (no one knows k8s unless they’re consulting) than it is to find someone that can build properly distributed systems (in the eyes of whoever is in charge of hiring).
Most startups are at least going to want to be able to deploy, scale up or down, and restart an app without downtime. I wouldn't say that's overkill.
While it's not impossible to do with a single instance, you can spend a lot of time shaving that yak. It's reasonable to pay a bit more to have that stuff handled for you in a robust way.
These reasons related to deployment, but there's also lots of value in the security aspects of the control plane.
* automatic service account for each workload
* automatic service to service auth to 3rd party services
* the audit log
* role based access control
* well defined api
* the explain subcommand
* liveness and readiness probes
* custom resources
The list goes on, but the big ones for a small team just getting started are workload identity and security.
K8S is basically another answer to Conway’s Law. Every startup I’ve worked at switched to it because then the infrastructure could map more closely to the code. Not unlike microservices at a higher level.
The old-skool approach is depending on a team of SREs or sysadmins to provision hardware for you and basically handle the deployment, which K8S plus container images basically abstract away.
Not to say that dedicating resources to platform development (k8s style) isn’t a time sink when you’re trying to build product and find a fit in the market.
In my experience, giving code preferential treatment is how you end up with complexity lunacy; so I’ll add an addendum to Conway’s Law:
“Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure — and which mirrors the skills of its key creators.”
K8s is designed to solve Google problems. Your startup will not have Google problems. Your startup will have Pinterest problems, or Gitlab problems, or Reddit problems — at which point you do not need K8s; you need someone who knows infra (which I expect devs to be working on distributed systems to understand).
Using K8s in a startup context is a sign of conformist thinking, detached from any critical aspect.
> The old-skool approach is depending on a team of SREs or sysadmins to provision hardware for you
This assumes that K8s won't require a "team of SREs". My experience is you need the same amount of SREs to maintain Kubernetes, probably more, because now you have a complicated control plane, a networking nightmare, then you layer that on top of resource-contention issues, security issues, cloud provider compatibility issues, buggy controllers, the list goes on.
The only thing K8s is great for is the maintainers, the consultants, and highly experienced SREs that inevitably have to be hired to clean up the mess that was created. This is my experience working in two similar sized environments, one with >1M containers, and another with an equivalent scale of bare metal servers.
Conway's law is about mapping teams to code+infrastructure (generally: areas of responsibility), not about mapping code to infrastructure. It's about people and politics.
You're right that K8S is an answer to Conway's Law: our people don't get along or can't collaborate or we have too many of them, so we will split them into team per service and force them to collaborate over network interfaces. Likewise, the infrastructure people will communicate with the other teams using Dockerfiles.
The ideal trade off is a single Kubernetes cluster with as much in the cluster as makes sense for the team and stage of the project. As you say, toss the app on a single node to start, but the control plane is tremendously valuable from on the onset of most projects.