These kinds of posts always focus on the complexity of running k8s, the large am...

envoked · on March 5, 2020

I'm in the same camp. I think a lot of these anti-k8s articles are written by software developers who haven't really been exposed to the world of SRE and mostly think in terms of web servers.

A few years ago I joined a startup where everything (including the db) was running on one, not-backed-up, non-reproducible, VM. In the process of "productionizing" I ran into a lot of open questions: How do we handle deploys with potentially updated system dependencies? Where should we store secrets (not the repo)? How do we manage/deploy cronjobs? How do internal services communicate? All things a dedicated SRE team managed in my previous role.

GKE offered a solution to each of those problems while allowing me to still focus on application development. There's definitely been some growing pains (prematurely trying to run our infra on ephemeral nodes) but for the most part, it's provided a solid foundation without much effort.

mslm · on March 5, 2020

Exactly, all these articles seem to come from operational novices, who think in terms of 1-2 click solutions. K8s is not a 1-2 click solution, and clearly isn't designed to be; it's solving particular tough operational problems that if you don't know exist in the first place you won't really be able to evaluate these kinds of things properly.

If a group literally doesn't have the need to answer questions like the ones you posed, then OK, don't bother with these tools. But that's all that needs to be said - no need for a new article every week on it.

collyw · on March 5, 2020

> it's solving particular tough operational problems that if you don't know exist in the first place

They probably don't exist for the majority of people using it. We are using k8s for when we need to scale, but at the moment we have a handful of customers and it isn't changing quickly any time soon.

ownagefool · on March 5, 2020

They basically exist for everyone.

As soon as you go down the road of actually doing infrastructure-as-code, using (not running) k8s is probably as good as any other solution, and arguably better than most when you grow into anything complex.

Most of the complaints are false equivalence: i.e. running k8s is harder than just using AWS, which I already know. Of course it is. You don't manage AWS. How big do you think their code base is?

If you don't know k8s already, and you're a start-up looking for a niche, maybe now isn't the time to learn k8s, at least not from the business point of view (personal growth, another issue).

But when you do know k8s, it makes a lot of sense to just rent a cluster and put your app there, because when you want to build better tests, it's easy, when you want to do zero trust, it's easy, when you want to integrate with vault, it's easy, when you want to encrypt, it's easy, when you want to add a mesh for tracing, metrics and maybe auth, it's easy.

What's not easy is inheriting a similarly done product that's entirely bespoke.

collyw · on March 5, 2020

No the complaint is that we aren't scaling to google levels, in fact we are barely scaling at all. K8s isn't needed.

We ran applications without it fine a few years ago. And it was a lot simpler.

ownagefool · on March 6, 2020

"Fine"

As in, doesn't get hacked or doesn't go down? We live in different worlds.

jen20 · on March 6, 2020

> Of course it is. You don't manage AWS. How big do you think their code base is?

This seems like a fairly unreasonable comparison. The reason I pay AWS is so that I _do not_ have to manage it. The last thing I want to do is then layer a system on top that I do have to manage.

ownagefool · on March 6, 2020

https://www.digitalocean.com/products/kubernetes/

https://cloud.google.com/kubernetes-engine/

???

Spooky23 · on March 5, 2020

Remember that not everyone is using an operational model that would benefit from K8s.

013a · on March 5, 2020

Yet, they write articles that prescribe their operational model on the rest of the world.

bcrosby95 · on March 5, 2020

Problem is hidden assumptions. Happens a lot with microservices too. People write about the problems they're solving somewhat vaguely and other people read it and due to that vagueness think it also is the best solution to their problem.

Spooky23 · on March 5, 2020

They are engineers, writing about what they do and what their companies sell. I wouldn't ascribe an "imposition" to that!

As a practitioner or manager, you need to make informed choices. Deploying a technology and spending the company's money on the whim of some developer is an example of immaturity.

ex_amazon_sde · on March 5, 2020

> I think a lot of these anti-k8s articles are written by software developers who haven't really been exposed to the world of SRE ...

Think again. There's plenty of SREs at FAANGs that dislike the unnecessary complexity of k8s, docker and most "hip" devops stuff.

p_l · on March 5, 2020

From my knowledge, pretty much all FAANGs got their own tooling in place before k8s went public, covering those same issues.

Now imagine you have to do it from scratch.

honkycat · on March 5, 2020

Agreed. I've been saying for years that if you go with Docker-Compose, or Amazon ECS, or something lower level, you are just going to end up rebuilding a shittier version of Kubernetes.

I think the real alternative is Heroku or running on VMs, but then you do not get service discovery, or a cloud agnostic API for querying running services, or automatic restarts, or rolling updates, or encrypted secrets, or automatic log aggregation, or plug-and-play monitoring, or VM scaling, or an EXCELLENT decoupled solution for deploying my applications ( keel.sh ), or liveness and readiness probes...

But nobody needs those things right?

sk5t · on March 5, 2020

You do in fact get a lot of this stuff with ECS and Fargate - rolling updates, automatic restart, log aggregation, auto scaling, some discovery bits, healthchecks, Secrets Manager or Parameter Store if you want, etc.

throwaway894345 · on March 5, 2020

This. We’ve been chugging along happily on Fargate for a while now. We looked into EKS, but there is a ton of stuff you have to do manually that is built into Fargate or at least Fargate makes it trivial to integrate with other AWS services (logging, monitoring, IAM integration, cert manager, load balancers, secrets management, etc). I’d like to use Kubernetes, but Fargate just works out of the box. Fortunately, AWS seems to be making EKS better all the time.

Logishort · on March 5, 2020

And what if Amazon feels like changing Fargate or charging differently or deprecating it or... Whats your strategy for that?

pjmlp · on March 5, 2020

We think about it when it happens, if it ever happens.

I have seen too many projects burn money with vendor independence abstraction layers that were never relevant in production.

throwaway894345 · on March 5, 2020

Like others have said, cross that bridge if/when we get to it--there's no sense in incurring the Kubernetes cost now because one day we might have to switch to Kubernetes anyway.

It's also worth noting that Fargate has actually gotten considerably cheaper since we started using it, probably because of the firecracker VM technology. I'm pretty happy with Fargate.

sk5t · on March 5, 2020

The fallback is to use something else less convenient--maybe k8s, nomad, some other provider's container-runner-load-balancer-thing.

russellendicott · on March 5, 2020

I'm pretty sure AWS has a better track record than Google when it comes to keeping old crud alive for the benefit of its customers.

In my experience AWS generally gives at least a year's notice before killing something or they offer something better that's easy to move to well in advance of killing the old.

Hell, they _still_ support non-VPC accounts...

Legogris · on March 5, 2020

And suddenly you have a MUCH higher degree of vendor lock-in than on k8s.

eropple · on March 5, 2020

The dirty secret of virtually every k8s setup--every cloud setup--is that the cloud providers' stuff is simply too good, while at the same time being just different enough from one another, that the impedance mismatch will kill you when you attempt to go to multi-cloud unless you reinvent every wheel (which you will almost certainly do poorly) or use a minimal feature set of anything you choose to bring in (which reduces the value of bringing those things in in the first place).

"Vendor lock-in" is guaranteed in any environment to such a degree that every single attempt at a multi-cloud setup that I've ever seen or consulted on has proven to be more expensive for no meaningful benefit.

It is a sucker's bet unless you are already at eye-popping scale, and if you're at eye-popping scale you probably have other legacy concerns in place already, too.

Legogris · on March 5, 2020

I'm not saying you'd be able to migrate without changes, but having moved workloads from managed k8s to bare-metal and having some experience with ECS and Fargate, I can tell you that the scale of disparities is significant.

eropple · on March 5, 2020

To me the disparities aren't "how you run your containers". They're "how you resolve the impedance mismatch between managed services that replace systems you have to babysit." Even something as "simple" as SQS is really, really hard to replicate, and attempting to use other cloud providers' queueing systems has impedance mismatch between each other (ditto AMQP, etc., and when you go that route that's just one more thing you have to manage).

Running your applications was a solved problem long before k8s showed up.

tylerl · on March 5, 2020

This.

The whole point of k8s, the reason Google wrote it to begin with, was to commoditize the management space and make vendor lock-in difficult to justify. It's the classic market underdog move, but executed brilliantly.

Going with a cloud provider's proprietary management solution gives you generally a worse overall experience than k8s (or at least no better), which means AWS and Azure are obliged to focus on improving their hosted k8s offering or risk losing market share.

Plus, you can't "embrace and extend" k8s into something proprietary without destroying a lot of it's core usability. So it becomes nearly impossible to create a vendor lock-in strategy that customers will accept.

archisgore · on March 5, 2020

Amen to that. We use ECS to run some millions of containers on thousands of spot nodes for opportunistic jobs, and some little Fargate for where we need uptime. It's a LOT less to worry about.

dilyevsky · on March 5, 2020

Pesky monitoring and logging just makes you notice bugs more and get distracted from making the world a better place

NicoJuicy · on March 5, 2020

Why would docker be shittier? It's easier and flexible.

Eg. Sidecar pattern resolves most things (eg. logging)

rclayton · on March 5, 2020

1000%. If you take a little bit of time to learn k8s and run it on a hosted environment (e.g. EKS), it’s a fantastic solution. We are much happier with it than ECS, Elastic Beanstalk, etc.

AlchemistCamp · on March 5, 2020

> If anyone has a solution that is actually simpler and just as easy to set up, I'm very much interested.

Sure. @levelsio runs Nomad List (~100k MRR) all on a single Linode VPS. He uses their automated backups service, but it's a simple setup. No k8s, no Docker, just some PHP running on a Linux server.

As I understand it, he was learning to code as he built his businesses.

https://twitter.com/levelsio/status/1177562806192730113

icedchai · on March 5, 2020

Many startups are not as resource efficient. The ones I am familiar with spend 50%+ of their MRR on cloud costs.

p_l · on March 5, 2020

My first production k8s use was actually due to resource efficiencies. We looked at the >50 applications we had to serve, the technical debt that would lead us down the road of building customized distribution to avoid package requirements incompabilties, and various other things, and decided to go with containers - just to avoid the conflicts involved.

Thanks to k8s, we generally keep to 1/5th of the original cost, thanks to bin packing of servers, and sleep sounder thanks to automatic restarts of failed pods, ability to easily allocate computing resources per container, globall configure load balancing (we had to scratch use of cloud-provider's load balancer because our number of paths was too big for URL mapping API).

Everything can be moved to pretty much every k8s hosting that runs 1.15, biggest difference would be hooking the load balancer to the external network and possibly storage.

AlchemistCamp · on March 5, 2020

That's very surprising to me! As an employee at multiple startups in multiple countries, I've never seen cloud costs anywhere near payroll.

icedchai · on March 5, 2020

Well, I said 50% of monthly revenue, not total costs.

Glyptodon · on March 5, 2020

You're asking exactly what I've been wondering. The answer in this thread this so far has been "maybe dokku." I can totally buy that K8s is overkill, but the alternatives for small scale operators seem to require a lot more work internally than just using a K8s service.

avereveard · on March 5, 2020

same we use kubernet not for scale but for the environment repeatability. we can spin any branch at any time on any provider and be sure it's exactly as we have it in production down to networking and routing, to built that out of plain devops tools would require a lot more in depth knowledge and it won't port exactly one provider to another

VvR-Ox · on March 5, 2020

> These kinds of posts always focus on the complexity of running k8s...

Yes but that is also the worst already that you could criticize about k8s.

Complexity is dangerous because if things are growing beyond a certain threshold X you will have side effects that nobody can predict, a very steep learning curve and therefor many people screwing up something in their (first) setups as well as maintainability nightmares.

Probably some day someone will prove me wrong but right now one of my biggest goals to improve security, reliability and people being able to contribute is reducing complexity.

After all this is what many of us do when they refactor systems.

I am sticking with the UNIX philosophy at this point and in the foreseeable future I will not have a big dev-team at my disposal as companies like Alphabet have to maintain and safe-guard all of this complexity.

hedora · on March 5, 2020

From a developer perspective, k8s seems ripe for disruption.

It does a bunch of junk that is trivial to accomplish on one machine - open network ports, keep services running, log stuff, run in a jail with dropped privileges, and set proper file permissions on secret files.

The mechanisms for all of this, and for resource management, are transparent to unix developers, but in kubernetes they are not. Instead, you have to understand a architectural spaghetti torrent to write and execute “hello world”.

It used to be similar with RDBMS systems. It took months and a highly paid consultant to get a working SQL install. Then, you’d hire a team to manage the database, not because the hardware was expensive, but because you’d dropped $100k’s (in 90’s dollars) on the installation process.

Then mysql came along, and it didn’t have durability or transactions, but it let you be up and running in a few hours, and have a dynamic web page a few hours after that. If it died, you only lost a few hours or minutes of transactions, assuming somone in your organization spent an afternoon learning cron and mysqldump.

I imagine someone will get sufficiently fed up with k8s to do the same. There is clearly demand. I wish them luck.

p_l · on March 5, 2020

It seems to be. When you start implementing something, you'll soon find that most of that complexity is inherent, not accidental. Been there, done that, didn't even get the T-Shirt.

Today it's much easier to package nicer API on top of the rather generic k8s one. There are ways to deploy it easier (in fact, I'd wager that a lot of complexity in deploying k8s is accidental due to deploy tools themselves, not k8s itself. Just look at OpenShift deploy scripts...)

rclayton · on March 5, 2020

Exactly - I deployed a k8s cluster with eksctl in about 10 min. Deployed all my services within another 10min (plain Helm charts).

hrgiger · on March 5, 2020

This, kubernetes is container orchestration framework. If you know your needs then you have opportunity to make a better decision and planning. I had similar experience with ingress also I would like to add installing Kubernetes on bare metal is a pain in the neck, even with kubespray.

Here is a comparison with other frameworks, from 2018: https://arxiv.org/pdf/2002.02806.pdf

geggam · on March 5, 2020

Running k8s on GKE is the only rational choice for anyone who isnt a large enterprise with a team of Google SREs to support k8s.

Period.

halbritt · on March 5, 2020

I'm curious how, with your experience, that you came to the conclusion that k8s is a monster to run?

rawoke083600 · on March 5, 2020

"Pod, Deployment, Job, CronJob, Service, Ingress, ConfigMap, Secre"

Wow as a new developer coming onboard your company, I will walk out the door after seeing that, and the fact that you admit its a small serivce.

jorams · on March 5, 2020

The names may seem daunting, but here's what they do: a Pod is one or more containers, usually 1, that run an application. A Deployment manages a number of the same pods, making sure the right number is running at all times and rolling out new versions. A Job runs a pod until it succeeds. A CronJob creates a job on a schedule. A service provides a DNS name for pods of a certain kind, automatically load balancing between multiple of them. An Ingress accepts traffic from the outside and routes it to services. A ConfigMap and a Secret are both just sets of variables to be used by other things, similar to config files but in the cluster.

It's a small service according to " web scale", but it's serving, and vital for, a good number of customers.

eska · on March 5, 2020

People seem to hate learning vocabulary for concepts that they are already dealing with. Personally I love having a common vocabulary to have precise conversations with other developers.

hedora · on March 5, 2020

It’s more than that. People hate learning concepts that should be abstracted away from then.

As an example, why can’t ConfigMap and Secret just be plain files that get written to a well known location (like /etc)?

Why should the application need to do anything special to run in kubernetes? If they are just files, then why do they have a new name? (And, unless they’re in /etc, why aren’t they placed in a standards-compliant location?)

If they meet all my criteria, then just call them configuration files. If they don’t, then they are a usability problem for kubernetes.

kube-system · on March 5, 2020

ConfigMaps and Secrets can hold multiple files, or something that isn't a file at all, and your application doesn't have to do anything special to access them. You define how they configure your application in the pod/deployment definition. They solve a problem not previously solved by a single resource.

Maybe you don't personally find value in the abstraction, but there are certainly people who do find it useful to have a single resource that can contain the entire configuration for a application/service.

curryst · on March 5, 2020

They can be files on the disk. They can also be used as variables that get interpolated into CLI args. What they really are is an unstructured document stored in etcd that can be consumed in a variety of ways.

As the other user said, they can also be multiple files. I.e. if I run Redis inside my pod, I can bundle my app config and the Redis config into a single ConfigMap. Or if you're doing TLS inside your pod, you can put both the cert and key inside a single Secret.

The semantics of using it correctly are different, somewhat. But you can also use a naive approach and put one file per secret/ConfigMap; that is allowed.

nrb · on March 5, 2020

It's an afternoon worth of research to understand the basic concepts. Then, with the powerful and intuitive tooling you can spin up your own cluster on your computer in minutes and practice deploying containers that:

- are automatically assigned to an appropriate machine(node) based on explicit resource limits you define, enabling reliable performance

- horizontally scale (even automatically if you want!)

- can be deployed with a rolling update strategy to preserve uptime during deployments

- can rollback with swiftness and ease

- have liveness checks that restart unhealthy apps(pods) automatically and prevent bad deploys from being widely released

- abstracts away your infrastructure, allowing these exact same configs to power a cluster on-prem, in the cloud on bare metal or vms, with a hosted k8s service, or some combination of all of them

All of that functionality is unlocked with just a few lines of config or kubectl command, and there are tools that abstract this stuff to simplify it even more or automate more of it.

You definitely want some experienced people around to avoid some of the footguns and shortcuts but over the last several years I think k8s has easily proven itself as a substantial net-positive for many shops.

scarface74 · on March 5, 2020

So why should I do all of that instead of throwing a little money at AWS, run ECS and actually spend my time creating my product?

Heck, if my needs are simple enough why should I even use ECS instead of just putting my web app on some VM's in an auto-scaling group behind a load balancer and used managed services?

Legogris · on March 5, 2020

I don't think anyone is arguing that you should use k8s for a simple web app. There's definitely some inherent stack complexity threshold before solutions like k8s/mesos/nomad are warranted.

When you start having several services that need to fail and scale independently, some amount of job scheduling, request routing... You're going to appreciate the frameworks put in place.

My best advice is to containerize everything from the start, and then you can start barebones and start looking at orchestration systems when you actually have a need for it.

scarface74 · on March 5, 2020

If you need to fail and scale independently.

-you can use small EC2 instances behind an application load balancer and within autoscaling groups with host based routing for request routing.

- converting a stand-alone api to a container is not rocket science and nor should it require any code rewrite.

- if you need to run scheduled Docker containers that can also be done with ECS or if it is simple enough lambda.

- the first thing you should worry about is not “containerization”. Its getting product market fit.

As far as needing containerization for orchestration, you don’t need that either. You mentioned Nomad. Nomad can orchestrate anything - containers, executables, etc.

Not to mention a combination of Consul/Nomad is dead simple. Not saying I would recommend it. In most cases (I’ve used it before), but only because the community and ecosystem is smaller. But if you’re a startup, you should probably be using AWS or Azure anyway so you don’t have to worry about the infrastructure.

geggam · on March 5, 2020

What requirement is driving containers ?

How are you managing your infrastructure and if you have that already automated how much effort is it to add the software you develop to that automation vs the ROI of adding another layer of complexity ?

The idea everything needs to be in containers is similar to the idea everything needs to be in k8s.

Let the business needs drive the technology choices don't drive the business with the technology choices

Legogris · on March 5, 2020

Portability - the idea being that you can migrate to an orchestration technology that makes sense when you have the need. The cost and effort of containerizing any single service from the get-go should be minimal. It also helps a lot with reproducibility, local testing, tests in CI, etc.

Valid reasons to not run containerized in production can be specific security restrictions or performance requirements. I could line up several things that are not suitable for containers, but if you're in a position of "simple but growing web app that doesn't really warrant kubernetes right now" (the comment I was replying to), I think it's good rule of thumb.

I agree with your main argument, of course.

geggam · on March 5, 2020

The overhead managing a container ecosystem to run production is not trivial. IF you are doing this in a service then by all means leverage that package methodology.

If you are managing your systems who already have a robust package management layer then adding the container stacks on top of managing the OS layers you have just doubled the systems your operations team is managing.

Containers also bring NAT and all sorts of DNS / DHCP issues that require extremely senior well rounded guys to manage.

Developers dont see this complexity and think containers are great.

Effectively containers moves the complexity of managing source code into infrastructure where you have to manage that complexity.

The tools to manage source code are mature. The tools to manage complex infrastructure are not mature and the people with the skills required to do so ... are rare.

Legogris · on March 5, 2020

> If you are managing your systems who already have a robust package management layer then adding the container stacks on top of managing the OS layers you have just doubled the systems your operations team is managing.

Oh yeah, if you're not building the software in-house it's a lot less clear that "Containerizate Everything!" is the answer every time. Though there are stable helm charts for a lot of the commonly used software out there, do whatever works for you, man ;)

> Containers also bring NAT and all sorts of DNS / DHCP issues that require extremely senior well rounded guys to manage.

I mean, at that point you can just run with host mode networking and it's all the same, no?

scarface74 · on March 5, 2020

Or you can just use ECS/Fargate and each container registers itself to Route53 and you can just use DNS...

scarface74 · on March 5, 2020

Out of all of the business concerns that a startup or basically any company has, “cloud lock in” is at the bottom of the list.

scarface74 · on March 5, 2020

Creating the infrastructure is using your infrastructure as code framework of choice - Terraform or CloudFormation.

Monitoring can be done with whatever your cloud platform provides.

geggam · on March 5, 2020

Hate to tell you this but that second idea is much easier to manage from an operations standpoint and significantly more reliable.

Also easier to debug and monitor... but you run your business to make developers happy right ?

p_l · on March 5, 2020

Having worked with ECS 2.0 when Fargate was released, the setup was much harder to use from operations standpoint the moment you needed anything more complex (not to mention any kind of state) than k8s. Just getting monitoring out of AWS was annoyance involving writing lambda functions to ship logs...

scarface74 · on March 5, 2020

Fargate has built in logging and monitoring via CloudWatch. It redirects console output to CloudWatch logs.

p_l · on March 5, 2020

And I'll be blunt that CloudWatch, especially logs, aren't all that hot in actual use. It might have gotten better, but all places I worked for last two years if they used AWS they depended on their own log aggregation, even if using AWS-provided Elastic Search.

scarface74 · on March 6, 2020

There are other supported options.

https://aws.amazon.com/blogs/opensource/fargate-container-lo...

rclayton · on March 6, 2020

CloudWatch sucks. Sorry.

takeda · on March 5, 2020

I think that was his point (you are referring to using VMs, right?)

hedora · on March 5, 2020

> It's an afternoon worth of research to understand the basic concepts. Then, with the powerful and intuitive tooling you can spin up your own cluster on your computer in minutes and practice deploying containers that.

Source? I’ve never heard of someone going from “what’s kubernetes?” to a bare metal deployment in 4 hours.

p_l · on March 5, 2020

That's more for minikube, which is a few minutes to run, as it is specifically for developer to test things locally on their computer.

The basic concepts in k8s are also pretty easy to learn, provided you go from the foundations up -- I have a bad feeling a lot of people go the opposite way.

akvadrako · on March 5, 2020

That’s a pretty small API. I don’t know what developers are doing if they can’t learn that in stride.

rclayton · on March 5, 2020

Really? If someone told me they were going to write all the glue code that basically gets you the same thing a UI deployment of k8s and a couple yaml files can provide, I’d walk out.

p_l · on March 5, 2020

Reminds me of the business trying to reduce our spend on VMware licenses while keeping the same architecture.

A high level person actually asked me to reimplement vCenter :|