Oh boy, I have so many questions... \* You want to simplify infrastructure, but ...

TYPE_FASTER · 2025-02-21T20:25:24 1740169524

> Why did you decide to go with diagramming as a solution?

I had a similar idea. I have enough experience with visual programming environments to be wary. Here are my thoughts on why it might be a good approach here: * It would be possible to take a whiteboard scribble and turn it into a real system. Combining this with the services available in the cloud, you end up with something really powerful. It all comes down to the level of abstraction supported. You have to be able to draw boxes at a level that adds value, but also zoom in to parameters at the service/API level as necessary. * I've worked on a team that was responsible for designing and maintaining its own AWS infrastructure. Along with that comes the responsibility for controlling cost. The idea of having a living architectural diagram that also reported cost in near real-time is really helpful, especially if you could start to do things like project cost given a level of traffic or some other measure.

Once you have a decent library of TF modules, and an understanding of the networking and compute fundamentals, and an understanding of the services offered by your cloud provider, you have something really powerful. If a service can help accelerate that, it's worth it IMHO.

dwill-mdcloud · 2025-02-21T20:43:40 1740170620

You have really hit the nail on the head with what we were going for! Cory and I, very early on said "We draw this stuff, agree on it, then go build it in TF which is where the problems start".

We imagined a world where you could go into architecture review and come out of that meeting with staging stood up and ready to run your application.

This makes sense for infra because it's mostly config management and API calls. Visual programming is rough because control structures are soo hard to visualize.

coryodaniel · 2025-02-21T18:18:58 1740161938

I'm here for the questions!

> * You want to simplify infrastructure, but there's a new learning curve here. Why did you decide to go with diagramming as a solution? What other methods did you evaluate and discard?

We try to make it so both teams have to learn as little as possible. For the ops team, we are built on the tools those teams are familiar with terraform, helm, ansible, etc. Our extension model is also ops-oriented. You add add'l provisioners by writing Dockerfiles, you enforce pre-validations with JSON Schema (this is the best we could come up w/, but figured it was a safe bet ops-wise since its a part of OpenAPI). For devs, they dont have to learn the ops teams tools to provision infrastructure, they just diagram. Massdriver was originally a wall of YAML to connect all the pieces, but it felt fumbly (and like everything else).

I wanted to make a VR version like something youd see in a bad hacker movie, but Dave told me not to get ahead of myself. :D

> * How does an organization with existing infrastructure implement Massdriver?

Depends on if they have IaC or not. If they have IaC, they publish the modules. If their IaC has a state backend, its usually just good to go, if they are using localfiles for state, we offer a state server they can push state into.

If teams dont have IaC, we run workshops on "reverse terraforming" or "tofuing" and also offer professional services to codify that stuff for you.

> * How do you handle edge cases, custom configurations, complex logic, etc.? For example, workflows that use custom scripts or some other form of band-aid.

As noted above, its all based off common ops tooling. Lets say you wanted to use a new sec scanning tool for IaC and we don't have it in our base provisioners, you can write a dockerfile, build the image, then you can include that scanning tool in any of your massdriver configs. We also have folks doing day-2 operations with the platform. Things like database migrations and whatnot. The lines in the graph actually carry information and can push that info across different tools, so you can do things like have helm charts get information from a terraform run. You can build a provisioner with say the psql tool or a helm chart running bucardo and use it to set up replication between an old and new postgres instance.

> * The visual approach could make it too easy to piece together infrastructure without understanding the implications. How do you prevent developers from creating poorly architected systems just because you make it simple to connect components?

The lines and connections are actually a type system that you can extend (also based on JSON Schema). That way ops teams can encode common things into the platform once. ie. this is how we authenticate to postgres, its an AWS secret, security gruops and these IAM policies. All of that information flows across the line into the other module. The modules reject invalid types so common misconfigurations _cant_ happen. It also lets you "autocomplete" infrastructure. Lets say I'm a dev and I want to deploy a database. I can drop it on the canvas, since massdriver understands the types, it'll automatically connect it to a subnet that dev has access to.

> * When things go wrong, how do developers debug issues at the infrastructure level? Do they reach out to ops?

They may, we have a lot of stuff built in though to make the system as truly self-service (through day 2) as possible. There are runbooks per module so ops teams that have built out a module around a use case can put in common trouble shooting steps and its all accessible from the same graph. Alarms and metrics also show up there. Ops teams can also publish day-2 modules to the catalog, so developers can drag and drop common one-off tasks for maintenance onto their canvas and perform it.

no_circuit · 2025-02-22T09:01:15 1740214875

The VR version of network management already existed [1]. It was called CA [2] Unicenter TNG. It really could use an update with some rendering with Unreal Engine! :D

Unrelated but could be confused with what was seen in Jurassic Park as "Unix".

[1] https://archive.org/details/vw_ca-unicenter-tng-demo

[2] https://en.wikipedia.org/wiki/CA_Technologies

varun_chopra · 2025-02-21T19:37:11 1740166631

> You add add'l provisioners by writing Dockerfiles, you enforce pre-validations with JSON Schema

That's really neat! Thank you for answering my questions and all the best with your launch!

coryodaniel · 2025-02-21T19:39:38 1740166778

Thanks, I appreciated the questions!