the elixir shop I was at, folks just repl'd into prod to do work. Batshit insani...

andrewmutz · on Sept 13, 2024

> repl'd into prod to do work

Like for debugging production problems and fixing customer data? Or for normal development?

If its the former that's a great use of technology, and if its the latter it sound insane.

psd1 · on Sept 14, 2024

It's an order of magnitude harder to debug when you don't have access to prod, but there's a reason to block that access. I think you need to put controls on that fairly early in your project's evolution.

Any good strategies to reduce the pain? My previous employer never solved this.

I always wanted to explore contextual logging - by which I mean, logging is terse by default, but in an error state, the stack is walked for contextual info to make the log entry richer; and also, ideally, previous debug log entries that are suppressed by default are instead written. I guess that implies buffering log entries and writing only a subset at the end of the happy path.

To illustrate what I mean: happy path log:

10:21:04 Authenticated 10:21:05 Scumbulated

Error condition log:

10:21:04 Authenticating id 49234 request DEADBEEF IDP response OK for 49234 request DEADBEEF https://idp.dundermifflin.com Session cookie OK request DEADBEEF Authenticated 10:21:05 Scumbulating flange 7671529 user 49234 request DEADBEEF NullFlangeError flange 7671529 at scum.py:265 Frame vars a=42, password=redacted, flags=0x05

I'm reacting to hard-to-repro bugs at $employer where we chucked logging statements at a dartboard, deployed, waited, didn't capture the issue, repeat several times. At a cadence of 5-10 deploys a week, this is below what I consider acceptable velocity. We often took days to fix major bugs, we'd run degraded for weeks at a time.

ryoshu · on Sept 14, 2024

It has hot swappable code. For skilled practitioners it's fine.

sethammons · on Sept 14, 2024

> For skilled practitioners it's fine

That sounds like something that will not scale because people make mistakes. Our DBA at $prev_gig was very skilled. And also truncated a production database. Mistakes happen. Write access to prod all willy nilly gives me the heebie jeebies

andy_ppp · on Sept 13, 2024

It depends on the situation, if something is broken in a live system and you can login and introspect the real thing this is awesome. Obviously there are trade offs and potentially you might break things further!

sethammons · on Sept 14, 2024

I'm used to introspecting live data with read only access. They used write access and were an accidental key stroke from deleting. Writing in prod should take permission escalation

psd1 · on Sept 14, 2024

Sometimes auditors won't even ok read-only access. I'd have loved RO access, but we hosted financial institutions etc.

Nice if you can get it.

andy_ppp · on Sept 19, 2024

With great power and all that...