How are you getting these results? Even with grounding in sources, careful conte...

bdangubic · 2025-09-28T01:27:17 1759022837

I genuinely think that biggest issue LLM tools is that most people expect magic because first attempts at some simple things feel magical. however, they take insane amount of time to get expertise in. what is confusing is that I think SWEs spent immense amounts of time in general learning the tools of the trade but this seems to escape a lot of people when it comes to LLMs. on my team, every developer is using LLMs all day, every day. on average based on sprint retros each developer spends no less than an hour each day experimenting/learning/reading… how to make them work. the realization we made early is that when it comes to LLMs there are two large groups:

- group that see them as invaluable tools capable of being an immense productivity multiplier

- group that tried things here and there and gave up

we collectively decided that we want to be in the first group and were willing to put time to be in that group.

danpalmer · 2025-09-28T02:23:33 1759026213

I'm persisting, have been using LLMs quite a bit for the last year, they're now where I start with any new project. Throughout that time I've been doing constant experimentation and have made significant workflow improvements throughout.

I've found that they're a moderate productivity increase, i.e. on a par with, say, using a different language, using a faster CI system, or breaking down some bureaucracy. Noticeable, worth it, but not entirely transformational.

I only really get useful output from them when I'm holding _most_ of the context that I'd be holding if writing the code, and that's a limiting factor on how useful they can be. I can delegate things that are easy, but I'm hand-holding enough that I can't realistically parallelise my work that much more than I already do (I'm fairly good at context switching already).

lomase · 2025-09-28T01:42:17 1759023737

I have been in teams that do this and in teams that dont.

I have not see any tangible difference in the output of both.

bdangubic · 2025-09-28T02:17:21 1759025841

year-over-year we are at around 45% in increased productivity and this trajectory is on an upward slope

danpalmer · 2025-09-28T02:25:34 1759026334

How are you measuring increased productivity? Honest question, because I've seen teams claim more code, but I've also seen teams say they're seeing more unnecessary churn (which is more code).

I'm interested in business outcomes, is more code or perceived velocity translating into benefits to the business? This is really hard to measure though because in pretty much any startup or growing company you'll see better business outcomes, but it's hard to find evidence for the counterfactual.

bdangubic · 2025-09-28T03:25:27 1759029927

same as we have before LLMs for a decade - story points. we move faster now, we have automated stuff we could never automate before. same project, largely same team since 2016, we just get a lot more shit done, a lot more

gedy · 2025-09-28T04:09:45 1759032585

So something like: automate unit tests, where the tests are X points where you'd not have done these before?

Not snarking, but if they are automated away, then isn't this like 0 story points for effort/complexity?

bdangubic · 2025-09-28T11:49:11 1759060151

hehe not snarky at all - great question. this was heavily discussed but in order to measure productivity gains (we are phasing this out now) we kept the estimations the same as before. as my colleague put it you don’t estimate based on “10x developer” so we applied the same concept. now that everyone is “on board” we are phasing this out

gedy · 2025-09-28T14:29:05 1759069745

Thanks, I'm probably a kook but I've never wanted to put any non-product, user-visible feature-related tasks on the board with story points (tests, code cleanup, etc) and just folded that into the related user work (mainly to avoid some product person thinking they "own" that and can make technical decisions).

So the product velocity didn't exactly go up, but you are now producing less technical debt (hopefully) with a similar velocity, sounds reasonable.

danpalmer · 2025-09-28T23:28:27 1759102107

I'm glad you're more productive, although I would question this result both in terms of objectivity (story points are typically very subjective), and in terms of capturing all externalities of the LLM workflow. It's easy to have "build the thing", "fix the thing", "remove tech debt in the thing", "replace the thing" be 4 separate projects, each with story points, where "build the better thing" would have been one, and churn is something that is evidenced with LLM development.

lomase · 2025-09-28T02:37:16 1759027036

This reads like the bullshit bulletpoints people write on their CV.

bdangubic · 2025-09-28T03:26:37 1759029997

comments like this give me warm and fuzzy feeling that theoretically we compete for same jobs - no worries about job security for forseeable future :)

lomase · 2025-09-28T03:31:11 1759030271

Someones ego got hurt.

bdangubic · 2025-09-28T11:45:23 1759059923

talking to yourself in third person? :)

lomase · 2025-09-28T13:50:33 1759067433

You keep comming back to this fights online because is the only real interactions you can have with people outside work.

You will live the rest of your life like that. Because nobody likes you. Enjoy.

bdangubic · 2025-09-28T18:04:45 1759082685

ouch that is not very nice :)

vivzkestrel · 2025-09-28T03:32:25 1759030345

dont you think it would be better off getting that expertise in actual system design, software engineering and all the programming related fields. by involving chat GPT to make code, we ll eventually lose the skill to sit and craft code like we used to do all these years. after all the brain s neural pathways only remember what you put to work daily

caseyf7 · 2025-09-28T03:17:43 1759029463

Where are you finding the best material for reading/learning?

bdangubic · 2025-09-28T03:38:46 1759030726

- everything that simon writes (https://simonwillison.net/)

- anything that goes deep into issues (I seldom read “i love llms” type posts like this is great: https://blog.nilenso.com/blog/2025/09/15/ai-unit-of-work/)

- lots of experimentation - specifically I have spent hours and hours doing the exact same feature (my record is 23 times).

- if something “doesn’t work” I create a task immediately to investigate and understand it. even the smallest thing that bother me I will spend hours to figure out why it might have happened (this is sometimes frustrating) and how to prevent it from happening again (this is fun)

My collegue describes the process as Javascript developer trying to learn Rust while tripping on mushrooms :)

oblio · 2025-09-28T01:15:23 1759022123

> Its not like there is zero benefit to it, but I am genuinely curious how you get consistently correct output for a "complicated subject matter like insurance".

Most likely by trying to get a promotion or bonus now and getting the hell out of Dodge before anyone notices those subtle landmines left behind :-)

fn-mote · 2025-09-28T01:58:24 1759024704

Cynical, but maybe not wrong. We are plenty familiar with ignoring technical debt and letting it pile up. Dodgy LLM code seems like more of that.

Just like tech debt, there's a time for rushing. And if you're really getting good results from LLMs, that's fabulous.

I don't have a final position on LLM's but it has only been two days since I worked with a colleague who definitely had no idea how to proceed when they were off the "happy path" of LLM use, so I'm sure there are plenty of people getting left behind.

0000000000100 · 2025-09-28T08:10:53 1759047053

Wow the bad faith is quite strong here. As it turns out, small to mid sized insurance companies have some ridiculously poorly architected front ends.

Not everyone is the biggest cat in town with infinite money and expertise. I have no intention of leaving anytime soon, so I have confidence that the code that was generated by the AI (after confirming with our guy who is the insurance OG) is solid improvement over what was before.

oblio · 2025-09-28T20:23:49 1759091029

The bad faith is super strong when it's being swamped by a lot more bad faith driven by greed. I'm not talking about you, but about all these companies with overnight valuations in the billions and their PR machines.

To your example, frankly, I would have started with that very important caveat, of an initial situation defined by very poor quality. It's a very valid angle as a lot of code that's available today is of very low quality and if AI can't take 1/10 or 2/10 and make it 5/10 or 6/10, yes, everyone benefits.

gamblor956 · 2025-09-28T02:15:59 1759025759

A lot of programmers that say that LLMs are awesome tend to be inexperienced, not good programmers, or just gloss over the significant amount of extra work that using LLMs requires.

Programmers tend to overestimate their knowledge of non-programming domains, so the OP is probably just not understanding that there are serious issues with the LLM's output for complicated subject matters like insurance.

cjbarber · 2025-09-28T00:50:08 1759020608

What are you trying to use LLMs for and what model are you using?

0000000000100 · 2025-09-28T07:01:34 1759042894

Depends a lot. Use it for one off scripts, particularly for anything Microsoft 365 related (expanding Sharepoint drives, analyzing AWS usage, general IT stuff). Where there is a lot of heavy context based business logic it will fail since there’s too much context for it to be successful.

I work in custom software where the gap in non-LLM users and those who at least roughly know how to use it is huge.

It largely depends on the prompt though. Our ChatGPT account is shared so I get to take a gander at the other usages and it’s pretty easy see: “okay this person is asking the wrong thing”. The prompt and the context has a major impact on the quality of the response.

In my particular line of work, it’s much more useful than not. But I’ve been focusing on helping build the right prompts with the right context, which makes many tasks actually feasible where before it would be way out of scope for our clients budgets.

kace91 · 2025-09-28T01:12:03 1759021923

Could you give an example of a prompt?

yeasku · 2025-09-28T02:03:10 1759024990

You are a top stackoverflow contributor with 20 years of experience in...

kace91 · 2025-09-28T02:46:46 1759027606

I meant an example of the prompts he was attempting, in case it helped provide advice.