I use it a lot now for knocking up grafana charts etc. It’s not so much that the LLM is feeding the numbers through. You can still use real tools to analyse and summarise the numbers, it’s just much quicker at driving them.
As ever with data analysis, two things will continue to be true. Real insights come from spotting something that looks off and digging into it deeper. Secondly, it’s really easy to connect data in a misleading way.
I’ve had a Claude analysis handed to me this morning including a summary list of actions we’re going to take next which falls into this very trap.
The insights you’ll get from your data will only be as deep as the curiosity of the person at the helm.
Sure it depends how it is done but for most uses I'd say they are not appropriate - building tools with them is ok if you double check (though how many people will when the answers seem good enough at first?).
I'd find it really troubling if financial analysts are using them without knowing the deep limitations of the tooling (which the companies selling them will not highlight for you). They don't actually count or reason so they are liable to just make up figures based on their training dataset, not the data you give them.
Using them for actual financial analysis and generating reports based on data will lead to hallucinated figures which conform to what was asked for, not what the data says and silently fills in gaps in the data. It's extremely dangerous and not something they are good at at all.
Don’t get me wrong, I very much agree with the danger. As I highlighted - I saw it this morning when someone used Claude to draw the wrong conclusions.
I’m saying there is a way in which they can be used where there isn’t scope for numerical hallucinations at all. They can write sql queries, for example, without ever being allowed to even see numbers.
What invariably does and will happen though is they’ll inner join instead of left join and some data will get missed. Or there will be some missing context (users in this set already have a certain class of property by virtue of some selection bias and that will be mistreated as some signal etc).
For the people that didn’t live through this time, lining these images up makes it obvious why those that did speak of how visually impressive the Amiga was.
It always makes me a bit sad that everyone knows RHCP but less so their early stuff. Blood sugar sex magik is a funk masterpiece. Didn’t help that for years Spotify used the singles versions of the tracks so the levels were all over the place and it was basically unstreamable.
I loved all the early stuff. Freaky Styley, Mothers Milk, The Uplift Mofo party plan. With Rick Rubin at the controls I just think Blood Sugar Sex Magik took their sound to another level.
JJ can save conflict related state with the change so that you don't need to resolve a conflict in the middle of a stack of changes for rebasing to continue for the remaining changes. Concretely, it uses a "conflict algebra" where it can track the impact of a conflict as it propagates through the stack of rebased changes: https://docs.jj-vcs.dev/latest/technical/conflicts/
You have to fix them at some point, but not in the middle of doing other things. Right now. With no possible way to make progress elsewhere while deferring this decision.
Not really very similar at all for the scenario discussed here. Rerere remembers how you have resolved a conflict before. It doesn't let you rebase a stack of commits that result in different conflicts. You will have to stop and resolve each conflict and then `git rebase --continue`.
Avoiding manual conflict resolution isn't really a good thing though - conflicts are an indication that multiple different changes affect some code and you really should think hard about what the combination of them should be. Even what git does automatically already can be dangerous.
> To what degree did I expand scope because I knew I could do more using the AI?
Someone at work recently termed this “Claude Creep”. It’s so easy to generate things push you towards going further but the reality is that’s you’re setting yourself up for more and more work to get them over the line.
If you’re an employee who can finish their work 25% faster but you’re not getting a 4-day work week, what are the incentives for not introducing creep?
Just sitting around and thinking of solutions to your own problems beats giving yourself work. As refactor_nietzsche would put it, resource slack is what lets you be a refactor_master instead of a refactor_slave. If you feel pressured into self-imposed creep, it's probably because you've internalized the idea that having too much slack makes you look dangerous to your superiors, so you default to playing the worker bee.
As someone who really hopes developers get much more productive in our team, I would hope we are able to finally keep pace with the desired roadmap and fix bugs at a much faster rate so customers don’t have to wait too long for things to get fixed. Currently, we always get late and the roadmap has to be changed to remove less important things so we focus on fewer things. Which is not ideal, we are definitely losing business when we drop items, and we know that since potential customers keep running into use cases we knew would be useful but never managed to get to.
It's a matter of whether you are just writing more regular quality things, or whether you are improving the quality of what you write. There's many things that increase quality, but are time consuming, which Claude Code can do for you.
One thing I recently did was run a pass over some unit test and functional test suites, asking for standardization on initialization, and creating reasonable helper methods to minimize boilerplate. Any dev can do that, if they have a week, and it'll future code changes more pleasant later. For Claude, an hour was a -8000 line PR that kept all the tests, with all the assertions.
It's what people need to figure out out of a a codebase. Our normal quality practices have an embedded max safe speed for changes without losing stability. If you use LLMs to try to change things faster, the quality practices have to improve if one wants to keep the number of issues per week constant. Whether it's improving testing, or sending the LLM to look at logs and find the bugs faster, one needs to increase the quality budget.
Dude. I’ve been thinking about this a lot! I think it’s because the traditional way we internalize the costs of what we are building just got take for a ride. We don’t really (or I don’t anyway) fully know what “too much scope” feels like with one of these Claude thingies. So it’s easy to completely both overestimate complexity and underestimate it too. Some times the LLM makes a seemingly daunting refactor be super simple and sometimes something seemingly not complex can take it forever… and there really is, for me, a good “gut sense” of how something will go.
So lately I’ve just decided that I’ll time box things instead of set defined endpoints. And by “endpoint” I really mean “I’m done for the day” and honestly maybe thinking about it… “I’m done with this project”.
I don’t know. But the term “Claude Creep” is absolutely something I can identify with. That thing will take you down a rathole that started with just pulling in some document and ends with you completely repartitioning your file system. lol.
Some of the expanded scope that I’ve done almost for free is usually around UX polish and accessibility. I even completely redid the —help for a few CLI tools I have when I would never have invested over an hour on each before agents.
I agree that the efficiency and quality are very hard to measure. I’m extremely confident that when used well, agents are a huge gain for both though. When used poorly, it is just slop you can make really fast.
the flip side of claude creep is that the easy parts are now genuinely free, which means all your time goes to the 30% that was already hard. ai doesn't save you time on the hard bits, it just eliminates the excuse to not have done the easy bits first.what's helped: think in postconditions, not tasks. instead of 'add feature X', define 'the tests pass and the user can do Y'. the agent figures out what X means. without that anchor there's nothing to mark as done, so scope drifts indefinitely.
100%
Over the years I've amassed hundreds of code boilerplate snippets/templates that I would copy and paste and the modify, and now they're all just sitting in Obsidian gathering dust.
Why would I waste my time copying and pasting when I can just have Claude generate me basic ansible playbooks on the spot in 30 seconds.
Cognitive overload? For me, it’s easier to construct a mental model of the thing than have a full (sometime complex) example which may not necessarily be valid and/or on point. And as I’ve been exploring some foundational ideas of computing, you can do away with a lot of complexity in modern development.
Yup same. Going a little insane trying to set up a new device that's having network issues. Claude was helping me diagnose. Seems like I've infected the whole internet.
One annoying thing about that flow is that when you change the world outside the model it breaks its assumptions and it loses its way faster (in my experience).
Ha! I had someone doing a task the other day and the llm they used wrote a regex in Perl. I joked that in 25 years all the seems to have changed is the number of layers between me and Perl.
I use it a lot now for knocking up grafana charts etc. It’s not so much that the LLM is feeding the numbers through. You can still use real tools to analyse and summarise the numbers, it’s just much quicker at driving them.
As ever with data analysis, two things will continue to be true. Real insights come from spotting something that looks off and digging into it deeper. Secondly, it’s really easy to connect data in a misleading way.
I’ve had a Claude analysis handed to me this morning including a summary list of actions we’re going to take next which falls into this very trap.
The insights you’ll get from your data will only be as deep as the curiosity of the person at the helm.