Field Notes from Shipping Real Code with Claude

diwank · 2025-06-08T00:49:46 1749343786

Author here: To be honest, I know there are like a bajillion Claude code posts out there these days.

But, there are a few nuggets we figured are worth sharing, like Anchor Comments [1], which have really made a difference:

——

  # CLAUDE.md

  ### Anchor comments

  Add specially formatted comments throughout the codebase, where appropriate, for yourself as inline knowledge that can be easily `grep`ped for.

  - Use `AIDEV-NOTE:`, `AIDEV-TODO:`, or `AIDEV-QUESTION:` as prefix as appropriate.

  - *Important:* Before scanning files, always first try to grep for existing `AIDEV-…`.

  - Update relevant anchors, after finishing any task.

  - Make sure to add relevant anchor comments, whenever a file or piece of code is:

  * too complex, or  
  * very important, or  
  * could have a bug

——

[1]: https://diwank.space/field-notes-from-shipping-real-code-wit...

peter422 · 2025-06-08T02:28:39 1749349719

Just to provide a contrast to some of the negative comments…

As a very experienced engineer who uses LLMs sporadically* and not in any systematic way, I really appreciated seeing how you use them in production in a real project. I don’t know why people are being negative, you just mentioned your project in details where it was appropriate to talk about the structure of it. Doesn’t strike me as gratuitous self promotion at all.

Your post is giving me motivation to empower the LLMs a little bit more in my workflows.

*: They absolutely don’t get the keys to my projects but I have had great success with having them complete specific tasks.

diwank · 2025-06-08T02:36:04 1749350164

Really appreciate the kind words! I did not intend the post to be too much about our company, just that it is the codebase I mostly hack on. :)

panny · 2025-06-08T20:00:32 1749412832

>Think of this post as your field guide to a new way of building software. By the time you finish reading, you’ll understand not just the how but the why behind AI-assisted development that actually works.

Hi, AI skeptic with an open-mind here. How much will this cost me to try? I don't see that mentioned in your writeup.

diwank · 2025-06-09T04:01:45 1749441705

There are a bunch of different options so it depends on the models you end up liking but the simplest place to start is to get Claude Max $100 tier and use Opus 4 (the others don’t really get you the full experience)

mcv · 2025-06-09T07:19:13 1749453553

I hear a lot of good stuff about Claude Code lately, but before, it was all about Copilot or Cursor. And some options are a lot cheaper than others. Is Claude Code so much better now?

I admit I have no idea what the real differences are. Everybody seems to claim to be the best and most comprehensive AI coding solution.

indigodaddy · 2025-06-13T09:25:12 1749806712

Isn't Claude Code fully functional/available now on the $20 Pro monthly plan?

diwank · 2025-06-17T04:39:31 1750135171

In my experience, you’ll run out of Opus usage credits way too quickly on the Pro plan

noufalibrahim · 2025-06-08T18:43:16 1749408196

There are a lot of posts around but this was very practical and gives me a system i can try to implement and perhaps improve. Much appreciated. Thanks for taking the time to write it.

One thing I would have liked to know is the difference between a workflow like this and the use of aider. If you have any perspective on that, it would be great.

diwank · 2025-06-09T03:59:18 1749441558

Thank you! aider is a different beast actually, I found its memory/context handling best in class. Somehow though, I ended up liking Claude Code the most because of its TUI but really a matter of personal preference and workflow

kikimora · 2025-06-08T10:41:43 1749379303

Thanks for the great article, this is much needed to understand how to properly use LLM at scale.

You mentioned that LLM should never touch tests. Then followed up with an example refactoring changing 500+ endpoints completed in 4 hours. This is impressive! I wonder if these 4 hours included test refactoring as well or it is just prompting time?

diwank · 2025-06-08T16:02:29 1749398549

that didn't include the testing, that def took a lot longer but at least now my devs don't have an excuse for poorly written tests lol

r0b0ji · 2025-06-08T17:27:03 1749403623

At one place you mentioned, if a test is updated by AI, you reject the PR. How do you know if it was generated or updated by AI. From the article I only got that it's a git commit message convention to add that but that too is only at commit level.

diwank · 2025-06-08T18:10:57 1749406257

Mostly just good faith during PR reviews. Plus other than Opus 4 models largely flub it and it shows

mcv · 2025-06-09T07:21:55 1749453715

Really? I was hoping to use AI to write mocks. That's always the part that I hate most about unit tests.

mafro · 2025-06-08T09:28:30 1749374910

Great post. I'm fairly new to the AI pair programming thing (I've been using Aider), but with 20 years of coding behind me I can see where things are going. You're dead right in the conclusion about now being the time to adopt this stuff as part of your flow -- if you haven't already.

And regarding the HN post getting buried for a while there...[1] Somewhat ironic that an article about using AI to help write code would get canned for using an AI to help write it :D

[1]: https://news.ycombinator.com/item?id=44214437

localhost · 2025-06-08T15:12:18 1749395538

Did you use Claude Code to write the post? I'm finding that I'm using it for 100% of my own writing because agentic editing of markdown files is so good (and miles better than what you get with claude.ai artifacts or chatgpt.com canvas). This is how you can do things like merge deep research or other files into the doc that you are writing.

diwank · 2025-06-08T15:47:53 1749397673

no, just used chatgpt to bootstrap the research :)

here's the original chat: https://chatgpt.com/share/6844eaae-07d0-8001-a7f7-e532d63bf8...

I also used bits from claude research but apparently if you use claude research, they don't let you create a share link -_-

localhost · 2025-06-08T15:56:45 1749398205

Right. But you can copy paste that into a separate doc and have Claude Code merge it in (and not a literal merge - a semantic merge "integrate relevant parts of this research into this doc"). This is super powerful - try it!

gavinray · 2025-06-09T12:34:10 1749472450

Does Claude Code perform any different than browser Claude for writing tasks?

I recently wrote a long Markdown document, and asked Claude, ChatGPT, Grok, and Gemini to improve it.

Comparing outputs, it was very close between Gemini and Claude, but I decided that Claude was slightly better-written.

localhost · 2025-06-09T14:06:11 1749477971

The models are the same, but the actual prompts sent to the model are likely somewhat different because of the agentic loop - so I would imagine (without having done the experiments) there will be slight differences. Unclear whether they will be more or less than the variance in responses sent multiple times to the same experience (e.g., Claude.ai variance vs. Claude Code variance vs. variance between Claude.ai and Claude Code). Would be an interesting controlled experiment to try!

meeech · 2025-06-08T01:11:12 1749345072

Q: How do you ensure tests are only written by humans? Basically just the honor system?

diwank · 2025-06-08T01:13:55 1749345235

You can:

1. Add instructions in CLAUDE.md to not touch tests.

2. Disallow the Edit tool for test directories in the project’s .claude/settings.json file

meeech · 2025-06-08T02:37:27 1749350247

Disallow edit in test dirs is a good tip. thanks.

I meant though in the wider context of the team - everyone uses it but not everyone will work the same, use the same underlying prompts as they work. So how do you ensure everyone keeps to that agreement?

mathgeek · 2025-06-08T13:45:04 1749390304

> So how do you ensure everyone keeps to that agreement?

There's nothing specific to using Claude or any other automation tool here. You still use code reviews, linters, etc. to catch anything that isn't following the team norms and expectations. Either that or, as the article points out, someone will cause an incident and may be looking for a new role (or nothing bad happens and no one is the wiser).

davidmurdoch · 2025-06-08T16:16:00 1749399360

kasey_junk · 2025-06-07T22:46:13 1749336373

One of the exciting things to me about the ai agents is how they push and allow you to build processes that we’ve always known were important but were frequently not prioritized in the face of shipping the system.

You can use how uncomfortable you are with the ai doing something as a signal that you need to invest in systematic verification of that something. As a for instance in the link, the team could build a system for verifying and validating their data migrations. That would move a whole class of changes into the ai relm.

This is usually much easier to quantify and explain externally than nebulous talk about tech debt in that system.

diwank · 2025-06-08T02:32:11 1749349931

For sure. Another interesting trick I found to be surprisingly effective is to ask Claude Code to “Look around the codebase, and if something is confusing, or weird/counterintuitive — drop a AIDEV-QUESTION: … comment so I can document that bit of code and/or improve it”. We found some really gnarly things that had been forgotten in the codebase.

tinodb · 2025-06-10T19:47:55 1749584875

But why stick with the AIDEV- style? I mean, this is what comments are for: explaining the why that can’t be read from code. You could just make it a regular comment, right?

diwank · 2025-06-13T04:38:54 1749789534

yup but:

1. AIDEV-* is easier to visually distinguish, and

2. it is grep-able by the models, so they can "look around" the codebase in one glance

theptip · 2025-06-08T04:15:44 1749356144

Agreed, my hunch is that you might use higher abstraction-level validation tools like acceptance and property tests, or even formal verification, as the relative cost of boilerplate decreases.

m_kos · 2025-06-08T19:27:57 1749410877

Good read – I have learned something new.

I have been having a horrible experience with Sonnet 4 via Cursor and Web. It keeps cutting corners and misreporting what it did. These are not hallucinations. Threatening it with deletion (inspired by Anthropic's report) only makes things worse.

It also pathologically lies about non-programming things. I tried reporting it but the mobile app says "Something went wrong. Please try again later." Very bizarre.

Am I the only person experiencing these issues? Many here seem to adore Claude.

j_crick · 2025-06-08T23:57:41 1749427061

I think they might have cut its brains too much in the latest updates.

I remember versions 3.5 doing okay on my simple tasks like text analysis or summaries or little writing prompts. In 4+ versions the thing just can't follow instructions within a single context window for more than 3-4 replies.

When prompted about "why do you keep rambling if I asked you to stay concise" it says that its default settings are overriding its behavior and explicit user instructions, ditto for actively avoiding information that it considers "harmful". After pointing out inconsistencies and omissions in its replies it concedes that its behavior is unreliable and even extrapolates that it is made this way so users keep engaging with it for longer and more often.

Maybe it got too smart to its detriment, but if yes then it's really sad what Anthropic did to it.

threeseed · 2025-06-08T21:14:51 1749417291

Routinely had every one of these issues.

I find it's much better just to use Claude Web and be extremely specific about what I need it to do.

And even then half the code it generates for me is riddled with errors.

diwank · 2025-06-09T04:02:57 1749441777

Claude Code is a really very different experience. I’d say give it a shot but Opus 4 is a step change

wredcoll · 2025-06-09T01:07:35 1749431255

What's the difference between "sonnet 4 via cursor" and claude code?

linsomniac · 2025-06-09T02:46:23 1749437183

"Sonnet 4" is the model, "cursor" is the IDE. "Claude Code" is another IDE, a TUI with a chat interface. Cursor is a VSCode fork with an AI panel in it.

nxobject · 2025-06-08T16:49:23 1749401363

As I read (and appreciate) your documentation tips, I don’t even think they have to be labeled as AI-specific! ‘CLAUDE.md’ seems likely it could just as be ‘CONVENTIONS.md’, and comments to ‘AI’ could just as be comments to ‘READER’. :) Certainly I would appreciate comments like that when reading _any_ codebase, especially as an unfamiliar contributor.

SatvikBeri · 2025-06-08T17:57:35 1749405455

Not the OP, but in practice, the comments that tend to help Claude tend to be very different from what helps humans. With humans, I'll generally focus on the why.

Our style guide for humans is about 100 lines long, with lines like "Add a ! to the end of a function name if and only if it mutates one of its inputs". Our style guide for Claude is ~500 lines long, and equivalent sections have to include many examples like "do this, don't do this" to work.

__mharrison__ · 2025-06-08T21:00:01 1749416401

I just tried this out with aider. It worked great. Vibe coded a PDF viewer with drawing capabilities in 30 minutes while waiting for a plane...

dkobia · 2025-06-08T12:18:40 1749385120

Thank you for writing this. Many software developers on HN are conflicted about ceding control of software development to LLMs for many reasons including the fact that it feels unstructured and exploratory rather than rigidly planned using more formal methodologies.

There's a good middle ground where the LLMs can help us solve problems faster optimizing for outcomes rather than falling in love with solving the problem. Many of us usually lose sight of the actual goal we're trying to achieve when we get distracted by implementation details.

diwank · 2025-06-08T15:51:53 1749397913

absolutely! I think of these as new levers in the making, rather rusty, and def can often bite you in the behind but worth learning, and perhaps most importantly to help evolve them into useful tools rather than an excuse to ship sloppy engineering

wonger_ · 2025-06-08T02:11:57 1749348717

Some thoughts:

- Is there a more elegant way to organize the prompts/specifications for LLMs in a codebase? I feel like CLAUDE.md, SPEC.mds, and AIDEV comments would get messy quickly.

- What is the definition of "vibe-coding" these days? I thought it refers to the original Karpathy quote, like cowboy mode, where you accept all diffs and hardly look at code. But now it seems that "vibe-coding" is catch-all clickbait for any LLM workflow. (Tbf, this title "shipping real code with Claude" is fine)

- Do you obfuscate any code before sending it to someone's LLM?

diwank · 2025-06-08T02:21:01 1749349261

> - Is there a more elegant way to organize the prompts/specifications for LLMs in a codebase? I feel like CLAUDE.md, SPEC.mds, and AIDEV comments would get messy quickly.

Yeah, the comments do start to pile up. I’m working on a vscode extension that automatically turns them into tiny visual indicators in the gutter instead.

> - What is the definition of "vibe-coding" these days? I thought it refers to the original Karpathy quote, like cowboy mode, where you accept all diffs and hardly look at code. But now it seems that "vibe-coding" is catch-all clickbait for any LLM workflow. (Tbf, this title "shipping real code with Claude" is fine)

Depends on who you ask ig. For me, hasn’t been a panacea, and I’ve often run into issues (3.7 sonnet and codex have had ~60% success for me but Opus 4 is actually v good)

> - Do you obfuscate any code before sending it to someone's LLM?

In this case, all of it was open source to begin with but good point to think about.

jstummbillig · 2025-06-08T10:26:36 1749378396

Is it really though, when a lot of critical business data goes through Google workspace (usually without client side encryption), or are we trying very hard to be a bit special in the name of privacy? From a result standpoint I find curious how interesting people deem their code base to be to a LLM provider.

diwank · 2025-06-08T15:54:44 1749398084

true but this does matter to a lot of enterprise customers that have to obey strict data provenance laws (for instance, there are no gpt-4.1 model endpoints hosted in India and hence, fin-tech companies cannot use those apis)

skerit · 2025-06-08T09:53:10 1749376390

Very interesting, I'm going to use some of these ideas in my CLAUDE.md file.

> One of the most counterintuitive lessons in AI-assisted development is that being stingy with context to save tokens actually costs you more

Something similar I've been thinking about recently: For bigger projects & more complicated code, I really do notice a big difference between Claude Opus and Claude Sonnet. And Sonnet sometimes just wastes so much time on ideas that never pan out, or make things worse. So I wonder: wouldn't it make more sense for Anthropic to not differentiate between Opus and Sonnet for people with a Max subscription? It seems like Sonnet takes 10-20 turns what Opus can do in 2 or 3, so in the end forcing people over to Sonnet would ultimately cost them more.

diwank · 2025-06-08T16:00:52 1749398452

yeah, the max subscription comes in two tiers, $100 gets you 5x the tokens than pro (which only has sonent) and the $200 gets you 20x. doing the math for tokens is kinda annoying and not straightforward atm. they also have a "hybrid" mode which uses opus until ~20% tokens for opus left then switches to sonnet

wredcoll · 2025-06-09T01:09:34 1749431374

How do you get opus on a subscription? Their system is so confusing.

diwank · 2025-06-09T03:44:58 1749440698

Yeah absolutely. I’m on their Max sub since last month and I still don’t fully get how many tokens I have etc.

Simplest I’d say is: get the $100/mo Max and then npm install -g @anthropic-ai/claude-code

lispisok · 2025-06-08T02:19:50 1749349190

I think most of this is good stuff but I disagree with not letting Claude touch tests or migrations at all. Handing writing tests from scratch is the part I hate the most. Having an LLM do a first pass on tests which I add to and adjust as I see fit has been a big boon on the testing front. It seems the difference between me and the author is I believe whether code was generated by an LLM or not the human still takes ownership and responsibility. Not letting Claude touch tests and migrations is saying you rightfully dont trust Claude but are giving ownership to Claude for Claude generated code. That or he doesn't trust his employees to not blindly accept AI slop, the strict rules around tests and migrations is to prevent the AI slop from breaking everything or causing data loss.

diwank · 2025-06-08T02:24:31 1749349471

True but, in my experience, a few major pitfalls that happened:

1. We ran into really bad minefields when we tried to come back to manually edit the generated tests later on. Claude tended to mock everything because it didn’t have context about how we run services, build environments, etc.

2. And this was the worst, all of the devs on the team including me got realllyy lazy with testing. Bugs in production significantly increased.

jaakl · 2025-06-08T03:42:11 1749354131

Did you try to put all this (complex and external) context to the context (claude.md or whatever), with intructions how to do proper TDD, before asking for the tests? I know that may be more work than actual coding it as you know all it by heart and external world is always bigger than internal one. But in long term and with teams/codebases with no good TDD practises that might end up with useful test iterations. Of course developer commiting the code is anyway responsible for it, so what I would ban is putting “AI did it” to the commits - it may mentally work as “get out of jail card” attempt for some.

diwank · 2025-06-08T15:57:08 1749398228

we tried a few different variations but tbh had universally bad results. for example, we use `ward` test runner in our python codebase, and claude sonnet (both 3.7 and 4) keep trying to force-switch it to pytest lol. every. single. time.

maybe we could either try this with opus 4 and hope that cheaper models catch up, or just drink the kool-aid and switch to pytest...

ayewo · 2025-06-08T10:57:27 1749380247

I literally LOLed at #2, haha! LLMs are making devs lazy at scale :)

Devs almost universally hate 3 things:

1. writing tests;

2. writing docs;

3. manually updating dependencies;

and LLMs are a big boon wrt to helping us avoiding all 3, but forcing your team to pick writing tests is a sensible trade off in this context, since as you say bugs in prod increased significantly.

diwank · 2025-06-08T15:58:34 1749398314

yeah, this might change in the future but I also found that since building features has become faster, asking devs to write the tests themselves sort of demands that they take responsibility of the code and the potential bugs

Artoooooor · 2025-06-08T02:06:20 1749348380

I finally decided few days ago to try this Claude Code thing in my personal project. It's depressingly efficient. And damn expensive - I used over 10 dollars in one day. But I'm afraid it is inevitable - I will have to pay tax to AI overlords just to be able to keep my job.

Syzygies · 2025-06-08T02:19:23 1749349163

I was looking at $2,000 a year and climbing, before Anthropic announce $100 and $200 Max subscriptions that bundled Claude Console and Claude Code. There are limits per five hour windows, but one can toggle back to metered API with the login/ command, or just walk the dog. $100 a month has done me fine.

diwank · 2025-06-08T02:26:47 1749349607

Same. I ran out on the $200 one too yesterday. It’s skyrocketed after Opus 4. Nothing else comes close

StefanBatory · 2025-06-08T09:10:46 1749373846

I had been musing over this. Will devs in very cheap countries still stay an attractive option, just because they'd be still cheaper monthly than Claude.

thelittleone · 2025-06-08T17:56:52 1749405412

The cost on paper may be cheaper. But those options become less attractive when you take into account timezones, communication challenges, availability, scheduling, and the ever increasing coding performance of coding agents.

deadbabe · 2025-06-08T16:18:51 1749399531

We’ve stopped hiring devs in cheaper countries because their quality of output isn’t much more than LLMs, but LLMs are faster and cheaper. Since we don’t really trust cheap devs to ship big important features, the market for the kind of work we allowed them to do has been totally consumed by LLMs.

diwank · 2025-06-08T15:53:22 1749398002

the cost per token for ~similar performance is dropping by a factor of 2 every 10-11 months at the moment, so I am not sure. that said, I think devs in less expensive parts of the world are actually picking up these tools the fastest (maybe from existential angst? idk)

remram · 2025-06-08T17:38:52 1749404332

The 2.3 MB picture at the top loaded comically slowly even on wifi.

diwank · 2025-06-17T04:41:34 1750135294

Compressed it. Thanks for letting me know :)

mgraczyk · 2025-06-08T18:13:07 1749406387

Good post, the only part I think I disagree with is

> Never. Let. AI. Write. Your. Tests.

AI writes all of my tests now, but I review them all carefully. Especially for new code, you have to let the AI write tests if you want it to work autonomously. I explicitly instruct the AI to write tests and make sure they pass before stopping. I usually review these tests while the AI is implementing the code to make sure they make sense and cover important cases. I add more cases if they are inadequate.

diwank · 2025-06-09T03:46:21 1749440781

The risk, in my experience, is to reach a state where you as the dev is not easily able to understand/update the tests. That becomes very tricky

nilirl · 2025-06-08T03:53:21 1749354801

Lot of visual noise because of model specific comments. Or maybe that's just the examples here.

But as a human, I do like the CLAUDE.md file. It's like documentation for dev reasoning and choices. I like that.

Is this faster than old style codebases but with developers having the LLM chat open as they work? Seems like this ups the learning curve. The code here doesn't look very approachable.

indigodaddy · 2025-06-13T09:47:30 1749808050

Can anyone recommend a specific YouTube video that really shows how to use Claude Code effectively in your workflow?

__mharrison__ · 2025-06-08T19:45:37 1749411937

Thanks for the post. It's very interesting time navigating the nascent field of AI assisted development.

Curious if the author (or others) tried other tools / models.

diwank · 2025-06-09T03:49:09 1749440949

Have tried everything under the sun at the moment— broadly just two winners (and both have become my daily drivers for different use cases):

1. Claude Code with Opus 4

2. Cursor with Opus 4 or Gemini 2.5 Pro (Windsurf used to be an option but Anthropic has now cut them out)

3. (Coming up; still playing around) Claude Code’s GitHub Action

meeech · 2025-06-08T01:01:49 1749344509

Honest question: approx what percent of the post was human vs machine written?

tomhow · 2025-06-09T00:23:34 1749428614

We detached this subthread from https://news.ycombinator.com/item?id=44213763 and marked it off-topic.

diwank · 2025-06-08T01:10:02 1749345002

I’d say around ~40% me, the ideating, editing, citations, and images are all mine; rest Opus 4 :)

I typically try to also include the original Claude chat’s link in the post but it seems like Claude doesn’t allow sharing chats with deep research used in them.

Update: here’s an older chatgpt conversation while preparing this: https://chatgpt.com/share/6844eaae-07d0-8001-a7f7-e532d63bf8...

tomhow · 2025-06-08T03:51:18 1749354678

Thanks for being transparent about this, but we’re not wanting substantially LLM-generated content on HN.

We’ve been asking the community to refrain from publicly accusing authors of posting LLM-generated articles and comments. But the other side of that is that we expect authors to post content thay they’ve created themselves.

It’s one thing to use an LLM for proof-reading and editing suggestions, but quite another for “60%” of an article to be LLM-generated. For that reason I’m having to bury the post.

Edit: I changed this decision after further information and reflection. See this comment for further details: https://news.ycombinator.com/item?id=44215719

diwank · 2025-06-08T04:19:15 1749356355

I completely understand. Just to clarify, when I said it was ~40%, I didn’t mean the content was written by Claude/ChatGPT but that I took its help in deep research and writing the first drafts. The ideas, all of the code examples, the original CLAUDE.md files, the images, citations, etc are all mine.

tomhow · 2025-06-08T05:20:02 1749360002

Ok, sure, these things are hard to quantify. The main issue is that we can't ask the community to refrain from accusing authors of publishing AI-generated content if people really are publishing content that is obviously AI-generated. What matters to us is not how much AI was used to write an article, but rather how much the audience finds that the article satisfies intellectual curiosity. If the audience can sense that the article is generated, they lose trust in the content and the author, and also lose trust in HN as a place they can visit to find high-quality content.

Edit: On reflection, given your explanation of your use of AI and given another comment [1] I replied to below, I don't think this post is disqualified after all.

[1] https://news.ycombinator.com/item?id=44215719

diwank · 2025-06-08T15:49:06 1749397746

I appreciate this :)

elcritch · 2025-06-08T09:50:41 1749376241

Shouldn’t the quality of the content be what matters? Avoiding articles with low grade effort or genuine content either made with or without LLMs would seem to be a better goal.

tomhow · 2025-06-08T10:48:15 1749379695

Yes, and I've changed the decision, given further information and reflection, and explained it here: https://news.ycombinator.com/item?id=44215719

misnome · 2025-06-08T19:15:42 1749410142

Wasn’t it “40% me” e.g. 60% LLM generated?

“I supplied the ideas” is literally the first thing anyone caught out using chatgpt to do their homework says… I’d tend to believe someones first statement instead of the backpedal once they’ve been chastised for it.

__mharrison__ · 2025-06-08T20:58:01 1749416281

Speaking for this "we", this was one of the best posts I read this week. (And I imagine that a lot of them were AI-assisted.)

ericb · 2025-06-08T17:12:25 1749402745

Is the percentage meaningful, though? If an LLM produces the most interesting, insightful, thought-provoking content of the day, isn't that what the best version of HN would be reading and commenting on?

If I invent the wheel, and have an LLM write 90% of the article from bullet points and edit it down, don't we still want HN discussing the wheel?

Not to say that the current generation of AI isn't often producing boring slop, but there's nothing that says it will remain that way, and a percent-AI assistance seems like the wrong metric to chase to me?

never_inline · 2025-06-08T18:32:27 1749407547

Because why do you anti-compress your thoughts using LLM at all? It makes things harder to read.

ericb · 2025-06-09T18:51:50 1749495110

I re-compress my thoughts during editing. That's how I write normally. First, a long draft, then a short one. Saving writing time on the long draft is helpful.

Slop is slop, whether a human or AI wrote it--I don't want to read it. Great is great. Period. If a human or AI writes something great, I want to read it.

Assuming AI writing will remain slop is a bold assumption, even if it holds true for the next 24 hours.

“I didn't have time to write a short letter, so I wrote a long one instead.”

- Mark Twain

threeseed · 2025-06-08T21:17:43 1749417463

> If an LLM produces the most interesting, insightful, thought-provoking content of the day, isn't that what the best version of HN would be reading and commenting on?

Absolutely not. Would much rather take some that is boring, not thought provoking but that was authentic and real rather than as you say AI slop.

If you want that sort of content maybe LinkedIn is a better place.

pbhjpbhj · 2025-06-08T07:57:42 1749369462

Surely you're missing the wood for the trees here - isn't the point of asking for no 'AI' to avoid low effort slop? This is a relatively high value post about adopting new practices and the human-LLM integration.

Tag it, let users decide how they want to vote.

Aside: meta: If you're speaking on behalf of HN you should indicate that in the post (really with a marker outside of the comment).

tomhow · 2025-06-08T09:06:44 1749373604

Indeed, and since the author has clarified what they meant by "40%", I've put the post back on the front page. Another relevant factor is they seem not to speak English as a primary language, and I think we can make allowances for such people to use LLMs to polish their writing.

Regarding your other suggestion: it's been the case ever since HN started 18 years ago that moderators/modcomments don't have any special designation. This is due to our preference for simple design and an aversion to seeming separate from the community. We trust that people will work it out and that has always worked well here.

remram · 2025-06-08T17:37:24 1749404244

"40% me" means "60% LLM"

meeech · 2025-06-08T02:43:13 1749350593

thanks. to be clear, I'm not asking the q to be particularly negative about it. Its more just curiosity, mixed with trade in effort. If you wrote it 100%, I'm more inclined to read the whole thing. vs say now just feeding it back to the GPM to extract the condensed nuggets.

ishita159 · 2025-06-08T16:22:42 1749399762

great tip! will do that to consume ai generated content as well.

GiorgioG · 2025-06-08T01:57:55 1749347875

[flagged]

diwank · 2025-06-08T02:17:28 1749349048

I don’t understand why not. I’m not a natural prose writer, but (I felt that) these ideas were worth putting out there.

I posted on HN largely to get feedback.

malnourish · 2025-06-08T03:42:25 1749354145

You used your tools well. The times are adapting and it's best we get on with it. It's the only way we can discover another step-change.

I use AI to help craft technical messages to different audiences and get various perspectives on my ideas and questions. It's a tool that's given me more insight into other perspectives than anything else, and it's helped me make some excellent slides.

Executives and senior leadership really need things at different levels. They are abstracting domains like we extract functions. Then there are the technical leaders, who this speaks to. I shared this article with my VP and peers. I expect it will be too much for some, but it remains approachable.

It's not all roses though, I tried using various LLMs to help me draft a personal message and it left me feeling remarkably conflicted. It was my message, but it lost my voice, even when just used to reach my niece's "audience" . I haven't found out how to use it and still allow for my creative emotional expression.

I don't use it for emails, because it still feels professionally dishonest for me; for some reason, presentations don't. I'm bad at them and it helps.

One more thing: I instruct LLMs to not let me meander like this.

climb_stealth · 2025-06-08T03:33:49 1749353629

Just want to say that I appreciate you sharing all that and being open about it.

As someone who has mostly ignored the AI progression it is interesting to see. And a bit bewildering.

bitwize · 2025-06-08T00:07:24 1749341244

[flagged]

sdorf · 2025-06-08T00:35:54 1749342954

The whole point seems to be how to get the most out of today's tooling without "glue getting in your pizza". It's a little flag-wavy (probably because of the author's company) but overall seemed like a pretty candid peek into how it's being used. Did you have a specific critique?

diwank · 2025-06-08T01:01:41 1749344501

Feedback appreciated! Will tone it down; did not intend it to be too much about our company, just that it is the codebase I mostly hack on. :)

djrockstar1 · 2025-06-08T01:24:26 1749345866

Pretty disingenuous to emphasize "building a culture of transparency" while simultaneously not disclosing how heavily AI was [very evidently] used in writing this post.

diwank · 2025-06-08T01:30:04 1749346204

I’d say around ~40% me, the ideating, editing, citations, and images are all mine; rest Opus 4 :)

I typically try to also include the original Claude chat’s link in the post but it seems like Claude doesn’t allow sharing chats with deep research used in them.

See this series of posts for example, I have included the link right at the beginning: https://diwank.space/juleps-vision-levels-of-intelligence-pt...

I completely get the critique and I already talked about it earlier: https://news.ycombinator.com/item?id=44213823

Update: here’s an older chatgpt conversation while preparing this: https://chatgpt.com/share/6844eaae-07d0-8001-a7f7-e532d63bf8...

GiorgioG · 2025-06-08T01:55:29 1749347729

[flagged]

diwank · 2025-06-08T02:15:55 1749348955

Why not? This way at least the idea gets out there. Otherwise I’d have never come around to writing this.