LLMs make learning new material easier than ever. I use them a lot and I am learning new things at an insane pace in different domains.
The maximalists and skeptics both are confusing the debate by setting up this straw man that people will be delegating to LLMs blindly.
The idea that someone clueless about OAuth should develop an OAuth lib with LLM support without learning a lot about the topic is... Just wrong. Don't do that.
But if you're willing to learn, this is rocket fuel.
On the flip side, I wanted to see what common 8 layer PCB stackups were yesterday. ChatGPT wasn't giving me an answer that really made sense. After googling a bit, I realized almost all of the top results were AI generated, and also had very little in the way of real experience or advice.
It's going to be like the pre-internet dark ages, but worse. Back then you only didn't find the information. Now, you find unlimited information, but it is all wrong.
I don't know, this sounds a lot like in the late 90s when we heard a lot about how anyone could put information on the internet and that you shouldn't trust what you read online.
Well it turns out you can manage just fine.
You shouldn't blindly trust anything. Not what you read, not what people say.
Using LLMs effectively is a skill too, and that does involve deciding when and how to verify information.
The difference is in scale. Back then, only humans were sometimes putting up false information, and other humans had a chance to correct it. Now, machines are writing infinitely more garbage than humans can ever read. Search engines like Google are already effectively unusable.
I think there will be solutions, although I don't think getting there will be pretty.
Google's case (and Meta and spam calls and others) is at least in part an incentives problem. Google hasn't been about delivering excellent search to users for a very long time. They're an ad company and their search engine is a tool to better deliver ads. Once they had an effective monopoly, they just had to stay good enough not to lose it.
I've been using Kagi for a few years now and while SEO spam and AI garbage is still an issue, it is far less of one than with Google or Bing. My conclusion is these problems are at least somewhat addressable if doing so is what gets the business paid.
But I think a real long term solution will have to involved a federated trust model. It won't be viable to index everything dumped on the web; there will need to be a component prioritizing trust in the author or publisher. If that follows the same patterns as email (ex: owned by Google and Microsoft), then we're really screwed.
You missed the full context: you would never be able to trust a bunch of amateur randos self-policing their content. Turns out it's not perfect but better than a very small set of professionals; usually there's enough expertise out there, it's just widely distributed. The challenge this time is 1. the scale, 2. the rate of growth, 3. the decline in expertise.
>> Using LLMs effectively is a skill too, and that does involve deciding when and how to verify information.
How do you verify when ALL the sources are share the same AI-generated root, and ALL of the independent (i.e. human) experts have aged-out and no longer exist?
> How do you verify when ALL the sources are share the same AI-generated root,
Why would that happen? There's demand for high quality, trustworthy information and that's not going away.
When asking an LLM coding questions, for example, you can ask for sources and it'll point you to documentation. It won't always be the correct link, but you can prod it more and usually get it, or fall back to searching the docs the old fashioned way.
This thread started from the question of where the experts with the ability to use LLMs effectively would still come from in the future.
I was making the point that it's still easy to find great information on the internet despite the fact that there's a lot of incorrect information as well, which was an often mentioned 'danger' on the internet since its early days.
I wasn't speaking to broader societal impact of LLMs, where I can easily agree it's going to make misinformation at scale much easier.
>> LLMs make learning new material easier than ever.
feels like there's a logical flaw here, when the issue is that LLMs are presenting the wrong information or missing it all together. The person trying to learn from it will experience Donald Rumsfield's "unknown unknowns".
I would not be surprised if we experience an even more dramatic "Cobol Moment" a generation from now, but unlike that one thankfully I won't be around to experience it.
Learning from LLMs is akin to learning from Joe Rogan.
You are getting a stylised view of a topic from an entity who lacks the deep understanding needed to be able to fully distill the information. But it is enough to gain enough knowledge for you to feel confident which is still valuable but also dangerous.
And I assure you that many, many people are delegating to LLMs blindly e.g. it's a huge problem in the UK legal system right now because of all the invented case law references.
I can think of books I used to learn software engineering when I was younger which, in retrospect, I realize were not very good, and taught me some practices I now disagree with. Nevertheless, the book did help me learn, and got me to a point where I could think about it myself, and eventually develop my own understanding.
I think what’s missing here is you should start by reading the RFCs. RFCs tend to be pretty succinct so I’m not really sure what a summarization is buying you there except leaving out important details.
(One thing that might be useful is use the LLM as a search engine to find the relevant RFCs since sometimes it’s hard to find all of the applicable ones if you don’t know the names of them already.)
I really can’t stress this enough: read the RFCs from end to end. Then read through the code of some reference implementations. Draw a sequence diagram. Don’t have the LLM generate one for you, the point is to internalize the design you’re trying to implement against.
By this time you should start spotting bugs or discrepancies between the specs and implementations in the wild. That’s a good sign. It means you’re learning
how do you gain anything useful from a sycophantic tutor that agrees with everything you say, having being trained to behave as if the sun shines out of your rear end?
making mistakes is how we learn, and if they are never pointed out...
It's a bit of a skill. Gaining an incorrect understanding of some topic is a risk anyway you learn, and I don't feel it's greater with LLMs than many of the alternatives.
Sure, having access to legit experts who can tutor you privately on a range of topics would be better, but that's not realistic.
What I find is that if I need to explore some new domain within a field I'm broadly familiar with, just thinking through what the LLM is saying is sufficient for verification, since I can look for internal consistency and check against things I know already.
When exploring a new topic, often times my questions are superficial enough for me to be confident that the answers are very common in the training data.
When exploring a new topic that's also somewhat niche or goes into a lot of detail, I use the LLM first to get a broad overview and then drill down by asking for specific sources and using the LLM as an assistant to consume authoritative material.
> LLMs will tell you 1 or 2 lies for each 20 facts. Its a hard way to learn.
That was my experience when growing up with school also, except you got punished one way or another for speaking up/trying to correct the teacher. If I speak up with the LLM they either explain why what they said is true, or corrects themselves, 0 emotions involved.
You are ignoring the fact that the types of mistakes or lies are of a different nature.
If you are in class, and you incorrectly argue, there is a mistake in an explanation of Derivatives or Physics, but you are the one in error, your Teacher hopefully, will not say: "Oh, I am sorry you are absolutely correct. Thank you for your advice.."
Yeah, no of course if I'm wrong I don't expect the teacher to agree with me, what kind of argument is that? I thought it was clear, but the base premise of my previous comment is that the teacher is incorrect and refuse corrections...
My point is a teacher will not do something like this:
- Confident synthesis of incompatible sources:
LLM: “Einstein won the 1921 Nobel Prize for his theory of relativity, which he presented at the 1915 Solvay Conference.”
Or
- Fabricated but plausible citations:
LLM: “According to Smith et al., 2022, Nature Neuroscience, dolphins recognise themselves in mirrors.”
There is no such paper...model invents both authors and journal reference
I don't know what reality you live in, but it happens that teachers are incorrect, no matter what your own personal experience have been. I'm not sure how this is even up for debate.
What matters is how X reacts when you point out it wasn't correct, at least in my opinion, and was the difference I was trying to highlight.
A human tutor typically misquotes a real source or says “I’m not sure”
An LLM, by contrast, will invent a flawless looking but nonexistent citation.
Even a below average teacher doesn’t churn out fresh fabrications every tenth sentence.
Because a teacher usually cites recognizable material, you can check the textbook
and recover quickly. With an LLM you first have to discover the source never existed. That verification cost is higher, the more complex task you are trying to achieve.
A LLM will give you a perfect paragraph about the AWS Database Migration service,
the list of supported databases, and then include in there a data flow
like on-prem to on-prem data that is not supported...Relying on an LLM is like flying with a friendly copilot but who has multiple personality disorder. You dont know which day he will forget to take his meds :-)
Stressful and mentally exhausting in a different kind of way....
And you are saying human teachers or online materials won't lie to you once or twice for every 20 facts? no matter how small. Did you do any comparison?
> LLMs make learning new material easier than ever. I use them a lot and I am learning new things at an insane pace in different domains.
With learning, aren’t you exposed to the same risks? Such that if there was a typical blind spot for the LLM, it would show up in the learning assistance and in the development assistance, thus canceling out (i.e unknown unknowns)?
If you trust everything the LLM tells you, and you learn from code, then yes the same exact risks apply. But this is not how you use (or should use) LLMs when you’re learning a topic. Instead you should use high quality sources, then ask the LLM to summarize them for you to start with (NotebookLM does this very well for instance, but so can others). Then you ask it to build you a study plan, with quizzes and exercises covering what you’ve learnt. Then you ask it to setup a spaced repetition worksheet that covers the topic thoroughly. At the end of this you will know the topic as well as if you’d taken a semester-long course.
One big technique it sounds like the authors of the OAuth library missed is that LLMs are very good at generating tests. A good development process for today’s coding agents is to 1) prompt with or create a PRD, 2) break this down into relatively simple tasks, 3) build a plan for how to tackle each task, with listed out conditions that should be tested, 3) write the tests, so that things are broken, TDD style and finally 4) write the implementation. The LLM can do all of this, but you can’t one-shot it these days, you have to be a human in the loop at every step, correcting when things go off track. It’s faster, but it’s not a 10x speed up like you might imagine if you think the LLM is just asynchronously taking a PRD some PM wrote and building it all. We still have jobs for a reason.
> Instead you should use high quality sources, then ask the LLM to summarize them for you to start with (NotebookLM does this very well for instance, but so can others).
How do you determine if the LLM accurately reflects what the high-quality source contains, if you haven't read the source? When learning from humans, we put trust on them to teach us based on a web-of-trust. How do you determine the level of trust with an LLM?
> When learning from humans, we put trust on them to teach us based on a web-of-trust.
But this is only part of the story. When learning from another human, you'll also actively try and devise whether they're trustworthy based on general linguistic markers, and will try to find and poke holes in what they're saying so that you can question intelligently.
This is not much different from what you'd do with an LLM, which is why it's such a problem that they're more convincing than correct pretty often. But it's not an insurmountable issue. The other issue is that their trustworthiness will wary in a different way than a human's, so you need experience to know when they're possibly just making things up. But just based on feel, I think this experience is definitely possible to gain.
Because summarizing is one of the few things LLMs are generally pretty good at. Plus you should use the summary to determine if you want to read the full source, kind of like reading an abstract for a research paper before deciding if you want to read the whole thing.
Bonus: the high quality source is going to be mostly AI written anyway
I did actually use the LLM to write tests, and was pleased to see the results, which I thought were pretty good and thorough, though clearly the author of this blog post has a different opinion.
But TDD is not the way I think. I've never been able to work that way (LLM-assisted or otherwise). I find it very hard to write tests for software that isn't implemented yet, because I always find that a lot of the details about how it should work are discovered as part of the implementation process. This both means that any API I come up with before implementing is likely to change, and also it's not clear exactly what details need to be tested until I've fully explored how the thing works.
This is just me, other people may approach things totally differently and I can certainly understand how TDD works well for some people.
When I'm exploring a topic, I make sure to ask for links to references, and will do a quick keyword search in there or ask for an excerpt to confirm key facts.
This does mean that there's a reliance on me being able to determine what are key facts and when I should be asking for a source though. I have not experienced any significant drawbacks when compared to a classic research workflow though, so in my view it's a net speed boost.
However, this does mean that a huge variety of things remain out of reach for me to accomplish, even with LLM "assistance". So there's a decent chance even the speed boost is only perceptual. If nothing else, it does take a significant amount of drudgery out of it all though.
> With learning, aren’t you exposed to the same risks? Such that if there was a typical blind spot for the LLM, it would show up in the learning assistance and in the development assistance, thus canceling out (i.e unknown unknowns)?
I don't think that's how things work. In learning tasks, LLMs are sparring partners. You present them with scenarios, and they output a response. Sometimes they hallucinate completely, but they can also update their context to reflect new information. Their output matches what you input.
Another limitation of LLMs lies in their inability to stay in sync with novel topics or recently introduced methods, especially when these are not yet part of their training data or can't be inferred from existing patterns.
It's important to remember that these models depend not only on ML breakthroughs but also on the breadth and freshness of the data used to train them.
That said, the "next-door" model could very well incorporate lessons from the recent Cloudflare OAuth Library issues, thanks to the ongoing discussions and community problem-solving efforts.
The value of LLMs is that they do things for you, so yeah the incentive is to have them take over more and more of the process. I can also see a future not far into the horizon where those who grew up with nothing but AI are much less discerning and capable and so the AI becomes more and more a crutch, as human capability withers from extended disuse.
If the hypothesis is that we still need knowledgeable people to run LLMs, but the way you become knowledgeable is by talking to LLMs, then I don’t think the hypothesis will be correct for long..
We need knowledgeable people to run computers, but you can become knowledgeable about computers by using computers to access learning material. Seems like that generalizes well to LLMs.
> LLMs make learning new material easier than ever. I use them a lot and I am learning new things at an insane pace in different domains.
Sorry, but the the amount of bad information dispensed by models and the student's ability to go "hey, that's wrong" due to a lack of experience and knowledge means that this is going to lead to disaster very often.
People already dispensing terrible information on YouTube because they trusted an AI to generate their voice-over script to explain something when creating learning materials.
And yet, human coders may do that exact type of thing daily, producing far worse code. I find it humorous at how much higher of a standard is applied to LLMs in every discussion when I can guarantee those exact some coders likely produce their own bug-riddled software.
We’ve gone from skeptics saying LLMs can’t code, to they can’t code well, to they can’t produce human-level code, to they are riddled with hallucinations, to now “but they can’t one-shot code a library without any bugs or flaws” and “but they can only one-shot code, they can’t edit well” even tho recents coding utilities have been proving that wrong as well. And still they say they are useless.
Some people just don’t hear themselves or see how AI is constantly moving their bar.
This, for me, has been the question since the beginning. I’m yet to see anyone talk/think about the issue head on too. And whenever I’ve asked someone about it, they’ve not had any substantial thoughts.
Engineers will still exist and people will vibe code all kinds of things into existence. Some will break in spectacular ways, some of those projects will die, some will hire a real engineer to fix things.
I cannot see us living in a world of ignorance where there are literally zero engineers and no one on the planet understands what's been generated. Weirdly we could end up in a place where engineering skills are niche and extremely lucrative.
Did musical creativity end with synths and sequencers?
Tools will only amplify human skills. Sure, not everyone will choose to use tools for anything meaningful, but those people are driving human insight and innovation today anyway.
to use your own example though, many of these core skills are declining, mechanized or viewed through a historical lens vs. application. I don't know if this is net good or bad, but it is very different. Maybe humans will think as you say, but it feels like there will be significantly less diverse areas of thought. If you look at the front page of HN as a snapshot of "where's tech these days" it is very homgenous compared to the past. Same goes for the general internet and the AI-content continues to grow. IMO published works are a precursor to future human discovery, forming the basis of knowledge, community and growth.
No, but it'll become a hobby or artistic pursuit, just like running, playing chess, or blacksmithing. But I personally think it's going to take longer than 30 years.
The implication is that they are hoping to bridge the gap between current AI capabilities and something more like AGI in the time it takes the senior engineers to leave the industry. At least, that's the best I can come up with, because they are kicking out all of the bottom rings of the ladder here in what otherwise seems like a very shortsighted move.
In a few years hopefully the AI reviewers will be far more reliable than even the best human experts. This is generally how competency progresses in AI...
For example, at one point a human + computer would have been the strongest combo in chess, now you'd be insane to allow a human to critic a chess bot because they're so unlikely to add value, and statistically a human in the loop would be far more likely to introduce error. Similar things can be said in fields like machine vision, etc.
Software is about to become much higher quality and be written at much, much lower cost.
My prediction is that for that to happen we’ll need to figure out a way to measure software quality in the way we can measure a chess game, so that we can use synthetic data to continue improving the models.
I don’t think we are anywhere close to doing that.
Agreed, chess is for all intents and purposes a ‘solved’ game, it’s merely a matter of processing power, and even then we have shortcuts that are easily ‘good enough’ to play perfectly against human opponents.
But how do you reduce the requirements for software to something so simple and elegant as chess rules? Is it foolish to assume that if we could have, we already would have? Even for humans, the process of writing software includes a lot of guess-and-check most of the time - the idea that you could sit down and think through every aspect of software, then describe it immaculately, then translate that description to a working solution with no bugs or review or need for course correction is just… it’s a pipe dream.
Not really... If you're an average company you're not concerned about producing perfect software, but optimising for some balance between cost and quality. At some point companies via capitalist forces will naturally realise that it's more productive to not have humans in the loop.
A good analogy might be how machines gradually replaced textile workers in the 19th century. Were the machines better? Or was there a was to quantitatively measure the quality of their output? No. But at the end of the day companies which embraced the technology were more productive than those who didn't, and the quality didn't decrease enough (if it did at all) that customers would no longer do business with them – so these companies won out.
The same will naturally happen in software over the next few years. You'd be an moron to hire a human expert for $200,000 to critic a cybersecurity optimised model which costs maybe a 100th of the cost of employing a human... And this would likely be true even if we assume the human will catch the odd thing the model wouldn't because there's no such thing as perfect security – it's always a trade off between cost and acceptable risk.
Bookmark this and come back in a few years. I made similar predictions when ChatGPT first came out that within a few years agents would be picking up tickets and raising PRs. Everyone said LLMs were just stochastic parrots and this would not happen, well now it has and increasingly companies are writing more and more code with AI. At my company it's a little over 50% at the mo, but this is increasing every month.
Almost none of what you said about the past is true. Automated looms, and all of the other automated machinery that replaced artisans over the course of the industrial revolution produced items of much better quality than what human craftsman could produce by the time it started to be used commercially because of precision and repeatability. They did have quantitative measurements of quality for textiles and other goods and the automated processes exceeded human craftsman at those metrics.
Software is also not remotely similar to textiles. A subtle bug in the textile output itself won’t cause potentially millions of dollars in damages, they way a bug in an automated loom itself or software can.
No current technology is anywhere close to being able to automate 50% of PRs on any non trivial application (that’s not close to the same as saying that 50% of PRs merged at your startup happens to have an agent as author). To assume that current models will be able to get near 100% without massive model improvements is just that—an assumption.
My point about synthetic data is that we need orders of magnitude more data with current technology and the only way we will get there is with synthetic data. Which is much much harder to do with software applications than with chess games.
The point isn’t that we need a quantitative measure of software in order for AI to be useful, but that we need a quantitative measure in order for synthetic data to be useful to give us our orders of magnitude more training data.