hn_throw_bs's comments

hn_throw_bs · 2025-10-11T15:43:04 1760197384

Anyone in the dating market should know this instinctively by now, that it makes reading most news or puff pieces of this sort comical.

hn_throw_bs · 2025-10-11T15:18:59 1760195939

> So why don’t we use it?

Because we don’t need another fucking tag, that’s why.

hn_throw_bs · 2025-10-11T00:25:29 1760142329

That’s why the stories of genghis khan seem like bullshit to me.

hn_throw_bs · 2025-10-11T00:23:40 1760142220

That is so cringe, my goodness.

hn_throw_bs · 2025-10-10T18:11:36 1760119896

Can someone elucidate us as to how so many platforms (ChatGPT, Gemini, Claude, etc etc) all sprung up so quickly? How did the engineering teams immediately know how to go about doing this kind of tech with LLMs and DNNs and whatnot?

dwohnitmok · 2025-10-10T19:52:20 1760125940

By 2020/2021 with the release of GPT-3, the trajectory of a lot of the most obvious product directions had already become clear. It was mainly a matter of models becoming capable enough to unlock them.

E.g. here's a forecast of 2021 to 2026 from 2021, over a year before ChatGPT was released. It hits a lot of the product beats we've come to see as we move into late 2025.

https://www.lesswrong.com/posts/6Xgy6CAf2jqHhynHL/what-2026-...

(The author of this is one of the authors of AI 2027: https://ai-2027.com/)

Or e.g. AI agents (this is a doc from about six months before ChatGPT was released: https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality...)

airstrike · 2025-10-11T04:45:19 1760157919

Yeah, no. The whole 2025-2026 is 10 years away at best.

bryanlarsen · 2025-10-10T18:18:37 1760120317

This is the paper that kicked off the current generation: https://proceedings.neurips.cc/paper_files/paper/2017/file/3....

That was 2017. And of course Google & UofT were working on it for many years before the paper was published.

rcxdude · 2025-10-10T19:57:35 1760126255

It's not much different to other ML, pretty much it's on a bigger and more expensive scale. So once someone figured out the rough recipie (NN architecture, ludicrous scale of weights and data, reinforcement learning tuning), it's not hard for other experts in the field to replicate, so long as they have the resources. Deepseek was pretty much a side project, for example.

nemomarx · 2025-10-10T18:13:45 1760120025

Was it that quickly? GPT 3 is where I would kind of put the start of this and that was in 2020, they had to work on the technology for quite a while before it got like this. Everyone else has been able to follow their progress and see what works.

dantyti · 2025-10-10T18:31:37 1760121097

GPT 2 didn't have a chat interface but had made a splash in some circles (think spam-adjacent).

Edit: mixed up my dates claiming DALL E came out before GPT 3

beeflet · 2025-10-10T23:25:07 1760138707

I remember subredditsimulator was big (based on GPT and GPT2), but there was some confusion because no one could get ahold of these models

saghm · 2025-10-10T18:24:57 1760120697

I imagine it wasn't as immediate as it might look on the outside. If they all were working independently on similar ideas for a while, one of them launching their product might have caused the others to scramble to get theirs out as well to avoid missing the train.

I think it's also worth pointing out that the polish on these products was not actually there on day one. I remember the first week or so after ChatGPT's initial launch being full of stories and screenshots of people fairly easily getting around some of the intended limitations with silly methods like asking it to write a play where the dialogue has the topic it refused to talk about directly or asking it to give examples about what types of things it's not allowed to say in response to certain questions. My point isn't that there wasn't a lot of technical knowledge that went into the initial launch, but that it's a bit of an oversimplification to view things at a binary where people didn't know how to do it before, but then they did.

icyfox · 2025-10-10T18:17:01 1760120221

All of the products you mention already had research teams (in the case of ChatGPT and Claude that actually predated most of their engineers). So knowing how to build small language models was always in their wheel house. Scaling up to larger LLMs required a few algorithmic advancements but for the most part it was a question of sourcing more data and more compute. The remarkable part of transformers is their scaling laws, which let us achieve much better models without having to reinvent new architecture.

iamflimflam1 · 2025-10-10T18:20:36 1760120436

Once you have the weights, actually running these models is easy. The code is not complicated - they are just huge in terms of memory requirements.
Deep learning has now been around for a long time. Running these models is well understood.

obviously running them at scale for multiple users is more difficult.

The actual front ends are not complicated - as is evidenced by the number of open source equivalents.

0xfeba · 2025-10-10T18:24:44 1760120684

Intersection of cloud compute power being plentiful combined with existing LMs. As I understand it, right now, it's really just throwing compute power at existing LMs to learn on gigantic datasets.

warkdarrior · 2025-10-10T18:14:14 1760120054

"so quickly" meaning over the last 3 years? (ChatGPT was launched in Nov 2022)

hn_throw_bs · 2025-10-07T20:47:28 1759870048

I don’t like these examples because IRL nobody does things this way.

Try actual problems that require you to use these tools and the inter-relationships between them, where it becomes blindingly obvious why they exist. Calculus is a prime example and it’s comical most students find Calculus hard because their LA is weak. But Calculus has extensive uses, just not for doing basic carb counting.

potbelly83 · 2025-10-08T17:16:35 1759943795

Honestly all these cute websites give people a false sense that they're actually learning something. The only way to learn this stuff is get one of the million good LA books out there and work through the problems. But that's hard, so people look for shortcuts.

hn_throw_bs · 2025-10-08T23:38:33 1759966713

Yeah I think when students actually hit Calculus-level related rates, a small dim light starts to glow. Obviously it only gets brighter the less you have to hold onto and the more you have to mathematically present something that you are trying to reason about that all the tools start to make sense, the relationships are asking you “is this true in my case or do I need to take a step back?” and so forth.

I don’t have an axe to grind against the site I think it’s fine, but if someone wants to learn LA, a college level course followed by an intense grind of word problems and having to work backwards and forwards and finding flaws in answers might be a better way to develop the noggin for it. Just my 2c.

hn_throw_bs · 2025-10-07T20:44:56 1759869896

Posting throwaway for obvious reasons.

You’ve heard this line of thought before, and forgive me for parroting but here it goes:

Bluesky attracts the same people X attracts, they just disagree on specifics which in most cases are surface level. The fanaticism and tribalism is basically the same. There is no utopia where a community is pleasant without a lot of guarding and gatekeeping and, really, viewpoint alignment and subject matter filtering. Some topics are basically there for shitflinging, and that’s mostly the topics that seem to be a hot poker for everyone.

No one gets banned for preferring Debian over Fedora.