Hacker Newsnew | past | comments | ask | show | jobs | submit | shpongled's commentslogin

It's not totally novel, but it's very cool to see the continued simplification of protein folding models - AF2 -> AF3 was a reduction in model architecture complexity, and this is a another step in the direction of the bitter lesson.

I’m not sure AF3’s performance would hold up if it hadn’t been trained on data from AF2 which itself bakes in a lot of inductive bias like equivariance

Probably because ByteDance and Facebook (spun out into EvolutionaryScale) are doing it

Protein folding is in no way "solved". AlphaFold dramatically improved the state-of-the-art, and works very well for monomeric protein chains with structurally resolved nearest neighbors. It abjectly fails on the most interesting proteins - just go check out any of the industry's hottest undrugged targets (e.g. transcription factors)

> When it comes to very complicated things, physics tends to fall down and we need to try non-physics modeling, and/or come up with non-physics abstraction.

"When things are complicated, if I just dream that it is not complicated and solve another problem than the one I have, I find a great solution!"

Joking apart, models that can help target potentially very interesting sub phase space much smaller than the original one, are incredibly useful, but fundamental understanding of the underlying principles, allowing to make very educated guesses on what can and cannot be ignored, usually wins against throwing everything at the wall...

And as you are pointing out, when the complex reality comes knocking in it usually is much much messier...


I have your spherical cow standing on a frictionless surface right here, sir. If you act quickly, I can include the "spherical gaussian sphere" addon with it, at no extra cost.

There's no problem with randomness in FP?

You could use a monad/external state for an OS-level RNG, or define a purely functional PRNG


It's usually quicksorting a linked list, where a random pivot, median of three, etc. are terrible for performance.

(Merge sort is of course the natural sort for lists, but qs is like 2 lines of Haskell so it gets demoed for being clever)


2016 remains one the greatest single player FPS games I've played (Titan Fall 2 is the other)


As someone who loves SML/OCaml and has written primarily Rust over the past ~10 years, I totally agree - I use it as a modern and ergonomic ML with best-in-class tooling, libraries, and performance. Lifetimes are cool, and I use them when needed, but they aren't the reason I use Rust at all. I would use Rust with a GC instead of lifetimes too.


How do you use Rust without lifetimes?


Either a lot of clones or a lot of reference counted pointers. Especially if your point of comparison is a GC language, this is much less of a crime than some people think


When I mean "use" them, I mean make heavy use of them, e.g. structs or functions annotated with multiple lifetimes, data flows designed to borrow data, e.g. You can often get by just with `clone` and lifetime elision, and if you don't need to eke out that last bit of performance, it's fine.


I looked through their torch implementation and noticed that they are applying RoPE to both query and key matrices in every layer of the transformer - is this standard? I thought positional encodings were usually just added once at the first layer


No they’re usually done at each attention layer.


Do you know when this was introduced (or which paper)? AFAIK it's not that way in the original transformer paper, or BERT/GPT-2


All the Llamas have done it (well, 2 and 3, and I believe 1, I don't know about 4). I think they have a citation for it, though it might just be the RoPE paper (https://arxiv.org/abs/2104.09864).

I'm not actually aware of any model that doesn't do positional embeddings on a per-layer basis (excepting BERT and the original transformer paper, and I haven't read the GPT2 paper in a while, so I'm not sure about that one either).


Thanks! I'm not super up to date on all the ML stuff :)


Should be in the RoPE paper. The OG transformers used multiplicative sinusoidal embeddings, while RoPE does a pairwise rotation.

There's also NoPE, I think SmolLM3 "uses NoPE" (aka doesn't use any positional stuff) every fourth layer.


This is normal. Rope was introduced after bert/gpt2


I think you cropped out the important part of the quote:

> It’s rare that I have more than a drink or two in one night.

I don't drink that often any more, but 2-3 drinks in a night, done occasionally is not a problem. I've had weeks where I drink a beer (or two!) every night, and also don't struggle with any alcohol problems.

2 drinks every single night? Leaning that way - and not great for you just from a health/caloric perspective.


I always wonder why people would make such obvious selective edits that completely change the meaning of a sentence and quote it as if it was what the author intended.

Do they not think people will notice? Or do they not notice that they've even done it?


Maybe they got really excited while reading...


I would pay $5000 to never have to read another LLM-authored piece of text ever again.


One possibility is trying a different form of cardio. I personally don't enjoy running at all... but I love cycling. Running for 30 minutes is super boring, but I can go do a 4-hour ride no problem. If you can't go outside at all, then this won't really help you though.


Same here : I discovered the fun of rollerblading in skate parks at 34.

Never did any sport in my whole life, officially obese, but now I’m taking a collective course in a skatepark every week and I’m having so much fun that I’m forcing myself to do more sessions even when I don’t feel like it. And even if I’m still pretty "bad" at it, it’s just amazingly liberating.

I guess you just have to find your thing ?


> If you can't go outside at all, then this won't really help you though.

If going outside is not an option, stationary bicycles are a thing though there wont be any nice outdoor scenery to go with your cycling.


My best cardio is doubtless stair machine while doing Anki with headphones and a small Bluetooth game pad in hand


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: