It's not totally novel, but it's very cool to see the continued simplification of protein folding models - AF2 -> AF3 was a reduction in model architecture complexity, and this is a another step in the direction of the bitter lesson.
I’m not sure AF3’s performance would hold up if it hadn’t been trained on data from AF2 which itself bakes in a lot of inductive bias like equivariance
Protein folding is in no way "solved". AlphaFold dramatically improved the state-of-the-art, and works very well for monomeric protein chains with structurally resolved nearest neighbors. It abjectly fails on the most interesting proteins - just go check out any of the industry's hottest undrugged targets (e.g. transcription factors)
> When it comes to very complicated things, physics tends to fall down and we need to try non-physics modeling, and/or come up with non-physics abstraction.
"When things are complicated, if I just dream that it is not complicated and solve another problem than the one I have, I find a great solution!"
Joking apart, models that can help target potentially very interesting sub phase space much smaller than the original one, are incredibly useful, but fundamental understanding of the underlying principles, allowing to make very educated guesses on what can and cannot be ignored, usually wins against throwing everything at the wall...
And as you are pointing out, when the complex reality comes knocking in it usually is much much messier...
I have your spherical cow standing on a frictionless surface right here, sir. If you act quickly, I can include the "spherical gaussian sphere" addon with it, at no extra cost.
As someone who loves SML/OCaml and has written primarily Rust over the past ~10 years, I totally agree - I use it as a modern and ergonomic ML with best-in-class tooling, libraries, and performance. Lifetimes are cool, and I use them when needed, but they aren't the reason I use Rust at all. I would use Rust with a GC instead of lifetimes too.
Either a lot of clones or a lot of reference counted pointers. Especially if your point of comparison is a GC language, this is much less of a crime than some people think
When I mean "use" them, I mean make heavy use of them, e.g. structs or functions annotated with multiple lifetimes, data flows designed to borrow data, e.g. You can often get by just with `clone` and lifetime elision, and if you don't need to eke out that last bit of performance, it's fine.
I looked through their torch implementation and noticed that they are applying RoPE to both query and key matrices in every layer of the transformer - is this standard? I thought positional encodings were usually just added once at the first layer
All the Llamas have done it (well, 2 and 3, and I believe 1, I don't know about 4). I think they have a citation for it, though it might just be the RoPE paper (https://arxiv.org/abs/2104.09864).
I'm not actually aware of any model that doesn't do positional embeddings on a per-layer basis (excepting BERT and the original transformer paper, and I haven't read the GPT2 paper in a while, so I'm not sure about that one either).
I think you cropped out the important part of the quote:
> It’s rare that I have more than a drink or two in one night.
I don't drink that often any more, but 2-3 drinks in a night, done occasionally is not a problem. I've had weeks where I drink a beer (or two!) every night, and also don't struggle with any alcohol problems.
2 drinks every single night? Leaning that way - and not great for you just from a health/caloric perspective.
I always wonder why people would make such obvious selective edits that completely change the meaning of a sentence and quote it as if it was what the author intended.
Do they not think people will notice? Or do they not notice that they've even done it?
One possibility is trying a different form of cardio. I personally don't enjoy running at all... but I love cycling. Running for 30 minutes is super boring, but I can go do a 4-hour ride no problem. If you can't go outside at all, then this won't really help you though.
Same here : I discovered the fun of rollerblading in skate parks at 34.
Never did any sport in my whole life, officially obese, but now I’m taking a collective course in a skatepark every week and I’m having so much fun that I’m forcing myself to do more sessions even when I don’t feel like it. And even if I’m still pretty "bad" at it, it’s just amazingly liberating.
reply