Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>distributed training networks

Now that's an idea. One bottleneck might be a limit on just how much you can parallelize training, though.



There's a ton of work in this area, and the reality is... it doesn't work for LLMs.

Moving from 900GB/sec GPU memory bandwidth with infiniband interconnects between nodes to 0.01-0.1GB/sec over the internet is brutal (1000x to 10000x slower...) This works for simple image classifiers, but I've never seen anything like a large language model be trained in a meaningful amount of time this way.


Maybe there is a way to train a neural network in a distributed way by training subsets of it and then connecting the aggregated weight changes to adjacent network segments. It wouldn't recover 1000x interconnect slowdowns, but might still be useful depending on the topology of the network.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: