Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

RL can be massively disappointing, indeed. And I agree with you (and with the amazing post I already referenced [1]) that it is hard to get it to work at all. Sorry to hear you have been disappointed so much!

Nonetheless, I would personally recommend even just learning the basics and fundamentals of RL. Beyond supervised, unsupervised, and the most-recent and well-deservedly hyped semi-supervised learning (generative AI, LLMs, and so on), reinforcement learning indeed models the learning problem in a very elegant way: an agent interacting with an environment and getting feedback. Which is, arguably, a very intuitive and natural way of modeling it. You could consider backward error correction / propagation as an implicit reward signal, but that would be a very limited view.

On a positive note, RL has very practical sucessful applications today - even if in niche fields. For example, LLM fine-tuning techniques like RLHF successfully apply RL to modern AI systems, companies like Covariant are working on large robotics models which definitely use RL, and generally as a research field I believe (but I may be proven wrong!) there is so much more to explore. For example, check Nvidia Eureka that combines LLM to RL [2]: pretty cool stuff IMHO!

Far from attempting to convince you on the strength and capabilities of DRL, just recommending folks to not discard it right away and at least give it a chance to learn the basics, even just for an intellectual exercise :) Thanks again!

[1] https://www.alexirpan.com/2018/02/14/rl-hard.html

[2] https://blogs.nvidia.com/blog/eureka-robotics-research/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: