I don't think it's just the alignment work. I suspect OpenAI+Microsoft are over-... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		alecco on May 31, 2023 \| parent \| context \| favorite \| on: Ask HN: Is it just me or GPT-4's quality has signi... I don't think it's just the alignment work. I suspect OpenAI+Microsoft are over-doing the Reinforcement Learning from Human Feedback with LoRA. Most of people's prompts are stupid stuff. So it becomes stupider. LoRA is one of Microsoft's most dear discoveries in the field, so they are likely tempted to over-use it. Perhaps OpenAI should get back to a good older snapshot and be more careful about what they feed into the daily/weekly LoRA fine-tuning. But this is all guesswork because they don't reveal much.

broast on May 31, 2023 | [–]

I thought RLHF with LoRA is precisely the alignment method.

heliophobicdude on June 1, 2023 | [–]

Supposedly the API hasn't changed.

https://twitter.com/officiallogank/status/166393494793189785...

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact