This period of model scaling at all cost is going to be a major black eye on the... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		valine on Sept 30, 2024 \| parent \| context \| favorite \| on: Do AI companies work? This period of model scaling at all cost is going to be a major black eye on the industry in a couple years. We already know that language models are few shot learners at inference time, and yet OpenAI seems to be happy throwing petaflops of compute training models the slow way. The question is how can you use in-context learning to optimize the model weights. It’s a fun math problem and it certainly won’t take a billion dollar super computer to solve it.

johnsutor on Oct 1, 2024 [–]

Seems like this is already being answered:

https://arxiv.org/abs/2407.10930 https://arxiv.org/abs/2006.04439

valine on Oct 1, 2024 | [–]

Not really the first paper is just fine-tuning on synthetic data. The second paper doesn’t optimize the model weights.

Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact