Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For real. Editing prompts bares no resemblance to engineering at all, there is no accuracy or precision. Say you have a benchmark to test against and you're trying to make an improvement. Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all. It's just throwing shit and examples at the wall in hopes and prayers.


> Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all.

Many prompt engineers do measure and quantitatively compare.


Me too but it's after the fact. I make a change then measure, if it doesn't I roll back. But it's as good as witch craft or alchemy. Will I get I get gold with this adjustment? Nope, still lead. Tries variation #243 next


This is literally how the light bulb filament was discovered.


And Tesla famously described Edison as an idiot for this very reason. Then Tesla revolutionized the way we use electricity while Edison was busy killing elephants.


Lots of things have been discovered by guess-and-check, but it's a shit method for problem solving. You wouldn't use guess-and-check unless A) you don't know enough about the problem to employ a better method or B) the problem space can be searched quickly enough that using a better method doesn't really matter.


I think that the only reason guess-and-check fails is when you don't know enough to actually check.

Maybe this is why we disagreed on a previous comment. I think guess and check is generally the best way to solve problems, so long as your checks well-designed (which to be fair, does require understanding the problem) and you incorporate the results from the check back into further guesses, and you can afford to brute force it (which is statistically the most common problems big and small I've had to solve).

In a lot of ways, there's nothing fancy about that, it's just the scientific method -- the whole point which is that you don't really need to "know" about the problem a priori to get to the truth, and that you can recompile all dependent knowledge from scratch through experimentation to get to it.

In practice it feels really expensive but it's also where the best insights come from -- experimentally re-deriving common knowledge often exposes where its fractures and exceptions are in clear enough detail to translate it into novel discovery.


updating benchmarks and evals is something closer to test engineering / qa engineering work though




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: