Chess reminds me more of programming given the set of defined rules in each. How...

ogogmad · 2025-05-14T15:10:50 1747235450

So far. Computer vision is currently lagging NLP, but I wouldn't expect that to last.

johnmaguire · 2025-05-14T15:13:54 1747235634

> but I wouldn't expect that to last

Do you have any links to research or work being done on computer vision that leads you to this conclusion? Would love to check it out!

startupsfail · 2025-05-14T15:26:25 1747236385

You can compare best image synthesis and image understanding from two decades ago (SIFT / HOG), from a decade ago (CNN, SdA) and now (Transformer). Very rapid progress that went from being able to unreliability recognize a face to getting to outperforming human professionals (see MMMU) is quite remarkable.

johnmaguire · 2025-05-14T16:11:49 1747239109

AIUI, and I may be wrong, but each of the mentioned technologies was a "breakthrough" technology - not iterative improvement. Along this vein, I was wondering if there was some promising, novel research OP was aware of for image understanding.

The most recent of which you mentioned, Transformers, is used by both LLMs and image synthesis/understanding. The parent posits that while computer vision lags behind NLP, this may not continue. While your comment points out that image synthesis and understanding has improved over time, I'm not sure I follow the argument that it may soon leapfrog or even catch up with LLMs (i.e. text understanding and synthesis.)