I find the concept of low floor/high ceiling quite helpful, as for instance recently discussed in "When Will AI Transform the Economy?" [1] - actually more helpful than "jagged" intelligence used in TFA.
What annoys me the most in my LinkedIn feed is low-effort visual AI slop, in particular based on fads such as the bland 4o comic style with text bubbles.
As a "visual animal", I find it very hard to tune out this kind of noise. I consequently try to hide such content from my feed, but the LinkedIn algorithm will not budge.
Thank you, I'll bite. If within your code of conduct:
- Are you providing reasoning traces, responses or both?
- Are you evaluating reasoning traces, responses or both?
- Has your work shifted towards multi-turn or long horizon tasks?
- If you also work with chat logs of actual users, do you think that they are properly anonymized? Or do you believe that you could de-anonymize them without major efforts?
- Do you have contact to other evaluators?
- How do you (and your colleagues) feel about the work (e.g., moral qualms because "training your replacement" or proud because furthering civilization, or it's just about the money...)?
GEO: generative engine optimization (not explained in TFA)
I for one enjoy the current situation where LLMs still return useful search results and product comparisons, as long as it lasts. Stay tuned for colossal enshittification when the VC money dries up...
What made google rich will be used here with an order of magnitude more power … custom ads for everyone within the answer, not alongside. The LLM knows people with a precision even deeper, the personality like SEO has never been able to achieve. Of course it will become very hard not to have biais in answers. Even video ads made on the spot for each people. Welcome to the new world.
Want a new pair of running shoes ? Custom ad all the way to convince you personally, powerful ad made just for you and disguised as a neutral answer gathered from the web. Really curious how this will unfold because there is a lot of money to be made that way to the detriment of their product
Interesting detail from the court order [0]: When asked by the judge if they could anonymize chat logs instead of deleting them, OpenAI's response effectively dodged the "how" and focused on "privacy laws mandate deletion." This implicitly admits they don't have a reliable method to sufficiently anonymize data to satisfy those privacy concerns.
This raises serious questions about the supposed "anonymization" of chat data used for training their new models, i.e. when users leave the "improve model for all users" toggle enabled in the settings (which is the default even for paying users). So, indeed, very bad for the current business model which appears to rely on present users (voluntarily) "feeding the machine" to improve it.
So, the NYT asked for this back in January and the court said no, but asked OpenAI if there was a way to accomplish the preservation goal in a privacy-preserving manner. OpenAI refused to engage for 5 f’ing months. The court said “fine, the NYT gets what they originally asked for”.
Speaking of performance wall: The Claude 4 results were added to the Aider LLM Leaderboard [0] yesterday. Opus 4 is clearly below Gemini 2.5 Pro at almost twice the price. Sonnet 4 fares worse than Sonnet 3.7, with the thinking version of Sonnet 4 being somewhat cheaper than its 3.7 counterpart.
[1] https://andreinfante.substack.com/p/when-will-ai-transform-t...