Hacker Newsnew | past | comments | ask | show | jobs | submit | more wfme's commentslogin

If you don't want to use your own product, why should anyone else? Empathy and understanding for your users is what drives a great product.


My experience reflects this too. My hunch is that GPT-4o was trained to game the benchmarks rather than output higher quality content.

In theory the benchmarks should be a pretty close proxy for quality, but that doesn't match my experience at all.


A problem with a lot of benchmarks is that they are out in the open so the model basically trains to game them instead of actually acquiring knowledge that would let it solve it. Probably private benchmarks that are not in the training set of these models should give better estimates about their general performance.


Agreed, I assume people are flagging this post for some reason?


It’s most likely the flame war filtering algorithm of HN. Posts that create a lot of discussion quickly are down ranked until an admin fix the rank manually, or not.


There are more options than store in git, or flip willy nilly.

For systems of sufficient scale, it's fairly standard to keep flag changes outside of git so that they can be flipped without a pr. That way the flag change UI can apply other validation steps before any change is attempted such as ensuring valid enrolment ranges (no accidental overlap, and no accidental rollouts to 100% instead of 10%), and the associated rollout analytics can be shown alongside the changes.

You can also override things in emergencies more easily, which is the parent's point.


But it’s also common to stick those flags back under some sort of audit system so we know why and how weird states got set. The simplest way is a separate repo with simpler rules for pushing to master.

Though you could also create your own audit system (just make sure it functions even when the entire site is down)


This is all decided by business needs, not by engineer preference. If devs aren’t pulling the levers it is good to expose them in an accessible interface, not plaintext.


Let’s not pretend like these “business needs” don’t often boil down to a couple of tropes and that the story doesn’t change after a couple of post mortems are turned in for perfectly predictable failure modes.


Chatbots are great for this too. You get flippiness, but you also get auditing implicitly from the chat log.


Because it is cool to hate on React on hn. Write something in Rust and watch everyone swoon, but mention anything that uses React and the comments will be filled with people hating all over it. As is the way on hn.


You're right about the disproportionate hate for React, but your information about Rust is out of date. These days it's 50-50 whether you'll get a lot of love or hate for a Rust project, and several of the top-upvoted Rust posts of the past year are disenchanted ones.


Dropping an SPA onto Cloudflare takes under 5 minutes including sign up and is completely free.


How the jury be considered unbiased given the amount of media coverage this has had? It seems near impossible that none of the members have been exposed to any of the coverage during or prior to the trial.


I don't think anyone is saying that a mother needs to take acetaminophen for their child to have ASD, just that there is an association between a mother taking acetaminophen and their child having ASD.

It seems that the cause of the connection is/was not known, but there is evidence that acetaminophen during pregnancy has an impact on the child's microbiome, and now the study this post is on points to a connection between an altered microbiome and ASD.

There are no doubt many other things done during pregnancy that can also affect the child's microbiome, but this connection is no less interesting given how freely acetaminophen is taken.


Looks great and works well.

Some small suggestions:

- Please add some keyboard shortcuts for common actions, i.e., cmd/ctrl + z to undo (+ shift to redo), delete/backspace to delete, cmd/ctrl + d to duplicate, cmd/ctrl + a to select all.

- Increase rotate cursor affordance - it's currently relatively tiny.


You mean something like a cheat sheet to show all the shortcut keys?


A cheat sheet would be useful too! But I mean shortcuts for frequently used actions like copy/paste and delete.


Fun to imagine such a dystopic setup but entirely unrelated and not even close to equivalent.

Food producers and sellers need to meet certain standards to sell their food. We take these standards for granted but they aren’t without cost. We like the standards because it means that we can generally trust the safety of the food we buy without needing to know much about it.

This is the point Apple sells on. Users are able to download apps and configure their phones without _any_ concerns of compromising their device.

Not everyone wants these guidelines and protections but for a large number of people they are worth a premium.


OK, here's another analogy:

Imagine if every time you installed software on your Mac, they charged a $0.50 fee to the software maker. Even for a shell script.

How is that ANY different?

They're only doing this with iPhone/iPad because they can, because by the time they realized it could be a profitable business for Mac the expectations with their customers had already been established, and they had too much competition.

As soon as they had a competitive edge with iPhone, they resorted to rent seeking.


> They're only doing this with iPhone/iPad because they can, because by the time they realized it could be a profitable business for Mac the expectations with their customers had already been established, and they had too much competition.

That's not historically accurate, and I worked in a core engineering group at Apple. An early concern on battery-dependent devices was that the software running on them would not deplete battery wastefully, and the famous/infamous offender giving rise to what you (understandably) question was the Flash runtime. The upshot of poorly written software (which gave rise to attempts to test and charge for access, which you dislike) was customers in a competitive landscape (Symbian and Android) thinking Apple hardware sucked, while that was not the hardware reality.

So, I get your perspective but it's overlooking the early and very real problems which caused this gateway/gauntlet to attempt improving user experience as a higher priority over developer inconvenience.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: