Hacker Newsnew | past | comments | ask | show | jobs | submit | spion's commentslogin

pnpm just added minimum age for dependencies https://pnpm.io/blog/releases/10.16#new-setting-for-delayed-...


From your link:

> In most cases, such attacks are discovered quickly and the malicious versions are removed from the registry within an hour.

By delaying the infected package availability (by "aging" dependencies), we're only delaying the time, and reducing samples, until it's detected. Infections that lay dormant are even more dangerous than explosives ones.

The only benefit would be if, during this freeze, repository maintainers were successfully pruning malware before it hits the fan, and the freeze would give scanners more time to finish their verification pipelines. That's not happening afaik, NPM is crazy fast going from `npm publish` to worldwide availability, scanning is insufficient by many standards.


Afaict many of these recent supply chain attacks _have_ been detected by scanners. Which ones flew under the radar for an extended period of time?

From what I can tell, even a few hours of delay for actually pulling dependencies post-publication to give security tools a chance to find it would have stopped all (?) recent attacks in their tracks.


Thank god, adopting this immediately. Next I’d like to see Go-style minimum version selection instead.


Oh brilliant. I've been meaning to start migrating my use to pnpm; this is the push I needed.


I don't think thats contrary to the article's claim: the current tools are so bad and tedious to use for repetitive work that AI is helpful with a huge amount of it.


Its not settled whether AI training is fair use.


Try actually doing it, realise how very far the outcome is from what the blog posts describe the vast majority of the time, and get dread from the state of (social) media instead.


Yes, but the cooked thing is you just run more loops with the right prompts and you can resolve defective outcomes. It's terrifying


No, it still doesn't work. But the only way to realise it is to actually really try using it.


I think agents have a curve where they're kinda bad at bootstrapping a project, very good if used in a small-to-medium-sized existing project and then it goes downhill from there as size increases, slowly.

Something about a brand-new project often makes LLMs drop to "example grade" code, the kind you'd never put in production. (An example: claude implemented per-task file logging in my prototype project by pushing to an array of log lines, serializing the entire thing to JSON and rewriting the entire file, for every logged event)


There are a few languages where this is not too tedious (although other things tend to be a bit more tedious than needed in those)

The main problem with these is how do you actually get the verification needed when data comes in from outside the system. Check with the database every time you want to turn a string/uuid into an ID type? It can get prohibitively expensive.


The OP is the author of grugbrain.dev


What you think is an absurd question may not be as absurd as it seems, given the trillions of tokens of data on the internet, including its darkest corners.

In my experience, its better to simply try using LLMs in areas where they don't have a lot of training data (e.g. reasoning about the behaviour of terraform plans). Its not a hard cutoff of being _only_ able to reason exactly about solved things, but its not too far off as a first approximation.

The researchers took exiting known problems and parameterised their difficulty [1]. While most of these are not by any means easy for humans, the interesting observation to me was that the failure_N was not proportional to the complexity of the problem, but more with how common solution "printouts" for that size of the problem can typically be encountered in the training data. For example, "towers of hanoi" which has printouts of solutions for a variety of sizes went to very large number of steps N, while the river crossing, which is almost entirely not present in the training data for N larger than 3, failed above pretty much that exact number.

[1]: https://machinelearning.apple.com/research/illusion-of-think...


It doesn't help that thanks to RLHF, every time a good example of this gains popularity, e.g. "How many Rs are in 'strawberry'?", it's often snuffed out quickly. If I worked at a company with an LLM product, I'd build tooling to look for these kinds of examples in social media or directly in usage data so they can be prioritized for fixes. I don't know how to feel about this.

On the one hand, it's sort of like red teaming. On the other hand, it clearly gives consumers a false sense of ability.


Indeed. Which is why I think the only way to really evaluate the progress of LLMs is to curate your own personal set of example failures that you don't share with anyone else and only use it via APIs that provide some sort of no-data-retention and no-training guarantees.


Vibe-wise, it seems like progress is slowing down and recent models aren't substantially better than their predecessors. But it would be interesting to take a well-trusted benchmark and plot max_performance_until_date(foreach month). (Too bad aider changed recently and there aren't many older models; https://aider.chat/docs/leaderboards/by-release-date.html has not been updated in a while with newer stuff, and the new benchmark doesn't have the classic models such as 3.5, 3.5 turbo, 4, claude 3 opus)


I think that we can't expect continuous progress either, though. Often in computer science it's more discrete, and unexpected. Computer chess was basically stagnant until one team, even the evolution of species often behaves in a punctuated way rather than as a sum of many small adaptations. I'm much more interested (worried) of what the world will be like in 30 years, rather than in the next 5.


Its hard to say. Historically new discoveries in AI often generated great excitement and high expectations, followed by some progress, then stalling, disillusionment and AI winter. Maybe this time it will be different. Either way what was achieved so far is already a huge deal.


Why dagger and not just... any language? (Nushell for example https://www.nushell.sh/)


Because I'm typically building and running tests in containers in CI, which is what dagger is for.

nu is my default shell. Note that I am not talking about dagger shell. https://dagger.io/blog/a-shell-for-the-container-age-introdu...


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: