Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What's Going on Inside Your Node_modules Folder? (socket.dev)
64 points by swyx on March 2, 2022 | hide | past | favorite | 33 comments


This post does a really great job explaining the risk we all take when downloading code that we haven't audited from the internet. I've never taken the time to audit the code I'm including in my projects. Instead, I blindly trust that someone else has. I love how socket has identified common security red flags and is automatically warning users about what they find. Great to see!


For a lone dev, it's impossible but I wonder if the big players like Facebook and Google actually audit all the transitive dependencies they've selected each time they release a new version of React or Angular.


I'd be curious to see how many packages have a pre/post install script that gets executed automatically by npm. This seems particularly problematic. It's a huge reduction in security, I'm surprised npm can justify the feature.


Check out http://socket.dev/npm/issue/installScripts

Majority are benign, but when a package without one adds one, you probably want to see why :)


There's a reason I'll only ever run NPM in a VM on any machine that has any data I care about


Install scripts can be disabled with a config file, or npm flag, but your paranoia is not unwarranted.


Does deno (try to) solve any of this?


Deno does not have install scripts, or a lot of other npm+node features. You can turn these off in npm though if you want a hardened install step.


point in the middle about difficulty lining up github + npm package is smart

f-droid distributes builds they run themselves from github, which depending on your trust model, is either better or worse than letting authors upload binaries, but at least shows awareness of the problem

trusted CI for software will be a big deal -- it doesn't have to be open source, but in the case of a breach you should be able to have a trusted third party check the source code for your binary and rule out foul play

we should also be incentivizing ($) expert code review of high-traffic packages -- I suspect this happens informally already (remember how big cos got serious about ssl post snowden), but would be nice to formalize


npm and GitHub have always had an awkward overlap. One of the goals at Socket is to provide an aligned 'union' view into the data both services offer. Keen eye!


Even though this is obviously pitching their auditing app, the article is absolutely excellent and the interactive visualization for webpack alone is extremely well done and rather terrifying.

Looks like an awesome tool, actually.


The crux of the issue is really what we call the "kingdom of humanity". The earlier the population learn about this hard truth, the better.


What makes this problem so endemic to the Node community? I feel like I never hear about these types of scenarios with NuGet or other package managers. Not to say that it's not a problem elsewhere - it just seems like all the horror stories stem from NPM


I think it's a combination of a relatively slim standard library and culture. For the culture part, JS devs seem to be more willing to install dependencies for basic functionality, rather that copy-pasting simple code into their projects. For example, `isobject` (https://www.npmjs.com/package/isobject) gets 55 million weekly downloads, but it's 3 lines of code


I think it's partly because the JS ecosystem doesn't historically have a solid standard library. Doing simple things can require checking for null/undefined/does the runtime support it, etc. Why do it the hard way when you can NPM install it and call it a day?

That's my understanding of it anyway.


The number of libraries reached for on your average JS app is humongous compared to basically any other programming language I've worked with, partially due to this.

I think it's also just a footgun of the JS community. People tend to jump to "what package do I need to install for this" much quicker instead of thinking "how can I solve this".

Every recent JS developer that is learning through online material is constantly berated with "just install this dep, and this dep, and then this one", to the point where it's normalized to have a dependency that comes with who knows what for something that could be a few lines of code and maybe some witty google-foo.


That and an aggressive auto update with lax version constraints for transitive dependencies leads to frequently downloading untrusted code.


Nuget has the same issue - it can execute arbitrary code during installation. Perhaps it's being exploited already but nobody noticed?

Regardless, even in case of package managers that don't have install scripts (e.g. Maven) one could simply insert malware directly into library code and have it execute whenever you run tests or your application.

The only true solution would be some sort of sophisticated sandboxing or sophisticated malware detection or distributed code review.


Same reason most of the malware ecosystem focused on windows instead of macOS for many years. Scale of your potential target, and also volume of noise that is made when something bad happens. Also it supports nested dependencies and the tools can accommodate large deep trees without much pain, so they tend to grow in size over time.


Is Node really that much more widepsread than dotnet, Java, or Python? Or is it more about the general experience level/knowledge of the typical JS developer, where it's more likely to have Bootcamp/self-taught experience than a formal CompSci education?



Kinda wish npm gave you a little more info up front, like ‘this package will download # dependencies, do you wish to continue?’


There is a smattering a tools across the eco system that provide this kind of info (packagephobia, avanka etc) and it would be fantastic to surface these in a unified product UI. You will know we've met our goals when socket becomes your goto service to navigate npm!


Changelog podcast with the author: https://changelog.com/podcast/482


I am an android developer and we having a react bridge, so sometimes I end up dealing with npm/yarn etc. The sight and depth of node_modules folder gives me panic attacks.

The ~/.gradle/caches isn’t something to write home about either.


yeah gradle/maven packages benefit from being much more coarse grain, so fewer large packages, fewer authors, fewer transitives, fewer risks in theory. Quicker to download and manage too, it really is just the tiny size of JS packages that causes much of the problems, having an RCE baked into NPM doesn't help either I guess..

I did just check my ~/.gradle is 17G, I can probably delete Gradle 4.0, 4.1, 4.8, 5.0... thanks for the reminder! =)


Even if the pre/post install hooks are removed, you can still execute code in the source code files using some exec command. Doesn't catch as many people out, but anyone running an app locally has given it permissions to your shell.

This is the same problem for Go and other package managers right? I guess the best defence is some npm install alternative that runs a check against a trusted registry of modules that have been audited. Does that exist already?


Why can’t we solve this problem by forcing users to provide checksums? It’s particularly damning that Microsoft owns Github and Npm for their own strategic purposes and do nothing to mitigate the fact that source code hosted on Github can differ drastically from the compiled code in the registry.


Based on the blog's domain, this may be a mild dupe of https://news.ycombinator.com/item?id=30515090


Would be great to run their tool as a proxy to npm whenever you install rather being triggered on a github action.


We are assessing this possibility.


I would pay for that.


"Node_modules"? I don't think that exists




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: