Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> If those images disappear, we lose the ability to release and that's not acceptable.

This shines light on why it is so risky (from both availability and security perspectives) to be dependent on any third party for the build pipeline of a product.

I have always insisted that all dependencies must be pulled from a local source even if the ultimate origin is upstream. I am continuously surprised how many groups simply rely on some third party service (or a dozen of them) to be always and perpetually available or their product build goes boom.



Likewise. I've always insisted on building from in-house copies of external dependencies for precisely this kind of scenario. It astonishes me the number of people who didn't get why. Having things like docker rate-limiting/shutdowns, regular supply chain attacks, etc has been helping though.

Slightly related: actually knowing for sure that you've got a handle on all of the external dependencies is sometimes harder than it should be. Building in an environment with no outbound network access turns up all sorts of terrible things - far more often than it should. The kind that worry me are supposedly self-contained packages that internally do a bunch of "curl | sudo bash" type processing in their pre/post-install scripts. Those are good to know about before it is too late.


> Building in an environment with no outbound network access turns up all sorts of terrible things

Yes, highly recommended to build on such a system, it'll shake out the roaches that lie hidden.

In a small startup environment, the very least to do is at least keep a local repository of all external dependencies and build off that, so that if a third party goes offline or deletes what you needed you're still good.

For larger enterprises with more resources, best is to build everything from source code kept in local repositories and do those builds, as you say, in machines with no network connectivity. That way you are guaranteed that the every bit of code in your product can be (re)built from source even far in the future.


Be sure to archive your development tools as well, just in case that rug gets pulled. You don't want to be in the position that you need v3.1415927 of FooWare X++ because version 4 dropped support for BazQuux™, only to find that it's no longer downloadable at any price.


I do not know if Nix will be the answer, but I really hope it or a successor drags us to fully explicit and reproducible builds.


for reproducing a build you need at the least the source and the tools to build it which might not be available as well


Yes, but Nix is essentially about getting things built, so those build tools are part of the recipe to make something happen.

I'm still learning Nix myself, but one small example: a small, Haskell-based utility I've written depends on specific versions of one library, due to API changes. That version gets lumped in according to some GHC versions. The whole situation was uncomfortable, in that code I had left working, stopped building some time later when I came back to run with whatever was seeming more current.

Defining a short nix flake solved all of that. That first compile was a slog, since it fetched and built the appropriate GHC and libraries, including whatever transitive dependencies those needed. Once done though, those are cached, and "nix build" just works.


We can't go NIH for everything. If we do that we're back to baremetal in our own datacenters and that's expensive and (comparatively) low velocity. We have to pick and choose our dependencies and take the trade off of risk for velocity.

This is the tradeoff we made with the move to cloud. We run our workloads on AWS, GCP or Azure, use DataDog or New Relic for monitoring, use Github or GitLab for repos and pipelines, and so forth. Each speeds us up but is a risk. We hope they are relatively low risks and we work to ameliorate those risks as we can.

An organization like Docker should have been low risk. Clearly, it's not. So now it's a strong candidate for replacement with a local solution rather than a vendor to rely on.


It's less NIH and more "cache your dependencies." Details will very greatly depending on what your tech stack looks like, if you're lucky you can just inline a cache. I know Artifactory is a relatively general commercial solution although I can't speak personally about it.

If you can't easily use an existing caching solution, then the only NIH you need to do is copying files that your build system downloads. I know many build systems are "just a bunch of scripts" so those would probably be pretty amenable to this, I don't know if more opaque systems exist that wouldn't give you any access like that. If so, I suppose you could try to just copy the disk the build system writes everything to, but then you're getting into pretty hacky stuff and that's not ideal. Copying the files doesn't give you the nice UX of a cache, but it does mean that in the worst case scenario you at least have the all the dependencies you've used in recent builds, so you'll be able to keep building your things.


> I don't know if more opaque systems exist that wouldn't give you any access like that

As long as there is there is "server reimplementation", i.e. private registries available, one can always hack together a solution out of self signed CA, DNS and routing to replace "the server" with local registry.


"Free service which requires $$ to maintain" and "low risk" are not compatible.

We moved to cloud as well, and we use AWS ECR for caching. We have a script for "docker login to ECR" and a list of images to auto-mirror periodically. There is a bit of friction when adding new, never-seen-before image, but in general this does not slow developers much. And we never hit any rate-limits, too!

We pay for those ECR accesses, so I am pretty confident they are not going to go away. Unlike free docker images.


> We can't go NIH for everything. If we do that we're back to baremetal in our own datacenters[...]

It's a bit of a leap from keeping copies of dependencies to building your own datacenter. Even the smallest startup can easily do the former.

> This is the tradeoff we made with the move to cloud.

To clarify, when I say keep local copies I meant copies which are under local control (i.e. control of your organization). They may well still physically be in AWS somewhere. The key is that they can't be modified/deleted by some third party who doesn't report to your organization.

Yes, this assumes AWS is too big to fail, but for the typical startup whose entire existence is already dependent on their AWS account being available, this would not increase risk beyond what it already is. Whereas each additional hard dependency on third-party repos do increase risk.


> that's expensive and (comparatively) low velocity

The problem with this approach begins when many people your build depends upon start to share it.


You can prototype without NIH and later go NIH when you have stuff to lose.


I run a local Ubuntu mirror for the work systems I manage, for this reason.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: