I agree. Pixi solves all of that issues and is fully open source including the packages from conda-forge.
Too bad there is nowadays the confusion with anaconda (the distribution that requires a license) and the FOSS pieces of conda-forge. Explain that to your legacy IT or Procurement -.-
My variation is to use a custom script as `ProxyCommand` that resolves private route53 DNS names to instance ids, because remembering instance IDs is insane.
Mine is to run a Tailscale node on a tiny ec2 instance. Not only enabling ssh but direct access to database instances, s3 buckets that are blocked from public access etc
How are S3 buckets blocked from public access? I mean I know there is literally a “Block public access” feature that keeps S3 buckets from being read or written by unauthenticated users. But as far as I know without some really weird bucket ACLs you can still access S3 buckets if you have the IAM credentials.
Before anyone well actually’s me. Yes I know you can also route S3 via AWS internal network with VPC Endpoints between AWS services.
Specifically the vpce one as the other poster mentioned but there's other like IP limits
Another way is an IdP that supports network or device rules. For instance, Cloudflare Access and Okta you can add policies where they'll only let you auth if you meet device or network requirements which achieved the same thing
> Specifically the vpce one as the other poster mentioned but there's other like IP limits
IPs don't cut it to prevent public access. I can create my own personal AWS account, with the private IP I want, and use the credentials from there. There's really just VPC endpoints AFAIK.
I run an EC2 instance with SSM enabled. I then use the AWS CLI to port forward into the 'private' database instance or whatever from my desktop. The nice thing about this is it's all native AWS stuff, no need for 3rd party packages, etc.
The API has mostly stabilized and at this point other than some minor errors (“groupby” vs “group_by”) LLM’s seem to do pretty well with it. Personally, I’m glad they made the breaking changes they did for the long-term health of the library.
Snowflake usually unloads data to an internal stage bucket in the same region as your snowflake account. If you use an s3 gateway endpoint getting that data is free of egress charges.
Yeah, finally a method to figure out the org id of all those snowflake internal stage buckets that snowflake does not want to share for our VPC Endpoint Policies…
Hey, thanks for the question! Depot co-founder here.
We've optimized BuildKit for remote container builds with Depot. So we've added things like a different `--load` for pulling the image back that is optimized to only pull the layers back that have actually changed between the build and what is on the client. We've also done things like automatically supporting eStargz, adding the ability to `--push` and `--load` at the same time, and the ability to push to multiple registries in parallel.
We've removed saving/loading layer cache over the network. Instead, the BuildKit builder is ephemeral, and we orchestrate the cache across builds by persisting the layer cache to Ceph and reattaching it on the next build.
The largest speedup with Depot is that we build on native CPUs so that you can avoid emulation. We run native Intel and Arm builders with 16 CPUs and 32GB of memory inside of AWS. We also have the ability to run these builders in your own cloud account with a self-hosted data plane.
So the bulk of the speed comes from persisting layer cache across builds with Ceph and native CPUs. The optimized portions of BuildKit really help post-build currently. That said, we are working on some things in the middle of the build related to the DAG structure of BuildKit that will also optimize up in front of the build.
Seeing that reminded me of some healthy discussion in https://news.ycombinator.com/item?id=39235593(SeaweedFS fast distributed storage system for blobs, objects, files and datalake) that may interest you. control-f for "Ceph" to see why it caught my eye
hopefully depot will reply, but from my perspective it is mostly laid out on their homepage. they are comparing against builds in other CI products that use network-backed disks, virtualized hardware, and don’t keep a layer cache around. Depot provides fast hardware and disks and is good at making the layer cache available for subsequent builds.
You could likely get very similar performance by provisioning a single host with good hardware and simply leverage the on-host cache.
I like pixi, but I am not likely to make the switch. They don't support pyproject.toml and other standards. This disqualifies it from being a potential "recommended tool" by PyPA or whatever.
Have you compared it with poetry or pip-tools? I'm thinking of trying pixi but still can't muster up energy to do it. Especially since for my use case poetry and pip-tools cover most of it.
Too bad there is nowadays the confusion with anaconda (the distribution that requires a license) and the FOSS pieces of conda-forge. Explain that to your legacy IT or Procurement -.-