They do this because they want SSDs to be in a physically separate part of the building for operational reasons, or what's the point in giving you a "local" SSD that isn't actually plugged into the real machine?
The comment you’re responding to is wrong. AWS offers many kinds of storage. Instance local storage is physically attached to the droplet. EBS isn’t but that’s a separate thing entirely.
The reason for having most instances use network storage is that it makes possible migrating instances to other hosts. If the host fails, the network storage can be pointed at the new host with a reboot. AWS sends out notices regularly when they are going to reboot or migrate instances.
Their probably should be more local instance storage types for using with instances that can be recreated without loss. But it is simple for them to have a single way of doing things.
At work, someone used fast NVMe instance storage for Clickhouse which is a database. It was a huge hassle to copy data when instances were going to be restarted because the data would be lost.
Sure, I understand that, but this user is claiming that on GCP even local SSDs aren't really local, which raises the question of why not.
I suspect the answer is something to do with their manufacturing processes/rack designs. When I worked there (pre GCP) machines had only a tiny disk used for booting and they wanted to get rid of that. Storage was handled by "diskful" machines that had dedicated trays of HDDs connected to their motherboards. If your datacenters and manufacturing processes are optimized for building machines that are either compute or storage but not both, perhaps the more normal cloud model is hard to support and that pushes you towards trying to aggregate storage even for "local" SSD or something.
> At work, someone used fast NVMe instance storage for Clickhouse which is a database. It was a huge hassle to copy data when instances were going to be restarted because the data would be lost.
We moved to running Clickhouse on EKS with EBS volumes for storage. It can better survive instances going down. I didn't work on it so don't how much slower it is. Lowering the management burden was big priority.
Reboot keeps the instance storage volumes. Restarting wipes them. Starting frequently migrates to new host. And the "restart" notices AWS sends are likely cause the host has a problem and need to migrate it.