The calls per server is probably not the difficult part - this is the type of sc...

The calls per server is probably not the difficult part - this is the type of scale where you start hitting much, much harder problems, e.g.:

- Load balancing across regions [0] without significant latency overhead

- Service-to-service mesh/discovery that scales with less than O(# of servers)

- Reliable processing (retries without causing retry storms, optimistic hedging)

- Coordinating error handling and observability

All without the engineers actually writing the functions needing to know anything about the system (which requires airtight reliability, abstractions, observability).

I don't mean to comment on whether this is impressive or not, just pointing out that per-server throughput would never be the difficult part of reaching this scale.

[0] And apparently for this system, load balancing across time, which is at least a mildly interesting way of thinking about it