It's always balancing bottlenecks. Here, the bottleneck is memory bandwidth, limiting to (more or less) 32 lanes of network; the platform has 128 lanes, so using more lanes than needed at a slower rate works and saves a bit of cost (probably). On their Intel Ice Lake test machine, that only had 64 lanes which is also a bottleneck, so they used Gen4 NVMe to get the needed storage bandwidth into the lanes available.