AFAIK Part of the problem with Rust is also that it compiles crates individually before linking them and because of that cannot use the upfront knowledge of what's going to be needed, and as such a generic function that crosses the crate boundary is going to be handled twice by the compiler.
This was initially done so that Rust could compile things in parallel between crates by with spawning more rustc processes, which is obviously much easier than building a parallel compiler directly, but in the end it's suboptimal for performance.
Rust suffers because they compile everything from source, and the frontend sends piles of unprocessed LLVM IR to the traditional slow backend.
This can be improved with better tooling, one example is the Cranelift backend, there could be an interpreter, and so on.
Examples of languages that don't send compile times to the moon with similar polymorphic power, Standard ML, OCaml, Haskell, D, Ada.