No, there is hardware for it, and it makes a big difference. Ballpark 2x, but it can be more or less depending on the details of the workload (ie shader complexity).
One way to get an empirical handle on this question is to write a rasterization pipeline entirely in software and run it in GPU compute. The classic Laine and Karras paper does exactly that:
An intriguing thought experiment is to imagine a stripped-down, highly simplified GPU that is much more of a highly parallel CPU than a traditional graphics architecture. This is, to some extent, what Tim Sweeney was talking about (11 years ago now!) in his provocative talk "The end of the GPU roadmap". My personal sense is that such a thing would indeed be possible but would be a performance regression on the order of 2x, which would not fly in today's competitive world. But if one were trying to spin up a GPU effort from scratch (say, motivated by national independence more than cost/performance competitiveness), it would be an interesting place to start.