This is what I'm fiddling with. My 2080Ti is not quite enough to make it viable. I find the small models fail too often, so need larger Whisper and LLM models.
Like the 4060 Ti would have been a nice fit if it hadn't been for the narrow memory bus, which makes it slower than my 2080 Ti for LLM inference.
A more expensive card has the downside of not being cheap enough to justify idling in my server, and my gaming card is at times busy gaming.