Just curious, (I have no idea how GPU stats influence neural network benchmarks), would slapping a 1080ti alongside my 3060ti gain me anything? Can I 'cluster' VRAM for better performance? Can we top ~5x transcribe speeds with more VRAM?
I'm open to the idea of buying an additional old gen GPU that nails a good price/VRAM ratio
> I have no idea how GPU stats influence neural network benchmarks
I don’t have any idea either, I don’t do ML stuff professionally. On my day job I’m using the same tech (C++, SSE and AVX SIMD, DirectCompute) for a CAM/CAE application.
> would slapping a 1080ti alongside my 3060ti gain me anything
In the current version of my library, you’ll gain very little. You’ll probably get the same performance as on my computer.
I think it should be technically possible to split the work to multiple GPUs. The most expensive compute shaders in that library, by far, are computing matrix*matrix products. When each GPU has enough VRAM to fit both input matrices, the problem is parallelizable.
However, that’s a lot of work, not something I’m willing to do within the scope of that project. Also, if you have multiple input streams to transcribe, you’ll get better overall throughput processing these streams in parallel on different GPUs.
> I'm open to the idea of buying an additional old gen GPU that nails a good price/VRAM ratio
If you can, try on Radeon RX 6700 XT, or better ones from that table: https://en.wikipedia.org/wiki/Radeon_RX_6000_series#Desktop The figure for VRAM bandwidth is “only” 384 GB/sec, but the GPU has 96 MB L3 cache, which might make a difference for these compute shaders. That’s pure theory though, I haven’t tested on such GPUs. If you do that, make sure to play with the comboboxes on the “Advanced GPU Settings” dialog in the desktop example.
I'm open to the idea of buying an additional old gen GPU that nails a good price/VRAM ratio