It probably means you flipped the combobox on the first screen. In the build on github, the only included model implementation is GPU. The other two implementations are disabled with macros, there: https://github.com/Const-me/Whisper/blob/1.1.0/Whisper/stdaf... These implementations are lacking some UX features like callbacks and cancellation, and I haven't tested them for a while, but they might still work.
> does this use both GPU and CPU simultaneously?
No, it's sequential. There's a data dependency between these two stages. The encode function computes some buffers (probably called "cross attention" but I'm not sure, not an ML expert), and then the decode function needs that data to generate the output text.
When I run this, it succeeds fine but I get the message "This build of the DLL doesn’t implement the reference CPU-running Whisper model."
What does this mean? Also very interested in the hybrid option, how do you get this working, and does this use both GPU and CPU simultaneously?