This is pretty decent, but a bit slow on my M2 Pro. Runs better on CPU, which is...

This is pretty decent, but a bit slow on my M2 Pro. Runs better on CPU, which is strange.

Still, here's a quick guide to getting it to work on Metal:

    --requirements.txt additions--
    torchvision==0.18.0
    accelerate==0.30.1

    --gpu_utils.py patch--
    def select_device(min_memory = 2048):
        logger = logging.getLogger(__name__)
        if torch.backends.mps.is_available():
            device = torch.device('mps')
            return device

could probably do with support for device_map for multiple backends...

Edit: it also seems tho hallucinate/become increasingly unreliable with longer sentences.