I've been following open source LLMs for a while and at first glance this doesn'...

alchemist1e9 · on March 28, 2023

Sounds like you might be the right person to ask the “big” question.

For a small organization or individual who is technically competent and wants to try and do self-hosted inference.

What open model is showing the most promise and how does it’s results compare to the various openAI GPTs?

A simple example problem would be asking for a summary of code. I’ve found openAI’s GPT 3.5 and 4 to give pretty impressive english descriptions of code. Running that locally in batch would retain privacy and even if slow could just be kept running.

Garcia98 · on March 28, 2023

Google's Flan-T5, Flan-UL2 and derivatives, are so far the most promising open (including commercial use) models that I have tried, however they are very "general purpose" and don't perform well in specific tasks like code understanding or generation. You could fine-tune Flan-T5 with a dataset that suits your specific task and get much better results, as shown by Flan-Alpaca.

Sadly, there's no open model yet that acts like a Swiss knife and gets good-enough results for multiple use cases.

capableweb · on March 29, 2023

Iterating on the question, what model/weights would be the most appropriate for the specific use case of code generation right now?

Garcia98 · on April 2, 2023

Sorry for the late reply, as I said Flan-UL2 (or Flan-T5 if you want lighter models) fine-tuned against a dataset like CodeAlpaca's[0] is probably the best solution if it's intended for commercial use (otherwise LLaMa should perform better).

[0]: https://github.com/sahil280114/codealpaca

ftxbro · on March 28, 2023

Their goal isn't to make a powerful model. It's to show how well compute-optimal models do on test-loss as a function of increasing model size. This function can be used with some caveats to forecast the test-loss of larger models for which compute-optimality becomes more important.