Yeah, I honestly think some of the language used with LoRA gets in the way of people understanding them. It becomes much easier to understand when looking at an actual implementation, as well as how they can be merged or kept separate.
Create "packages" of context for popular API & Libraries - make them freely available via public url
Keep the packages up to date. They'll be newer than the cutoff date for many models, and they'll be cleaner then the data a model slurps in using web search.
Voila, you're now the trusted repository of context for the developer community.
You'll get a lot of inbound traffic/leads, and ideas for tangential use cases with commercial potential.
You can't start fresh chats with updated context and you still need to create multiple chat in your preferred chat service and copy-paste data. But this is SO good to use with development environment through docs or MCP! Thanks for sharing
To be free and not forward credentials I built an alternative to all-in-one chatting without auth and access to search API through the web:
https://llmcouncil.github.io/llmcouncil/
Provides simple interface to chat with Gemini, Claude, Grok, Openai, deepseek in parallel
I love this, it can be very useful to have ready to use library of context data and at the same time a perfect solution to bring in new users. Thanks so much.
"reinforcement learning from human feedback .. is designed to optimize machine learning models in domains where specifically designing a reward function is hard"
I think if you are able to define a reward function then it sort of doesn’t matter how hard it was to do that - if you can’t then RLHF is your only option.
For example, say you’re building a chess AI that you’re going to train using reinforcement learning alphazero-style. No matter how fancy the logic that you want to employ to build the AI itself, it’s really easy to make a reward function. “Did it win the game” is the reward function.
On the other hand, if you’re making an AI to write poetry. It’s hard/impossible to come up with an objective function to judge the output so you use RLHF.
It lots of cases the whole design springs from the fact that it’s hard to make a suitable reward function (eg GANs for generation of realistic faces is the classic example). What makes an image of a face realistic? So Goodfellow came up with the idea of having two nets one which tries to generate and one which tries to discern which images are fake and which real. Now the reward functions are easy. The generator gets rewarded for generating images good enough to fool the classifier and the classifier gets rewarded for being able to spot which images are fake and which real.
I don't know about ffslice, but you can get frame-perfect slicing with minimal reencoding via LosslessCut's experimental "smart cut" feature[2] or Smart Media Cutter's[3] smartcut[4].
For some reason, when ffmpeg reencodes from 23.97fps h264 to the same fps and codec, the result looks choppy, like the shutter speed was halved or something. The smart lossless encoding you mentioned helps a lot here.