There will be a time when computing is so cheap that ads will be injected to the stream in such a way that it is impossible to remove them without real-time AI detector that indentifies the parts where ads are.
The problem Youtube has is that it wants to make sure you can't skip the ad, so it has to signal in some way to its front end and app that this segment is an ad. That mechanism can and will be used to skip it with other clients. They can put the ads in band of the video stream like twitch does but if they are genuinely indistinguishable then they are also fast forward and skippable.
If they're injecting targeted ads in the stream, then the stream producer must be 'smart'. It's not much of a stretch for it to enforce playing out the segments at approximately realtime (or whatever speedup they want to allow), and to force the advert segments to play out before anything past them. Some sidechannel could be used to inform the client about what's going on and produce a sensible playhead position.
It seems inevitable that this is the end game, and I don't really see viable ways around it for realtime playback. For offline playback, yeah, presumably that sidechannel includes enough information to cut out the ads.
TYT has to mark the ad segment, they are required by law to do it. And no matter how they try to obfuscate it, their own webpage must be able to extract that info, and present it to user. So it is pointless to integrate ads if you are going to provide the timestamps to skip.
Just look how Facebook does it, there is no "Sponsored post" anywhere in HTML, the literally place entire alphabet multiple times, each letter in separate span/paragraph tag, and then use CSS to actually style that into a message of their choice. All of that work just to prevent simple adblocking rules to work.
Laws of thermodynamics suggest it will be easier for ad companies to find a way to spam you, than for you to bypass all of their ads. These real-time AI detectors cannot be very cheap to run/train.
If a human can tell what content is an ad (which is also required to be disclosed by the platform anyway), then an AI should be able to detect it fairly easily too.
They have a financial incentive to bypass our LLM ad-blockers but visual recognition is a fairly milquetoast function for LLMs even today. They will attempt to present ads in an unusual way to fool the models; the models will get updated; they will change their tactics again. It's the same cat and mouse game we play today but fought with LLM models instead. I'm confident we win such a fight.
If it's not worth that effort it probably wasn't worth watching anyway.