They maybe have an rhlf phase, but I mean there is also just the shape of the distribution of images on the internet and, since this is from alibaba, their part of the internet/social media (Weibo) to consider
With today's remote social validation for women and all time low value of men due to lower death rates and the disconnect from where food and shelter come from, lonely men make up a huge portion of the population.
I'm still not following. Ads for a pickup truck are probably more likely to feature towing a boat than ads for a hatchback even if they're both capable of towing boats. Because buyers of the former are more likely to use the vehicle for that purpose.
If a disproportionate share of users are using image generation for generating attractive women, why is it out of place to put commensurate focus on that use case in demos and other promotional material?
I mean things that take hard physical labor are typically self limiting...
I do nerdy computer things and I actually build things too, for example I busted up the limestone in my backyard in put in a patio and raised garden. Working 16 hours a day coding/or otherwise computering isn't that hard even if your brain is melted at the end of the day. 8 - 10 of physically hard labor and your body starts taking damage if you keep it up too long.
And really building houses is a terrible example! In the US we've been chronically behind on building millions of units of houses. People complain the processes are terribly slow and there is tons of downtime.
> It's incredibly clear who the devs assume the target market is.
Not "assume". That's what the target market is. Take a look at civitai and see what kind of images people generate and what LoRAs they train (just be sure to be logged in and disable all of the NSFW filters in the options).
Considering how gaga r/stablediffusion is about it, they weren’t wrong. Apparently Flux 2 is dead in the water even though the knowledge it has contained in the model is way, way higher than Z-Image (unsurprisingly).
Z-Image is getting traction because it fits on their tiny GPUs and does porn sure, but even with more compute Flux 2[dev] has no place.
Weak world knowledge, worse licensing, and it ruins the #1 benefit of a larger LLM backbone with post-training for JSON prompts.
LLMs already understand JSON, so additional training for JSON feels like a cheaper way to juice prompt adherence than more robust post-training.
And honestly even "full fat" Flux 2 has no great spot: Nano Banana Pro is better if you need strong editing, Seedream 4.5 is better if you need strong generation.
I'm a fan of more formal methods in progam analysis, but this particular excercise is very hindsight-is-20/20
> In this case, we can set up an invariant stating that the DNS should never be deleted once a newer plan has been applied
If that invariant had been expressed in the original code — as I'm sure it now is — it wouldn't have broken in the first place. The invariant is obvious in hindsight, but it's hardly axiomatic.
John McCarthy‘s qualification problem[0] relates to this.
While one can and will add invariants, as they are discovered, they cannot all be found.
Entscheidungsproblem and Trakhtenbrot's theorem apply here, counterintuitively that the validity of finite models is in co-re but not in re.
Validity in this case is not dependent by the truth of the premise or the truth of the conclusion.
Basically we have to use tools like systems thinking to construct robust systems, we cannot universally use formal methods across frames.
It is one way race conditions are complex.
Hindsight bias makes it seem easy but that is because that is in the co-re side.
Well intended actions with hindsight can often result in brittle systems as their composition tends to set systems in stone, with the belief that axioms are the end solution.
The fact that Gödels completeness theorem may not apply for finite systems when it works so well for infinite ones is hard for me to remember.
Remembering that axiomatization is a powerful tool but not a silver bullet has actually helped me more than I can count.
not deleting the active plan seems like a basic fail-safe design choice, and this isn't AWS people's first rodeo. likely there was some rationale for not going with a built-in fallback.
If it was, they would have mentioned it in their summary report, the way they justified other deliberate design decisions. I find it more likely they thought of 25 different ways this system could fail, fixed the ones that needed fixing (some of them hinted in the summary report), and then they forgot about that one way it was actually going to fail. Happens all the time.
I agree this article is very hindsight biased though. We do need a way to model the failure modes we can think of, but we also need a method that helps us think of what the failure modes are, in a systematic manner that doesn't suffer from "oops we forgot the one way it was actually going to fail".
Yes, any analysis after an incident has the benefit, and bias, of hindsight.
But I see this post less as an incident analysis and more as an experiment in learning from hindsight. The goal, it seems, isn’t to replay what happened, but to show how formal methods let us model a complex system at a conceptual level, without access to every internal detail, and still reason about where races or inconsistencies could emerge.
Every such analysis will have some hindsight bias. Still, it’s a great post that shows how to model such behavior. And I agree with the another reply that not deleting an active plan seems like a basic fail safe choice which the post also covered
Vimeo won’t “go down” anytime soon. It might get worse/more expensive, but it’s not in imminent danger. And it’s also not the only white-label provider around, either.
For customer facing streaming sites they also don't seem to be the clear default choice. I think dropout.tv is one of the few "secondary streaming services" to still be with Vimeo (and with the strong overlap in their networks I'm sure they got a good deal), while many other ones like Nebula evaluated them but went with other providers.
It looks like the majority of their business is in employee training portals for megacorps.
The Nebula apps are pretty bad. Vimeo white label has issues but the app experience is much better than whomever Nebula are using.
One reason Vimeo is a good deal is that they charge for video transcoding by the minute, not by the file size. So you can upload full ProRes 4K movies and it doesn't cost the earth.
I dug in a bit and did a little research. Criterion and Dropout, as mentioned, use Vimeo. So does Arrow's streaming service (cult films, mostly). From there on in it gets even more fringe: Taskmaster (the British game show) has a streaming service built on Vimeo, the Z-movie titans Troma and Full Moon Features use Vimeo, and so on. So not insanely crucial but I still think it's a better world in which those services can keep doing what they do.
Yes, there are alternatives. Nonetheless, the niche streamers I'm thinking of would find it a moderate burden to move to a different provider and they're operating on slim margins as is.
Yeah the issue reads as if someone asked Claude Code "find the most serious performance issue in the VSCode rendering loop" and then copied the response directly into GitHub (without profiling or testing anything).
Even if it is a real performance issue, the reasonable fix would be to move the sort call out of the loop - implementing a new data structure in JS is absolutely not the way to fix this.
Adding a new data structure just for this feels like such an AI thing. I've added to our agents.md a rule to prefer using existing libraries and types, otherwise Gemini will just happily generate things like this.
Right, and also this would show up in the profiler if it were a time sink — and I'm 100% certain this code has been profiled in the 10 years it's been in the codebase.
There’s clearly functionality to push more work to the current window’s queue, so I would not be surprised if the data structure needs to be continually kept sorted.
(Somewhere in the pile of VSCode dependencies you’d think there’d be a generic heap data structure though)
also no discussion of measured runtimes for the rendering code. (if it saves ~1.3ms that sounds cool, but how many ms is that from going over the supposed 16ms budget.)
reply