It's not in the cloud, this is all done on-device. They have a moon detection AI model that they pass the image through, the output of which is then used to set up and then run the moon enhancement model, which fills in the detail from moon images it's been trained upon.
These are part of a larger package of super-resolution models and processing pipeline that Samsung licenses from ArcSoft, a computational photography software company that specialises in mobile devices.
Anyway, if they can recognize gender and age and what do you have in your refrigerator ("This offers users a smoother and smarter refrigerator experience.") :
On Samsung devices that support this, it's implemented in the file libsuperresolution_raw.arcsoft.so in /system/lib64, if you're curious to have a look at how it works.
Some strings from that file, relating to the moon detection and enhancement process:
I suspect that if the AI detects two moons, it probably abandons "enhancement" and drops a message in the logs (or renders an error, though I doubt that's what that particular string is for).
I think that’s not really how super resolution works, it’s closer to how diffusion models hallucinate details inspired by their training set.
I roughly think about it being like you have a lot of high res moons lossy compressed into model weights until they are basically distilled into some abstract sense of “moon image”-ness “indexed” by blurry image patches. Running the network then “unzips” some of that detail over matching low resolution patches.
(Using quotes to denote terms I am heavily abusing in this loose analogy)
Edit: Importantly though I think this isn’t that different from what you are describing in terms of whether this is potentially misleading customers about the optics on their device, because it is inserting detail not actually captured by the camera sensor.
Super Resolution is only part of the process. It says they then apply "Scene Optimizer’s deep-learning-based AI detail enhancement engine".
If the model and its weights contain detail not in the photo being taken, then it's tantamount to having high res images of the moon stored on camera and composited into the image. And if it doesn't then it's not the moon being displayed.
Not that it's necessarily bad, but it could be if it fools someone into thinking they're buying superior electro/optics. Enough such that it warranted the line "Samsung continues to improve Scene Optimizer to reduce any potential confusion that may occur between the act of taking a picture of the real moon and an image of the moon".
I think it's actually worse than compositing in a high-resolution moon image. With AI enhancement, the details will look believable but may be completely inaccurate.
> If the model and its weights contain detail not in the photo being taken, then it's tantamount to having high res images of the moon stored on camera and composited into the image.
This is what is happening. I agree it’s tricking people into thinking it’s all optics and that’s kinda bad.
The "it finds an image in its database" understanding of diffusion models doesn't work for SD because it can interpolate between prompts, but since there is only one moon, and it doesn't rotate, and their system doesn't work when the moon is partially obscured, there is really no need to describe it as anything more complex than that.
You mean like wrinkles in basically all today's phones (at least for selfies camera, very aggressively) for silky smooth skin or completely changing skin tone that make even ghouls look OK for a nice instagram stream?
I properly don't get all the outrage, ironing faces, changing colors and removing moles is a nice celebrated feature and even created whole 'Apple skin' thingie, but adding well known static details to blurry image of the moon is somehow suddenly crossing the line? That line has been crossed long time ago my friends, look at optic physics of those tiny sensors and crappy lenses and results 'ya all want from them. People mentioned yesterday in main thread that Apple latest switched side pic of bunny in the grass for his other side, which is even more hilarious 'painting rather than photography' case.
Plus how things are exactly done in phone is known to maybe 10 engineers most probably in Korea, but people here quickly jumped on outrage wagon from 1 very lightly and non-scientifically done test on reddit (I tried to repeat his experiment with S22 ultra but failed 100% of the time, with his own images, it was just blurry mess 100% of the cases).
I have S22 ultra and its making properly good completely handheld night photos in total darkness (just stars, but no moon and no artificial light). I mean me standing in dark forest 11pm, snapping nice detailed foliage and starry pics, where my very eyes see very little from the scene. It very well surpasses my fullframe Nikon with superb lense in this. But its true that is done by main, big sensor, and not 10x zoom one, which is then zoomed extra say 3-4x digitally to get shot of moon which spans whole picture.
What do you mean, what else? I'm genuinely curious what kind of nefariousness you imagine is going on? Like,they're swapping images of Fords with Hyundais? Swapping images of catenaries with nigh indistinguishable parabolas?
This looks to be specifically optimized for the Moon, but they are very close to a generic algorithm that they can feed blurry cell-phone images of common photography subjects on the input, match them to high-resolution known-good stock photos. Those stock photos may be taken by pros using studio lighting and full-frame DLSRs with lenses containing more glass than the mass of the entire cell phone. Regardless of whether you're copy-pasting in a rectangular bitmap from a stored stock photo or whether your convolutional neural net applies stock-photo-ness to input data, the effect is the same: you can generate photos that look really astonishingly good given that the input optics stack has a 5mm deep total track and your sensor has less than 10mm across the diagonal.
Once you have that general framework, though, I'd expect the Samsung to want to 'enhance' Samsung phones and smartwatches (especially), Coke cans, candles, and cars, butterflies and birds, the Moon or a Macbook, rainbows or diamond rings, and anything else that's both hard to take a good picture of and also a common photography subject. Speaking of Coke cans, that gives my pessimistic, dystopian-leaning imagination an idea: I wonder if Coke could or other brands might in future develop an exclusive partnership with Samsung to send stock photos of their product, resulting in photos of your meal where the familiar red can looks especially glossy, saturated, reflective, crisp, and refreshing...while an Instagram of your meal from the food truck that has cans of Pepsi still looks like a cell phone photo.
Upscaling photos of printed pages by matching against a font library seems like a similar opportunity, though you run into the old Xerox copier issue that the transformation removes information about uncertainty. I'm sure they'd love to have pictures of people look better too, but it's more difficult when there are billions of subjects to ID.
I'm not saying that they should or should not do any of this, but when you want to sell phones with cameras that make your shots look really good, this kind of technique lets you cheat the laws of physics.
Sorry for being late in reading your response, but yes I agree that those dystopian scenarios are not outside of the realm of possibility. As I said before, what makes the moon unique is that every photo of the full moon is essentially the same image from the same angle, whereas a photo of a Coke can is going to be from all different angles, sides, lighting conditions, even sizes, and thus much harder to convincingly "enhance." But I have no doubt that Samsung Research is up to the task of figuring out even more ways to monetize our eyeballs, so your point is well taken.
Given that people still get convicted based on eyewitness testimony that has been repeatedly shown to be extremely inaccurate, I hold out little hope that the justice system will figure out how unreliable modern photography is anytime soon.
You think there aren't processes in place for this sort of thing? A due process even.
IT experts are brought into cases all the time for stuff like this. This isn't even the standard camera "lying", it's explicitly a special tab for taking super-zoomed in AI optimized photos.
And most of the "Experts" brought into courtrooms will argue whatever you pay them to argue. It's an industry. Lawyers aren't picking random AI engineers out of a mom and pop tech company, they have dedicated people who as a job sell their professional title to lawyers to launder a narrative through "an expert"