One method uses the photo you actually took as the base and enhances from that. The other method used a random old picture of the moon from elsewhere and copypastes details of that onto your photos.
The distinction is significant, as for one of them your photo is the sole source of truth, while for the other it just "inserts" the image from elsewhere into your own photo. The former is expected, but the latter is not.
It's the same as the difference between enhancing a photo of yourself by doing some color/light processing or upscaling/sharpening using AI vs. "enhancing" by getting Brad Pitt's eyes and Angelina Jolie's nose copypasted onto a photo of your own face.
I'd argue that "enhancing details" is not the same as "replacing details".
Modifying the existing pixels captured by the sensor isn't the same as replacing an AI recognized section of the original pixels with other pixels entirely.
I think other interpretations of "enhance" aren't wrong either, since "enhance" is pretty subjective in the first place.
If there were suddenly a new crater on the moon whose photons were reaching the camera, and the AI algorithm decided that it didn't exist (and removed it) because older pictures of the moon didn't include it, I'd contest that the photo wasn't "enhanced."