For the non-gamers or those who didn't play it: Spore used to do this with all its creations[0]. You could export them as PNGs and drag and drop them into an editor to load the content. The image itself was a thumbnail of the creation. That's how every creation was shared between the official servers and between users on the forums. Very neat & niche use case.
I can think of Exapunks[1] (by Zachtronics) which uses a similar concept; you can create games for a virtual device called the "TEC Redshift" and to share the solutions between the players, the game creates a cartridge picture with the name of your solution. The instructions are embedded into the file, and you simply have to drag&drop it into the game to import the solution.
So did Fractint in the early 1990s; it saved the coordinates and other settings in separate tags of the GIF image, so you could just load a previously saved image and keep panning and zooming.
My childhood friend reached out about wanting to hide her music collection. This completely nerd sniped me. What I did was create a FAT-like FS in complete reverse (starting at the end of the disk). AES-CTR was used to encrypt it (hurrah plausible deniability), so it would merely look like random empty space. The idea was then to drop some decoy files on the drive, defrag+compact, and then use the free space for the music collection.
She decided against crossing the border with the music collection, and so it was never used. Still a fun steganography project.
New Zealand had one of the strictest anti-piracy law in the world circa 2000-2011. Its there mostly to appease Hollywood and the US film industry iirc.
Check out the VICE documentary on Kim Dotcom for the story told from his perspective. Public enemy #1, for a file hosting service that happened to be a part of the standard infrastructure for piracy. They went after him almost like they did Assange.
I’m sure there are dozens of equivalent services now but the government doesn’t even bother keeping up. The market adapted in the form of streaming, subscriptions, and services.
Nice of them to write a useful tool for doing it, and I have never implemented it, but having done stego detection as part of CTFs/creepypasta puzzles, if the shannon entropy of the data in the LSB (extracted using cyberchef) is greater than that of the rest of the image, it probably has data encoded in it. Similarly, you can randomize the palette of an image to look for noise that would indicate encoded data.
You can also break an image down into layers based on bit position. Static is a clear sign of random and therefore likely encrypted or compressed data.
The end result of encryption is a completely random string of bits. If you replaced an images raster with a properly sized blob of encrypted data (0 padding could be used pre-encryption to get the right size), you would get an image of pure static.
So if you randomly generated a stream of bits and put that in a photo, that would not be differentiable from an encrypted blob inside of a photo data structure.
That means the question is where do you hide a random string of 1's and 0's without it being obvious there is a string of 1's and 0's. The related question to a noisy photo is, is there structure to the noise? Random 1's and 0's is the opposite of structure. So trying to de-steganogra-fy a photo means looking for absence of structure.
This leads to the observation you are making, which is that the photo you choose has a great effect on the effectiveness of the stenography. If you photo-shopped together an image of a TV displaying static, you could put an encrypted blob in the static with all bits and have a cohesive photo. If you were to hide a payload in a completely black (#000000) picture, any strategy used would likely be obvious.
Stenography is very much "security through obscurity" compared to encryption which is security. The nature of encryption means that stenography is primarily about plausible deniability rather than being technically effective.
Yeah, this is what you do, for example re-encode the image with Gray code (avoids the Hamming cliff), you split the bit-planes into 8x8 blocks, compute the entropy within them and encode the data for that block in such a way that it has same/very similar entropy, re-encode from Gray back into normal binary.
That too, say, 3-state modulation. Ethernet uses extra degree of freedom to increase entropy, but it can decrease entropy too, create a fake bias and approximate desired statistics. As a bonus stretch the key the second time and use it as a running bias for the modulation.
This form of steganography, of course, can't survive virtually every image transformation and format conversion and thus can only be used for limited cases. PICO-8's .p8.png format is one such example [1].
I always wondered how many of the images on eBay were actually clandestine underworld spy messages. The equivalent of the crossword puzzles during WW2. Ok probably not many but you never know. :)
When the Internet was fast enough, image forums were such a novelty it was common to hoard images/memes/wallpapers for later use. I've always wondered what kind of stenographic messages I unknowingly saved in my old files.
nice. to anyone else curious about the difference between the two images in the repo, here's the difference and then equalized: http://telnet.asia/diff.png
What happens if you send a photo generated by this tool through something like Facebook or WhatsApp where they might re-encode the file to save on storage/bandwidth costs?
The concept is lossless vs lossy compressions. Facebook is known to do all manner of lossy transformations on images, while whatsapp likely (definitely if you believe it is e2e encrypted) would just transfer the bits.
Re-sizing would trash the payload, but any type of conversion that losslessly preservers the raster (the matrix of pixels) would likely still be recoverable.
> whatsapp likely (definitely if you believe it is e2e encrypted) would just transfer the bits
They could still compress images on the client side before sending the message. Dunno if they do or not, didn’t check. Just saying that E2EE does not prevent them from doing so.
I've wondered whether this could be addressed making the pixel adjustments in the form of a QR code. Much lower data storage, but much better chance of surviving image transforms thanks to the built-in resilience and redundancy.
Kind of a fun barely-related historical script kiddie note, but due to the way the jpeg and zip file formats work, you can store both in the same file, then just change the file extension from one to the other and it will work correctly.
Nowadays must image parsing is done via libraries that will strip out extraneous info to prevent this, but back in the day when most people would roll their own code for this (or copy paste poorly thought out implementations) this was commonly, if unintentionally, supported.
Nothing has fundamentally changed, you can still upload images containing file archives to most social media services, if you pay attention to the relevant file format specs.
It depends on the site, but Twitter for example will leave a file untouched if its heuristics decide the file is already well compressed. Good compression is expensive, and in many cases more so than the bandwidth you'd save.
Can confirm. I've been stuffing configuration files inside images and storing them on Twitter for a couple years now; they remain untouched, and any newly-booted VM on my home network can pull the files down via API, strip the payload out of the image, put it into place as a config file, and cycle the service, no problems. Works great. All of this was born out of an idea to get something 'useful' out of twitter, rather than "perpetual doom scrolling to find something to be upset about". Now I never touch their GUI, and they are essentially an offsite Puppet repository for me at this point.
I can also confirm that Facebook strips everything out of images, rendering them useless for this purpose. Instagram does the same (not surprisingly).
I know it's cliche to say here on HN - but this kind of comment is why I love this site. Silly, unexpected ways to use technology are so satisfying to me.
Nice work! Steganography is always one of the topics I introduce at the end of CS1, since they've learned all the bits and pieces on how to do it. Its fun to watch their minds get blown over something like this and go "you can DO that?!"
However, LSB is the most famous method of doing steno, so if you actually have something to hide this is probably not a good method, as everyone knows about this method.
Really depends on what your threat model is, no? To anyone who doesn't even know whether there is something to look for, it's definitely far less suspicious than a TrueCrypt container.
I'd say it's good enough. An adversary would need to know the file the secret is hidden inside, what method was used, probably would need the unaltered container file to compare with to extract the bits, and then has to overcome the additional layer of encryption.
I didn't check the code corresponding to the submitted article, but I'd guess it would be infeasible even for an advanced or even state adversarial actor to automatically check lots and lots of images (unless there is some kind of "signature" that can be detected) just in case there might be some LSB-based steganographically hidden secret inside.
If the adversary is doing a targeted attack against an individual/group of individuals, it might be more feasible, but then I'd wonder if other means of attack, including the good old monkey-wrench-to-the-knee aren't still more efficient.
> unless there is some kind of "signature" that can be detected
There is.
You're right that the encryption probably can't be overcome. The risk is more that it is detected and then the adversary beats you up until you hand over the password.
Combine with stable diffusion and you don’t even need to bring your own image to the table. Plus there’s no public reference that exists to compare against.
This is not much steganography and by ditching steganography completely you could make it so that the file is both a valid image and a valid zip.
At least that would make it easy to extract the data.
Needless to say that too has already been done many times in the last 20 year.
Stenography is quick writing using a vowel-/tone-based script usually used to record the spoken word. You are thinking about steganography, which is hiding something inside other data.
The ZIP thing is a nice trick, but by now is being checked for regularly when computer forensics come into play. Also, I would argue that it is not actually steganography, for it does not hide itself.
That's just about "hiding" being a relative word. "ZIP thing" also can be considered hiding, if you are absolutely clueless. And if we speak about "forensics" — that toy implementation is about as much "hiding" as the ZIP thingy. For more or less solid tool somebody already mentioned steghide here, as well as the reasons why the OP's script doesn't really hide anything, if we are being meticulous.
Generally speaking, no. While theoretically a JPEG could be used to STORE a virus, the chances of that code actually running on your computer is almost nil. (Provided you don't do silly things such as clicking on a file named "img.jpeg.exe" or what have you.) Of course from time to time in the past there have been bugs. Some of the early Windows versions for example. AFAIK most (if not all) JPEG decoders in common use today have already ironed out those issues.
JPEG as a format allows you to add extra data to the end. ZIP as a format allows you to add extra data to the beginning. So its probably easy to make a JAR-JPEG polygot.
Generally though, even if you make an executable JPEG, the thing stopping this is that normally you look at the extension to decide how to interpret a file, so when you see a .jpg ending, you don't try and execute.
You could store the bytes of an EXE inside an image, just like how I could paste the base-64 of an EXE on the end of this message. The trick is going from that to actually running it.
The LSBs of a typical photograph are not uniformly random. For example, here's the blue channel LSBs of the input image from the repo: https://i.imgur.com/lOVlcpU.png
If they are uniformly random, then you know something is up.
I should have clarified, I'm not talking about the JPEG artifacts, I'm talking about everything else.
The faint outlines of objects you can see, the varying textures, and the areas of clipping (where the source-brightness was either above 255 or below 0).
There are other statistical correlations not visible in that image - correlations between the different channels, and between the different bits within a channel.
If I showed you the most significant bit of some non-JPEG'd image, you could obviously see that it's non-random (since it'd essentially be a threshold function). If I showed you the second-most significant bit, it would again be non-random, but perhaps less obviously so. As you go through the bits, it starts looking more and more random, but there are still going to be statistical tests you can do to distinguish from true uniform random bits.
JPEG is super common. Having jpeg compression artifacts, even in images that have been converted to a different format, is not something that will raise eyebrows.
Cool, but Least significant bit steganography is kind of old now. It would be nice to see hiding data in the content or meaning of images, probably using Stable Diffusion.
The steganography aspect does not seam very robust. It significantly alters the distribution of LSB of pixel values of typical images.
In contrast steghide[1] encodes the data by swapping pixels within the image. This leaves the distribution of pixel values the same, avoiding this channel of detection.
[0]https://nedbatchelder.com/blog/200806/spore_creature_creator...