Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Encrypt and hide files inside images (github.com/7thsamurai)
208 points by 7thSamurai on Oct 25, 2022 | hide | past | favorite | 85 comments


For the non-gamers or those who didn't play it: Spore used to do this with all its creations[0]. You could export them as PNGs and drag and drop them into an editor to load the content. The image itself was a thumbnail of the creation. That's how every creation was shared between the official servers and between users on the forums. Very neat & niche use case.

[0]https://nedbatchelder.com/blog/200806/spore_creature_creator...


I can think of Exapunks[1] (by Zachtronics) which uses a similar concept; you can create games for a virtual device called the "TEC Redshift" and to share the solutions between the players, the game creates a cartridge picture with the name of your solution. The instructions are embedded into the file, and you simply have to drag&drop it into the game to import the solution.

It's a neat way of sharing save files!

[1] https://www.zachtronics.com/exapunks/


The Pico-8 virtual console also stores games as PNG. What you see as the game cart is the game.

https://www.lexaloffle.com/pico-8.php https://pico-8.fandom.com/wiki/P8PNGFileFormat


So did Fractint in the early 1990s; it saved the coordinates and other settings in separate tags of the GIF image, so you could just load a previously saved image and keep panning and zooming.


Hydrus also distributes plugins as encoded images.

https://hydrusnetwork.github.io/hydrus/index.html


My childhood friend reached out about wanting to hide her music collection. This completely nerd sniped me. What I did was create a FAT-like FS in complete reverse (starting at the end of the disk). AES-CTR was used to encrypt it (hurrah plausible deniability), so it would merely look like random empty space. The idea was then to drop some decoy files on the drive, defrag+compact, and then use the free space for the music collection.

She decided against crossing the border with the music collection, and so it was never used. Still a fun steganography project.


What border crossing is this so I make sure to never visit?


New Zealand if I'm remembering correctly.


New Zealand had one of the strictest anti-piracy law in the world circa 2000-2011. Its there mostly to appease Hollywood and the US film industry iirc.

Reference:https://www.google.com/amp/s/torrentfreak.com/new-zealand-3-...


We tried really hard to impress the FBI with helicopters and stuff.

Half of it fell over in court, but that came later but our government got to be be friends with the big guys.

I find it hard to believe NZ customs would be interested in music, pirated or not.

Try bring a bit of fruit or meat though, that gets them fired up.


Check out the VICE documentary on Kim Dotcom for the story told from his perspective. Public enemy #1, for a file hosting service that happened to be a part of the standard infrastructure for piracy. They went after him almost like they did Assange.

I’m sure there are dozens of equivalent services now but the government doesn’t even bother keeping up. The market adapted in the form of streaming, subscriptions, and services.


Why was she trying to do this?


Clearly "music" is a code word.


New Zealand is lovely.


Did she tell you why she wanted to hide her music collection? Is this a common thing?


Yes in the old times when your friend's where Emo's but you secretly listened to Britney Spears.


Sounds a bit like the hidden volumes in TrueCrypt/VeraCrypt...


Wow, that sounds super cool! Nice work on that!


Nice of them to write a useful tool for doing it, and I have never implemented it, but having done stego detection as part of CTFs/creepypasta puzzles, if the shannon entropy of the data in the LSB (extracted using cyberchef) is greater than that of the rest of the image, it probably has data encoded in it. Similarly, you can randomize the palette of an image to look for noise that would indicate encoded data.


You can also break an image down into layers based on bit position. Static is a clear sign of random and therefore likely encrypted or compressed data.


What if the image is a noisy photo?


The end result of encryption is a completely random string of bits. If you replaced an images raster with a properly sized blob of encrypted data (0 padding could be used pre-encryption to get the right size), you would get an image of pure static.

So if you randomly generated a stream of bits and put that in a photo, that would not be differentiable from an encrypted blob inside of a photo data structure.

That means the question is where do you hide a random string of 1's and 0's without it being obvious there is a string of 1's and 0's. The related question to a noisy photo is, is there structure to the noise? Random 1's and 0's is the opposite of structure. So trying to de-steganogra-fy a photo means looking for absence of structure.

This leads to the observation you are making, which is that the photo you choose has a great effect on the effectiveness of the stenography. If you photo-shopped together an image of a TV displaying static, you could put an encrypted blob in the static with all bits and have a cohesive photo. If you were to hide a payload in a completely black (#000000) picture, any strategy used would likely be obvious.

Stenography is very much "security through obscurity" compared to encryption which is security. The nature of encryption means that stenography is primarily about plausible deniability rather than being technically effective.


Maybe you could compute the spectrum or some other characteristic of the noise and compare it to the type of noise you would expect from a camera.

I guess you can always mimic these things to some degree if you are determined enough though.


Yeah, this is what you do, for example re-encode the image with Gray code (avoids the Hamming cliff), you split the bit-planes into 8x8 blocks, compute the entropy within them and encode the data for that block in such a way that it has same/very similar entropy, re-encode from Gray back into normal binary.

See https://web.archive.org/web/20120905034757/http://www.eece.m...


That too, say, 3-state modulation. Ethernet uses extra degree of freedom to increase entropy, but it can decrease entropy too, create a fake bias and approximate desired statistics. As a bonus stretch the key the second time and use it as a running bias for the modulation.


> CTFs/creepypasta puzzles

Where can I find games like that?


Nice tool.

As semi-related trivia, Pico-8 stores its "cartridges" using this technique so the carts are .PNG files.

https://www.lexaloffle.com/pico-8.php


On a side note: Pico-8 is such an awesome platform. I'm truely impressed.


This form of steganography, of course, can't survive virtually every image transformation and format conversion and thus can only be used for limited cases. PICO-8's .p8.png format is one such example [1].

[1] https://pico-8.fandom.com/wiki/P8PNGFileFormat


I always wondered how many of the images on eBay were actually clandestine underworld spy messages. The equivalent of the crossword puzzles during WW2. Ok probably not many but you never know. :)


Not spy messages, but steganography has been used for malvertizing campaigns, hiding malicious code inside ad images to prevent automated detection by ad networks. See for example https://www.proofpoint.com/us/threat-insight/post/massive-ad....


Have you seen any of these images? If so, then yes they were clandestine spy messages. https://vault.fbi.gov/ghost-stories-russian-foreign-intellig...


When the Internet was fast enough, image forums were such a novelty it was common to hoard images/memes/wallpapers for later use. I've always wondered what kind of stenographic messages I unknowingly saved in my old files.


nice. to anyone else curious about the difference between the two images in the repo, here's the difference and then equalized: http://telnet.asia/diff.png


What happens if you send a photo generated by this tool through something like Facebook or WhatsApp where they might re-encode the file to save on storage/bandwidth costs?


The concept is lossless vs lossy compressions. Facebook is known to do all manner of lossy transformations on images, while whatsapp likely (definitely if you believe it is e2e encrypted) would just transfer the bits.

Re-sizing would trash the payload, but any type of conversion that losslessly preservers the raster (the matrix of pixels) would likely still be recoverable.


> whatsapp likely (definitely if you believe it is e2e encrypted) would just transfer the bits

They could still compress images on the client side before sending the message. Dunno if they do or not, didn’t check. Just saying that E2EE does not prevent them from doing so.


They do, you can choose a level but they still do. Document sending is uncompressed though, but won't generate a thumbnail in the chat


WhatsApp compresses the image/video on the device before sending.

Example video 13.2MB => 7.2MB


you can choose to send as document/uncompressed.


I've wondered whether this could be addressed making the pixel adjustments in the form of a QR code. Much lower data storage, but much better chance of surviving image transforms thanks to the built-in resilience and redundancy.


Bye bye data, probably.


Or something like Amazon Photos which can store RAWs...


Kind of a fun barely-related historical script kiddie note, but due to the way the jpeg and zip file formats work, you can store both in the same file, then just change the file extension from one to the other and it will work correctly.

The "Dangerous Kitten" pack of hacking tools was famously shared this way on early image boards: https://web.archive.org/web/20110902044711/http://partyvan.i...

Nowadays must image parsing is done via libraries that will strip out extraneous info to prevent this, but back in the day when most people would roll their own code for this (or copy paste poorly thought out implementations) this was commonly, if unintentionally, supported.


Nothing has fundamentally changed, you can still upload images containing file archives to most social media services, if you pay attention to the relevant file format specs.


Don't most social sites reencode/compress everything? I would expect this to get inadvertently broken as everyone tries to save space/bandwidth.


It depends on the site, but Twitter for example will leave a file untouched if its heuristics decide the file is already well compressed. Good compression is expensive, and in many cases more so than the bandwidth you'd save.


Can confirm. I've been stuffing configuration files inside images and storing them on Twitter for a couple years now; they remain untouched, and any newly-booted VM on my home network can pull the files down via API, strip the payload out of the image, put it into place as a config file, and cycle the service, no problems. Works great. All of this was born out of an idea to get something 'useful' out of twitter, rather than "perpetual doom scrolling to find something to be upset about". Now I never touch their GUI, and they are essentially an offsite Puppet repository for me at this point.

I can also confirm that Facebook strips everything out of images, rendering them useless for this purpose. Instagram does the same (not surprisingly).


I know it's cliche to say here on HN - but this kind of comment is why I love this site. Silly, unexpected ways to use technology are so satisfying to me.


Nice work! Steganography is always one of the topics I introduce at the end of CS1, since they've learned all the bits and pieces on how to do it. Its fun to watch their minds get blown over something like this and go "you can DO that?!"


As another example I have an extremely easy to understand and well commented version of a similar technique from about 10 years ago

https://github.com/kristopolous/piggypack


This is really cool.

However, LSB is the most famous method of doing steno, so if you actually have something to hide this is probably not a good method, as everyone knows about this method.


Really depends on what your threat model is, no? To anyone who doesn't even know whether there is something to look for, it's definitely far less suspicious than a TrueCrypt container.


I'd say it's good enough. An adversary would need to know the file the secret is hidden inside, what method was used, probably would need the unaltered container file to compare with to extract the bits, and then has to overcome the additional layer of encryption.

I didn't check the code corresponding to the submitted article, but I'd guess it would be infeasible even for an advanced or even state adversarial actor to automatically check lots and lots of images (unless there is some kind of "signature" that can be detected) just in case there might be some LSB-based steganographically hidden secret inside.

If the adversary is doing a targeted attack against an individual/group of individuals, it might be more feasible, but then I'd wonder if other means of attack, including the good old monkey-wrench-to-the-knee aren't still more efficient.


> unless there is some kind of "signature" that can be detected

There is.

You're right that the encryption probably can't be overcome. The risk is more that it is detected and then the adversary beats you up until you hand over the password.


Combine with stable diffusion and you don’t even need to bring your own image to the table. Plus there’s no public reference that exists to compare against.


This is not much steganography and by ditching steganography completely you could make it so that the file is both a valid image and a valid zip. At least that would make it easy to extract the data.

Needless to say that too has already been done many times in the last 20 year.


Stenography is quick writing using a vowel-/tone-based script usually used to record the spoken word. You are thinking about steganography, which is hiding something inside other data.

The ZIP thing is a nice trick, but by now is being checked for regularly when computer forensics come into play. Also, I would argue that it is not actually steganography, for it does not hide itself.


That's just about "hiding" being a relative word. "ZIP thing" also can be considered hiding, if you are absolutely clueless. And if we speak about "forensics" — that toy implementation is about as much "hiding" as the ZIP thingy. For more or less solid tool somebody already mentioned steghide here, as well as the reasons why the OP's script doesn't really hide anything, if we are being meticulous.


Auto-correction wins again.

Yes, both is easily detected anyway, hence I said "ditching" it would allow to make it a zip file.


If a JPEG can store data, can that data (code) also be run upon opening the JPEG? This could be really disturbing.


Generally speaking, no. While theoretically a JPEG could be used to STORE a virus, the chances of that code actually running on your computer is almost nil. (Provided you don't do silly things such as clicking on a file named "img.jpeg.exe" or what have you.) Of course from time to time in the past there have been bugs. Some of the early Windows versions for example. AFAIK most (if not all) JPEG decoders in common use today have already ironed out those issues.


JPEG as a format allows you to add extra data to the end. ZIP as a format allows you to add extra data to the beginning. So its probably easy to make a JAR-JPEG polygot.

Generally though, even if you make an executable JPEG, the thing stopping this is that normally you look at the extension to decide how to interpret a file, so when you see a .jpg ending, you don't try and execute.


You could store the bytes of an EXE inside an image, just like how I could paste the base-64 of an EXE on the end of this message. The trick is going from that to actually running it.

TVoKSGVsbG8gdGhlcmUuClRoYW5rIHlvdSBmb3IgcmVhZGluZyBteSBlbmNvZGVkIG1lc3NhZ2Uu ClRoZSBiZXN0IHdvcmQgaW4gdGhlIEVuZ2xpc2ggbGFuZ2FnZSBpcyBSdXRhYmFnYS4KQ2hhbmdl IG15IG1pbmQ=


yea I'm not going to run that trojan you just posted here


Oh go on.


Only if there is a flaw in the jpeg decoder


I love this kind of stuff, if I ever would use security by obscurity this is probably the way I would do it :)


When I was a teenager we hid games on DOS on school's computers by just changing the name and extension.


The author claims it is possible to detect an image having some possible data. How does one detect this?


The LSBs of a typical photograph are not uniformly random. For example, here's the blue channel LSBs of the input image from the repo: https://i.imgur.com/lOVlcpU.png

If they are uniformly random, then you know something is up.


This image is clearly showing JPEG artifacts, you'd never see such square blocks in an photograph that has not been compressed with a lossy algorithm.


I should have clarified, I'm not talking about the JPEG artifacts, I'm talking about everything else.

The faint outlines of objects you can see, the varying textures, and the areas of clipping (where the source-brightness was either above 255 or below 0).

There are other statistical correlations not visible in that image - correlations between the different channels, and between the different bits within a channel.

If I showed you the most significant bit of some non-JPEG'd image, you could obviously see that it's non-random (since it'd essentially be a threshold function). If I showed you the second-most significant bit, it would again be non-random, but perhaps less obviously so. As you go through the bits, it starts looking more and more random, but there are still going to be statistical tests you can do to distinguish from true uniform random bits.


JPEG is super common. Having jpeg compression artifacts, even in images that have been converted to a different format, is not something that will raise eyebrows.


usually by the fact that a 200x200 jpg is 15MB


It would be interesting to use this to embed tags for an AI to inference.


Getting errors when building it on my machine, macOS 12.5, Intel.


Looks like there's already a pull request to fix building under macOS: https://github.com/7thSamurai/steganography/pull/1


Just merged it!


How do you install this?


Currently it's only for Linux and Mac users. Check out the building section at https://github.com/7thSamurai/steganography#building to learn how to compile it.


Cool, but Least significant bit steganography is kind of old now. It would be nice to see hiding data in the content or meaning of images, probably using Stable Diffusion.


Will we get to "hide <secret info x> in 3 red balloons floating over a waterfall"?


The steganography aspect does not seam very robust. It significantly alters the distribution of LSB of pixel values of typical images.

In contrast steghide[1] encodes the data by swapping pixels within the image. This leaves the distribution of pixel values the same, avoiding this channel of detection.

[1] https://steghide.sourceforge.net/index.php


[flagged]


Yeah, I just made this as a fun proof-of-concept, I just figured that other people might also want to play around with it!




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: