Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Espressif ESP32: Breaking HW AES with Power Analysis (2023) (raelize.com)
99 points by transpute on Feb 10, 2024 | hide | past | favorite | 25 comments


To what degree does the unknown key extraction rely on the knowledge of exactly when the AES engine is engaged? They use a trigger (on pin 26 of the ESP32) to tell their acquisition hardware when the AES engine is running to zoom in on the sample that carries the most information about the key bits.

In the 'unknown key' example it seems as though that trick is still used as well, but the text isn't very clear around that part - at least, not to me, but in the code it looks clearly as though that pin is still cycled around the AES encryption call. In a real world attack you wouldn't have that information, you'd have to figure out some way of determining when the AES engine is operational first. And then this attack would suddenly become a lot harder.


When you look at a power trace of a whole AES operation, it's usually quite obvious when it starts and stops - you can see a pattern that repeats for each round. If you acquire a slightly larger window, you can align your traces after you've captured them (by shifting them until the average difference is small). This is quite doable but it is a whole extra step, so a trigger pin is always preferable.

The ChipWhisperer Husky has an "analog pattern" trigger mode[1], which can help with this sort of thing.

[1] https://rtfm.newae.com/Capture/ChipWhisperer-Husky/


Does that still hold if you have no idea what's running on the device? Or does that just help to narrow down the blocks of samples to test to a small enough section that you can brute force the remainder?


I'd say for it to be practical, you'd need to know the rough time window (within seconds) that the device is doing cryptography within. For example, if it's got an encrypted bootloader, you can reliably expect it to start decrypting stuff shortly after power on.


I can see how auto correlation might work on the signal traces if you have enough of them.


Spotting the em fingerprint I guess. Why would it be hard?


It would be hard(er) because that's not what is demonstrated here, if it would be easy you wouldn't need all that scaffolding to focus on 'sample 410 of a block of samples that occurs exactly after a self generated trigger point'. That reduces the search space considerably, in the real world you don't have that luxury. Intercepting the powerline is easy, spotting the EM signature of the AES engine in a bunch of software is arguably a harder problem than the one that they've solved here, my guess is that if it was that easy they would have shown just that and then that pin 26 magic wouldn't be there.

In essence, as far as I can follow the article what they've done is to first identify a sample in a block of samples taken after their own trigger goes off that has a high likelihood of leaking bits from the key by using a known key and running analysis to verify that that sample is the one to focus on. They then proceed to monitor just that one sample to retrieve an unknown key (this had a very high likelihood of working).

But that's - to me - the same as knowing exactly where in the haystack the needle is buried and then to proceed to prove this by uncovering it. Whether the needle was known or unknown doesn't matter any more at that point it's simply a way of proving that the extraction itself can work. But the hard part is to find an unknown needle in an unknown location and identifying the location without knowing the key would be a lot harder. The article reads as if you can extract any key from any ESP32 in a couple of seconds but I don't think it is quite that simple. Keep in mind that normally the flash of the ESP32 would be encrypted so you wouldn't be able to conveniently patch in a helpful trigger event around the encryption engine.


When attacking an encrypted flash, the idea is to trigger the sampling based on the activity of the external SPI flash signals.

This is detailed in the paper that inspired this article, or in in this more recent blog post: https://courk.cc/breaking-flash-encryption-of-espressif-part...


Even if it's finding a needle in a haystack, it's still a way smaller haystack that cracking the key. Let's say you can force a decryption operation somewhere in one billion samples you only need to try decryption with, worst case, just under a billion keys as you shift your way through the sample stream until you get something with low entropy. That's literally nothing compared to the time complexity of AES.


Ok, that makes good sense, so the 25 seconds in the article are optimistic, the billion keys you mention are a pessimistic view (a mere 800 years), and the real number would then be somewhere in between those two.

Do you account for jitter in your billion keys example or do you assume that the samples will all nicely overlay each other even absent a convenient trigger?


You would think things like spread spectrum clocking and low passing power would thwart this attack but really it just means you need to take a larger average and computers are really fast.

See for example: http://www1.ece.neu.edu/~saoni/files/Chao_ICCD_2015.pdf

I still can't believe that worked.

That said, these techniques are pretty old now and vendors should be mitigating this attack.


Yes, agreed you'd need a larger average. The thing I'm trying to get at is whether or not this is a practical attack, as in: feasible in the real world, without being able to control the code running on the device. Because if you can do this to any random ESP32 without being able to manipulate the code and it coughs up the keys in a few days, weeks or even months that's an entirely different level of threat than being able to do this the way the article did it.

And I'm having a hard time figuring out how big that difference is, it may well be 'impractical today, childsplay tomorrow'. And ESP32 devices are in a lot of different places. Access to the hardware should be assumed (because you're not going to be able to monitor the 3.3V line with this level of accuracy otherwise), I'd assume any caps after the monitoring point would be removed and the only capacitance left would sit on the supply side before the current transformer. If that's your setup and you have no knowledge of what's running on the chip is it doable or not?

The article suggests that any key can be recovered in a couple of seconds but I don't think that's the case at all.


In general: One needs to get code on the device for this attack to work.

But, in many demonstrated cases, one doesn't need to get privileged code on the device, which is an important distinction. And in other cases this type of monitoring was done without direct access to the machine, for example by examining the intensity of LEDs with a camera. Admittedly that's within eyeshot, but it's not direct access either.

For this ESP32 attack in particular, it's not clear how it would work without full control of the device.


Thank you.


AES is fast and hardware accelerated, I'd have to test but it's probably in the realm of minutes?


If you can do this without your code on the device in minutes then yes, that would be a serious problem. If 100's of years, then not so much.


I found this interesting but not actually very informative and walked away really learning nothing from having read the article submitted itself. It’s all “we used this tool” and “we followed this guide” with no explanation of theory or process. (The links provided, I’m sure, are more helpful.)

There wasn’t even a discussion on why the particular AES configuration deployed was used or how it affects the process.


You really learned nothing?

>Therefore, we decided, in similar fashion as Ledger’s scaffold, to make our own custom board where all the relevant signals are routed to dedicated pins.

Explaination or theory: someone else did it the same way.

>Throughout our research we used two acquisition techniques that are supported by our oscilloscope: normal block mode and rapid block mode. We used Picoscope’s Python bindings to communicate with the oscilloscope. We used their ps3000aRapidBlockExample.py as a reference to integrate rapid block mode into Riscure’s FiPy.

Sounds like they used what worked on their hardware.


That's because this article is more of a part 3 of a series of articles.

Part 1: https://eprint.iacr.org/2023/090 How to break ESP32 AES with power analysis

Part 2: https://courk.cc/breaking-flash-encryption-of-espressif-part... How to do it for no more than $100

Part 3: This article Improving the performance


The chip should not be leaking this much information but the scaffolding around the attack is such that I doubt whether this could be pulled off so easily on a random ESP32 with its flash encryption enabled.


Very nice analysis. I found the setup to be pretty neat, and of course if you break a key that is distributed with all the chips ...

It is an interesting challenge to secure chips, and something I don't think anyone has a really excellent response to. Always interested in papers along this area.


I don’t think anyone uses ECB mode for anything. At least CTR mode is used when an AEAD is not required. CTR mode would defeat this attack. Still, I really like seeing how this is done.


CTR is just as vulnerable to side channels as ECB, although it might be mildly less convenient to collect the traces.


The article just mentions firmware encryption, it looks like secure boot is compromised as well?

Both features depend on HW AES keys.


Secure boot uses asymmetric crypto for authentication.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: