This week people have Opinions on Ledger. A lot of people have expressed betrayal, disbelief, and doubt. While I understand this (people often reaction negatively to having magical thinking stripped away), I think it’s a little unfounded. Unlike a backdoor or a hidden software flaw, you can deduce from first principles how a HW Wallet must work, and why. People simply chose not to think about it very hard.
Now that I’m done insulting you1, dear reader, let’s dig into the misconception.
People seem to believe that a HW wallet exists to store a key in a Secure Element. This is wrong. The purpose of a hardware wallet is to make key usage safer by securely communicating with the user. In order to achieve this, it has exactly two functions:
Communicate to the user what they are signing, in order to get permission to sign
Operate on the key in order to create the signature
This purpose informs the technical design of HW wallets. They aim to prevent asset theft by communicating transaction details to users from a trusted interface. Twitter users (incorrectly) assumed that they aim to prevent key exfiltration from the device under any circumstances. For the purposes of HW wallet security model, signing arbitrary transactions without consent is equivalent to key compromise. The goal of the device is to create informed consent in the signing process. Preventing exfiltration is a secondary goal.
The screen must be trusted
When a user is interacting with a dapp, a compromised webapp can lie about transaction details. For example, it could claim a transaction sends 0.1 ETH, while actually taking 100 ETH. To prevent these sorts of lies, MetaMask extension introduced a popup. The popup exists outside the webapp, and cannot be compromised the same way.
The popup parses the transaction details, and gives users a reliable view of the transaction even if the webapp that produced it is compromised. The user consents to the transaction via this more-trusted interface, not via the webapp. This provides an extra layer of user-protection by moving the user interaction to a more-trusted context. It is critical that all of the following are done in the more-trusted interface.
Transaction details parsing.
The interface must understand the transaction, so that it can display it. It cannot accept the (potentially faulty) dapp’s input blindly. The trusted context must check the details for itself.
Transaction details display.
The interface must display transaction details to the user, so that a compromised application cannot show incorrect information (lie).
User consent.
The acceptance button must be in the more-secure interface. It cannot trust user consent passed in from the (possibly evil) dapp.
These steps must be managed by trusted code. Trusted code runs in the MetaMask extension and popup. Untrusted code runs in the web page. By ensuring that trusted code handles the entire communication procedure, we ensure that the user is correctly informed about the transaction they’re signing. I leave the design of this communication experience to people better qualified than me.
A HW Wallet is a physical version of the MetaMask popup. It separates the trusted communication platform from the user’s computer, which may be compromised. It ensures that a compromised machine is not sufficient to steal from the user. HW wallets provide a trusted display. As such, any program that can display information on the HW Wallet screen must be trusted, otherwise the benefit disappears.
The screen must be trusted. The buttons must be trusted. The code that drives them must be trusted. Hardware wallets exist to securely securing user’s consent. The apps that you install on the device, which drive the screen, are trusted code. If you don’t trust the code to control your keys, why are you installing it on the device that controls your keys.
Y’all are getting upset that trusted code has access to key material. Which, ya, is silly.
Upgradability is Necessary
Now let’s suppose for a second, that only the secure element controlled the screen. In this world, only the secure element can operate on the keys, and keys are never exposed to apps. This means that the transaction parsing and signing processes (RLP transaction formats for Ethereum. Transaction parsing and sighash for Bitcoin) run in the secure element as well, to allow for secure display of transaction details.
What do we do when Ethereum adds a new transaction type? Or when we want to support new chains?2
Transactions change over time, and signing procedures change over time. As such, a HW wallet that cannot upgrade the code that operates on the key material becomes useless over the course of a few years. If Ledger didn’t allow upgrades to logic operating on the key, it could not have added Cosmos or EIP-1559 or EIP-2718 or Taproot support or Ethereum Validator keys. This would be bad for users.3 We would all be hanging around overpaying gas because we didn’t want to spend $100 to upgrade to an EIP-1559 compatible ledger. 💀
Ledger also (helpfully) tells you when you’re sending a well-known ERC-20 token. This requires regular code updates to the transaction signing flow so that the device knows token addresses and can securely show token names to users. Without upgrades to the signing flow, securely communicating this would be impossible.
Y’all claim to want a hardcoded secure element that can never have its behavior changed. Okay. Just never upgrade your wallet firmware or apps again and see how you feel. I bet you’ll feel pretty silly.
Aaaaanyway
Basically, the whole point here is that you can’t eat your cake, and also have it secured in a bank vault 24/7. The HW wallet must communicate to users; tx parsing and signing need to be trusted. The HW wallet must be useful to the user; tx parsing and signing need to be upgradable. Trusted code cannot be static.
To resolve this tension between upgradability and security, Ledger chose to extend the ring of trust to include all application code. This is reasonable. It’s also what every other HW wallet I’ve investigated does. The trust barrier is at the USB cable. Everything on the HW Wallet end is trusted4. Everything on the computer end is untrusted.
I find this so frustrating because people could just sit down and think about how it works, and deduce the information above. This isn’t some grand fraud.5
Now that people are thinking about it, I hope someone proposes a new design that preserves secure communication AND upgradability. Then I hope that person brings a new HW wallet to market. Either that or you all shut up about it. One of the two. Whatever. Your choice.
I’m not done. I lied.
For anyone looking for an answer to this hypothetical, it’s “we manufacture a whole new hardware wallet and make users buy it again.”
The user would have to buy a new device at every Ethereum hard fork!
Frankly I’m kinda astonished that people apparently thought they could just load untrusted code onto their trusted device without taking risks. Like come on y’all. That’s goofy.
Yes, the Ledger Twitter misrepresented the facts. This is undoubtedly shitty.
No, I haven’t seen evidence of Ledger technical leadership lying about it or concealing the capability. Please let me know if you have some as I’ll happily revise my opinion.
Yes, I despise when developers lie to their users about their shitty code.
No, I don’t think Ledger put in a backdoor.
Yes, I think they ought to have communicated the trust model better.
No, this footnote has not gone on too long.
Yes, I am stretching the joke to its breaking point.
No, I won’t stop.