Simple anonymous digital identity with Merkle tree
Disclaimer: this article is a product of my own thoughts and different famous works and it is not pretending to be original. Leave a comment if you like — I’ll be happy to discuss anything related to this topic.
Recent experiments in cryptography-powered digital identity and distributed systems made an excitement in the community. People became very concerned in topics of private data security and anonymity. Many teams are working on different solutions right now (e.g. uPort and Hyperledger Indy) to enable their users to choose whether they want to share some of their private information.
In this article, I want to show you some very simple yet powerful way to implement your own digital identity mechanism. It enables you to keep your sensitive information (like passport data) private and to share only small requested pieces which can be simply verified by the counterparty.
Algorithm
Let’s suppose, I have some legal passport data from my government: my full name, date of my birth and a house address where I live. So, in other words, I have some structure like this:
{
name: "Alexander Vtyurin",
dateOfBirth: "12.12.1981",
residence: "Lenina St. 1, Moscow, Russia"
}
To enrich this data chunk with amazing properties mentioned above we need only two cryptographic primitives: a digital signature and a hash function. With help of these two we’ll transform this data into a Merkle tree and then we’ll describe some protocol to make it all work nice.
Data transformation
- We have N key/value pairs. Apply a hash function to each key and each value so you have M = 2N hashes.
- Hash pairs from the previous step are combined and hashed again (producing M = M / 2 hashes).
- Repeat step 2 until M = 1 — this last hash is the root hash.
- The produced data structure is shown in picture 1. As we can see, the root hash can be calculated only from all data combined together: ROOT = H1 + H2 + H3
Interaction protocol is consists of two parts:
- Credential issuance by some trusted party (e.g. the government).
- Proving some credential attribute to a counterparty (e.g. some store).
Let’s dive deeper into that processes.
Credential issuance by some trusted party
- Transform data to a Merkle tree using the algorithm above and some other constraints (like correct field spelling, correct field order and so on).
- Apply your digital signature to the root hash and transfer the entire Merkle tree to the trusted party.
- The trusted party makes all required validations of the tree, then signs the root hash (with your signature) and transfers it back to you.
Proving some credential attribute to a counterparty
I will reference this structure later in the algorithm. It is placed here only because Medium is not allowing me to put it inside the list without breaking it.
{
signedRoot: {
rootHash: ROOT,
issuerSignature: Signature,
proverSignature: Signature
},
treeWithRevealedData: [
{
type: "hash",
value: H1
},
{
type: "revealedData",
value: {
key: "dateOfBirth",
value: "12.12.1981"
}
},
{
type: "hash",
value: H3
}
]
}
- The counterparty (verifier) requests us to reveal some credential attribute. Let’s imagine, we just want to buy beer and there is some kind of a terminal in the store that can interact with us, as a verifier. So this terminal is asking us to reveal a field “dateOfBirth” of the credential issued by the government.
- We, as a prover, respond to the verifier with a structure mentioned above. This structure consists of signed root, key/value pair of revealed fields and also most-possible higher level hashes which are required to compute the root hash.
- To verify your revealed credential validity, the verifier just needs to compute the root hash from the date we passed to it. In our case, the terminal will compute the hash from or birth date (H2) and combine it with the hashes we also passed (H1 and H3).
- Verifier compares this newly computed hash with the one we’ve passed and also checks signatures. If hashes are equal and everything is nice — job’s done, you’ve disclosed your age, but said nothing about your name and address.
Pros:
- Simple to understand and implement yet powerful mechanism.
- All required cryptographic primitives are available on most modern programming platforms, so the mechanism can be implemented in almost any environment.
Cons:
- This is not a ZKP-algorithm, so a malicious entity can refer your disclosed fields to a third party, which can completely deanonymize you after a while.