Our personal data is being harvested for many purposes these days. Stored in vast databases with little or no encryption organizations from our supermarkets to the security services are mining this data for reasons from commercial gain to national security. How we keep confidential and sensitive information private is much debated. However, important facts concerning cryptography are being ignored in this discourse. Today I want to describe what is possible as a result of the revolution in cryptography that occurred at the end of the 20th century.
Our intuitions tell us that ineffective solutions aside (such as identification by indexed numbers) it is impossible to have both the benefits of anonymity and those of transparency. But this is false. Cryptography can combine benefits of anonymity and benefits of transparency. Pseudo-anonymity is possible and comes in many forms. Without an understanding of these possibilities any discussion concerning privacy will be be missing out on a huge range of potential solutions.
In what follows I will be making a couple of technical assumptions that are not hugely controversial. Firstly I shall assume that certain widely believed mathematical conjectures are true or at least not usefully false. I shall also assume that we are not going to be able to build quantum computers any time soon. Our entire banking transfer system is based on these assumptions so I am in good company in making them!
Various counter intuitive things are possible with modern strong cryptography. A cryptographic signature consists of a private and a public key. Anyone in possession of the private key can sign messages (but no-one else can). Anyone in possession of the public key can check the signature and read the messages. Creating a cryptographic signature is easy with the right software. Cryptographic signatures can be used to create a virtual identity which is hard (or if desired impossible) to tie to the person who created it. However, over time such virtual identities can acquire trust in much the same way that individuals have for millenia.
Networks may be set up which allow anonymous communications to be sent. This allows not only the content of a message but the existence of a message to be hidden. Such networks already exist (for example Tor) and are used in high tech music piracy peer to peer networks.
Protocols are possible in which a certain action (such as decrypting a document) are only possible if certain people agree to the operation. As an example it is possible to so encrypt a document so that any 3 people out of 5 key holders can decrypt it but that no 2 people can decrypt it acting alone.
A canary is a characteristic piece of data which identifies the source of a document. The ordinance survey include minor errors dotted over their maps. This allows them to detect any may which has been copied from an ordinance survey map. Canaries can be added to many types of data to help identify unauthorized copies (although their utility is restricted to situations where few people have access to data).
No one has absolute privacy. There are many ways in which your privacy may be intentionally violated. For instance private detectives can be employed to follow you, public records mined for information or bribery used to obtain sensitive information. We shouldn't try to get any absolute guarantees of privacy because we know it to be impossible. In practice maintaining privacy is a matter of raising the cost of violating privacy to the extent that it is not worth the effort for the eavesdropper.
What matters is the cost of access to private data, the people who can access it, how easy it is to trace them and how susceptible the data is to abuse. The problem with storing masses of credit card details in centralized databases is not that the information needs to be private but that the cost of steeling each record is lower by a kind of mass-stealing economy. If only 2 or 3 people have access to a confidential file and anonymous blackmail threats are made then there is already a ready made shortlist of suspects available. If furthermore there are tell tail mistakes in the blackmailer's threat (because canaries have been used) then the perpetrator may be identifiable. Finally ease with which data can be abused matters. There are sometimes alternative methods to store information and some may be less prone to abuse than others.
So taking all this into account we should worry when:
1) Data is stored in central databases
The more data in one place the cheaper it is to illegally access that data (per record)
2) This Data is in a computer readable format
Working through masses of data by hand takes many more resources than if that process can be automated. As an example a supermarkets credit card database is computer readable but google street view is not (in any useful way)
3) Data is in abusable format
Records of transactions are necessary but these records can be stored in a manner which doesn't expose people's bank accounts to fraud.
4) The data is sensitive
Data about what music you like is not as open to abuse as data identifying which whores you've been visiting.
5) Many people have access to the data
It stands to reason that the more people have access to records the easier it is to trick/bribe them and the more likely it is that there are bad apples.
6) Its hard to trace the source of a leak
Clearly the easier it is to identify abuse the easier it is to discourage it.
7) The value of the data is high
A database containing movie preferences is much less valuable than one containing details of police cautions. The second one needs much better protection than the first.
I will now give some examples of what cryptography could be used for. Firstly it is possible to have electronic voting systems which are private but for which everyone involved in the process can count the votes themselves. Unfortunately no system that is currently in operation uses the necessary technology. Hence I am not against electronic voting in principle but I oppose all systems currently used.
Secondly any interaction that can be thought of as a sort of game with hidden information (such as a game of poker or a financial transaction) can be implemented using cryptography is such a way that the information can be hidden (such as the face of the card) the and yet when it is revealed the information is still known to be correct (revealing your hand).
Thirdly identity cards are possible which allow you to prove that you are a member of some group (such as non-terrorists or over 18s) without identifying who you actually are. It is possible to do this without making the system more prone to abuse by terrorists or underage drinkers! Please note that the UK governments proposals for ID unbelievably do not use such technology.
The benefits of modern cryptography then are (A) that pseudo anonymity is possible are can be used to prove facts such as your age, your criminal status without revealing any other information (B) that signature schemes allow proofs of transactions without increasing the risk of fraud (C) that cryptographic protocols are stricter instruments of public policy than laws in that they can (subject to our assumptions) be mathematical proven to prevent abuses. One of the many failings of modern liberal democracies is a failure to put our understanding of cryptography to work to provide these benefits and a failure to recognize the need for cryptographic solutions to provide privacy for the public, data for the government and intelligence for the police.
There are problems with cryptography too though. Cryptographic protocols take time to perform but as computers get faster this objection becomes weaker and weaker. Cryptographic protocols are brittle and not easy to adapt to new usage patterns. I think that it is better to live with this than to risk the massive privacy violations that will occur without it.