Data Anonymisation – opportunities and risks

Following the ICO‘s recent public consultation and proposal to establish a stakeholder network, Steve Wood, Head of Policy discusses the challenges ahead in the use of anonymised data. 

The UK is putting more and more valuable data into the public domain, with the government’s open data agenda allowing us to find out more than ever about the performance of public services and hold public bodies to account. And it looks to be a popular move with the Home Office crime mapper website, launched early last year, already receiving more than 50 million visits.

Also, the private sector is increasingly focused on deriving benefits from big data* – vast amounts of linked data, with huge potential in terms of insight into customers, markets and economic trends. Businesses are seeing the benefit of creating new business models around open datasets.

Whilst the disclosure or use of many datasets will be in the public interest the public rightly expect that this will not be at the expenses of their privacy. The public’s support and trust in initiatives such as open data will be damaged of their privacy is compromised. We believe anonymisation has a crucial role to play in this data revolution; it offers great opportunities, but it is also important that risks are properly considered and managed.  We find that the risks related to anonymisation are sometimes understated and also overstated.  Developing an effective and balanced risk framework for anonymisation is vital to protect privacy and provide rich sources of data, that can benefit society and the economy.

Anonymised data is essentially information that does not identify any individuals, either in isolation or when cross referenced with other data already in the public domain. The benefits of working with anonymised data are numerous. Properly anonymised data is not personal information therefore working with anonymised data allows an organisation to release the information into the public domain without running the risks of inappropriately disclosing personal data. As it’s no longer personal data the information can also be used in different ways as the Data Protection Act’s requirement that personal information is only used for a specific purpose or purposes does not apply.  If two organisations want to share datasets containing personal data they should always consider whether their needs can be met by anonymised data before using personal data.

We are already seeing anonymised data being used to improve public services. For example anonymised geo-location data collected from an individual’s mobile phone can be used to introduce traffic calming measures. The process works due to telecommunications providers having access to a mobile phone number, an approximate location and a time and date. By anonymising the data, the information can be provided to a research body to show how many people were on that section of road at a particular time, but crucially not who that person was, or what number they were using.

The field of medical research also stands to reap massive benefits from the release of rich sources of anonymised data which can be used to carry out studies into the latest drugs and treatments. The UK continues to be known as a world leader in medical research and anonymised data opens up a world of possibilities to ensure this reputation is retained and built upon.

Whilst the UK Information Commissioner’s Office (ICO) supports the use of anonymisation techniques organisations must not be complacent. It may be simple to aggregate and anonymise some datasets but it is often not as easy as one might expect. For example while a piece of information may appear to be anonymised when looked at in isolation, this may not necessarily be the case when you look at this information in context with the other information already available in the public arena. With ever increasing amounts of data in the public domain this can be challenging. This is why it is so important that anonymisation is carried out correctly.

There have been some high profile examples of anonymised datasets being “broken” in the US. We believe these were examples of poor and complacent anonymisation. It is simply unrealistic, as some commentators have called for, to stop using anonymisation techniques because of the risks. It is a call to ensure anonymisation techniques are more effective and that organisations deploy the right expertise. The demands for open data, big data and information sharing in our information society will not disappear – there are often strong  arguments on their favour. What we must do is address the privacy risks with the best privacy enhancing techniques available and make judgments on a case by case basis whether data can be disclosed publicly.

The ICO also stands by to take swift enforcement action against those who negligently or complacently place individuals’ privacy at risk through poor standards of anonymisation.

Earlier this year we carried out a public consultation to inform our upcoming code of practice issued under the Data Protection Act – the code is focused on managing the data protection risks related to anonymisation. The focus on the term risk is important – in some cases anonymisation techniques are not 100% certain but the Data Protection Act requires these risks to be addressed if they are greater than remote. The code is not a security engineering manual or a complete guide to all the issues – it is framework that explains the main concepts, using examples throughout. It should be an important starting point.

We are pleased to announce that the feedback we have received to this consultation has helped us to produce a comprehensive piece of guidance that will help practitioners understand the risks associated with producing anonymised data. We’ve had important input from experts in the public and private sectors. The code will be published next week and we hope that you find the code useful.  We look forward to receiving feedback and examples of where practitioners are using the code.

We know that producing a code of practice is only part of the solution. More needs to be done to share knowledge about how to deploy anonymisation techniques, particularly for those practitioners tackling the issues for the first time.  We will also be announcing the details of the ICO funded Anonymisation Network, which will be responsible for enabling good practice to address the specific issues impacting on the production of anonymised data in all of its different forms. We are providing funding for a group of experts to run the network for two years – this will include a website with membership areas, use of social media, events, seminars and case studies.

The use of anonymised data will have an increasing role to play in the way the UK shares information about it citizens and so it is vitally important that a consistency of approach is developed, based on high standards that recognise individuals’ rights to privacy as well as the benefits that anonymised data presents for the population at large. We hope our work in this area will go some way to achieving this.

Editor’s Note

‘Big data’ is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools. The challenges include capture, curation, storage, search, sharing, analysis, and visualization.

Leave a Reply

Loading Facebook Comments ...
Loading Disqus Comments ...