Monthly Archives: January 2014

The NHS data sharing story rumbles on

I’ve found out a lot more about the NHS data sharing scheme.

First, the people in charge do actually have very solid awareness of (de-)anonymisation, but the info is hidden a bit deep in the HSCIC Privacy Impact Assessment and its supporting documents.

Second, there was a different scheme for sharing medical data, the “Summary Care Record“. People were sent opt-out forms for that, some as late as last year. It’s a different system, and it does involve medical data being shared with (e.g.) A&Es. Many GP practices turn out to be confused about this vs. care.data.

My assessment of the situation based on what I understand now has been written up in a piece at The Conversation. For any further links to interesting source documents please do look there. I’ve excluded the scenario from my previous blog post in that.

Since I published that yesterday, some serious discussion has broken out on Twitter (especially around @GeraintLewis, NHS Chief Data Officer) concerning whether data would be sold to insurance companies or not, with apparently contradictory statements on this having come out from the NHS side. See the comments on my piece for a summary of the positions.

NHS care.data: even if you opt out …

Following on from my earlier post

If you ring an insurance company, there is every chance that at some point you will be reminded that data is liberally shared between insurance companies and other authorities in order to prevent fraud. The following scenario now suggests itself …

You opted out from having your data shared in the care.data program. In the end, despite newspaper front pages and assorted expressions of worry about privacy and accountability, you are in a tiny minority of people to have done so. Now you apply for life insurance, or maybe health insurance (in the post-NHS era we may all need to do this!) A week later you receive a letter from the insurance company: “We don’t have access to your medical data from the NHS. Unfortunately in our experience this indicates a high likelihood that you have medical circumstances that you would wish to hide from us. Because of this, we will not be able to provide you with insurance.”

You decide whether this is a likely scenario or not. In today’s Guardian piece, insurance companies were mentioned as potential buyers of the data. (Aside: doesn’t the financial dimension erode the “it’s all for our benefit” story somewhat?) The piece also reminded us that de-pseudonymisation is not only a risk in general, but very likely no problem at all for organisations who already have lots of our data – such as the insurance industry.

I’ll leave it to the game theorists to decide whether this post is arguing for- or against opting out. Only at the end of writing this it comes back to me that I actually watched “The Rainmaker” last night 😉

Timing of cyber attacks: a model

Last week Axelrod and Iliev from the University of Michigan (Ann Arbor) published a paper “Timing of cyber conflict“. Akshat Rathi, science editor at The Conversation reviewed the paper for the site, and asked me for comments. A quote from me is included in his piece.

There’s a bit more to say than I did there. The Daily Mail also covered this, and lazily presented it as a model that you just enter your data into and presto! it tells you when to perform your cyber attack. That’s not at all the case. The model asks you to guess at some probabilities, and then measure some unmeasurables including a quantification of the attack’s effect. (Makes me think of risk assessment!) There isn’t much of a mathematical model in it, really – there are variables, and a formula, and case studies in the paper, but in the case studies the variables never get values, and the formula isn’t used for anything.

That’s not at all to say it’s a useless paper. Although you can’t actually establish values for the variables, the concepts embodied in them are very useful, and it makes perfect sense to talk about them going up/down in scenarios. The case studies, such as of Stuxnet, make for very interesting reading indeed.

On the NHS data sharing

The cyber security issue that has fascinated me most over the last few days has been the NHS data sharing story, not least because the “data privacy” and “sense about science” camps (both of which I normally strongly support) disagree about it. (I don’t think they are being played off against each other though.) Apparently all our medical data as currently held and controlled by GPs will be shared, in different ways for different forms (see an official NHS explanation), “red”, “amber”, and “green”, and we are asked to opt out if we don’t want this to happen.

First, opting out is clearly the wrong way around. Compare this to organ donation: “opting out” is still not a socially acceptable solution to this, despite it being more unequivocally medically essential and much less open to potential abuse than this data sharing. Not an appropriate comparison to make for everyone maybe – I’m probably too much of a “digital citizen”, caring more about my medical data when I’m alive than about my organs when I’ll be dead …

“Green” data looks relatively safe: it will be published publicly, and will consist of summarised medical info, excluding information (e.g. on rare diseases) that will come close to identifying people.

“Amber data” is pseudonymised, replacing non-medical identity data by meaningless pseudonyms. This is much more reason for worry. What is left is essentially behavioural data, some of which is similar to location data. If I can be uniquely identified from (typically) four locations visited during one day, I can also be uniquely identified from a small number of medical appointments (at given locations…!) Certainly anyone who can get hold of my mobile location data would be able to de-anonymise this. Given I have an Android phone, that list likely includes Google, the NSA, and GCHQ already.

So where will the amber data go? The NHS chief data officer saysmany of the most innovative uses of amber hospital data have come from outside organisations, including universities, think tanks and data analytics companies“. Universities – fine, I have to and do believe in them generally.
Think tanks, though? I buy Monbiot’s line that many of these are disguised corporate lobbies, and thus don’t have my best interests at heart. Data analytic companies I have no reason to trust whatsoever.

Any data, including the “red” data (which retains all personal information) will be shared within the NHS, plus in what looks like limited and tightly controlled situations with others, such as researchers. A case against opting out is made by the director of the Wellcome Trust here, but it concentrates on that limited and less controversial scenario of red data for research. If I knew that the NHS would not get privatised in any way during my lifetime, I would be reasonably comfortable with this. The old argument is that you want any A&E to be able to get all your relevant data out immediately if you’re brought in unconscious. I’d still worry about adequate protection, what the NHS do with their laptops and USB sticks, and the spooks tapping in somewhere along the way, of course. Unfortunately, with backdoor privatisation going on and likely to get worse, with dubious oversight and accountability, I’m not sure I can even trust the NHS in the medium term.

The case for opting out is made clearly by Ross Anderson (read the comments too, and see also an earlier story).

Finally, there seems to be an odd gap in data governance going on. I don’t understand the law well enough to see whether this is a real problem or a technicality, but apparently the GPs remain the data controller for your data even after it has been uploaded to the NHS. See this description of a recent Information Commissioner verdict. In terms of exercising your data subject’s rights under the DPA, this would surely be problematic?

Update (29-1-2013) The data controller issue has been resolved by an Information Commisioner Office’s blog post: it’s the GP until the data has been uploaded, and after that HSCIC will be the data controller.
Meanwhile, my worries about this have been written up in a piece at The Conversation.