NHS data sharing: taking stock | Eerke Boiten's blog

I have written in the last few weeks, on this blog and twice in The Conversation, on the NHS care.data sharing scheme. In terms of the “authoritative” information, the picture has become a bit clearer to me, although the information “out there” is hardly getting any clearer. Mindless accusations such as “NHS selling data to the highest bidder” are still floating about, and on the “other side” even yesterday the BBC was still reporting data was non-identifiable when it is. Whatever clarifications I have obtained have come through Twitter chats, especially with Geraint Lewis, the Chief Data Officer for NHS England, who is the best and most engaged advocate for the system. (If you follow @CyberSecKent, which in theory could be used by colleagues but in practice so far only by me, you may have seen some of the @GeraintLewis stuff on your Twitter feed.)

Time to take stock. In my first article, “Outdated laws put your health data in jeopardy” I described the system and listed a few worries.

The legal set up: opt-out rather than opt-in, and subsequently reports on rather than requests for use, seems ethically dubious but legally correct. The main chink that is appearing is the presumption that GPs’ DPA duty to inform patients about sharing has been adequately covered by a junkmail leaflet is called into doubt. This applies a fortiori to patients with a visual or mental disability. An amazingly readable long paper on the legal background which encompasses all the shards of legal thought that have appeared in the discussions I have seen (and much more) was pointed out to me by Prof. Julia Hippisley-Cox.
Intelligence services obtaining and re-identifying the data, with possible subsequent use for reputation attacks: the risk of particularly the former still stands, especially after Sir David Omand affirmed this week the official GCHQ doublethink that bulk collection of data does not equate to mass surveillance. Ironically, this was on The Day We Fight Back.
The Honeypot Value of such a single huge database of sensitive data is still a worry. HSCIC point at their recent cyber security inspection. They have claimed there have been no breaches of the HES precursor database (collected hospital data) in its long existence, but gloss over the large number of past NHS data breaches, and have so far not responded to requests for disclosure of recent monitoring information. Worryingly, some of the NHS response does not take into account that the new database will be so much more valuable for potential abusers that it warrants higher levels of security than anything. My worry remains here: data of extremely high value protected by people who are supposed to be top experts has been stolen regularly in the last few years. Particularly full medical histories once out will never go back in the toothpaste tube.
On potential abuse by commercial companies the situation has become a bit clearer to me. As things stand, Bupa and others have access to the precursor HES (“the data analytics arm” of Bupa rather than their insurance business), and they will not have access to care.data. HSCIC currently state that data will not go to anyone beyond the “commissioners” (NHS England, CCGs, Public Health). I have asked about the status of the
care.data addendum which in section 5.3 clearly lists commercial companies as “additional customers”, which contradicts the HSCIC position. Geraint Lewis has asked me to ask for clarification of this by email, but so far I have not received a reply yet. Overall it looks like there will still be an opportunity to reconsider the plans for sharing of “amber” data with others besides NHS and research organisations after the care.data database is established, which is good. That does not reduce the risk of the data going to parts of NHS now which will become private later, though. Again we can think of Bupa there. Or worse: G4S or Atos.
Weakness of DPA is still an issue. Even if the sharing of orange data is limited, the potential gains from abuse dwarf both the maximum fines under the Data Protection Act (500K, no prison), and the money HSCIC assign themselves (“selling at cost”) to monitor against potential abuse. I have been probing to find out how HSCIC could be sure to find out the data has been abused once it has been passed on to a third party, but have had no convincing answers. Channeling all third party access through HSCIC, on a query by query basis, would be much more secure in that respect. The value of the data to us as its subjects means that it should also be worth the extra expense involved.
In my second The Conversation article, Your NHS data is completely anonymous – until it isn’t I did not really raise any new worries. I merely articulated what I thought was common knowledge: pseudonymised (“orange”) data is re-identifiable, particularly if you have a lot of it. The remaining worry from this is that Tim Kelsey’s categorically wrong comment on this, “No one who uses this data will know who you are”, made on BBC Radio 4, is still being repeated e.g. by the BBC. Some of the discussion around the care.data issue concerns trust. The “pro” line goes: you have been trusting the NHS all your life, do not let your distrust of politics get in the way of this great opportunity for the NHS to improve care. However, Tim Kelsey has just shown that despite the HSCIC’s best efforts to do this important job in a responsible way, they are answerable to bosses who are happy to misinform the public. That is a worry that will not go away easily, especially not after they have appointed yet another person with a huge conflict of interest.
In the meantime, another worry appeared: that the data of people who had opted out would still be uploaded onto the system. This has been authoritatively debunked this week by Geraint Lewis; there is a sense that this is a (helpful!) change of policy rather than a clarification. (Ross Anderson’s comments suggest care.data would be used for NHS to pay bonuses to GPs, which would lead to an inconsistency: without the data of opters-out, the information would not be there. GP Neil Bhatia of care-data.info says care.data is not intended for this anyway, and GPs get paid via QoF/CQRS and other submissions which are nearly all anonymised/aggregated.)
Randeep Ramesh in the Guardian raised the above worry (plus MP David Davis’ wonderful 5 broken noses re-identification illustration!) alongside another one: that police would use the HSCIC database to get at medical data without a warrant, in the same way and under the same conditions they can do already now through GP practices. This got the medics worried about HSCIC sticking to the same strict procedures that GP practices have to. The response to this was reassurance from Tim Kelsey that the police wouldnot do so, but it would have been ever so much more reassuring if he’d said they could not do so.
Finally, only on this blog I sketched a nightmare scenario where people would stand out negatively (as having something to hide) through the mere fact of having opted out. That one, at least, has become a lot less likely now.

(edit 14 Mar: Neil Bhatia is care-data.info not medconfidential, sorry. All good people though!)