Significant elements of the initially proposed General Practice Data for Planning and Research (GPDPR) programme to collect patient data from general practitioners in England to help improve frontline care in the NHS has been branded “wrong” and a “mistake” by one of the UK’s most noted scientists.
The GPDPR programme was heavily criticised last year by privacy experts and others who said it presented an unacceptable level of security risk, and that the public had not been adequately informed of the plans.
Had it gone ahead in its initial form, the resulting database would have contained substantial amounts of personally identifiable information (PII) on millions of people, including diagnoses, symptoms, observations, test results, medications, allergies, immunisations, referrals, recalls and appointments.
It would also have included information on physical, mental and sexual health, data on gender, ethnicity and sexual orientation, and data on staff who have treated patients.
Giving evidence this week before the parliamentary Science and Technology Select Committee on the findings of his review into the use of health data for research and analysis, Ben Goldacre, professor of evidence-based medicine and director of the Bennett Institute for Applied Data Science at the University of Oxford, said it had been a mistake to try to launch such an enormous programme without making it clear to the public what safeguards would be put in place.
Goldacre, who is also a prominent science writer and a frequent commentator on science and technology, said: “I think the national GP dataset is so granular and so comprehensive in its coverage that it wouldn’t be appropriate to share it outside of the TRE [trusted research environment].
“I would say it was wrong to try to launch a programme like that before we had made a commitment that all data access would only be through a trusted research environment.”
Goldacre told the committee he had been among those who had withdrawn consent for their data to be included.
“I did, because I know so much about how this data is used and how people can be de-anonymised,” he said, “and also in part because in the past, to a greater extent, I have been in the public eye from doing public engagement work and I have friends who have had their data illegally accessed through national datasets, not health datasets.
“I suppose, because I work in this field, the risks are very salient to me.”
Goldacre said the process of pseudonymising data in order to disseminate it had serious shortcomings that presented privacy risks that had not been properly addressed.
“Because we have taken a rather chaotic and ad hoc approach to access to data through dissemination, we now see a lack of people doing good work at all in data,” he said.
While Goldacre said it was eminently sensible and reasonable to have access to secured patient data for planning and research purposes, there may have been a tendency on high to fall victim to “groupthink”, although he added that once concerns were raised, this had been quickly dealt with.
“When I and others involved in the review started raising concerns about this, it was striking that often people in more junior and technical roles in the system put their hand up or emailed or called to say, ‘I am actually really glad that you have said that, because I have been surprised at the extent to which people haven’t really engaged with the shortcomings of these approaches before’,” he said.
Goldacre added that the government’s commitment to tighten the rules before commencing the data scrape was to be welcomed, in particular the promise to make available a TRE where approved researchers could work securely on pseudonymised data.
“That, as I understand it, is now policy, so we have a very positive future ahead of us, in part because trusted research environments don’t just mitigate privacy risks – they also make for a much more productive environment for working with data,” he said.
Commenting further on a proposed bill currently before lawmakers in New Zealand that would criminalise de-anonymising data, Goldacre said he felt very strongly that legal barriers should be put in place to prevent misuse of patient data within the planned TREs.
“You need to block people misusing data, ensure that you detect it when they do, and make sure that the penalties are so high that they get talked about and people are really afraid,” he said, adding that it may be reasonable and proportionate for those penalties to include prison time.