If you live in the UK then in among the junk mail that your postman kindly delivers in order to ensure your waste paper bin gets fed you will probably have had a leaflet from the NHS entitled "Better information means better care". Essentially this is about the pooling of medical records in order to improve care; or at least that is what the leaflet suggests. The problem is that this sharing of data isn't about ensuring that you get better care as your medical details are already shared between your GP, hospitals, etc. to provide you with the best care. In fact, as far as I can tell, no one who would ever treat you would ever have direct access to this shared data. So if it isn't about treating you better what is it about?
The real aim of pooling the data together is for strategic planning and research. Now given that I work as a researcher I understand that in most instances the more data you have the better your results will be. While I'm not a medical researcher I have, over the last few years, been involved in a number of large projects which have involved limited access to medical records of individual patients. So in theory, I should, eventually, benefit from this programme as there will be more data available not just to medical researchers but to people like me (for those who don't know I'm a Natural Language Processing researcher) who can also help society; just a few of the projects I've worked on, or know about, have ensured that prescription information is accurately recorded, have helped to find a new cause for cancer, and have built a system that can predict suicide attempts from medical notes. In all cases access to the medical records have been strictly controlled (the number of hoops you have to jump through can be insane) and for good reason. These documents contain a large amount of very personal information. All the data I have seen has either been from patients who have opted-in to a study (like the cancer work) or has been used on a purely non-commercial basis.
I certainly have no problem with opt-in medical studies and research within a hospital to improve patient care seems a worthwhile use of data. Selling data to outside companies, however, seems to cross a line. I'm sure that large pharmaceutical companies have strict controls in place over data, but the wider you share things the harder it is to control. And anyway, why should the Health and Social Care Information Centre (HSCIC -- the people who will collect and hold the data) be allowed to make money by selling your data? If you asked HSCIC they will of course tell you that in most cases the data they release or sell will be fully annonymized.
I certainly have no problem with releasing population statistics like the percentage of the country who suffered from flu last year. I can see that that is useful information and can in no way lead back to me, if I had flu or not. The problem with handling individual medical records is that they are often very private, and the details have been shared knowing that they are treated confidentially. By the time I see a medical record when doing research, they should have been annoymized (like the majority of HSCIC releases will be) to ensure that I can't work out who the patient is. This usually involves removing name, date of birth, address, etc. Unfortunately there are two problems with this. Firstly no matter how hard you try to annoymize a data set, it is often possible to reverse this and to uncover the people involved; two large examples of this are the AOL search dataset, and the Netflix Prize dataset. The more important problem though is that large scale annoymization is usually done automatically, and like any automatic system it will fail and some of the worst instances of this I have seen were actually in medical records. The worst case I ever saw was a large set of notes in which the names of the patients had been correctly removed throughout. Unfortunately these were letters and notes regarding social care and in many cases the names of other family members, including the patients spouse, had not been removed.
So, as the title of this post suggests, I'm torn. On one side I know from first hand experience that the data the HSCIC wants to collect can be extremely useful, while on the other hand I know that keeping it secure can be very difficult. Added to this, while I might be happy for the NHS to internally use this data for research I'm not convinced it should be sold to commercial third parties, however strict the conditions on its use might be. For example, are we talking about selling to pharmaceutical companies for research or advertising?
I've still not made a final decision on what to do about my own records, but I thought some of you might find this post an interesting reflection from both sides of the fence. Unfortunately the leaflet that was sent out didn't actually include an opt-out form. If you do decide you want to opt-out then you don't need to make a GP appointment (that would be a serious waste of the GPs time), all you need to do is send them a letter explaining that you don't want your records included, and there even a couple of websites (here and here) that have form letters you can use to make the process easy.