Raptor and Data Protection issues

It is always fascinating in the early weeks of a project to observe how what might seem like a fairly straightforward proposition rapidly spins out its tentacles and gets much more complicated. One of the issues that has arisen already at Kent is that of data protection and long term data storage. The library staff would like to see data trends over periods of time – ideally as long as five years. Raptor, supplied with log files for the required period – has no problem in producing appropriate reports. However the general advice on keeping log files from the European Directive on Privacy and Electronic Communications says we should not retain data for longer than the time necessary to carry out the specific purpose for which the data was collected. Most people would say that log files are created and kept to help troubleshoot problems and to increase security – log files can be analysed to spot unusual patterns of activity. It is pretty difficult to justify keeping data for more than a few months if that is indeed the purpose of collecting the data.  But if we are using log files to help with projections of usage, or to produce reports and graphs which simply analyse access to e-resources do we need to specifically state this is what are intending to do with the data?

The relationship between a university and its users – generally staff and students – is somewhat different to that between, for instance, an ISP and its users. In the former case there is an assumption, even an expectation that data will be retained much longer than six months – in fact in order to comply with education and employment law, organisations are required to keep the data throughout the period of education. But five years is probably beyond what is reasonable.

The guidelines do not state that we cannot keep the data for longer periods but stipulate that the if we do keep it, it must be anonymised.  Anonymised data is less rich – it may not be possible to extract the type of affiliation from anonymised data for instance, if there is no longer a username to look up.  The problem is not insurmountable but this does bring into focus a more general issue with Raptor and its use by non-technical staff.

An ideal tool for the library liaison librarians and library management would be one that could deal with ad hoc reports. By its nature, an ad hoc report should not have to be delayed whilst someone in the Learning & Research Department finds time to create it. The seriousness of this may vary across the sector depending on what level of technical expertise exists among library staff. Some university libraries may well have staff who are comfortable with tinkering with the xml to create new and modified reports or who can anonymise data and import the modified logs for Raptor to parse. But many libraries will rely on their IT department to do this work for them. Whilst IT departments will be happy to provide this service it will not always be possible to respond as quickly as the requesters would like.

I would be interested in hearing other Raptor users views on data protection issues and how they intend to tackle them.

BTW It feels a little mean criticising Raptor but I guess that is part of the evaluation. So I would like to also say that we think Raptor is a really useful tool and was much needed.

 

Standard

2 thoughts on “Raptor and Data Protection issues

  1. ancormack says:

    I think you’re right that keeping every person’s access history for five years would be considered excessive. However anonymisation only means that you have to remove things that would allow information to be linked it to an individual. So there should be no problem in keeping, for example

    “Department X accessed N articles from journal M in March 2012”

    (Unless you have departments with only a single user, in which case you need to aggregate across multiple departments)

    The Information Commissioner is currently consulting on an anonymisation Code of Practice: his draft thoughts may give some guidance on the sort of thing that’s needed, though of course they may change in the final version. See
    http://www.ico.gov.uk/about_us/consultations/our_consultations.aspx

    Cheers
    Andrew(Andrew.Cormack(AT)ja.net)

  2. Leo Lyons says:

    Thanks for taking time to comment Andrew and I do agree that anonymisation is not the end of the world. I think it is important to try to anticipate what we might need to report on in the future. If we simply anonymise the data and remove the username we would no longer be able to extrapolate the users’ association – member of staff, undergrad, postgrad etc. I don’t think there would be any objection to us storing that affiliation but we would have to process the logs to perhaps replace the username with a code signifying that affiliation.

    We would have similar problems if we wnated to extract unique users accessing resources too. None of this is insurmountable even accepting our obligation to stay within the guidelines but does require some additional work.

Leave a Reply