The Big Data Buffet

Over the last decades there has been an exponential increase in the volume of digital data acquired in all research disciplines. These big data research projects have necessitated a paradigm shift in the way researchers make provision for the secure storage and distribution of digitised data to the community. This need is reinforced by both publishers and funding bodies and journals that now expect all data be made freely available to the research community at large.

The diversity in size and types of files initially provided a challenge to IT departments, but these issues have largely been overcome with the appearance of institutional data depositories, which allows the secure multi-site online storage of data, which can be made freely available for the global community to access.

The Kent Data Repository (KDR) is such a facility here at the University of Kent. It is a centrally coordinated data storage and distribution hub run by a team of dedicated staff within Information Services. Experience from my own research group is proving KDR to be an invaluable resource for sharing and thereby increasing the use, impact and value of research data.

My research group in the School of Biosciences generates large quantities of microscopy based imaging data through our investigations into the dynamic distribution and movement of molecules within living cells. Research within the group is primarily funded by the Biotechnology and Biological Sciences Research council (BBSRC). They, like other research councils, expect all meaningful research data generated through their funding should be shared with the research community.

On a number of occasions over the last couple of years we have needed to deposit significant volumes of data onto the Kent depository. Each time it has proven to be a simple and straight forward process. There are clear guidelines and instructions available. In addition the extremely helpful team can assist you through the simple steps, and ensure the files upload correctly, and even come out to your office to help upload particularly large datasets if required! They will also help you with any specific copyright or other issues you may not be clear on.

Once the data is uploaded it can be linked to publications and access limited to article reviewers. Then upon acceptance access can be opened up to the whole community. The data can be highlighted on social media (such as blogs or via Twitter accounts), and during other public engagement activities. You can easily monitor how often the data is accessed and downloaded, which can be used as evidence in impact case development and to strengthen research funding applications. I have found our data is proving to be a useful resource for colleagues from research disciplines different from my own or who lack the necessary research infrastructure and reagents, and would otherwise struggle to have access to datasets. These have initiated a number of surprising and exciting new cross-disciplinary research opportunities.

I am looking forward to finding out what research and interactions our future dataset deposits will generate.

Open Access Week Guest Blog: Professor Dan Mulvihill, School of Biosciences

Leave a Reply