Written by Sarah Slowe and originally published on the OSC Blog on 14th February 2018
This would have been my shortest blog post ever, but for the general feeling that it is bad form to have a post with more words in the title than in the post, especially when you offered to write the post in the first place!
While I am always keen to talk research data, I recognise this is not a universal trait and many of you have many other things to worry about. There are some key times when thinking about research data is important.
Thinking about your project
What data are you collecting? How are you collecting it? Where will you store it? Who needs to access it?
Sometimes the answer to this question is “But I don’t have data” – data conjures up a picture of numbers, of tables, of spreadsheets and so on – the reality is that research data is much broader and more varied than this – Pictures? Sculptures? Books? Interview transcripts? Lego models? Films? All of these can, and have been at Kent, the data that underlies a research project. “What information would you need to reproduce your research findings?” is a good question to ask when thinking about what your research data looks like. For more information, see the research support ‘What is data’?post.
“Research data is defined as recorded factual material commonly retained by and accepted in the scientific community as necessary to validate research findings; although the majority of such data is created in digital format, all research data is included irrespective of the format in which it is created.”
Engineering and Physical Sciences Research Council (EPSRC)
Planning from the start of a project helps you to
- consider your project as a whole
- think about the steps and different parts
- plan for every eventuality
- keep track of what changes you have made, and why, as your project progresses, so you don’t lose this information
Planning to manage data also gives you the opportunity to look at the data that is already available to researchers, looking at services such as re3data. For more information, see the research support ‘Where is data’? post.
Applying for funding
Many research funders ask for a data management plan as an attachment. The Digital Curation Centre has a list of funder policies, an outline resource for creating a data management plan and a very handy checklist that covers many of the questions to consider when managing your research data.
While you are collecting your data
Keep your data safe! Data can be lost in many different ways: through human error, hardware failure, software or media faults, or malicious hacking and virus infection. Digital data files can also be corrupted in storage or through file transfer. Those of you who have heard me talk about research data will have heard the “A USB stick is not an archive” refrain.
Organise your data – How will you name files so they are still meaningful at the end of the project? How will you track versions of files? How will you structure and reference files? This is especially important in collaborative projects, to ensure that everyone is working to the same guidelines – having an agreement about date formats can save a lot of time later on.
FileName2018-09-01? Filename01-09-2018? Filename09-01-2018?
Give your data context – Keep information to help you interpret your data and give it context in the short and long term. Record when the data was collected and by whom and keep this information with the data files wherever possible.
Create a ReadMe file: A ReadMe file is used to describe everything someone would need to know to replicate the data, or to use it and understand it properly
Keep in touch with research support when you have questions – it may save time later.
Publishing your data
“Do I need to publish everything?” – no!
There are important considerations in conducting reproducible research, but there are also ethical considerations. The aim is to ensure that research data are available for others to access and re-use where legally, ethically and commercially appropriate, taking note of any relevant safeguards.
Some journals require data that underpins a publication to be made openly available on submission or acceptance of the manuscript, but this is often a subset of the data collected throughout the project.
There are many way of publishing your data the two key ones are:
- include it within your final published papers – this means researchers already engaged with your research output can reuse the data if it is relevant to their work
- deposit it in a repository (subject or institutional) – this provides a single, authoritative source for your data, that you can then reference when sharing informally among colleagues, using tools such as KUDOS, or on personal, project or school websites.