What are research data?
Research data means different things to different researchers. Research data are information generated by scholarly projects which supports or validates the analysis. Each project will produce data with unique characteristics. As a result, it comes in many formats and forms.
Your research datasets may include:
machine generated code
recordings and transcriptions of interviews
photographs detailing the stages of development of an artwork
paper notebooks or workbooks
sketches, prototypes, or models
manuscripts and early versions of research outputs
anything else that contributed to the development of your research in its final or published form
What is Research Data Management?
Research Data Management (RDM) is organising your data so it is secure from creation, through to publication, and beyond. RDM includes ensuring your data is available to replicate your findings or to contribute to further research.
What is a Research Data Management Plan?
Management of your research data needs to start as soon as you begin to plan your project. You will need to create a plan to help ensure you have the resources and processes to keep your data safe. Use your plan to record all the decisions you make about your research data. RCUK funders ask for Research Data Management Plans as a part of funding bids (e.g. AHRC guidelines).
Your plan will include:
- Administrative metadata – basic information about your project and its context.
- Data collection – what is your data and where do they come from? Will their format enable long-term preservation, access and re-use? Are you re-using any existing data?
- Documentation and metadata – what information will you provide to help users to understand your data?
- Ethics and legal compliance – does your data include information about people? Have you consent to use your data, and to preserve and share it? Who owns your data?
- Storage and backup – How and where will you store your data during your project? Are there recovery and backup procedures in place? Will there be costs associated with this storage?
- Selection and preservation – What data will you archive? – which is scholarly useful, does it support your findings? Where will you archive your data? What data must be destroyed for ethical and legal reasons?
- Data sharing – What data will you share? How will scholars and the public find your data? What anonymization or other procedures does your data need before you can store and share it safely? How will you licence it for reuse?
- Responsibility and resources – Who is managing your data? What extra resources will you need? Do you have the skills available? What support does your institution provide?
Use the Digital Curation Centre DMP Online tool to create your plan using tailored guidance and examples.
What is a Research Data Archive?
A research data archive is a digital system, which curates datasets to agreed and approved standards and governance. It is designed to ensure data deposited is findable, secure and preserved for the long term. It will provide a metadata schema to record information about the data and secure file storage. The archive will have published governance and procedures to ensure it, and the data it holds is relaible and trustworthy. There are three main standards for archives of this kind:
- Data Seal of Approval (DSA)
- Audit and certification of trustworthy repositories (ISO16363)
- Criteria for Trustworthy Digital Archives (DIN 31644)
DataCite has set up a global registry of research repositories called re3data.org. re3data.org lists research data archives that are run by a legal entity (such as a University) and have robust governance in place.
What data do I need to archive?
How much of your data you choose to archive is up to you – as long as you follow your funder and publisher guidelines. You may want to record or archive only that data which is necessary to verify or reproduce your research findings. Alternatively, you may want to preserve all your raw datasets for future reuse. Once you have selected what data you want to archive, you will need to ensure it is ready.
You will need to think about the format your data is created in and whether this file type is appropriate for archiving. Open, interchangeable or standard formats will ensure your files are accessible over the long term and by a wider audience. Organise your data in a clear and logical structure. Include documentation and contextual metadata to help scholars find and use them. Research data archives have metadata forms designed to ensure you can record this information easily.
What about non-digital data?
You may create research data in physical or analogue formats. These may be handwritten notebooks, models and preparatory sketches. Data of these kinds can be preserved by creating a digital surrogate or copy that you can then store alongside your digital data. Workbooks and small-scale text can be scanned. Audio and video files can be converted to digital formats. Interviews can be transcribed and saved in text files.
Take care to ensure that the digital surrogates represent the data accurately. Include information about the digitisation process in the accompanying information and metadata.
Does your data have to be Open Access?
Funder, publisher and data policies encourage you to make your data openly available to other scholars. This is so that your findings can be replicated and so your data can be reused. Continued use of the data beyond the lifetime of your project increases your citations and exposure. This may lead to fruitful collaborations and more funding.
This may not be possible without intervention if your data contains sensitive information. This may be:
- personal details
- commercially sensitive information
- information that may place individuals or groups at risk
- existing datasets with restrictions on onward dissemination
At the planning stage of your project you should include details of how you will share your data without compromising the rights of others. As early as possible consider:
- what consent you need to obtain from your partners and participants
- what anonymization processes will you need to apply to your data to protect your participants
- what embargo periods you will need to apply to your data
- what redaction will you need to perform to keep it safe
What is F.A.I.R. data
Using the concept of F.A.I.R. Data instead of Open Data allows us to discuss these issues in a more detailed manner. With F.A.I.R data we can make datasets ‘as open as possible, (but) as restricted as necessary’. F.A.I.R. stands for data that are:
- Findable – accurate descriptions of your data available to search engines; a unique identifier for your data; archived or recorded in a searchable and stable repository
- Accessible – universally retrievable and authenticated
- Interoperable – in open or standard formats; with contextual information about the data
- Re-usable – with a clear and standard licence; with information about their provenance; with all necessary consent and clearance in place
What help is available at the University of Kent?
At the University of Kent the Research Support team provides guidance in Managing your Research Data. This includes:
- Guidance for creating research data management plan
- information about sharing and preserving your data
- help with understanding the policy, legal and ethical context of your research data
We also provide the Kent Data Repository (KDR). KDR is a secure and accredited archive for research data created by Kent staff and students. It is also a registry of Kent’s research data where they are stored in an external archive. KDR was launched earlier this year, and we are continuing to develop its features and functionality.
Come and talk to us about how we can help you with your Research Data Management.