A robust Research Data Management Plan (DMP) is required to demonstrate and ensure good research practice and procedures. This helps with protection of Intellectual Property Rights (IPR), proper recording, maintenance, storage and security of Research Data which in turn supports compliance with relevant legislation and regulations regarding data usage and rights in relation to data. It also can ensure that common law confidentiality obligations and appropriate access to Research Data is maintained.
Basic DMP is required by the University's Research Data Management policy and is recommended in the Concordat on Open Research Data. DMP is a condition of UKRI funding and is likely to be mandated by other funding bodies, Government and institutions in the near future.
Even if your project funder does not require planning, it may be useful to write a DMP because time spent reflecting on roles and options at the start can save time later and provide additional benefits, for example:
A DMP will bring most benefit if it is referred to and updated throughout the project and viewed as an integral part of the research process
A Data Management Plan (DMP) is a document that describes:
If your funder requires you to write a data management plan (DMP), follow that funder's current advice (see below). If your funder has no specific requirements or your research is unfunded, this checklist is for you.
1. What data will be created?
2. Who will create the data?
3. Roles and Responsibilities
4. Software and Services required
5. Naming and describing your data
6. Data Sharing with Collaborators
7. Storage - short & long term
8. Dissemination
9. Restrictions to Sharing
10. Permissions to share
A more detailed checklist is available from the Digital Curation Centre DMP checklist
DCC have a list of “real life” DMPs at http://www.dcc.ac.uk/resources/data-management-plans/guidance-examples
We also have some Southampton DMPs which we can share on request. Contact researchdata@soton.ac.uk
Moreover, you can download the generic Southampton DMP information (Word file, sign in required).
University policies related to data management and sharing:
For those writing a data management plan in support of a grant application, the Research Data team can review your plan before you submit your application. We would prefer to have at least two weeks notice before the deadline. Email us at researchdata@soton.ac.uk
DMPonline is a platform for creating data management plans and provides access to templates for all the major UK and EU funding
DMP guidance for PGRs can be found on the library's thesis guide
Most major research funders require some form of documentation at the application stage, to explain how research data will be managed.
Funder requirements should followed alongside the existing University Research Data Management Policy.
UK Research and Innovation (UKRI) expects research data arising from its funding to be made as open as possible and as restricted as necessary. Each Council has developed their own specific policies and requirements to take account of the disciplines involved, there is an expectation that researchers will:
More:
The DCC keeps a broad list of funder template DMPs including guidance and sample plans.
Thought needs to be given at an early stage to the costs of preserving data, so that these can be included in the funding application.
The largest share of costs for data are incurred in the preparation and ingest to the selected storage service, as shown by the costing tool provided by the UK Data Archive.
It may be necessary to budget for additional time and effort to prepare data for preservation, and some data archives levy a charge for deposits. Funders usually will only pay for costs incurred during a project so archival storage costs will have to be invoiced during the grant period when data is deposited rather than a rolling annual cost.
Most funding bodies will cover reasonable costs, you can check with your funder what support is available.
The FAIR Data Principles were first defined in a 2016 article in Scientific Data. They are designed to promote:
Having machine-readable metadata allows for the discovery of datasets and services.
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for the automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
F1. (Meta)data are assigned a globally unique and persistent identifier
F2. Data are described with rich metadata (defined by R1 below)
F3. Metadata clearly and explicitly include the identifier of the data they describe
F4. (Meta)data are registered or indexed in a searchable resource
After finding data, users must know how data can be accessed. Metadata must be accessible even when data is no longer available.
Once the user finds the required data, she/he/they need to know how they can be accessed, possibly including authentication and authorisation.
A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1 The protocol is open, free, and universally implementable
A1.2 The protocol allows for an authentication and authorisation procedure, where necessary
A2. Metadata are accessible, even when the data are no longer available
Ensuring that the data can communicate and exchange with other data, applications, or workflows for processing, analysing and storing.
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data
Having metadata and data well described so it can be repeated or combined in various settings.
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards
FAIR principles are increasingly important as we use computational support to find and deal with data. The principles advocate using rich metadata, persistent identifiers (such as DOIs), licences and, where they exist, shared community standards. See the full FAIR principles below.
Datasets can still be FAIR even if they are not openly accessible. If they are more findable because of rich metadata and it is clear how potential users can request and access the dataset, then even a sensitive dataset which cannot be made openly available can achieve a high degree of FAIRness.
The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure. For instance, principle F4 defines that both metadata and data are registered or indexed in a searchable resource (the infrastructure component).
The FAIR principles are further defined on the GO FAIR website.
A data protection impact assessment (DPIA) is a process to help identify and minimise the data protection risks of a project.
You must do a DPIA for certain listed types of processing, or any other processing that is likely to result in a high risk to individuals' interests. It is also good practice to complete a DPIA for any other major project which will require the processing of personal data.
Under The Data Protection Act 2018, DPIA (the new term for a Privacy Impact Assessment) is compulsory for any project that is likely to be 'high risk' to the rights and freedoms of individuals. The GDPR does not define what high risk is, however examples include 'large-scale' processing so it is likely that DPIA will be required for some research projects.
Even sensitive research data can often be shared legally and ethically by using informed consent, anonymisation and controlled access. In order to be able to do this it is important to consider potential data sharing and re-use scenarios well before the ethics process and data collection. Be explicit in your consent forms and PIS about your plans to make data available, who will be able to access the data, and how the data would be accessed and potentially re-used.
You should complete an Initial Data Protection Review (IDPR) (serviceline form) and you may also need to undertake a full Data Protection Impact Assessment. You can find guidance on this process on the Information Governance & Data Protection sharepoint site.