Research data will typically be stored, managed and shared during a project, but may also need to be retained longer for a variety of reasons. This guidance will help you understand some of the factors that influence the length of the retention period for data, so you can make the correct decisions when planning, selecting storage and depositing data.
Storing data costs money, as shown by the costing tool provided by the UK Data Archive. Extended data retention periods may have some additional costs that will impact your project directly or they may be covered by the full economic costing included in your proposal. Invariably data retention periods will outlive projects, so you may want to consider how this will be funded as part of your data management plan and/or in your proposal – check with your funder’s guidelines. In some cases the costs and benefits of data storage and retention decisions may need to be assessed and justified for funding purposes. The KRDS (Keeping Research Data Safe) Benefits Analysis Toolkit may be used for this.
More information about Storing Sensitive Data
See more information about retention.
See more information on destruction of data.
It is not a requirement that all research data must be held within the University, it is recommended that you keep copies of your data on the university's networked storage. If your research data contains personal information from living indidivuals, you must store you data securely within the University's systems - see Research Data & GDPR for more information.
There are different types of networked storage:
Research Data Storage (University of Southampton Service Statement) - available from iSolutions KnowledgeBase
At the end of the research project, in a timely manner and in accordance with any funding requirements, research data should be deposited in an appropriate data repository. The best repository to choose for your research data will be a national data centre or discipline specialist data repository because they have the expertise and resources to deal with particular types of data.
You can deposit small datasets (gigabytes in size) in ePrints Soton for long-term storage. If your data is a terrabyte or over in size, contact email@example.com to discuss how to deposit your data.
Data can be open or on request depending on the nature of the data.
You can request a DOI for your dataset to include in the funder acknowledgement and data access statement in your publications and we can organise this. Ideally you should request the DOI prior to submitting your manuscript. See DOI for Data.
Where possible we recommend using discipline-specific data repositories such as the Archeology Data Service, you can find one for your subject via Re3data.org. Some funders expect data to be deposited in specific data centres e.g. ESRC and NERC support dedicated data centres. Also consider whether any agreements with your collaborators include requirements for data deposit. If you have an option to deposit in a repository associated with your funder, or your publication will pays for deposit in Dryad it is worthwhile considering this.
If you deposit your data elsewhere, please create a dataset catalogue entry in Pure with a link to where the data is stored.
You may be able to publish your data in a data journal. This is a growing and fast moving area. Some publishers are now requiring the deposit of supporting data with the article, while others require that a link to the data is provided. You will need to take this into account when considering how long you will need to retain the data and may influence your choice of storage location.
Deciding what data to keep is not always easy, as multiple factors and interests may bear on whether data should be preserved and the means that should be used. The Digital Curation Centre has published a Checklist for appraising research data that can help you approach this question in a considered, systematic fashion.
The University's RDM policy states that 'significant' data should be kept. But how do you tell what is significant? One way of defining significant is if it falls into one of these categories:
These are some of the criteria you should apply when going through data appraisal and selection:
It is important that, as well as planning for the curation of your data, you give consideration to how it will be destroyed where this is required for legal or other reasons. Guidance on when and who authorises the destruction of research data is covered in our section on Retention Periods and in the University Research Data Management Policy.
See more information on destruction of data.
The largest share of costs for data are incurred in preparation and ingest to the selected storage service, as shown by the costing tool provided by the UK Data Archive.
Extended data retention periods may have some additional costs that will impact your project directly or they may be covered by the full economic costing included in your proposal. Invariably data retention periods will outlive projects, so you may want to consider how this will be funded as part of your data management plan and/or in your proposal – check with your funder’s guidelines. Funders usually will only pay for costs incurred during a project so archival storage costs will have to be invoiced during the grant when data is deposited rather than a rolling annual cost.
Over time costs will be incurred for storage, typically based on the volume of data stored for a given retention period, and for additional services, for example active data management such as reformatting to counter possible format obsolescence. The latter is now regarded as less of a problem for popular formats, but may need to be considered for specialised data formats.
In some cases the costs and benefits of data storage and retention decisions may need to be assessed and justified for funding purposes. The KRDS (Keeping Research Data Safe) Benefits Analysis Toolkit may be used for this.
Thanks to the University of Reading for the list of selection criteria.