Skip to Main Content

Research Data Management: Storing

Guidance and support to staff, researchers and students at the University of Southampton

Storing, retaining & destroying

Research data will typically be stored, managed and shared during a project, but may also need to be retained longer for a variety of reasons. This guidance will help you understand some of the factors that influence the length of the retention period for data, so you can make the correct decisions when planning, selecting storage and depositing data.

Storing data costs money, as shown by the costing tool provided by the UK Data Archive. Extended data retention periods may have some additional costs that will impact your project directly or they may be covered by the full economic costing included in your proposal.  Invariably data retention periods will outlive projects, so you may want to consider how this will be funded as part of your data management plan and/or in your proposal – check with your funder’s guidelines. In some cases the costs and benefits of data storage and retention decisions may need to be assessed and justified for funding purposes.

More information about Storing Sensitive Data

Storing your data

While it is not a requirement that all research data must be held within the University, it is recommended that you keep copies of your data on the University's networked storage. If your research data contains personal information from living individuals you must store you data securely within the University's systems - see Research Data & GDPR for more information.

There are different types of networked storage with different snapshot and replication schedules.

Access required by a group

  • Research Filestore (AKA \\xxx.files.soton.ac.uk\<SHARE>\) 
  • Sharepoint
    • How set up a Sharepoint
    • Adding non-Southampton people
    • Allows all collaborators access to files, and dependent on rights, to edit files
    • Individual file limit of 250Gb
    • Sits on the Office 365 platform and has same security as OneDrive
    • Data is encrypted at rest
    • All data held in secure centres which are within the UK
  • High Performance Computing (HPC)
    • Any researcher who is limited by the computing capability of their desktop PC may find HPC beneficial. Contact serviceline if you think you may need to use this facility.
  • J:\ drive (AKA \\soton.ac.uk\resource\) N.B. use of J drive is being phased out.
    Snapshot on primary every 2 hours, retained for 30 days
    Sync to replica every 6 hours (at 04:00, 10:00, 16:00, 22:00), retained for 3 months

If you need to share data during research

If you need to share data with collaborators either within or external to the University, you should use Sharepoint or Teams through Office 365 (see above).

If you need to send data to collaborators, we recommend using Safesend rather than email

  • Safesend
    • Can deal with files too large for email
    • Held on university services
    • Receipts sent when users pick up the file(s)
    • Files deleted after set period of no more than 32 days
    • safesend.soton.ac.uk

Access required only by an individual

  • OneDrive through Office 365
    • Cloud based storage
    • 5TB of Storage, (can be increased to 25TB)
    • max file size 15GB, raising to 100GB in the near future.
    • Restrictions on filenames
    • Data is encrypted at rest
    • All data held in secure centres which are within the UK
    • Data will be automatically deleted 30 days after user leaves the University
    • http://go.soton.ac.uk/365login
    • Do NOT use personal OneDrive
  • Network Home Directories (AKA \\filestore.soton.ac.uk\Users\<username>\, AKA “My Documents”):
    • Snapshot on primary every 2 hours, retained for 30 days
    • Sync to replica every 6 hours (at 04:00, 10:00, 16:00, 22:00), retained for 3 months
    • To access network storage from Linux machines see http://linuxdesktops.soton.ac.uk/mount.html 
    • Data will be deleted a maximum of 90 days after user leaves the University.

Related University Policies

University of Southampton Research Data Management Policy

Research Data Storage (University of Southampton Service Statement) - available from iSolutions KnowledgeBase

Bookmark

Bookmark this page as https://library.soton.ac.uk/researchdata/unistorage

At the end of the research project, in a timely manner and in accordance with any funding requirements, research data should be deposited in an appropriate data repository. The best repository to choose for your research data will be a national data centre or discipline specialist data repository because they have the expertise and resources to deal with particular types of data.

Depositing @ Soton

You can deposit small datasets (gigabytes in size) in ePrints Soton for long-term storage. If your data is a terrabyte or over in size, contact researchdata@soton.ac.uk to discuss how to deposit your data. Data can be open or on request depending on the nature of the data. See our detailed guide and information on how to deposit data at Southampton,

You can request a DOI for your dataset to include in the funder acknowledgement and data access statement in your publications and we can organise this.  You should request the DOI prior to submitting your manuscript.  See DOI for Data.

 

Depositing elsewhere

Where possible we recommend using discipline-specific data repositories such as the Archeology Data Service, you can find one for your subject via Re3data.org. Some funders expect data to be deposited in specific data centres e.g. ESRC and NERC support dedicated data centres. Also consider whether any agreements with your collaborators include requirements for data deposit. If you have an option to deposit in a repository associated with your funder, or your publication will pays for deposit in Dryad it is worthwhile considering this.

If you deposit your data elsewhere, please create a dataset catalogue entry in Pure with a link to where the data is stored.

You may be able to publish your data in a data journal. This is a growing and fast moving area.  Some publishers are now requiring the deposit of supporting data with the article, while others require that a link to the data is provided.  You will need to take this into account when considering how long you will need to retain the data and may influence your choice of storage location.

Bookmark

Bookmark this page as https://library.soton.ac.uk/researchdata/deposit

Deciding what data to keep is not always easy, as multiple factors and interests may bear on whether data should be preserved and the means that should be used. The Digital Curation Centre has published a Checklist for appraising research data that can help you approach this question in a considered, systematic fashion.

The University's RDM policy states that 'significant' data should be kept.  But how do you tell what is significant? One way of defining significant is if it falls into one of these categories:

  • Data that directly underpin research findings which have been or will be published, and that must be retained in order that these findings can be independently substantiated;
  • Data that have enduring value independent of any published findings. This may be value to yourself, in view of possible future research or the exploitation of any IPR, to other researchers, who might usefully interrogate them to generate further insights, or to other stakeholders such as industrial collaborators for whom the data may hold commercial interest
  • Data that must be managed according to legal, ethical or contractual obligations irrespective of any research value they possess. This may affect what can be retained under what conditions, and what must be destroyed.

Selection criteria

These are some of the criteria you should apply when going through data appraisal and selection:

  • Do the data directly support research findings that have been or will be published? 
  • What are the funder's or sponsor's requirements? Many funders require the preservation and sharing of key data underpinning research publications for set periods after the completion of the research. Check any relevant policies and contract terms;
  • What are the publisher's requirements? Some publishers, e.g. the Nature Publishing Group, require authors to make data available as a condition of publishing;
  • Do the data have long-term research value? Would you use them in future research? Would other researchers be able to use them?
  • What is the likely cost of retaining the data in relation to the cost of recreating the data or the impact of losing the data? The results of simulations may be easily reproducible; experimental data may be reproducible cheaply or only at great cost, depending on the nature of the experiment; time series data or survey findings are by their nature irreplaceable, and may be highly valuable;
  • Do you have permission to keep/publish the data? It may not be possible to retain or publish secondary data obtained under licence;
  • Are you legally entitled to keep all the data, or publish or otherwise process them? The Data Protection Act, for example, and the scope of any consents for collection and processing of personal data, might affect what data can be kept, and what can be done with them;
  • Can you afford to keep the data? Can the cost of archiving the data be recovered from the sponsor? The Research Councils as well as some other funders accept that data management costs may be recovered through grants;
  • Do you have space to keep the data? What preservation resources are available to you?

What to destroy and how to do it

It is important that, as well as planning for the curation of your data, you give consideration to how it will be destroyed where this is required for legal or other reasons. Guidance on when and who authorises the destruction of research data is covered in our section on Retention Periods and in the University Research Data Management Policy.

See more information on destruction of data.

Other guides on selection

[Thanks to the University of Reading for the list of selection criteria ]

 

Bookmark

Bookmark this page as https://library.soton.ac.uk/researchdata/selection

The largest share of costs for data are incurred in preparation and ingest to the selected storage service, as shown by the costing tool provided by the UK Data Archive. 

Extended data retention periods may have some additional costs that will impact your project directly or they may be covered by the full economic costing included in your proposal.  Invariably data retention periods will outlive projects, so you may want to consider how this will be funded as part of your data management plan and/or in your proposal – check with your funder’s guidelines. Funders usually will only pay for costs incurred during a project so archival storage costs will have to be invoiced during the grant when data is deposited rather than a rolling annual cost.

Over time costs will be incurred for storage, typically based on the volume of data stored for a given retention period, and for additional services, for example active data management such as reformatting to counter possible format obsolescence. The latter is now regarded as less of a problem for popular formats, but may need to be considered for specialised data formats.

In some cases the costs and benefits of data storage and retention decisions may need to be assessed and justified for funding purposes. The KRDS (Keeping Research Data Safe) Benefits Analysis Toolkit may be used for this.

Useful Links

Bookmark

Bookmark this page as https://library.soton.ac.uk/researchdata/costs

The University of Southampton Research Data Management Policy has a requirement that all significant Research Data should be held for a minimum of 10 years and may be longer where the data is actively used. Funders also have retention requirements and some research data will also be subject to legal requirements. 

The Digital Curation Centre (DCC) has a summary of the requirements by the major funders. If your funder is not on this list please use the Sherpa Juliet service to check if your funder has any requirements for research data.

It may be tempting to keep everything but that has drawbacks as it can be more difficult to find the truly important material. It is also worth remembering that research data can be subject to Freedom of Information (FOI) requests.

You may have your own view on how long you need or want to retain data. This will be influenced by the discipline you are working in, the type of data created and whether further work or publications will be based on it. Factors that may influence retention include:

  • Research impact
  • Academic reputation
  • Derived and linked publications
  • Statutory/legal obligations
  • University and/or Funder policy requirements
  • On-going or further research
  • Validation and testing by others

Data which you decide not to keep should be destroyed securely.

Further guidance is available:

Retention & Repository

It is not a requirement that all research data must be held within the University. Discipline specific repositories and funder requirements may mean that research data will be held elsewhere.  You should consider what services you may require to meet the retention requirements applicable to your data

You can find discipline-specific data repositories for your subject via Re3data.org.

If you are funded, check your funder policy for recommended data repositories e.g. ESRC and NERC support dedicated data centres. Also consider whether any agreements with your collaborators include requirements for data deposit.

Regardless of where you deposit your data, you need to add a dataset record to our data catalogue (via Pure)

Data and Publication

This is a growing and fast moving area. Some publishers are now requiring the deposit of supporting data with the article, while others require that a link to the data is provided. You will need to take this into account when considering how long you will need to retain the data and may influence your choice of storage location.

You can deposit small datasets (gigabytes in size) in Pure for long-term storage.  If your data is a terrabyte or over in size, contact researchdata@soton.ac.uk to discuss how to deposit your data. For further guidance see Depositing data.

You can request a DOI for your dataset to include in the funder acknowledgement and data access statement in your publications and we can organise this.  Ideally you should request the DOI prior to submitting your manuscript.  See DOI for Data.

You may be able to publish your data in a data journal with an accompanying data paper (a separate entity for any research papers based on the data).

Discipline specific repositories and funder requirements may mean that research data will be held elsewhere. You should consider what services you may require to meet the retention requirements applicable to your data.

Costs of Retention

The largest share of costs for data are incurred in preparation and ingest to the selected storage service, as shown by the costing tool provided by the UK Data Archive. Extended data retention periods may have some additional costs that will impact your project directly or they may be covered by the full economic costing included in your proposal.  Invariably data retention periods will outlive projects, so you may want to consider how this will be funded as part of your data management plan and/or in your proposal – check with your funder’s guidelines.

Over time costs will be incurred for storage, typically based on the volume of data stored for a given retention period, and for additional services, for example active data management such as reformatting to counter possible format obsolescence. The latter is now regarded as less of a problem for popular formats, but may need to be considered for specialised data formats.

In some cases the costs and benefits of data storage and retention decisions may need to be assessed and justified for funding purposes. The KRDS (Keeping Research Data Safe) Benefits Analysis Toolkit may be used for this.

  • Data Management costing tool and checklist (UK Data Service
    This resource, developed by the UK Data Archive, is a simple activity-based tool that can be used to cost the additional expenses associated with the need to make data shareable beyond the primary research group. Available from the Create & Manage Data Planning for Sharing - http://data-archive.ac.uk/create-manage/planning-for-sharing/costing
  • Keeping Research Data Safe
    A web site set-up to support dissemination of information on the "Keeping Research Data Safe (KRDS)" cost/benefit studies, tools and methodologies that focus on the challenges of assessing costs and benefits of curation and preservation of research data.

Expired Retention

Research data represents an investment not just from the funder and the University but also by the individual researcher. However, as part of the deposit process you will be asked to consider what should happen at the end of the retention period and who is responsible for carrying this out.  Under the University policy the review process will be the responsibility of the lead PI’s Faculty. (see also Secure Destruction of data).

Bookmark

Bookmark this page as http://library.soton.ac.uk/researchdata/retention

There is a wide range of data held in the University, in many different formats, with research data being one of the most significant. It is important that, as well as planning for the curation of your data, you give consideration to how it will be destroyed where this is required for legal or other reasons. Guidance on when and who authorises the destruction of research data is covered in our section on Retention Periods and in the University Research Data Management Policy.

It should be noted that;

On Your Computer

  • University Windows build computers: deleting from My Documents will delete from the server (deleted items may end up in the Recycle Bin which will required to be deleted)  
  • University Windows build computers: for locally stored data on desktop or C drive, data will be deleted and moved to Recycle Bin which then must be emptied. 
  • Deleting from J Drive will delete files with no copy retained in Recycle Bin.
  • Mac laptops/desktops: deleted items are moved to Trash which must be emptied. The "Secure Empty Trash" utility is available from the "Finder" menu

N.B. when data stored on an iSolutions server or on SharePoint/One Drive/Teams, it will still exist on backups for a period of time (usually 90 days).

In Your Email

To delete emails: empty mailbox and then purge.

Outlook/Exchange has a facility to allow deleted emails to be recovered up to 30 days after they have been deleted.   There is also a facility within Outlook to purge deleted emails altogether so that they cannot be recovered.

Physical Equipment

To enable the secure disposal of electronically held data, it is not sufficient to simply delete the folder or file(s) from your PC or other device, because the data remains on the media but the space previously allocated is now available to be overwritten at some point in the future. Even reformatting the storage is not guaranteed to make the data unavailable. There are several methods to ensure that electronic data held on magnetic and solid state media is secure from unauthorised access.

  1. A secure wipe should be carried out. This process can take some time to complete.
    • DBAN is a popular and free disk wiping utility and is available for Macintosh, Linux and Windows PCs
    • On Macintosh (10.3 or greater) the "Secure Empty Trash" utility is available from the "Finder" menu
    • Digital voice recorders/tape cassettes/video cassettes: follow the manufacturer’s instructions to carry out a hard reset of the device
  2. Physical destruction of the disk
    • The University operates the WEEE (Waste Electrical and Electronic Equipment) regulation to ensure equipment is disposed of safely and securely. The University retains an external company that specialises in destroying magnetic and solid state media and will provide a "Certificate of Assurance". To arrange for the disposal of any electronic equipment please complete the the IT Disposal request form.

CDs/DVDs

It is recommended that optical media such as DVDs and CDs are physically destroyed. A simple method is to use a suitable shredder. Note that not all shredders are capable of destroying optical media, please check the suitability of your shredder before using this method.

Printed Materials

It is important that any data identified as sensitive and/or confidential and is not to be retained, whether for legal, ethical or other reasons, is destroyed carefully.  The University Estates and Facilities provide a service for the removal of confidential waste. Requests for the removal of confidential waste should be made via Planon on SUSSED.

If your data is highly sensitive you should seek advice from within your Faculty/Research/Academic group to confirm that the confidential waste service is appropriate for your material. If you are shredding your sensitive material you should use cross-cut shredders with a minimum standard of DIN 4.

 

Bookmark

Bookmark this page as http://library.soton.ac.uk/researchdata/destruction

Requesting a DOI

We can register a DOI for your dataset through DataCite - this gives a persistent link and can make it easier to cite.

For more details see our DOI for data page.

 

Research Support Guide