Skip to Main Content

Theses: Thesis Data Deposit

Information about theses at Southampton: thesis templates, guidance on e-theses, how to find theses

Data Deposit Guide

Deposit Guide

Data should be deposited via Pure. Data that can be made publically available will be visible in ePrints Soton


It is good practice to have a README file to accompany your dataset. A README file should be a txt file.

We would recommend the following guides to writing README files:

Thesis Data Deposit

To deposit your data and request a DOI that you can cite in your thesis and future publications:

  • Create a dataset record in Pure ( and complete all fields marked with a red asterisk.
  • Save the record to ‘For Validation’ and email to request a DOI.
    • You will find the status at the foot of the record.  If you are not ready to request the DOI you can save the record to 'Entry in Progress' and come back to it later.
    • You will be notified of the assigned DOI by a member of the research data team who will also offer further advice on completing the record.  They will ask about the nature of the data to help you decide on the right level of access and embargoes.
    • The record will be set to ‘Entry in Progress’ and will remain in Pure until record is ready
  • Complete the record, adding files and other details including a README file.
  • Send a scanned copy of the completed and signed ‘Permission to Deposit Thesis’ form OR request your supervisor emails confirming data can be released and approving any agreed embargo period.
  • When data is final, resubmit the data record to ‘For Validation’. Send a confirmation email

On receipt of confirmation that the data is final and can be released, the research data record will be checked, validated and the DOI will be registered the next working day.

Dataset Title*:

The title should reflect the content of the dataset.  It can reference the title of the thesis, but should not be identical.


This should provide more information on the content of the dataset but should include a reference to the Thesis.

Date of data production:

This can be a specific date or a range of dates to reflect the period of time during which you collected and analysed the data.


Add anyone who helped in the creation of the data or contributed and, generally your supervisor.  Change the default ‘role’ under the edit option once the name has been added.  Only Creators will be in the citation on ePrints Soton.

Dataset Managed by*:

This will reflect the organisational unit that you are attached to on the University system and is filled in automatically.


University of Southampton is the default publisher and should not be altered.

Digital Object Identifier (DOI):

Leave this field blank.  This field will be completed by the Research Data team when the DOI is to be registered.  A temporary note giving the assigned DOI will be added to the description, then remove prior to the DOI being registered.

Electronic data:

Upload files by browsing or dragging and dropping.  Add any agreed embargo to the files.  Do not change visibility of the files.

  • Files should be prepared in advance and zipped if more than 10 files.
  • File format should be as open as possible, for example .csv, .docx – see UKDA guidance
  • Max file or zip size is 4GB.  Contact if dataset is bigger.
    • Max number of 4GB files or folders to be attached is 10
  • A README file should be included.  A thesis README template is available.

The year should be the year that data will be released openly or available on request.  If all files are embargoed the year will be the year the embargo lifts.

Temporal coverage:

Only complete this field if the data is about a specific period of time.  For example if the data related to the London 2012 Olympics you would add the relevant dates this took place (27 July to 12 August 2012).


This should only be completed if the data is about a specific location.  For example if the data is about the Mekong Delta you could add this in words, use of polygon coordinates, point coordinates (10.033333, 105.783333).  


If the data collected is subject to any legal or ethical constraints this information should be added.

Data protection:

  • Add anything that has been done to the data to make is shareable - ‘all data has been pseudo-anonymised’

Ethics approval:

  • Add ERGO number and details of any other ethics approval received, for example NHS. 
  • Upload a copy of the Participant Information Sheet and blank consent form as part of the data set.

Commercial partner:

  • Provide the name of the partner and type of restriction, for example, limited period of embargo 10 years or permanent


  • Add information on why this is sensitive

It is helpful to add a few words that describe the subject of the data


The default ‘Public – no restriction’ should be accepted for data that is going to be

  • assigned a DOI,
  • be openly available,
  • available on request
  • available after an embargo

For all other visibility settings contact to discuss before adding files to the record.

Please see our research data deposit videos at

*Mandatory fields

The fields marked with an asterisk (red asterisk in Pure) are mandatory fields and must be completed before the record can be saved. Once the DOI has been registered they cannot be changed.

The research data that supports your thesis is an essential part of the work that you have carried out.  Research data can be an output of research that can bring its own rewards via citation, and open doors to future collaboration. By depositing your data you are able to properly cite the data in publications, including your thesis, using a Digital Object Identifier (DOI).  Publishers now often require that research data is deposited in support of publications and funders view it as good research practice to include a data access statement on papers stating how the data can be obtained - see for further guidance. 

The University of Southampton Research Data Management Policy, which applies to postgraduate research students, requires that research data underpinning publication should be deposited in the Institutional Repository (via Pure) and this includes Doctoral theses.

What data?

Research data, defined as "material intended for analysis", that directly underpins the assertions, figures and diagrams in your thesis should normally be deposited as a dataset, accompanied by any necessary documentation. This applies to data or research materials that you have collected or created yourself. 

Even if you have presented the data in your thesis, you should still deposit the data separately in order for it to meet the FAIR principles, meaning it should be finadable, accessible, interoperable and reuseable. In particular, data which is hidden in a thesis PDF is not reuseable.

Do I need to deposit all my data?

It may not mean that every piece of data that you have collected should be deposited.  It is important to evaluate the data, something you should do throughout the research process, and remove data from the final dataset that covers aspects of your research that no longer form part of your thesis. It may not be a case of discarding that data, just removing it from the dataset supporting your thesis. See for further guidance.

If you have extracted data from commercial databases; obtained data under licence; collected data from archives for personal research purposes only, are using confidential data (for example, MRC or NHS data)  or are using material still under copyright, you may not be able to deposit the data.  Contact for guidance.

My data is covered by ethics, do I need to deposit this?

Data should be anonymised or psuedonymised before deposit, unless you sought permission from your participants to include their names.  

Does my data need to be open?

We strongly encourage research data to be as "open as possible, and closed as necessary".  However, not all data can be made open.  If your data is subject to a commercial confidentiality clause, is sensitive or there is too high a risk that the data can be re-identified then the data does not need to be open. It is possible to make the data available:

  • After an embargo period.
    • For example, at the end of the confidentiality period.
  • On request.
    • For example, to bona fide researchers.
  • Closed
    • For example, the data contains sensitive data that cannot be anonymised or third party copyright.  Note if the dataset is permanently closed it is not possible to provide a DOI.

Further guidance is available from and  Contact for advice if required.

Is there a file size limit to the data I can deposit?

Our Institutional Repository can only accept files or zips up to 4GB in size.  We recommend that no more than 4 such files are uploaded to a single record.  If your data is much larger than this you contact for advice.

I have lots of files, do I need to upload them individually?

If you have more than 10 files we would recommend that the files are placed in a zip folder.  It can be useful to zip files of the same type together.  For example if you have audio files and analysed data files, these could be in two separate zip files as this could make them easier to describe. However, the data should be zipped in a way that helps to keep the data together in a logical way. See our guide on how to zip files.
In all cases the README should explain the content, the relationship between files and any software required.

Further guidance

Further guidance is available on the main Research Data Management Pages - 

Please also see our research data deposit videos at