Skip to main content

Research Data Management: Working

Guidance and support to staff, researchers and students at the University of Southampton

Requesting a DOI

We can register a DOI for your dataset through DataCite - this gives a persistent link and can make it easier to cite.

For more details see our DOI for data page.


Data Lifecycle

Good research data management doesn’t end with a Data Management Plan or begin when you want to share your data; thinking about how to organise your data throughout your research project will allow you to comply with funder requirements and make your research more efficient. Data management is an essential building block for good research.  It will help you collect data that underpins your work in a way that allows it to be used with confidence, both now and in the future.  Basic principles, such as filenaming and version control, will help you locate and understand your data.  Ultimately, well managed data can add to the credit you receive and impact of your research alongside other research outputs.

See further information if you are working with sensitive data (data that is security or commercially sensitive or contains information and living human subjects) .


Working with Data

By describing and documenting your data you will be able to

  • return to data created earlier in a project and be reminded of what work or processes have been applied to the data. 
  • revise or review the data should you need to do so
  • extend your original work at a later date
  • allow others to add to your work rather than repeating it. 

How will my research data be used? This will depend on the type of data and any requirements, or restrictions, placed on you by funders, ethical or commercial considerations.  UK funding councils and others are increasingly requiring details of how and where data will be shared, while acknowledging some limitations need to be imposed for reasons of commercial interest or confidentiality.

Three kinds of data reuse

1. Author consultation and reuse: the data must be meaningfully named and located  so that the originator of the data can find and use it on any future occasion. Use Document properties tools to describe MSOffice files

2. Non-author consultation: for other researchers to access your work, the metadata must be consistent and discoverable, and assigned according to international standards where these exist, for example, Dublin Core or Data Documentation Initiative.Allowing others to see your work gives credit to you, your research team, and your institution.

3. Non-author reuse:the most open  form of reuse, enabling other researchers to replicate/develop/enhance your data in their own research. Increasingly required by funders, and means that the data must be completely and consistently described. For example, the OECD requires publicly funded data to be openly available to the scientific community.

See the Research Councils UK Common Principles on Data Policy.

For further guidance see our section on Funder Expectations.


Metadata are a subset of core data documentation, which provides standardised structure information that explains:

  • the origin
  • purpose
  • time references
  • geographic location
  • creator
  • access conditions
  • terms of use

of a data collection (UK Data Archive).

The detail and range of the metadata for any research file is in part dependent on the subject, format, and intended reuse:
  • The creation of metadata for the various elements of a project, and for the project as a whole is essential - there must be evidence that the project data is both findable and usable
  • The simplest form of metadata is assigned through meaningful filenames and use of the document properties and tag option in programs such as Word and Excel.
  • At the file level, metadata must include a comprehensive description that enables replication: this varies between disciplines and file type: see the comprehensive overview from the MRC
  • At the resource level, metadata is required for linked files that form part of a complete project, which requires an additional level of metadata: a general overview is available from the Archaeology Data Service

Why do you need metadata?

Creating metadata is good research practice and enables you to keep track your own work.  Depositing your metadata with your data will also enable others to discover and understand your data.

For further resources see our section on Metadata Standards and Training



Useful Links

This resource is freely available

Bookmark this page as:

Files are commonly saved within a folder structure.   You should consider whether one big-flat folder for all your files or a hierarchical tree structure would be the most appropriate for the piece of work or project you are doing.  A complex structure can encourage the use of shorter less meaningful file names that are dependent on that structure.  This may mean that when the folder structure is removed, for example when you provide your data to a collaborator, the file names may have little or no meaning. To avoid this try to use names that match your environment and contain:

  1. Something meaningful to you (such as what you are doing with the file)
  2. Something meaningful to someone else (such as an experiment number or project name.

Develop a system for file naming that works for your project or work, use it consistently and make sure it is part of the assigned metadata. The UK Data Archive has a useful guide: Format your data.

Ideally all data items related to a project, with associated metadata, should be grouped, and deposited with a summary of contents and relationships, itself in an appropriate open format.  You may want to consider using a database or spreadsheet to track data. Data analysis software such as NVivo can be used to describe and document data.

Tips for filenames

Best Practice Example

Limit file names to 32 characters


Don't use special characters or spaces

NO name&date@location.txt

NO name-datelocation.txt

NO VI .2.txt

YES name_date_location.txt

Use versioning

NO ProjlD_latest.txt

YES ProjlD_v02.txt

Use leading zeros in sequential numbering

to allow for multi-digit versions

For a sequence of 1-10: 01-10

For a sequence of 1-100: 001-010-100

NO ProjID_I .csv

YES ProjlD_01 .csv

Don't use generic data file names that may

conflict when moved from one location to another

NO MyData.csv

YES ProjlD_date.csv



Bookmark this page as:

Software solutions

Developers use version and revision control software to maintain current and historical versions of files such as source code, web pages, and documentation but it can be used for any sort of digital file.

Git is a free and open source distributed version control system.  The University runs the University of Southampton Git Service.  You can use it to keep private versions of your data and files within the University.

The University also supports: 

Contact Serviceline or your iSolutions Business Relationship Manager for further details.

Including version information in filenames

This can be done in any of the following ways:

  • the date recorded in the file name or within the file, for example HealthTest-2008-04-06
  • version numbering in the file name, for example HealthTest-00-02 or HealthTest_v2
    Filename Description


    Original document


    Minor revisions made


    Further minor revisions


    Substantive changes

  • a file history, version control table or notes included within a file, where versions, dates, authors and details of changes to the file are recorded

Further information

See  Version control and authenticity - UK Data Service



Bookmark this page as:

Data Security covers not just data containing personal information, but also data which may be commercially or otherwise ethically sensitive.

Even sensitive research data can often be shared legally and ethically by using informed consent, anonymisation and controlled access. In order to be able to do this it is important to consider potential data sharing and re-use scenarios well before the ethics process and data collection.

Be explicit in your consent forms about your plans to make data available, who will be able to access the data, and how the data would be accessed and potentially re-used.

For more guidance see the Sensitive Data and Research Data and GDPR pages.

At the University, the Information Security team in iSolutions can provide guidance and help. They can be contacted via


Many of the techniques for dealing with sensitive data involve some form of encryption. Encryption obfuscates the data so that only those with the correct decryption key or password are able to read them. The strength of encryption refers to how difficult it would be for an attacker to decrypt the data without knowing the key in advance, and this depends on both the method and the key used.

The tool you use for encryption should inform you of the method it will use and may give you a choice. The Information Commissioner's Office currently recommends using the AES-128 or AES-256 encryption methods, of which the latter is stronger.

Whenever setting the key to be used by an encryption method, be sure to use a strong password. You must keep the key safe, as if it is lost the data will be unrecoverable, and conversely if it is leaked the encryption will cease to offer protection.

For more information about encryption contact InfoSec via


Bookmark this page as

The format of research data varies between disciplines:

The Digital Curation Centre has an excellent guide to disciplinary metadata  and tools for implementation. This includes examples of good practice.

Other useful resources:


Bookmark this page as: 


Bookmark this page as:

Loading ...


Parts of this guide on Working with Research Data are based on MIT Libraries Data Management, CC BY