Good research data management doesn’t end with a Data Management Plan or begin when you want to share your data; thinking about how to organise your data throughout your research project will allow you to comply with funder requirements and make your research more efficient. Data management is an essential building block for good research. It will help you collect data that underpins your work in a way that allows it to be used with confidence, both now and in the future. Basic principles, such as filenaming and version control, will help you locate and understand your data. Ultimately, well managed data can add to the credit you receive and impact of your research alongside other research outputs.
See further information if you are working with sensitive data (data that is security or commercially sensitive or contains information on living human subjects) .
By describing and documenting your data you will be able to
How will my research data be used? This will depend on the type of data and any requirements, or restrictions, placed on you by funders, ethical or commercial considerations. UK funding councils and others are increasingly requiring details of how and where data will be shared, while acknowledging some limitations need to be imposed for reasons of commercial interest or confidentiality.
Three kinds of data reuse
1. Author consultation and reuse: the data must be meaningfully named and located so that you, the originator of the data, can find and use it on any future occasion.
2. Non-author consultation: for other researchers to access your work, the metadata must be consistent and discoverable, and assigned according to international standards where these exist, for example, Dublin Core or Data Documentation Initiative. Allowing others to see your work gives credit to you, your research team, and your institution.
3. Non-author reuse: the most open form of reuse, enabling other researchers to replicate/develop/enhance your data in their own research. Increasingly required by funders, and means that the data must be completely and consistently described. For example, the OECD requires publicly funded data to be openly available to the scientific community.
See UKRI's Common Principles on Research Data.
For further guidance see our section on Funder Expectations.
Metadata are a subset of core data documentation, which provides standardised structure information that explains:
of a data collection (UK Data Archive).
The detail and range of the metadata for any research file is in part dependent on the subject, format, and intended reuse:
Why do you need metadata?
Creating metadata is good research practice and enables you to keep track your own work. Depositing your metadata with your data will also enable others to discover and understand your data.
For further resources see Useful links below
Bookmark this page as: https://library.soton.ac.uk/researchdata/description
Files are commonly saved within a folder structure. You should consider whether one big-flat folder for all your files or a hierarchical tree structure would be the most appropriate for the piece of work or project you are doing. A complex structure can encourage the use of shorter less meaningful file names that are dependent on that structure. This may mean that when the folder structure is removed, for example when you provide your data to a collaborator, the file names may have little or no meaning. To avoid this try to use names that match your environment and contain:
Develop a system for file naming that works for your project or work, use it consistently and make sure it is part of the assigned metadata. The UK Data Archive has a useful guide: Format your data.
Ideally all data items related to a project, with associated metadata, should be grouped, and deposited with a summary of contents and relationships, itself in an appropriate open format. You may want to consider using a database or spreadsheet to track data. Data analysis software such as NVivo can be used to describe and document data.
Best Practice | Example |
---|---|
Limit file names to 32 characters |
32CharactersLooksExactIyLikeThis.csv |
Don't use special characters or spaces |
NO name&date@location.txt NO name-date—location.txt NO name.date VI .2.txt YES name_date_location.txt |
Use versioning |
NO ProjlD_latest.txt YES ProjlD_v02.txt |
Use leading zeros in sequential numbering to allow for multi-digit versions For a sequence of 1-10: 01-10 For a sequence of 1-100: 001-010-100 |
NO ProjID_I .csv YES ProjlD_01 .csv |
Don't use generic data file names that may conflict when moved from one location to another |
NO MyData.csv YES ProjlD_date.csv |
Bookmark this page as: https://library.soton.ac.uk/researchdata/filenaming
Developers use version and revision control software to maintain current and historical versions of files such as source code, web pages, and documentation but it can be used for any sort of digital file.
Git is a free and open source distributed version control system. The University runs the University of Southampton Git Service. You can use it to keep private versions of your data and files within the University.
The University also supports:
Contact Serviceline for further details.
This can be done in any of the following ways:
Filename | Description |
---|---|
LiteratureReview_1.0 |
Original document |
LiteratureReview_1.1 |
Minor revisions made |
LiteratureReview_1.2 |
Further minor revisions |
LiteratureReview_2.0 |
Substantive changes |
See Version control and authenticity - UK Data Service
Bookmark this page as: https://library.soton.ac.uk/researchdata/versioning
Data Security covers not just data containing personal information, but also data which may be commercially or otherwise ethically sensitive.
You should be careful when working with data when working away from campus. Please see iSolutions advice on working safely while not on campus.
At the University, the Information Security team in iSolutions can provide guidance and help on data security. They can be contacted via serviceline@soton.ac.uk
See also the Sensitive Data & Data Protection (GDPR) pages.
Many of the techniques for dealing with sensitive data involve some form of encryption. Encryption obfuscates the data so that only those with the correct decryption key or password are able to read them. The strength of encryption refers to how difficult it would be for an attacker to decrypt the data without knowing the key in advance, and this depends on both the method and the key used.
The tool you use for encryption should inform you of the method it will use and may give you a choice. The Information Commissioner's Office currently recommends using the AES-128 or AES-256 encryption methods, of which the latter is stronger.
Whenever setting the key to be used by an encryption method, be sure to use a strong password. You must keep the key safe, as if it is lost the data will be unrecoverable, and conversely if it is leaked the encryption will cease to offer protection.
For more information about encryption contact InfoSec via serviceline@soton.ac.uk
Bookmark this page as https://library.soton.ac.uk/researchdata/security
If you want to share data with external collaborators, even if they are part of the same research project, you must have a data sharing agreement in place. Contact riscontracts@soton.ac.uk for more information. When you share the data with others, they will be data processors
but the University will still be the data controller and therefore responsible for how the data is used.
Extra precautions need to be taken when transferring sensitive data between collaborators:
Bookmark this page as: https://library.soton.ac.uk/researchdata/researchdata/sharing-during-research
The metadata describing your data supports findability, citation and reuse. Rich metadata provides important context for the interpretation of your data and makes it easier for machines to conduct automated analysis. Follow standard metadata schemes, general ones such as Dublin Core, or discipline specific. The Digital Curation Centre has an excellent disciplinary metadata directory, see also the RDA Metadata Directory and a portal of data standards at FAIRsharing.
Other useful resources:
Bookmark this page as: https://library.soton.ac.uk/researchdata/disciplines
Bookmark this page as: https://library.soton.ac.uk/researchdata/guides
We can register a DOI for your dataset through DataCite - this gives a persistent link and can make it easier to cite.
For more details see our DOI for data page.
Parts of this guide on Working with Research Data are based on MIT Libraries Data Management https://libraries.mit.edu/data-management, CC BY and OpenAIRE, CC-BY