Sensitive research data usually includes one or more of the following:
If you are working with sensitive data, you need to take extra precautions to ensure the data can only be viewed by those with permission to do so. These may include encryption or other special measures when storing, transferring and disposing of data.
Whilst adopting a proportionate risk based approach, the entire lifecycle of the research information needs to be considered, from creation to destruction. Minimum controls for highly restricted information to remain secure include user access controls, encryption, identifying and guaranteeing the location of the information, legitimate sharing / appropriate contracts.
The key actions to reduce your risk are:
You left your laptop on a train or a bag on the bus. Your laptop had the DNA profiles from the participants in your research project, or the bag was full of consent forms. Accidents happen but the penalties increased substantially when the EU General Data Protection Regulation (GDPR) was adopted into British law as the the Data Protection Act (2018) in May 2018.
DPA (2018) covers all forms of personal data including genomic and some anonymised data.
It is possible to restrict access to folders on the University's research filestore, so that only certain individuals or groups are allowed to view and edit the contents. A typical configuration for project folders is to allow access only to members of the project team, but it is also possible to set up folders within the project folder that are restricted to fewer users. For more information contact your Faculty's Business Relationship Managers (BRMs)
You can share data which is stored on your Office 365 One Drive for Business, however for data which contains direct and indirect personal identifiers, we recommend you use Research Filestore. See the iSolutions website for more information about Office365 and more information on how to share files safely.
While external services such as Dropbox, Google Drive and OneDrive are convenient, they do not comply fully with the University's data policies due to the following issues:
External cloud-based solutions should therefore be avoided for sensitive data. If you are considering using external storage providers nevertheless, perhaps because of conditions imposed by external collaborators, you must only consider those which will allow you to take the following security measures:
Anonymisation is the complete and irreversible removal of any information that could lead to an individual being identified, either from the removed information itself or this information combined with other data held by the University. Once data is truly anonymised and individuals are no longer identifiable, the data will not fall within the scope of the DPA (2018) and GDPR and it becomes easier to use.
Full Anonymisation is the process of removing personal identifiers, both direct and indirect. An individual may be directly identified from their name, address, postcode, telephone number, photograph or image, or some other unique personal characteristic (direct identifiers). An individual may be indirectly identifiable when certain information is linked together with other sources of information, including, their place of work, job title, salary, their postcode or even the fact that they have a particular diagnosis or condition (indirect identifiers).
Full anonymisation is often difficult to attain. In most cases the information can only be partially anonymised and therefore will still be subject to data protection legislation. If you can't fully anonymise the information, it is still good practice to partially anonymise it as this limits the ability to identify people.
Much of what may have been considered anonymised data 20 years ago would now be defined as pseudonmyised data due to the increased ability for data-linking where two or more data sources can be combined to re-identify individuals.
Full anonymisation is often difficult to attain and for research, often not desirable. In most cases the information can only be partially anonymised or psedonymised and therefore will still be subject to data protection legislation. Pseudonymisation is defined within the GDPR as “the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information, as long as such additional information is kept separately and subject to technical and organizational measures to ensure non-attribution to an identified or identifiable individual” (Article 4(3b)).
Unlike full anonymisation, pseudonymisation will not exempt researchers from the DPA (2018) altogether. It does however help the University meet its data protection obligations, particularly the principles of ‘data minimisation’ and ‘storage limitation’ (Articles 5(1c) and 5(1)e), and processing for research purposes for which ‘appropriate safeguards’ are required.
Where 'de-identified' or pseudonymised data is in use, there is a residual risk of re-identification; the motivated intruder test can be used to assess the likelihood of this. Once assessed, a decision can be made on whether further steps to de-identify the data are necessary. By applying this test and documenting the decisions, the study will have evidence that the risk of disclosure has been properly considered; this may be a requirement if the study is audited.
This involves considering whether an ‘intruder’ would be able to achieve re-identification if motivated to attempt this.
The ‘motivated intruder’ is taken to be a person who starts without any prior knowledge but who wishes to identify the individual from whose personal data the anonymised data has been derived. This test is meant to assess whether the motivated intruder would be successful.
In practice this includes but is not limited to:
The ‘motivated intruder’ is not assumed to have any specialist knowledge such as computer hacking skills, or to have access to specialist equipment or to resort to criminality such as burglary, to gain access to data that is kept securely.
UK Data archive text anonymisation helper tool (downloads a zip file) This tool can help you find disclosive information to remove or pseudonymise in qualitative data files. The tool does not anonymise or make changes to data, but uses MS Word macros to find and highlight numbers and words starting with capital letters in text. Numbers and capitalised words are often disclosive, e.g. as names, companies, birth dates, addresses, educational institutions and countries.
If you want to share data with external collaborators, even if they are part of the same research project, you must have a data sharing agreement in place. Contact firstname.lastname@example.org for more information. When you share the data with others, they will be data processors
but the University will still be the data controller and therefore responsible for how the data is used.
Extra precautions need to be taken when transferring sensitive data between collaborators:
Data that is to be published should have all direct identifiers removed, those include:
Data for open publication should also not have two or more indirect identifiers (listed below) as that can lead to re-identification through a process called 'triangulation'. You should remove or modify one or more of the indirect identifier until the risk of re-identification is neglible. If you are unsure or require more advice, please contact email@example.com. Indirect identifiers include:
(List courtesy of University of Bristol (2017), Sharing Data Concerning Human Participants guide)
The more that anonymised data is aggregated and non-linkable, the more possible it is to publish it. However this may remove valuable information from the data, pseudonymised data is often valuable to researchers because of the granularity it affords, but carries a higher risk of re-identification. Instead of making this data openly available, it may be preferable to release the data, on request, to other bone fide researchers using non-disclosure data sharing agreements. This allows more data to be disclosed than is possible with wider or public disclosure. Information security controls still need to be in place and managed. For more information contact firstname.lastname@example.org
Many of the techniques for dealing with sensitive data involve some form of encryption. Encryption obfuscates the data so that only those with the correct decryption key or password are able to read them. The strength of encryption refers to how difficult it would be for an attacker to decrypt the data without knowing the key in advance, and this depends on both the method and the key used.
The tool you use for encryption should inform you of the method it will use and may give you a choice. The Information Commissioner's Office currently recommends using the AES-128 or AES-256 encryption methods, of which the latter is stronger.
Whenever setting the key to be used by an encryption method, be sure to use a strong password. You must keep the key safe, as if it is lost the data will be unrecoverable, and conversely if it is leaked the encryption will cease to offer protection.
For more information about encryption contact InfoSec via email@example.com
See the Research Data Management: Destruction webpage for more information on how to securely destroy electronic and printed data.
A data protection impact assessment (DPIA) is a process to help identify and minimise the data protection risks of a project.
You must do a DPIA for certain listed types of processing, or any other processing that is likely to result in a high risk to individuals’ interests.
It is also good practice to complete a DPIA for any other major project which will require the processing of personal data.
Under The Data Protection Act 2018, DPIA (the new term for a Privacy Impact Assessment) is compulsory for any project that is likely to be 'high risk' to the rights and freedoms of individuals. The GDPR does not define what high risk is, however examples include 'large-scale' processing so it is likely that DPIA will be required for some research projects.
Even sensitive research data can often be shared legally and ethically by using informed consent, anonymisation and controlled access. In order to be able to do this it is important to consider potential data sharing and re-use scenarios well before the ethics process and data collection. Be explicit in your consent forms and PIS about your plans to make data available, who will be able to access the data, and how the data would be accessed and potentially re-used.
You should complete an Initial Data Protection Review (serviceline form) and you may also need to undertake a full Data Protection Impact Assessment. You can find guidance on this process on the Information Governance & Data Protection sharepoint site.
Personal data means any information relating to an identifiable person who can be directly or indirectly identified in particular by reference to an identifier. This could include personal data, including name, identification number, location data or online identifier. Personal data that has been pseudonymised – eg key-coded – can fall within the scope of the GDPR depending on how difficult it is to attribute the pseudonym to a particular individual.
Sensitive personal data: data that consists of information about racial or ethnic origin, political opinions, religious beliefs or beliefs of a similar nature, physical or mental health or condition, sexual life, the commission/alleged commission of an offence alleged/committed by the data subject and any related court proceedings, trade union membership. It also includes genetic (i.e. inherited or acquired genetic characteristics e.g. blood type) and biometric data (e.g. fingerprints) where processed to uniquely identify an individual. It has some different grounds for processing and requires the explicit consent of the data subject for its collection.
For more information and links to the relevant provissions in the GDPR, see ICO's GDPR Key definitions: What information does the GDPR apply to?.
All EU Member States have the ability to provide exemptions to the GDPR for data processing "for archiving purposes and for scientific or historical research and statistical purposes". The bill to bring this into UK law is still going through Parliament. However just because there will be an exemption on processing and archiving data for research, does not mean you should not handle personal data carefully and in accordance with the GDPR.
Under the old Data Protection Act, the ICO advised that it was good practice to undertake a Privacy Impact Assessment. Under GDPR Data Protection Impact Assessment ('DPIA') (the new term for a Privacy Impact Assessment) is complusory for any project that id likely to be 'high risk' to the rights and freedoms of indidivudals. The GDPR does not define what high risk is, however examples include 'large-scale' processing so it is likely that DPIA will be required for some research projects. The ICO has created a series of DPIA checklists that you should use if you are unsure if you will need a DPIA. Even if a DPIA is not necesary, you need to be able to demonstrate that you ahve proactiely addressed data protection implications in your research to comply with the GDPR's requirements for accountability and privacy by design.
If your data uses personal information, even if you started before 25 May 2018, it is necessary to comply with GDPR. You will need to decide if your data are "personal" (which means that they are identifiable in some way e.g name, postcode, cookies), and if there is appropriate consent or other legal basis (usually a 'task in the public interest') in place for you to collect, store and analyse the data.
For research undertaken at the University, the legal basis is likely to be a 'task in the public interest'. If you have already gone through Ethics and secured consent for participation, you are unlikely to need to re-consent as this consent is for participation and not for data handling. However you are likely to need to issue a Transparency Notice to your participants. The Research Integrity and Governance Office is actively contacting PIs on a rolling programme from high risk to lower risk studies regarding transparency notices and participant information sheets.
Please note: Health and social care research data have specific requirements that are not covered by GDPR.
GDPR means that you must
You need a legal basis (usually a "task in the public interest" or consent) to process personal data (e.g. name, postcode, cookies) and an additional legal basis to process special categories of personal data, as well as being able to show that additional legal requirements such as fairness and transparency are being met. For research undertaken at the University, the legal basis is likely to be a 'task in the public interest'.
Article 5 (e) of the GDPR states that personal data shall be kept for no longer than is necessary for the purposes for which it is being processed. There are some circumstances where personal data may be stored for longer periods (e.g. archiving purposes in the public interest, scientific or historical research purposes). Recital 39 of the GDPR states that the period for which the personal data is stored should be limited to a strict minimum and that time limits should be established by the data controller for deletion of the records or for a periodic review. Organisations must therefore ensure personal data is securely disposed of when no longer needed, reducing the risk that it will become inaccurate, out of date or irrelevant. For research purposes, once the project is complete, the data should be deleted, unless there is a requirement to archive the data for future reference and validation, in which case the organisation should undertake to periodically check that the data are still required, bearing in mind that they should be accurate, in date and relevant. As the University has a Research Data Management Policy which requires significant research data to be held for at least 10 years, that is the minimum period for which you should archive your significant data.
In short, yes, GDPR does apply if data is collected and processed overseas because the University, as the Data Controller for all research undertaken by University staff, is based in the EU. Article 3 of the GDPR sets the territorial scope of the Regulation to apply to both:
Research data collected by you while you are a member of staff at the University is owned by the University unless otherwise stated in collaboration agreements. See the University's Research Data Management Policy for more information.
To be drafted.
The Data Controller is the organisation that determines the purposes for and the manner in which any personal data is processed, in this case, the University. As a PI you may be taking these decisions on behalf of the University, but the University, rather than you personally, is the Data Controller. Research students or other non-employees who do the processing of personal data on behalf of the data controller are data processors.
If your data are personal data, even if you started before 25 May 2018, it is necessary to comply with GDPR. Personal data are identifiable in some way (e.g name, postcode, cookies), and you need appropriate consent or another legal basis (likely to be a 'task in the public interest' for University-based research) in place for you to collect, store and analyse the data. You should liaise with the Research Governance Manager and the University Data Protection Officer to ensure that you are compliant with GDPR. Health and social care research data have specific requirements that are not covered by GDPR.
Fully anonymised data is not covered by GDPR. However, it can be complicated to fully anonymise data and doing so may reduce the re-use potential of your research data.
Anonymisation applies to both direct and indirect identifiers. Direct identifiers like name, address, or telephone numbers specify an individual. Indirect identifiers could also reveal an individual when pieced together, for example, cross-referencing occupation, employer, and location. You should be aware that even if you have only one or two indirect identifiers left in your data, they could still be linked to other data sources to allow re-identification. See Research Data Management: Sensitive Data for further information on removing indentifiers to share data for publication.
For more information on anonymisation, see:
You must contact firstname.lastname@example.org as soon as possible. Do not delay, do not spend time trying to find the data, email email@example.com or use the online form as soon as you suspect the data loss may have happened. Alternatively, telephone +44(0)23 8059 4684 during office hours or Security +44(0)23 8059 2811 x22811 outside office hours.
The Information Security team have developed a RAG data classification tool. Contact the Information Security team for more guidance; send an email to firstname.lastname@example.org marked for the attention of InfoSec.
Raw data containing personal information should be stored on University servers that require a login in with a University username and password rather than held locally on laptops or other devices. Therefore raw data should only be stored on the Research Filestore.
Local copies of data should only be held on laptops or other devices after the raw data has been processed to remove personal identifiers.
Data can be encrypted using various software. Contact the Information Security team for more guidance; send an email to email@example.com marked for the attention of InfoSec.
See the Research Data Management: Sensitive Data guide for more information.
The Library can advise on drafting Data Management Plans and have advice on the RDM webpages. You can contact them on firstname.lastname@example.org. Your Faculty Business Relationship Manager can help you with storage requirements for your project, and iSolutions Information Security team gives advice on how best to encrypt data.
Any personal data you hold on paper must be held securely in locked cabinets in lockable rooms. If possible, scan the paper copies and save them securely on the University network, and destroy the paper originals.
If you work in a shared office, do not leave papers containing personal data on your desk when stepping away from your desk, even if you are just popping to the loo. Either dispose in the confidential waste or lock them away in your desk drawers or cabinet.
Contact the Information Security team for more guidance; send an email to email@example.com marked for the attention of InfoSec.
Personal data should not be stored on removable devices if at all possible. If you have to collect data on a laptop, the data and the laptop should be encrypted. When undertaking international travel, ensure any devices are in hand luggage.
Windows laptops should be UoS supplied, UoS build and regularly plugged in to network to receive updates. Windows laptops use Bitlocker.
Apple Mac devices are not supplied with encryption (‘Filevault’) enabled by default, as there is no UoS infrastructure available to manage and store encryption keys. You will need to enable this yourself.
When any other Apple device is issued it is provided with only factory default security settings. In order to make these are as secure as possible please undertake the following steps:
For further guidance, contact the Information Security team for more guidance; send an email to firstname.lastname@example.org marked for the attention of InfoSec.
You should ensure the personal data is stored securely, following the same guidelines as for new data. Electronic data should be stored on University servers and paper data should be kept in secure, locked cabinets in locked offices. See the Research Data Management: Sensitive Data guide for more information.
For more information see:
There is no limit on how far back you need to go. Before GDPR, you should have been holding personal data securely in accordance with the Data Protection Act.
If you no longer need the personal data that you hold, it should be destroyed (See the Research Data Management: Destruction guide for more information on how to do this securely). Do remember that for some types of research you may need to keep consent records for longer than the research project lasted; contact RIG for guidance. Also consider that the University's Research Data Management Policy requires significant research data to kept for at least 10 years, ideally these data would be anonymised to a level that would allow sharing, either openly or on request from bone fide researchers.
If at all possible, you should avoid saving or moving raw data on USB sticks but if you have to, you should only use encrypted USB sticks such as the Integra Crypto drive. Only use removable storage supplied by iSolutions: Staff equipment and purchasing.
Under GDPR, you will need: a legal basis to process personal data (e.g. name, postcode, cookies), an additional legal basis to process special category personal data, and to ensure that all additional legal requirements are met (e.g. the need to be fair and transparent, and to comply with the common law duty of confidence). Under the new law, the most relevant legal basis for researchers processing personal data for university research will usually be ‘processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller’. This justification for this should be internally documented by reference to the public research purpose as established by statute or alternative e.g. University Charter.
Consent is also an legal basis for processing and has very specific requirements under GDPR. If you use consent as your legal basis, you may need to re-consent to be GDPR compliant if your research started before 25 May 2018; alternatively you may be able to establish an alternative legal basis to proceed. You should liaise with the Research Governance Manager and the University Data Protection Officer for further information.
Consent forms must be stored in a secure location. At the very least you should store forms in a lockable cabinet in a lockable office. If you are able to scan the consent forms, you should dispose of the paper forms in confidential waste or shredded using a cross-cut shredder (see Research Data Management: Destruction (paper & electronic) for more information).
To be drafted.
Anonymisation is a trade-off against utilisation. What might be a reasonable level of anonymisation in a secure setting due to the high need for utilisation of the data, might not be suitable in an open setting due to the higher risk of re-identification.
"It can be impossible to assess re-identification risk with absolute certainty. [...] The risk of re-identification through data linkage is essentially unpredictable because it can never be assessed with certainty what data is already available or what data may be released in the future. It is also generally unfeasible to see data return (ie recalling data or removing it from a website) as a safeguard given the difficulty, or impossibility, of securing the deletion or removal of data once it has been published. That is why it is so important to take great care, and to carry out as thorough a risk analysis as is possible, at the initial stage of producing and disclosing anonymised data." ICO (2012), Anonymisation: managing data protection risk code of practice
"De-identification – refers to a process of removing or masking direct identifiers in personal data such as a person’s name, address, NHS or other unique number associated with them. De-identification includes pseudonymisation.
"Anonymisation – refers to a process of ensuring that the risk of somebody being identified in the data is negligible. This invariably involves doing more than simply de-identifying the data, and often requires that data be further altered or masked in some way in order to prevent statistical linkage.
"[For] both processes (i.e. de-identification and anonymisation) the purpose is to make re-identification more difficult. Both deidentification and anonymisation are potentially reversible; the data environment in which you share or release data is of critical importance in determining reversibility. In other words, the data environment can either support or constrain reversibility which means you need to think very carefully about the environment in which you share or release data. For example, it may be entirely appropriate to release deidentified data in a highly controlled environment such as a secure data lab but not at all appropriate to release them more openly, for example by publishing them on the Internet.
Re-identification might occur:
From: UKAN (2016), The Anonymisation Decision-Making Framework , pp.15-16
If you want to share data with external collaborators, even if they are part of the same research project, you should have a data sharing agreement in place. Contact email@example.com for more information.
The safest and simplest way to share documents and data with collaborators external to the University so you can all edit the material, is to use a University SharePoint site.
If you are just going to be sharing data with members of the University, you can also request space on the Research drive, contact your iSolutions BRM to set this up.
If you need to send data to a collaborator, internally or externally, you can also use the Safesend service. The service allows you to easily move files of up to 50Gb in and out of the University. All files are transferred across the network securely encrypted. Safesend is in not a cloud service; everything is stored on equipment directly owned by the University, and managed by its own IT staff. All access to data is very tightly and strictly controlled by the University; all accesses to data on Safesend are logged and can be easily checked if you are ever concerned that a 3rd party might have gained access to your data. Files are automatically deleted from Dropoff 32 days after you upload them. No backups are taken of the uploaded data (it's only a transitory stopping point), so after an uploaded file has been deleted, there is no way of recovering the file.
If at all possible, you should avoid saving or moving raw data containing direct and indirect identifiers on USB sticks but if you have to, you should only use encrypted USB sticks such as the Integra Crypto drive. Only use removable storage supplied by iSolutions, see: Staff equipment and purchasing
Research data you collect when employed by the University is owned by the University unless other data sharing or collaboration agreements apply to your research. A copy of the data must be left at the University when you leave. Depending on your actual research, you may be able to take a copy some of the data with you. Contact the Legal Services IP service for a response to your specific situation.
If you use data owned by a third party (copyright material, software or database), you need to understand the terms under which these are obtained and the scope of use. It is necessary to obtain permission from the data owner for re-use of such material, unless conditions of re-use have been explicitly indicated, for example, with a Creative Commons licence. It is your responsibility to ensure you comply with the terms that apply. RIS can assist with drafting/reviewing of data sharing agreements/T&Cs. In most instances, these are not negotiable. However, it may be possible to seek specific use terms or negotiate different licensing arrangements more appropriate to your specific requirements. It may be that in some circumstances, a commercial licence offers more freedom-to-operate than provisions for academic purposes.
Personal data transfers will continue to need to be approved though ERGO.
If data is from UHS patients collected under UHS sponsored studies, UHS will need to be consulted. RIS can assist with data sharing agreements as required. Generally if the protocol, consents and patient info describes the data transfer in detail an agreement may not be needed for transfers within UK.
If you want to share data containing direct or indirect identifiers, with external collaborators, even if they are part of the same research project, you should have a data sharing agreement in place (contact firstname.lastname@example.org for more information). The PI or local CoI, as representatives of the University who is the Data Controller, should ultimately give permission for data to be shared. When you share the data with others, they will be data processors but the University will still be the data controller and therefore responsible for how the data is used.
At the end of the project, significant research data should be archived, and preferably made openly available, as per the University's Research Data Management policy. In order for data to be shared openly it should be throughly anonmyised with direct and indirect indentifiers removed or modified. See Research Data Management: Sensitive Data for further information.
International research projects should have data sharing agreements if personal data is going to transfered between countries. RIS can assist with data sharing agreements as required.
Raw data should be stored on University servers and of possible, raw data should not be transfered. If possible, remove personal data and reduce or anonymise identifiers in the data before transfering data to international collaborators. Data should be encrypted for transfer. The University's drop-off service allows you to transfer files of up to 50GB in size securely. To transfer larger files, contact serviceline for advice.
SharePoint provides a safe way to share documents collaboratively with other researchers both internal and external to the University. You simply need to create a new Sharepoint site. For more information see:
Instead of dropbox (which should never be used for transfering personal data), we recommend that you use the University's Safesend service. The service allows you to easily move files of up to 50Gb in and out of the University. All files are transferred across the network securely encrypted. Safesend is in not a cloud service; everything is stored on equipment directly owned by the University, and managed by its own IT staff. All access to data is very tightly and strictly controlled by the University; all accesses to data on Safesend are logged and can be easily checked if you are ever concerned that a 3rd party might have gained access to your data. Files are automatically deleted from Dropoff 32 days after you upload them. No backups are taken of the uploaded data (it's only a transitory stopping point), so after an uploaded file has been deleted, there is no way of recovering the file. Unlike on a Dropbox view, once a file is deleted in Safesend, it really is deleted
GDPR prohibits transfer of personal data outside the EEA unless certain conditions are met:
For practical purposes, this means that if you are collaborating with organisations outside the EEA, you must have a data sharing agreement in place. Contact email@example.com for more information.
If you leave to work at an organisation outside the EEA and take research data with you, this will count as a data transfer.
Many cloud services are based outside the EEA. Even if the service in question has signed up to the EU-US Privacy Shield, it may not be appropriate to use such a service, since the terms and conditions tend to be one-sided, and are unlikely to be sufficient to enable the University to meet all its obligations under the GDPR. When sharing data with collaborators, do not use cloud-based services unless they have been approved by the University, for example Office 365's One Drive for Business.
The University has guidance and resources for staff to help them understand their and the University's responsibilities under GDPR.
UKRI has published guidance for researchers:
The Information Commissioner's Office (ICO) has guidance on how GDPR is being interpreted in the UK:
The University has a process to deal with DPIAs for both research and administrative work within the University. Templates for the DPIA and further information can be found the the Information Governance intranet site. Please direct any DPIA queries to DPIA@soton.ac.uk