Many funders, publishers and other stakeholders encourage researchers to make their data publicly available, but this is not always possible. Reasons for this may include:
- The data contains personal identifiable information and is subject to data protection regulations
- The data is subject to a non-disclosure agreement with a collaborating partner
- Access to the data is limited due to contractual or licensing agreements with third party data providers
- The data contains commercially sensitive information
- The dataset includes other kinds of sensitive or confidential information, such as matters of national security or ecological sensitivity
Before sharing sensitive or confidential research data, it is important to check whether there are any ethical or legal obligations or regulations preventing you from making the data available. However, even if openly sharing data is not possible, it may still be possible to share sensitive data if appropriate safeguards are in place. Ethical and legal data sharing can be supported by a combination of obtaining consent for data sharing, data anonymisation and implementing data access controls.
Accordion widget
- Informed consent
- IP and copyright ownership
- Anonymisation
- Data access controls
- Sharing data from clinical trials
Researchers should plan early on in a project for how and under what conditions data should be shared. If the research involves human participants plans for future data sharing should be addressed during ethics approval.
Informed consent should be obtained from research participants including future use and reuse of the data by others. It is important to explain to participants how their data will be shared both during and after the project. Data sharing must always be consistent with consent agreements signed by participants.
Intellectual Property Rights such as copyright and patents, significantly influence how you and others can use your research data. Clarifying these rights at the beginning of the research process is crucial to ensure that data sharing does not infringe the rights of IP owners.
When data is collected or created during multi-partner research projects, multiple parties may hold joint copyright and intellectual property rights. Before sharing such data, it is essential to obtain permission from all IP holders.
Data derived from secondary analysis may contain third-party content. It's important to seek permission from the rights holders, who may impose usage restrictions. Ensure that any shared derived data adheres to the relevant terms of use and clearly communicates any conditions to future users.
Researchers need to ensure that any sharing of personal data conforms to relevant data protection legislation such as UK Data Protection Act 2018 (DPA 2018) and the General Data Protection Regulation (GDPR). The GDPR includes a research exemption that allows the processing of personal data for research and archiving, provided that appropriate safeguards, such as pseudonymisation and anonymisation, are in place.
Pseudonymisation is the process of replacing identifiable information with pseudonymous identifiers and ensuring that the identifiable information or “key” is kept separate. While individuals may not be identifiable from the pseudonymous data itself, they can be identified by referring to other information held separately so pseudonymised data should still be treated as personal data.
Anonymisation is the process of turning personal data into anonymous information so that an individual is no longer identifiable. This process allows sensitive research data to be shared while protecting the privacy of participants and complying with data protection regulations. Data protection regulations do not apply to anonymised data.
The GDPR defines personal data as any information relating to a living person that can be used to identify them, either directly or indirectly. Removing direct identifiers (e.g., name, address, telephone number, image) may not guarantee anonymisation. If a dataset includes indirect identifiers - information that could indirectly identify an individual when combined with other data (e.g., age, occupation, post code, salary) - it may still be considered personal data.
With some datasets, anonymisation may compromise the usefulness or value of the data. Anonymising datasets which contain rich and detailed information might result in the loss of context and relevance thereby limiting their value for both analysis and subsequent use. Data sharing under controlled access conditions allows researchers to use more detailed data while still protecting individuals' privacy, striking a balance between data utility and confidentiality.
Some data repositories provide tiered access levels to research data. Open access provides unrestricted data availability to all users, whereas restricted or controlled access sets specific limitations. These limitations may include:
- Datasets are placed under embargo for a limited period to allow for delays in publication of the associated article or the filing of a patent application
- Access is limited to registered users to ensure that access is restricted to authorised users who meet specific criteria
- Researchers are asked to sign a data sharing agreement. Data sharing agreements outline the terms and conditions for data access and usage and any limitations on data sharing.
- Requests to access data are reviewed by a data access committee
- Access to highly sensitive data is limited to analysis in a secure environment
While it may be necessary to restrict access to a dataset to protect confidentiality, it is still possible to publish a metadata record in a data repository.
Important: Datasets containing personal data as defined by the GDPR may only be transferred to an external repository if a data processing agreement is in place between the university and the repository. For further information, please contact rdm-enquires@imperial.ac.uk.
If no suitable repository is available, potentially disclosive data should be stored on a secure platform that offers encryption and strict access controls to ensure that only authorized persons can access the data. Procedures for handling data access requests from external researchers should be established along with the use of data sharing agreements outlining the terms and conditions for accessing and using the data.
Sharing data from clinical trials increases scientific knowledge, leads to better therapies for patients and increases public trust in clinical trials. Data must be anonymised before publication, and participant consent should explicitly state that data can be shared beyond the initial trial.
Although making anonymised clinical trial publicly accessible promotes transparency and supports research advancement, it also presents risks of disclosure and misuse. Consequently, funding bodies like the MRC recommend using controlled access data-sharing methods. Whenever feasible, data should be stored in a repository that provides restricted access options- such as the UK Data Service - and access restricted to authorised users under a formal data access agreement.
Resources:
The UK Data Service provides comprehensive guidance on consent for data sharing and data anonymisation
A link to the university’s Intellectual Property (IP) Policy can found on the Intellectual Property web page.
The OSF maintain a List of Approved Protected Access Repositories
For assistance with data sharing agreements, please contact your faculty or departmental contracts manager. Relevant links are available on the Material Transfer Agreements (MTAs) web page.
The MRC, in collaboration with other major funders, has published Good practice principles for sharing individual participant data from publicly funded clinical trials.