Data sharing
Aim
Making your final dataset, or parts of your data, available to others and/or collaborating with other researchers at the beginning of your project by aligning and standardizing your datasets.
Responsibilities
Executing researcher:
- To organise and archive the agreement with the third party.
Project leaders:
- To make written arrangements with the third party about the data sharing. The agreement should be signed by both parties by an mandated person (who is authorised by the board to sign agreements for the organisation);
- To verify that the purpose of the third party is compatible with the Original study/informed consent provided by the participants whose data is to be shared.
Research assistant: N.a.
Why
Sharing your data with others can:
- increase the transparency of your research
- accelerate scientific discovery by enabling new (types of) research
- enhance the visibility and impact of your research
- create new opportunities for collaboration and broadening your network
- prove the opportunity for replication of your research
How to
- Use a goal oriented method
Start by defining (within the study team) WHY you want to share your data and consider potential benefits for you as a researcher and as a team. Being aware of such benefits to sharing data supports working in accordance with the FAIR principles to enhance the quality of data sets, throughout all stages of the project.
- Trust in the quality of the data
A clear and concise data management plan (DMP) is essential when starting a research project, and especially important when you intend to share your data. A well-defined description in your DMP helps build trust from other researchers. Additionally, providing clear meta-data makes your data interpretable.
- Transparency
Transparency builds trust and makes the research process more FAIR. You can be transparent by offering codebooks, meta-data and logbooks. Also manage expectations when sharing your data: explain what/how much data is missing, what inconsistencies you have found in your data and possible protocol deviations. Note that for clinical evaluations there are guidelines regarding what data need to be included in the ‘per-protocol’ and what data in the additional ‘intent-to-treat’ analysis.
- Communication and agreements
Making agreements with other researchers/parties before sharing data helps to protect your rights and can help prevent issues further down the road. Identify the data you will share. Make sure that you adhere to all legal regulations concerning the storage and sharing of the data, and collaborate with the legal advisor at your institute (e.g. Legal Research Support at Amsterdam UMC)
- Organizing sufficient time and support
Making your data reusable and providing clear meta-data takes time. Documenting your research process and data from the beginning of your project saves time. There is a lot of support available (https://www.amsterdamumc.org/research/support.htm), and requesting such support early on will help you to save time throughout your project.
How do you make your data ready for sharing?
First of all, do not try to reinvent the wheel, but involve experts on data management or data sharing from your institute. They can provide you with guidance and information that is tailored to your research project. [VUmc: datamanagement@vumc.nl, AMC and AMC/VUmc combined projects: rdm@amsterdamumc.nl, VU: researchdatamanagement@vu.nl)]
Consent for reuse and anonymization
Sensitive data cannot be shared without explicit consent of participants, with very limited exceptions provided under the AVG/GDPR, that require additional assessment and documentation in collaboration with a Data Privacy Officer and with the applicable Data Sharing agreement drawn up with LRS.
These exceptions may be applicable if the information given to participants prior to their consent for data collection indicated future use of the data, or if all of the following are applicable:
- The opportunity to gain consent no longer exists or is not practically possible;
- The data have been sufficiently de-identified;
- There is no risk that publishing or sharing the data will cause harm or contribute to discrimination towards the research participants or subjects;
- Information sheets and consent forms from the original data collection did not preclude sharing.
Please consult the Research Data Management department and a Data Privacy Officer for advice regarding meeting conditions for reuse of data and data sharing when no explicit consent was provided by participants.
Make sure your data are ‘tidy’
Before you share your data, make sure that others can understand the structure of your dataset and, therefore, can reuse your data more easily. By making your dataset ‘tidy’ (https://www.jstatsoft.org/article/view/v059i10) you provide a standardized way to link the structure of your dataset to its meaning. In tidy datasets, each variable has its own column (so two variables are not combined in one column), each observation has its own row, and each type of observational unit forms a table. For clinical trials there are standards that can be used (e.g. CDISC Study Data Tabulation Model) to ensure uniform evaluation and interoperability of one or more dataset(s).
Describe your data with metadata
To make sure that others understand what your research project is about and what (type of) data your dataset contains, it is important to add metadata (data about your data) to your data. There are many standards that help you to structure the metadata (examples). If you aim to publish your data in a data repository, check what the standard of the respective repository is. There are multiple levels of metadata: metadata for the overall research project (how, what, why, by whom data is collected) and metadata to describe the actual contents of your dataset (data dictionary).
Also consult: https://libguides.vu.nl/rdm/metadata
Define access conditions
If you aim to share your data, you should also define the conditions under which others can use your data. The principle of “as open as possible, as closed as necessary” applies here. Your data should be “as open as possible” to allow others to reuse your data and to accelerate (other) research, but at the same time, your data should be “as closed as necessary” to safeguard the privacy of the subjects in your research.
There are three main access conditions:
- Open: everyone has access to the data
- Restricted: special conditions apply, if the conditions are met, external users can access the data
- Closed: externals users do not have access to the data
You can also decide to combine the access conditions, by, for example, making a specific part of your data open, and another part closed.
Journal article: Prepare a data sharing statement
Manuscripts submitted to ICMJE journals (e.g., BMJ, The Lancet, New England Journal of Medicine) that report results of clinical trials must contain a data sharing statement. The trial must also include a data sharing plan in the trial’s registration.
Data sharing statements must indicate the following:
- whether individual deidentified participant data (including data dictionaries) will be shared
- what data in particular will be shared; whether additional, related documents will be available (e.g., study protocol, statistical analysis plan, etc.)
- when the data will become available and for how long
- By what access criteria data will be shared, including:
- with whom
- for what types of analyses
- by what mechanism
Also consult: http://www.icmje.org/icmje-recommendations.pdf
Share data upon request: Prepare a data sharing agreement (data transfer agreement).
If a third party requests access to your data and you have not published your data in a data repository, you should make a data transfer agreement. The agreement (example in Dutch and English) must be signed by a mandated person (someone who is authorised by the board to sign agreements for the organisation) from both parties. The agreement should contain statements about the terms and conditions under which data are made available. For advise contact RDM and Legal counsel (LRS and IXA).
Add a license to your data
If you want to publish your research data in a data repository, you have to choose a licence for your data. The most common licenses are Creative Common Licenses. You can use their online Chooser application to determine which license is appropriate for your data.
Publish your data
Once you have all the steps to make your data 'shareable’, you can publish your data. The most common ways to publish or share your data are the following:
- Sharing data via a data repository
- A data repository allows you to store, share, and publish your data. Location AMC is conducting a pilot with data repository Figshare. OpenAIRE published a guide on how to choose an appropriate repository. Also consult the website of DANS. Conditions for what parts of your data can be stored in a repository and under what conditions, very widely, based on privacy and IP considerations.
- Add data as supplementary materials to a journal article
- Most journals allow you to upload your data and attach it to your manuscript as supplementary material, or add links to online repositories for supportive data.
- Make data available via the project website
- If you are working on a project or within a consortium that has its own website, you can also decide to provide info on your data there including your procedures regarding data access.
- Share data upon request
- If you cannot make your all data publicly available (e.g. protocol yes but collected data no), you can decide to only share the data upon request, when specific access conditions are met and the appropriate agreements are drawn up.