Filling a DMP for the SNSF

Since 2017, applicants to a SNSF grant must fill the DMP form, directly in their mySNF account. For your convenience, we present here below the 12 questions (also available in a printable form with comments) and examples of answers taken from different sources [1, 2].

1 Data collection and documentation

1.1 What data will you collect, observe, generate or reuse?

Questions you might want to consider:
- What type, format and volume of data will you collect, observe, generate or reuse?
- Which existing data (yours or third-party) will you reuse?

Example of answer

This project will work with and generate three main types of raw data.

1. Images from transmitted-light microscopy of giemsa-stained squashed larval brains.
2. Images from confocal microscopy of immunostained whole-mounted larval brains.
3. Western blot data.

All data will be stored in digital form, either in the format in which it was originally generated (i.e. Metamorph files, for confocal images; Spectrum Mill files, for mass spectra with results of mass spectra analyses stored in Excel files; tiff file s for gel images; Filemaker Pro files for genetics records), or will be converted into a digital form via scanning to create tiff or jpeg files (e.g. western blots or other types of results).

Measurements and quantification of the images will be recorded in spreadsheets. Micrograph data is expected to total between 100GB and 1TB over the course of the project. Scanned images of western blots are expected to total around 1GB over the course of the project. Other derived data (measurements and quantifications) are not expected to exceed 10MB.

1.2 How will the data be collected, observed or generated?

Questions you might want to consider:
- What standards, methodologies or quality assurance processes will you use?
- How will you organize your files and handle versioning?

Example of answer

All samples on which data are collected will be prepared according to published standard protocols in the field. Files will be named according to a pre-agreed convention. The dataset will be accompanied by a README file which will describe the directory hierarchy and file naming convention.

Each directory will contain an INFO.txt file describing the experimental protocol used in that experiment. It will also record any deviations from the protocol and other useful contextual information.

Microscope images capture and store a range of metadata (field size, magnification, lens phase, zoom, gain, pinhole diameter etc.) with each image.

This should allow the data to be understood by other members of our research group and add contextual value to the dataset should it be reused in the future.

1.3 What documentation and metadata will you provide with the data?

Questions you might want to consider:
- What information is required for users (computer or human) to read and interpret the data in the future?

- How will you generte this documentation?

- What community standards (if any) will be used to annotate the (meta)data?

Example of answer

Metadata will be tagged in XML using the Data Documentation Initiative (DDI) format. The codebook will contain information on study design, sampling methodology, fieldwork, variable-level detail, and all information necessary for a secondary analyst to use the data accurately and effectively.

It will be the responsibility of each researcher to annotate their data with metadata, and it will be the responsibility of the Principal Investigator to check weekly (during the field season, monthly otherwise) with all participants to assure data is being properly processed, documented, and stored.

All the datasets produced by the project will be published under a GNU licence.

2 Ethics, legal and security issues

2.1 How will ethical issues be addressed and handled?

Questions you might want to consider:
- What is the relevant protection standard for your data? Are you bound by a confidentiality agreement?
- Do you have the necessary permission to obtain, process, preserve and share the data? Have the people whose data you are using been informed or did they give their consent?
- What methods will you use to ensure the protection of personal or other sensitive data?

Example 1 of answer:

Les données à caractère personnel seront anonymisées avant partage et diffusion selon les recommandations du Préposé fédéral à la protection des données..

Example 2 of answer:

This project will generate data designed to study the prevalence and correlates of DSM III-R psychiatric disorders and patterns and correlates of service utilization for these disorders in a nationally representative sample of over 8000 respondents. The sensitive nature of these data will require that the data be released through a restricted use contract.

2.2 How will data access and security be managed?

Questions you might want to consider:
- What are the main concerns regarding data security, what are the levels of risk and what measures are in place to handle security risks?
- How will you regulate data access rights/permissions to ensure the security of the data?
- How will personal or other sensitive data be handled to ensure safe data storage and transfer?

Example of answer for people using the NAS:

Our data is stored on the academic NAS managed by the UNIGE IT department (DiSTIC). Access to the data is limited to rights holders (central authentication). The head of the laboratory that owns this disk space manages access himself, with the possibility of registering additional users.

2.3 How will you handle copyright and Intellectual Property Rights issues?

Questions you might want to consider:
- Who will be the owner of the data?
- Which licenses will be applied to the data?
- What restrictions apply to the reuse of third-party data?

Key points for your answer

Research data generated by UNIGE collaborators in the performance of their duties is the property of the institution.

When the data is produced in partnership with a third party, it is strongly recommended to draw up, before the research project starts and with all the parties concerned, an agreeement on the use of the research data. In the absence of such a document, the researcher of the University and the third party will have to agree on the use of the data.

When the researcher wishes to use data produced by a third party, he must comply with the copyright license or, in the absence of such a license, require prior consent of the third party.

If you wish to transfer to a company, outside of an existing research agreement, research data that may represent a commercial interest, you can contact Unitec, the technology transfer service, which can answer all your questions, assist you in the drafting of any contracts governing the transfer and remuneration of the University, and help in negotiation with third party.

In general, and as part of its mission in knowledge development and sharing, the University encourages free dissemination of data and research results, while respecting the rights and duties of the parties (management of personal or sensitive data, for example). A license must be assigned to the data that may be shared in order to clarify the conditions associated with the use and possible transfer to third parties of such data. Creative Commons licenses, such as CC0 or CC BY, is common recommended choice. For any questions about these licenses, the Research Data team is available.

A decision tree to help you choose an appropriate license is available.

Example of answer:

Research data generated by UNIGE collaborators in the performance of their duties are the property of the institution. As the data is not subjected to a contract and will not be patented, it will be released as open data under Creative Commons CC0 license.

3 Data storage and preservation

3.1 How will your data be stored and backed-up during the research?

Questions you might want to consider:
- What are your storage capacity and where will the data be stored?
- What are the back-up procedures?

Example of answer for people using the NAS:

Our data is stored on the academic NAS managed by the University of Geneva's IT department - the Information and Communication Technologies and Systems Division (DiSTIC). This academic NAS follows common protocols and best practices to ensure maximum security, integrity and availability. It extends over two distinct physical locations (UniDufour and Campus Biotech) and automatically performs a snapshot of files every 4 hours, with a retention of copies of 6 weeks.

3.2 What is your data preservation plan?

Questions you might want to consider:
- What procedures would be used to select data to be preserved?
- What file formats will be used for preservation?

Example of answer:

At the end of the project we will preserve the data and its documentation for 10 years on university’s servers and also deposit it in an appropriate data archive (see section 4.1 below). Where possible, we will store files in open archival formats e.g. Word files converted to PDF-A or simple text files encoded in UTF-8 and Excel files converted to CSV. In case this is not possible, we will include information on the software used and its version number.

4 Data sharing and reuse

4.1 How and where will the data be shared?

Questions you might want to consider:
- On which repository do you plan to share your data?
- How will potential users find out about your data?

Example of answer 1:

The project data will be stored in Yareta, the University of Geneva data repository, also available to researchers from Geneva other high education institutions. This will ensure data archiving and sharing is fully compliant with FAIR principles.

Example of answer 2:

Datasets from this work which underpin a publication will be deposited in Enlighten: Research Data, the University of Glasgow’s institutional data repository, and made public at the time of publication. Data in the repository will be stored in accordance with funder and University data policies. Files deposited in Enlighten: Research Data will be given a Digital Object Identifier (DOI) and the associated metadata will be listed in the University of Glasgow Research Data Registry and the DataCite metadata store. The retention schedule for data in Enlighten: Research Data will be 10 years from date of deposition in the first instance, with extensions applied to datasets which are subsequently accessed. This complies with both University of Glasgow guidance and funder policies.

Enlighten: Research Data is backed by commercial digital storage wich is audited on a twice yearly basis for compliance with the ISO27001 Information Security Management standard.

The DOI issued to datasets in the repository can be included as part of a data citation in publications, allowing the datasets underpinning a publication to be identified and accessed.

Metadata about datasets held in the University Registry will be publicly searchable and discoverable and will indicate how and on what terms the dataset can be accessed.

4.2 Are there any necessary limitations to protect sensitive data?

Questions you might want to consider:
- Under which conditions will the data be made available (timing of data release, reason for delay if applicable)?

Example 1 of answer:

Astronomical data will be diffused but under an embargo of one year for priority of exploitation reasons.

Les données astronomiques sont destinées à être diffusées mais bénéficient d’une durée d’embargo d’un an pour priorité d’exploitation.

Example 2 of answer:

Personal data will be anonymized before diffusion based on the recommendations from Swiss Federal Data Protection and Information Commissioner.

Les données à caractère personnel seront anonymisées avant partage et diffusion selon les recommandations de la CNIL.

4.3 I will choose digital repositories that are conform to the FAIR Data Principles. [CHECK BOX]

You can find certified repositories in the catalog of repositories Re3data.org

4.4 I will choose digital repositories maintained by a non-profit organisation. [RADIO BUTTON yes/no]

--> If the answer is no: “Explain why you cannot share your data on a non-commercial digital repository.”

Useful documents

Filling the SNSF DMP: Key elements

DMP template by the DLCM project

DMP model for clinical research projects (HUG, CRC)

DMP guidance by SPHN

DMP Canevas Generator (SIB)

How to choose a license?

FAQ

What data to deposit?

The SNSF expects that researchers share at least the data underlying their publications, but only to the extent to make the published results reproducible. This data should be shared as soon as possible, but at the latest together with the relevant scientific publication.

Some data cannot be shared because applicants are bound by legal, ethical, copyright, confidentiality or other clauses. They will be asked to explain their specific constraints.

Please refer to the Digital Curation Centre website on How to Select What Data to Keep

When may I opt out?

A DMP is not necessary in two cases:

Some research projects do not produce or reuse any data. If this is the case, applicants do not have to complete a DMP. However, they are asked to explain why they do not expect to generate or reuse any data in their proposed research.
If your project is co-funded, by the SNSF and another funder, and a DMP has been submitted to the latter, you do not have to provide a new one for the SNSF. You can just link to the existing one.

Anything special with personal data?

If your research involves human subjects, you need to contact the Commission Universitaire pour une Recherche Ethique à l'Université de Genève (CUREG). In most cases, confidentiality clauses will prevent you from sharing sensitive data.

Which license should I choose?

You can check on the Creative Commons website to see which license is the most appropriate for your data.

Shall I use the academic NAS?

The Academic NAS is a centralized storage space offered by the University's IT department. It is especially necessary to use it if you need to work give access to the data to UNIGE colleagues, if you do not (or do not want to) make automatic and regular backups of your PC or if the data is too large for your computer.
The constraint related to the academic NAS comes from the fact that it is necessary to be connected to have access to the data.

Does the SNSF fund the costs of data sharing?

It is possible to ask the SNSF for up to 10'000CHF in addition to the amount allocated to the project for the preparation and validation of the data, as well as for the costs of uploading the data to a non-commercial FAIR repository. This request must be made in myfns.ch when submitting the application for funding. See art. 2.13 of the SNSF's General implementation regulations for the Funding Regulations.