Glossary

The definition you need is missing? Write to us at researchdata-info(at)unige.ch

A

Anonymisation

Definition

Anonymisation is an operation consisting of processing research data containing personal or sensitive information with the aim of making it impossible to attribute them directly or indirectly to the subjects they concern, identified or identifiable by these data. Anonymisation is an irreversible process.

See also:

Archiving of research data

Definition

Archiving research data means putting it on a secure platform for medium to long-term preservation. The retention period varies from one discipline to another, but the SNSF recommends archiving research data for a minimum period of 10 years.

It is important to distinguish between archiving and storage of research data. Indeed, these two actions do not pursue the same objectives and therefore there are dedicated and specific solutions and platforms for each of them.

See also:

C

Citation of data

Definition

Citation of data means providing a reference to the data, in the same way as for publications such as articles, reports and conference papers.

See also:

Copyright

Definition

"Copyright protects the authors of literary and artistic works. It is the way in which an idea is expressed that is protected, not the idea or concept itself. Copyright protection therefore applies to the form of the work and not its content. For example, Einstein’s essay “The Foundation of the General Theory of Relativity" in the “Annals of Physics” is protected by copyright. The theory of relativity itself, however, may be freely used, just not with the same words as in Einstein’s original text."

Source : Swiss Federal Institute of Intellectual Property

Creative Commons

Definition

Creative Commons or CC licences are a type of licence applicable to works specifying the conditions for their re-use and distribution. Creative Commons offers six licences to meet specific needs.

These licences can be applied to almost any type of work, for example: music, databases, photographs or educational resources. The only categories of works for which CC does not recommend its licences are software and hardware.

Source : Creative Commons

See also:

D

Data Journal

Definition

A data journal is a scientific journal dedicated to the publication of data papers.

Data lifecycle

Definition

The research data life cycle describes the different life stages data go through, from their creation to their archiving. This can be represented in several ways, for example the UK Data Service divides the life cycle into 6 stages:

  1. Planning data creation
  2. Data collection
  3. Data preparation and analysis
  4. Publish and share data
  5. Preparing data for preservation
  6. Reuse of data.

For each of these, actions and processes can be put in place to ensure that research data remains of high quality, integrity and security.

Source: UKDataService

Data Management Plan (DMP)

Definition

A Data Management Plan (DMP) is a formal document that outlines how research data will be handled during and after a research project.

Most funding agencies now require the submission of a DMP when applying for a grant. The Swiss National Science Foundation (SNSF) introduced this requirement during autumn 2017.

Data Paper

Definition

A data paper is a scientific article whose primary purpose is to describe in detail one or more datasets produced during a research project, typically with the help of metadata and without going into the analysis of the datasets themselves.

Data papers can be published in "traditional journals" or in dedicated journals called Data Papers and are in principle peer-reviewed.

Source : Chavan & Penev (2011) and IRDData 

Data Repository

Definition

A data repository is a space dedicated to the uploading of research data for the purpose of archiving and/or making it available for transparent or peer-to-peer re-use.

Repositories can be open or closed and can be classified into 3 main types:

  • generic or multidisciplinary: open to all types of data
  • disciplinary: open to data from a specific field/field of study
  • institutional: managed by an institution and open to its members only
Dataset

Definition

A dataset is "a coherent set of data produced within the framework of the same project, on the same object of study and/or collected at the same location. All the data in a dataset can therefore be described with a majority of common metadata".

Source: IRDData

DORA declaration

Definition

The San Francisco Declaration on Research Assessment (DORA) is a text published in 2013 by the American Society for Cell Biology and a group of scientific journal editors calling for the questioning and improvement of the evaluation of the performance of research, scientific journals and researchers, including bibliometric indicators such as the Journal Impact Factor or the H-Index..

See also:

DLCM

Definition

The DLCM (Data Life-Cycle Management) project or national project on the lifecycle management of research data was launched in 2015 by 8 Swiss partner universities funded by swissuniversities. Their aim is to provide researchers with resources to support them in the various aspects of research data management and archiving.

See also:

DOI

Definition

Digital Object Identifiers (DOIs) are a type of persistent identifier consisting of a string of alphanumeric characters. They aim at identifying in a perennial and unambiguous way published digital objects such as articles, but also research datasets.

Sources : DataCite

See also:

E

Electronic Laboratory Notebook

Definition

Electronic Laboratory Notebooks (ELN) are the digital equivalents of traditional laboratory notebooks. Many software programs are available on the market.

See also:

Embargo

Definition

In the context of knowledge dissemination, the embargo is a period of time during which access to a research product is restricted and permitted only under strict conditions. Embargoes can, for example, be requested by publishers in order to reserve exclusive rights to disseminate the publications concerned and thus give exclusive access to people who have subscribed to their services.

F

FAIR principles

Definition

The FAIR principles aim to enforce data sharing standards to ensure that humans and computer systems can easily find, interpret and use data.

The acronym FAIR stands for :

  • FINDABLE: Additional data and documents have sufficiently rich metadata and a unique and persistent identifier.
  • ACCESSIBLE: Metadata and data are understandable to humans and machines. The data are deposited in a reliable repository.
  • INTEROPERABLE: Metadata use a formal language that is accessible, shared and applicable to all forms of knowledge representation.
  • REUSEABLE: The data and collections are clearly licensed for use and provide accurate information about their provenance.

Source : Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. 

See also:

Files format

Definition

File format is the way a software encodes information contained in a file. For each type of file (images, text, audio, spreadsheet, etc.), a number of specific formats are available.

In all cases, the format of a file can be identified by a suffix preceded by a dot at the end of the file name.

Example: "content.txt" --> .txt indicates a text file.

There are two types of file formats :

  • Open or free formats, which can be used by anyone because the file specifications are publicly available.
  • Proprietary or closed formats, which only work with the vendor's software, and when the software is no longer supported, files in this format are usually unreadable.

The file format greatly affects the accessibility and potential reusability of research data. For this reason, the choice of format must be made in an informed and carefully considered manner.

See also:

FNS requirements

Definition

"The SNSF values research data sharing as a fundamental contribution to the impact, transparency and reproducibility of scientific research. In addition to being carefully curated and stored, the SNSF believes research data should be shared as openly as possible.

The SNSF therefore expects all its funded researchers

  • to store the research data they have worked on and produced during the course of their research work,
  • to share these data with other researchers, unless they are bound by legal, ethical, copyright, confidentiality or other clauses, and
  • to deposit their data and metadata onto existing public repositories in formats that anyone can find, access and reuse without restriction."

Source: SNSF

 

Free formats

Definition

Free formats or open formats are transparently encoded files whose technical specifications are public, accessible and unconditionally usable. These formats are interoperable, i.e. they can be opened and modified by any software designed to process the type of file (whether text, audio or images, etc.).

Open formats should be favoured as much as possible for data preservation and sharing, since they ensure the readability and reusability of these files over time while keeping them independent of a single technology.

L

Licences

Definition

Licences are legal provisions of intellectual property. These licences aim to protect a resource by specifying the terms and conditions of its use, in particular the terms and conditions of access, distribution and reuse.

See also:

LIMS

Definition

LIMS is an acronym that stands for Laboratory Information Management System. LIMS works by being "connected directly to scientific measuring instruments (spectrometer, MRI, scanner or electron microscope) and by capturing data at source via an interface and ensuring their management and traceability".

Nowadays, LIMS are combined with ELNs in a single application, although historically they have been developed independently. These combined systems allow the entire laboratory workflow to be supported within a single tool.

Sources : DLCM et  Campus n°132

See also:

M

Metadata

Definition

Metadata (literally data about data) is information that describes the basic characteristics of a data item, regardless of its medium (physical or digital).

For example:

  • Its author(s)
  • Its contents
  • Its creation date
  • The place of capture/production
  • The reason the data was generated
  • How the data was created

These different specifications are called metadata fields.

The metadata therefore places the data in context, making it easier to understand, process and potentially reuse in the future.

In order to know what information to include in metadata, it is possible to rely on metadata standards, i.e. sets of specific fields aimed at simply describing datasets, such as the Dublin Core or Data Cite.

See also:

N

Naming convention

Definition

In the context of file and data organisation, naming conventions are standardised and systematic ways of naming the files produced during the search in order to facilitate their identification, in particular using short and descriptive names.

A naming convention is particularly important in the case of data managed within a team or laboratory.

See also:

NAS academic storage

Definition

The academic NAS (Network Attached Storage) is a storage space service for UNIGE researchers. It enables active research data to be stored on equipment that is easily accessible, fast and secure (authentication and integrated backup). It is suitable for data that need to be regularly consulted, exploited, modified and shared.

See also:

O

OLOS

Definition

Developed by the DLCM project, OLOS is a national solution aiming to address archiving, long-term preservation, publication and access of research data to all Swiss Higher Education Institutions

Source: OLOS

OpenAIRE-H2020

Definition

OpenAIRE (Open Access Infrastructure for Research in Europe) is a European project funded by Horizon 2020. It is organized around two main poles of action: networking experts in open science and leveraging their expertise for the creation of training courses and the development of an open technical infrastructure for the centralization, management and sharing of scientific publications and research data to support the work of European scientists.

Source : OpenAIRE

Open Research Data

Definition

Open Research Data aims to make publicly funded research data freely and permanently accessible to researchers and citizens. This data must be FAIR (Findable, Accessible, Interoperable and Reusable) in order to be freely accessed, used, modified and shared.

Open Research Data is considered an essential element in the evolution of scientific research, particularly with regard to its transparency, reproducibility and measurement of its impact.

Source : Open HES-SO, UNIL et SNF

Open Science

Definition

Open Science is an umbrella term for a set of initiatives and policies aimed at reforming the way in which scientific research is conducted, evaluated and disseminated. This initiative has notably given rise to Open Access and Open Research Data. Open Science emphasizes the importance of transparency, replicability and collaboration among all stakeholders in science.

See also:

ORCID

Definition

ORCID, an acronym for Open Researcher and Contributor IDentifier, is a free, international persistent digital identifier system. This identifier allows a researcher to be uniquely identified and thus to distinguish him or her precisely from his or her peers, particularly from any of their namesakes. An ORCID can be linked to all of a scientist's productions such as publications, grants, and other contributions.

Source: ORCID

P

Persistent identifier

Definition

A persistent identifier, also known as a perennial identifier, is a string of characters and/or numbers used to uniquely identify a resource, irrespective of its location and with a long-term perspective.

The best known are: DOI, URI, ORCID and ARK.

These identifiers are generally structured in 3 parts. DOIs, for example DOI: 10.13097/archive-open/unige:27916, are structured as follows:

  • a prefix corresponding to the type of identifier used
  • the designation of the entity which has assigned the identifier ("10.13097" for UNIGE)
  • the specific name of the resource ("archive-open/unige:27916")

Sources : Espasandin et al. 2018 et Cevey & Raemy 2020

See also:

Personal data

Definition

"Personal data : all information relating to an identified or identifiable person;"
 
 
Pseudonymisation

Definition

Pseudonymisation is "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person".

Source : General Data Protection Regulation

R

Re3Data

Definition

Re3data or Registry of Research Data Repositories is a global directory of research data repositories launched in 2012.

It allows you to use several criteria to find research data repositories, such as:

  • Name
  • Topic
  • Country
  • etc.

This directory offers colored thumbnails assessing compliance with specific criteria, particularly in terms of accessibility. This makes it possible to assess whether a repository is compliant with FAIR principles.

Source : Re3Data

Research Data

Definition

"Research data are defined as factual records (numerical scores, textual records, images and sounds) used as primary sources for scientific research, and that are commonly accepted in the scientific community as necessary to validate research findings".

Source : OCDE

S

Sensitive Data

Definition

"Sensitive data is a specific category of personal data containing information about :

  1. religious, ideological, political or trade union-related views or activities,
  2. health, the intimate sphere or the racial origin,
  3. social security measures,
  4. administrative or criminal proceedings and sanctions;"

Source : Federal Act of 19 June 1992 on Data Protection (FADP) art. 3, let. c

 

Storage of research data

Definition

The storage of research data concerns so-called active data, i.e. data that is still in use. Storage must be on secure platforms, whose contents are backed up regularly, to ensure data integrity and security.

It is important to distinguish storage from archiving of research data. Indeed, these two actions do not pursue the same objectives and therefore there are dedicated and specific solutions and platforms for each of them.

See also

U

Unitec

Definition

Unitec is a UNIGE unit specialising in the valorisation of research and the promotion of technology transfer between the University, the University Hospitals, the Geneva University of Applied Sciences, the City and the business community.

See also:

Y

Yareta

Definition

Yareta is a Data Repository developed within the framework of the national "DLCM" project of swissuniversities and the cantonal bill "Digital Infrastructure for Research".

This platform complies with the FAIR principles for the management of research data. It is therefore in line with the requirements of the funders (SNSF, Horizon 2020) for the archiving and preservation of research data.

It is available to all researchers of Geneva's Higher Education Institutions .

Source : eresearch

See also: