Data and Digital Outputs Management Plan

www.biodiversa.eu/2024/04/15/motivate

This DDOMP was developed and is maintained by Stephan Kambach, Illona Knollová, and Francesco Maria Sabatini (with support from all data managers).

Contacts: stephan.kambach@gmail.com

francescomaria.sabatini@unibo.it

ikuzel@sci.muni.cz

I. DATA MANAGER(S)

Data compilation and distribution are managed by the following members of the MOTIVATE project.

Dr. Ilona Knollová is the data manager responsible for the compilation and distribution of raw EVA and ReSurvey data.
Dr. Gabriella Damasceno is the data manager responsible for the compilation and distribution of sPlot data.
Dr. Stephan Kambach is responsible for the coordination and data sharing among MOTIVATE members.
Within the different WPs, the following WP-curators are responsible for the compilation and distribution of the respective data subsets.
- WP1: Dr. Stephan Kambach (MLU)
- WP2: Dr. Michael Glaser (Uni Vienna)
- WP3: Dr. José Manuel Álvarez Martínez (UNIOVI)

Dr. Alicia Valdés (UNIOVI)

Dr Borja Jiménez-Alfaro

WP4: M.Sc. Goerg Hähn (UNIBO)

Dr. Manuele Bazzichetto (UNIBO)

Prof. Francesco Maria Sabatini (UNIBO)

WP5: Dr. Ilona Knollová (MUNI)
- WP6: Dr. Tracy Hruska (UOULU)
- WP7: Dr. Stephan Kambach (MLU)

Stephan Kambach, Francesco Maria Sabatini, and Ilona Knollová coordinate the data management practices among WPs and ensure that all MOTIVATE partners comply with the DDOMP.

II. DATA IDENTIFICATION & DESCRIPTION

Purpose: MOTIVATE will leverage a database of vegetation-plot time series (EVA + ReSurvey Europe) and ongoing monitoring under the EU HD to produce both habitat- and species-specific assessments of plant biodiversity status and trends. Openly-available remote-sensing data will be used to upscale the recorded trends in plant diversity to region and European scales, acting as a methodological bridge between in situ vegetation data and spatial upscaling, ensuring consistency with EUNIS habitat typologies and enabling cross-border analyses. Biodiversity modelling techniques, such as Generalized Dissimilarity Models, will be used to attribute the drivers to the observed changes in plant diversity on local, regional, and European scales. MOTIVATE will establish novel pipelines to foster future collections of vegetation-plot time series to put monitoring data into practice by decision-makers. For this, MOTIVATE’s will work with national conservation agencies to co-design a data information platform for the collection and reporting of biodiversity change indicators that allows to link local time series to spatial information on habitat extent and potential drivers from remote sensing. Within this framework, WP3 specifically focuses on the integration of vegetation-plot time series with Earth Observation (EO) data to generate spatially explicit and reproducible indicators of habitat extent, temporal change, and environmental drivers.

Openly-available datasets that will compiled for analyses.

· Raw taxonomic backbone data is compiled from the WFO database and the taxonomic backbone of the sPlot database.

· Raw species phylogenetic relationships are extracted from Open Tree of Life.

· Raw climatic data is compiled from CHELSA Climatologies.

· Raw EUNIS habitat classification and taxon harmonization algorithms are downloaded from the most recent publication (zenodo.org/records/4812736).

· Raw life-form information is extracted from FloraVeg.

· Raw Ellenberg indicator values are extracted from FloraVeg.

Primary datasets are initial data products that will not be published due to owner rights and/or data restrictions.

· Raw vegetation-plot data (EVA & ReSurvey) is provided by the European Vegetation Archive as requested by project # 200–2024-03-01 (euroveg.org/eva-database/projects).

· Raw plant trait data is provided by the sPlot database in the format of a gap-filled list of 35 plant traits.

· Raw European red list status information is provided by Laura Méndez (UFZ).

· Raw surveyor interview data, consisting of audio recordings, video recordings, and transcripts, will be collected by Tracy Hruska.

· Raw online questionnaire data, consisting of electronic responses from nature conservation agencies, stakeholders, and scientists, will be collected by Tracy Hruska.

· Photos will be taken in the field and on workshop meetings by different members of the MOTIVATE project.

· R-code will be generated for the manipulation and analysis of all data formats.

Secondary datasets are intermediate data products from processing and/or analysis steps that will not be published.

· Harmonized vegetation-plot data (EVA & ReSurvey) will be extended with additional taxa columns following different harmonizations according to EUNIS habitat types, sPlot and WFO backbones, and manual correction of potential errors.

· Subsets of vegetation-plot data (EVA & ReSurvey) will be filtered and created according to the needs of the different WPs (e.g., regarding plot locations, survey dates, and other quality criteria).

· Plot-level climate and topographic variables will contain extracted values of (bio-) climatic variables, altitude, slope, and terrain roughness.

· Intermediate remote-sensing products contain spatially and temporally aggregated information derived from openly available remote sensing products, including spectral indices, phenological metrics and spatial summaries aligned with EUNIS habitat typologies.

Produced dataset represent final data products that are used for statistical tests, distributed to stakeholders, and will be published (for instance with scientific publications).

· Plot-level plant traits contain community-level aggregated means and/or variances of univariate and/or multivariate plant trait distributions.

· Species (co-)occurrence maps contain grid-based spatial distributions of species (co-) occurrence probabilities.

· Vegetation dissimilarity maps show spatial patterns of vegetation dissimilarity across European ecoregions.

· Hyperspectral habitat libraries contain information on the hyperspectral indices among EUNIS habitat types (summarised into means, variances, and phenological trajectories).

· Habitat distribution maps contain grid-based spatial distributions of EUNIS habitat types.

· Remote-sensing-derived habitat indicators include spatially explicit metrics describing habitat extent and temporal dynamics, generated through the integration of vegetation-plot data and remote-sensing products.

· Survey summaries contain aggregated responses and/or data on vegetation (re-) surveyors and stakeholders.

· Interview transcript segments are portions of audio and/or video recordings of interviews that have been transcribed and, where necessary, translated.

· Interactive EVA and ReSurvey map is an online tool to filter and export header data from the two vegetation plot databases with the aim to foster collaborations and the future collection of vegetation plot data (publicly assessable at evamap.eu).

· Publications include manuscripts published in peer-reviewed journals, non-reviewed journals and books, as well as talks at conferences and stakeholder meetings.

Data formats specify the amount and organisation of the different datasets. Distinct software can be used to read and view the following data formats.

· Tabular data consists of tab-delimited .csv-files, .txt-files, .xlsx-files, and .Rdata files which can be read and viewed with R or common spreadsheet software.

raw vegetation-plot data (EVA and ReSurvey)

raw plant trait data
raw taxonomic backbone data
raw European red list status data
raw life-form information
raw Ellenberg indicator values
harmonized EVA and ReSurvey data
aggregated plot-level plant traits
aggregated plot-level climate variables

hyperspectral habitat libraries
textual and numericalsurvey responses
Interview transcripts

· Spatial pixel data consists of georeferenced .tiff or .Rdata-files which can be read and viewed with ArcGIS, QGIS or R.

raw climatic data
species (co-) occurrence maps
vegetation dissimilarity maps
spatial pixel data are designed to support integration with other spatial biodiversity indicators.

· EUNIS expert system data consists of JUICE-exported .txt-files which can be read and viewed with R and common text editors.

raw EUNIS habitat classification and taxon harmonization algorithms

· RData is a binary file format. It can be read and viewed with R.

raw species phylogenetic relationships
subsets of vegetation-plot data
species (co-) occurrence maps
vegetation dissimilarity maps
hyperspectral habitat libraries

· Online resources can be viewed with common browser software.

Interactive map of EVA and ReSurvey databases

· Audiovisual data consists of medio files, including .wav, .mp4, .mp3, and MOV files that can sometimes be read with standard media players but require more advanced software for editing (e.g., Adobe Premier, Insta360 Studio, Adobe Audition)

field video recordings of interviews and related fieldwork
field and virtual audio recordings of interviews and related fieldwork

Other types of material include samples, specimen, and data collected during field campaigns. For these other types of data, the corresponding researcher will safeguard that data collection, processing, and publication adhere to the MOTIVATE DDOOMP.

III. DATA ORGANISATION & EXCHANGE (INTERNALLY, DURING THE PROJECT)

Data management is jointly organised among all data managers and WP-curators (via email and monthly meetings of the MOTIVATE group). All data will be stored on private computers and/or password-secured cloud storage solutions (from universities, research institutions, or private companies). Within each WP, the WP-curators are responsible for the storage and protection of the respective data subsets. Google cloud services will not be used for the storage or distribution of primary datasets.

Data sharing is realized via hyperlinks that allow the access and download of cloud-stored data. Together with data description and metadata files, these hyperlinks are hosted on a private and password-protected MOTIVATE GitHub repository. In this way, all data managers have access to the following shared data: raw and harmonized vegetation-plot data, raw taxonomic backbone data, raw European red list status data, raw species phylogenetic relationships, plot-level climate variables, species (co-) occurrence maps, and vegetation dissimilarity maps. Data manipulations, including data addition, deletion, and/or updates are to be marked with different version numbers and documented in the respective data description and metadata files. Data organization, regarding file names and folder structures, are the responsibility of the WP data manager. Information on data organization is to be noted in the corresponding data description and metadata files. For remote-sensing-related analyses, data organisation emphasises traceability between vegetation-plot inputs, Earth Observation-derived variables and resulting spatial products. Versioning of scripts and intermediate outputs ensures that analytical steps can be reconstructed and verified internally throughout the project.

Data consistency and quality are reviewed by multiple data managers. Raw EVA and ReSurvey dataare compiled and checked by Ilona Knollová, then sent to Stephan Kambach, who adapts additional taxa harmonizations and checks for potential “doppelgaenger” taxa among consecutive vegetation surveys. The resulting harmonized EVA and ReSurvey data are made available to all data managers via the MOTIVATE GitHub repository. Within each WP, the respective data managers are responsible for the review and correction of their subsets of EVA and ReSurvey data. Recognized data errors and applied corrections will be communicated to Stephan Kambach, who will then update the respective data and hyperlinks in the MOTIVATE GitHub repository.

IV. DATA STORAGE AND BACK-UP (WITH INTERNALS)

Within the MOTIVATE project, data will be collected, processed, and generated between 02.02.2024 and 28.02.2028. The processing and publication of EVA and ReSurvey data must comply with the EVA Data Property and Governance Rules (euroveg.org/download/eva-rules.pdf).

Data storage and backup of raw and processed EVA and ReSurvey data will be guaranteed by using password-secured, cloud-based data storage solutions that conduct regular backups. Raw EVA and ReSurvey data is stored at the MUNI University (by Ilona Knollová). Data file sizes vary among WPs. With exception of gridded spatial data, the size of most data files ranges between few kilobytes and up to ten gigabytes and can thus be stored on personal computers and cloud-storage solutions. Spatial pixel data, as processed by WP2, WP3, and WP4 can reach up to 50 gigabyte and more and will be stored and shared via cloud-based solutions or via physical hard drives. Audiovisual data from WP6 can include very large file sizes, most of which will be protected and not shared; where permissions and needs allow, large files will be shared via cloud-based solutions.

V. DATA SHARING, STANDARDS & METADATA (WITH EXTERNALS)

Primary and secondary data will not be shared with externals outside of the MOTIVATE project. Joint publications, such as opinion papers or resampling protocols, might be collaboratively developed using word files (.docx) or google documents.

Although primary and secondary datasets are subject to data ownership and access restrictions, analytical workflows and programming scripts, such as R and Earth Engine code will be documented, shared among MOTIVATAE members and cooperation partners and, where possible, shared alongside scientific publications. This approach supports transparency, reproducibility, and alignment with FAIR principles, without compromising data protection constraints.

VI. DATA RESTRICTIONS

Due to data owner rights and/or data restrictions, the raw EVA and Resurvey data, raw spatial pixel data, raw red list data, and raw taxonomic backbone data will not be made publicly available. For the ReSurvey datasets “CH_0008” and “DE_0037” the updated precise coordinates (from 2024) will not be published due to data owner rights. To protect sensitive and personal data from interviews and questionnaires, each interviewee will grant consent prior to recordings. Recordings will not be shared with researchers other than those directly included in the collection without prior consent of the respective person. Other than the images and voices themselves, personal data will not be recorded in audio or visual recordings. The use and publication of personal data will generally comply with the EU GDPR. Personal data and/or photos will only be published after agreement of the respective person.

VII. DATA PUBLISHING, LICENSING & DATA ARCHIVING (AFTER THE PROJECT ENDS)

Data processing and publication will accord to the EVA Data Property and Governance Rules (euroveg.org/download/eva-rules.pdf).

Whenever possible, the plot- or pixel-level aggregated data that accompany published articles will be made openly available after consultation with all data owners under the creative commons licence CC BY-NC-SA (i.e., credit must be given to the creator, only non-commercial uses of the work are permitted, and adaptations must be shared under the same terms) and together with a Digital Object Identifier (DOI). To safeguard the long-term use of data, any published data will be stored in openly available databases, such as Dryad (datadryad.org) or the iDiv data repository (idata.idiv.de). Each published data file will be published with supporting documentation and metadata according to respective data repository (including DarwinCore and/or the Ecological Metadata Language metadata). Whenever possible, the supporting aggregated data will be made available as soon as the results of the research have been published.

After the project ends, raw EVA and ReSurvey data will be archived at MUNI University by Ilona Knollová (under project 200 – 2024-03-01 MOTIVATE). Mobilization of new vegetation-plot data (EVA and ReSurvey) will continue after the end of the MOTIVATE project, either via the project webpage (www.motivate-biodiversity.eu, which will be hosted until, at least, 2030) or via contacts to Ilona Knollová and all other WP PIs

After the termination of the MOTIVATE project, and in case that data managers acquire an employment in a new project, theprimary and secondary data must be either published or permanently deleted after a transition phase of 3 years.

Spatial analyses conducted over vegetation plots rely on open and operational Earth Observation data sources to ensure temporal consistency, scalability and long-term applicability of derived biodiversity indicators beyond the lifetime of the project. This facilitates future reuse of methods and indicators in policy and management contexts.

VIII. COSTS

During the MOTIVATE project (i.e., until 31.03.2027), the positions of (most) data managers are funded via Biodiversa+ (www.biodiversa.eu/2024/04/15/motivate) or their respective Universities. Data storage costs, with regard to cloud storage or data repositories, are to be covered by the Biodiversa+ funds. As soon as specific costs for data storage become apparent, they will be added to this DDOMP.

IX. Acknowledgements

In any emergent publication from the MOTIVATE project, the following funding sources must be acknowledged:

“The MOTIVATE-project (motivate-biodiversity.eu) was funded by Biodiversa+, the European Biodiversity Partnership, under the 2022-2023 BiodivMon joint call. It was co-funded by the European Commission (GA No. 101052342) and the following national funding organisations: Austrian Science Fund FWF (MOTIVATE, pr.no. I 6846-B), Deutsche Forschungsgemeinschaft (DFG) project number 532411638, Research Council of Finland (RCF) grant number 359866, Spanish PCI2023-2 International Collaboration Projects number MCINN-24-PCI2023-146014-2, Technology Agency of the Czech Republic (TAČR) project number SS73020008, and Italian Ministry of University and Research (MUR) project number BIODIV22_00086”

0. LIST OF ABBREVIATIONS

ArcGIS	Geographic information system software developed and maintained by Esri (www.arcgis.com)
CHELSA	Climatologies at high resolution for the earth’s land surface areas (chelsa-climate.org)
DDOMP	Data and digital outputs management plan
EML	Ecological Metadata Language (eml.ecoinformatics.org)
EU GDPR	European General Data Protection Regulation (gdpr-info.eu)
EU HD	EU Habitat Directive (environment.ec.europa.eu/topics/nature-and-biodiversity/habitats-directive_en)
EUNIS	European Nature Information System (eunis.eea.europa.eu)
EVA	European Vegetation Archive (https://euroveg.org)
FloraVeg	Online database of European vegetation, habitats and flora data (www.floraveg.eu)
iDiv	German Centre for Integrative Biodiversity Research Halle-Jena-Leipzig (www.idiv.de)
JUICE	Non-commercial software package for editing and analyses of phytosociological data (www.sci.muni.cz/botany/juice)
MLU	Martin Luther University Halle-Wittenberg (www.uni-halle.de)
MUNI	Masaryk University (www.muni.cz)
Open Tree of Life	Comprehensive, dynamic and digitally available tree of life synthesized from published phylogenetic trees (tree.opentreeoflife.org)
PI	Principal investigator
QGIS	Free and open-source geographic information system software (qgis.org)
R	Language and environment for statistical computing and graphics (www.r-project.org)
ReSurvey Europe	Initiative within the European Vegetation Archive (EVA) that aims at mobilizing vegetation-plot data with repeated measurements over time (euroveg.org/resurvey)
sPlot	Global vegetation plot database (www.idiv.de/reseach/projects/splot)
Turboveg	Program designed for storing, selecting, and exporting vegetation plot data (www.synbiosys.alterra.nl/turboveg3)
UFZ	Helmholtz-Centre for Environmental Research (www.ufz.de)
UNIBO	University of Bologna (www.unibo.it)
UNIOVI	University of Oviedo (www.uniovi.es)
Uni Vienna	University of Vienna (www.univie.ac.at)
UOULU	University of Oulu (www.oulu.fi)
WFO	World Flora Online database (www.worldfloraonline.org)
WP	Work package within the MOTIVATE project (WP1–WP7)

Data and Digital Outputs Management Plan for the MOTIVATE project