Session 1. Showcasing uses of open biodiversity data
Bridging the gap between science and policy through open data
Open biodiversity data has been the core of Artsdatabanken’s existence for nearly two decades. We share lessons learned, challenges met and our thoughts on future needs and possibilities.
Bjarte Rambjør Heide
Since 2021, Bjarte Rambjør Heide is the general director of The Norwegian Biodiversity Information Centre (Artsdatabanken). He holds a Master’s degree in natural sciences from the Norwegian University of Life Sciences, focusing on population genetics. From 2007 – 2020 he held several positions at the Norwegian Environment Agency, including Head of Section 2016-2020.
Crop wild relatives Digital Twin: where data flow meets computation to revolutionize agricultural research
While the human population is rapidly increasing and expected to reach 11 billion by the end of this century, global agricultural production is challenged due to climate change. To meet the UN Sustainable Development Goal targets and to bring about zero hunger, we need to boost our food production. For this, we need crops with higher yields, nutritional values, and the ability to resist diseases and adapt to changing environments. Untapped genetic resources to meet these goals are often harbored in crop wild relatives. Digital twinning represents a promising technology for the model-based identification and use of these resources by facilitating: 1) data flow and fusion from distributed data sources), 2) dynamic model updating, 3) automated model uncertainty analysis (validation against real-life data), and 4) provision of automated alerts for new genetic resources with predicted target genetic properties for plant breeders, conservation scientists and policymakers. Here, we showcase the importance of data flow to develop a digital twin that facilitates improving the nutritional quality of grasspea but can be transferable across traits and crops. Grasspea is a climate-smart crop that requires only residual moisture to complete its life cycle and is considered a lifesaver during severe droughts in tropical and subtropical regions. While it is protein-rich and could potentially help in beating protein malnutrition in the future, it also contains a neurotoxin that can cause paralysis of the lower limbs in adults and brain damage in children, if consumed over longer periods. Specific goals of our work include modelling geographic areas that present populations of grasspea wild relatives and land races with potential alleles to lower the toxicity content of grasspea to a non-harmful level.
Desalegn Chala is currently working as a researcher with focus on biodiversity digital twins at Natural History Museum, University of Oslo. He is a botanist and geo-informatician (GIS and remote sensing) by training and uses these fields as tools in research. Has extensive experience in ecological modelling with special interest in ecology and biodiversity digital twins. Enthusiastic to develop his career in research by coalescing different disciplines such as GIS, remote sensing, ecological niche modelling and molecular genetics to develop biodiversity digital twins as well as to answer food security, biogeography and conservation related questions.
The use of open biodiversity data in the private sector; what, how and what next?
Open biodiversity data is not only used for research – it is an integral part of the impact assessments performed by consultants within the private sector. We here share for how and what for we use this data and lay out thoughts on how we can facilitate better collaborations between data providers, public sector, and consultancy firms.
Tanja Kofod Petersen
Tanja K. Petersen is an ecologist with experience in using open biodiversity data for biostatistical modelling. Her PhD at NTNU (Norwegian University of Science and Technology) dealt with the effects of urbanization on biodiversity, primarily using data from GBIF. She has subsequently been employed as researcher at NTNU and as an advisor at The Norwegian Biodiversity Information Centre (Artsdatabanken). She currently works as an advisor at Multiconsult. Multiconsult is one of the leading firms of consulting engineers and designers in Norway and Scandinavia.
Traits, bioclimate and extinction risk of European bryophytes
Extinction risk is not randomly distributed among species but depends on the one hand on species traits and on the other on environmental factors and threats. While knowledge of which factors influence extinction risk is increasingly available for some taxonomic groups, this is still largely lacking for bryophytes. The Red List of European Bryophytes provides an opportunity to assess which factors affect the extinction risk of bryophytes on a continental scale. To do this, we compiled trait data and bioclimatic variables for the European bryophytes. We used random forest models to study which traits and bioclimatic variables are important predictors for extinction risk. In addition, we predict the extinction risk of bryophytes categorized as Data Deficient (DD) in the European Red List.
Kristel Van Zuijlen
Kristel Van Zuijlen is a plant ecologist with a special interest in cryptogams (lichens and bryophytes) and traits. Her PhD at NMBU (Norwegian University of Life Sciences) was about traits and ecosystem processes of plants and cryptogams in alpine environments. More recently, her postdoc at the Swiss Federal Research Institute WSL was about ecology and conservation of European bryophytes, linking trait data to extinction risk.
More speakers to be announced soon!
Session 2. Emerging opportunities in open biodiversity data
Diversifying the GBIF Data model
GBIF’s flagship resource, GBIF.org, is an open-data infrastructure allowing institutions to share datasets that document evidence of species occurrence. A simple data model built around the Darwin Core data standard has enabled the integration of 2.3B records of evidence from more than 2,000 institutions, growing at a rate of around 10 records per second. As the GBIF network has grown, so too has the desire for GBIF to accommodate more complex data and to enable richer questions to be asked of it. This presentation will introduce the case-study approach currently underway that explores expanding this common data model and share early results from the eDNA, physical material and abundance and absence cases.
Tim Robertson has a background in engineering and data infrastructure and leads the informatics activities at the secretariat of the Global Biodiversity Information Facility (GBIF). GBIF operates an open data infrastructure that allows anyone, anywhere in the world to search and access evidence-based species occurrence data. Tim has been involved in drafting and maintaining data standards, developing large-scale data-ingestion pipelines and establishing a citation tracking system.
The importance of propagating parameter uncertainty in matrix population models
Matrix population models (MPMs) are widely used population projection models. They use estimated vital rates to project future population size and structure. Large open source databases of MPMs stored in a standardized way, such as COMPADRE (https://compadre-db.org/), are great tools for making such models more accessible and opening the opportunity for comparative studies. However, it should also be noted that uncertainty accumulates at multiple levels in these predictive models and failing to account for all sources leads to ignored variability in predicted outcomes. Complete uncertainty consideration and propagation does not occur in approximately half of papers using MPMs. However, the importance of such omissions is not yet known. I will present a simulation study showing how propagation of vital rate uncertainty can impact estimates of derived quantities such as population growth rate. I also assess whether uncertainty omission alters the conclusions we draw.
Emily Simmonds is a researcher currently on an International Mobility Fellowship jointly at the University of Edinburgh and NTNU. Emily’s research interests are centred on how we can predict responses to environmental change. She has most recently been exploring how well we are reporting model-related uncertainty across the sciences and how important uncertainty propagation is in predictive population models.
Building a reproducible workflow for the field of integrated species distribution models
There has been an exponential increase in the quantity and types of biodiversity data available in the 21st century, driven mostly by citizen science initiatives. Integrated species distribution modelling has emerged as a powerful tool for ecologists, allowing us to make best use of all these new data sources at our disposal. We therefore present a reproducible workflow in the form of an R package to assist ecologists in developing such models, using open-source tools and data. This presentation is illustrated through a case study using red-listed vascular plants located around Norway.
Philip Mostert is a PhD student at the department of mathematics at the Norwegian university of science and Technology (NTNU). His work is related to developing methods and software to further the research on integrated species distribution models.
Risk of bias in big biodiversity data: assessment, mitigation and communication
Ecology has entered the era of big data, characterized by previously unimaginable quantities of data and the near death of the random sample. Unfortunately, for many scientific questions, having a large sample size does not make up for the lack of a random sample, and I will begin my talk by explaining why. It can even make matters worse by providing false confidence in the wrong answer. This does not mean that ecologists should stop using non-random samples for research, but they must be treated with special care. Particularly important is to assess the risk of bias, to mitigate it (where possible) and to communicate any remaining risk to readers. Researchers in other disciplines are well ahead on each of these fronts, and I will discuss how ecologists can catch up.
Rob Boyd is a quantitative ecologist and methodologist with the UK Centre for Ecology & Hydrology. He mostly works on methods to assess, mitigate and communicate the types of bias that are typical of biodiversity data.
Session 3. Where to go from here: Future directions in open biodiversity data
Data integration increases the value of open biodiversity data: lessons from the past for the future
Climate change, habitat degradation, exploitation, pollution, disease, and invasive species are all contributing to global declines of biodiversity. However, identifying which species are declining and where as well as the specific spatiotemporal effects of stressors on species’ trends is difficult because of gaps in available data. Open data has the potential to solve this problem. By simultaneously analyzing multiple open data sources and types within a single analytical framework, data integration approaches have the potential to provide enhanced understanding of when, where, and why biodiversity is impacted by anthropogenic disturbance. This talk will highlight past data integration research using open data and discuss exciting avenues for future research.
Elise Zipkin is an Associate Professor in the Department of Integrative Biology and Program Director in Ecology, Evolution, and Behavior at Michigan State University. As a quantitative ecologist, Elise connects the complexities of natural communities with the precision of mathematics to shine light on mysteries in ecology and conservation. Elise and her team develop analytical frameworks to address grand challenges in the study of biodiversity loss and the effects of anthropogenic activities, such as climate change. She harnesses empirical data (big and small) to understand fine and subtle interactions in the natural world, revealing the causes and consequences of species’ declines and biodiversity loss while charting pathways to mitigate and reverse these alarming trends.
SPI-Birds – synthesising and FAIRifying long-term bird studies for global avian biodiversity
SPI-Birds is a global and growing network of long-term observational data on individually marked breeding birds. The initiative was established to increase the visibility of study sites and the coordination between researchers, and provide a standardisation service to facilitate collaboration and data reuse. Recently, SPI-Birds has been branching out in new directions, linking to other databases, and developing ways through which other users can interact with our rich source of long-term data.
Stefan Vriend is a computer ecologist and data juggler, enthusiastic about synthesizing and integrating data for open, transparent, and accessible science. As a postdoc and developer at SPI-Birds and the Netherlands Institute of Ecology, he develops data standardization pipelines for long-term bird datasets, creates tools to facilitate metadata harmonization and exchange, and helps increase the accessibility to the FAIR data landscape in the Netherlands.
More than data: Citizen Science as a means for opening science to society
Citizen science is often considered a challenge due to potential biases or a lack of structure in citizen science data. Based on experiences with citizen science in water and environmental fields, this presentation offers a social science perspective of citizen science to explore the opportunities that citizen science presents to change science-society-policy interfaces. It concludes with an outlook of how open science policy can create an enabling environment for holistic citizen science.
Uta Wehn is Associate Professor of Water Innovation Studies at IHE Delft and Visiting Professor of Citizen & Community Science at the University of Sonora, Mexico. A social scientist from the field of innovation studies with a background in ICTs, she works at the intersection of data & knowledge co-creation, digital innovation, and water and environment. Drawing on more than 20 years of combined experience in industry, research and international development, her Action Research activities on diverse water, climate & environmental challenges aim to help mainstream community-based approaches for monitoring and achieving the SDGs. She currently co-chairs the global Citizen Science & Open Science Community of Practice and holds various advisory roles for international research bodies, funders and policy makers.
More speakers to be announced soon!
Workshop 1. Accessing, handling, and referencing open biodiversity data using the Global Biodiversity Information Facility (GBIF)
In the age of changes and threat to the ecosphere at macroecological scales, big data has crystalized as a go-to tool for ecological research. To facilitate the creation of, access to, and referencing of contributors to such big data repositories, the Global Biodiversity Information Facility (GBIF) has established data streams that make readily available a wealth of biodiversity data. Navigating and accessing such large data sets and ensuring fair use and accreditation of observations can seem daunting. In this workshop, we will introduce GBIF, show how to navigate its data portal, demonstrate data streams to obtain and handle data programmatically, and communicate accreditation procedures.
Erik Kusch is a data scientist and macroecologist focusing on integrating different source of biological and environmental big data into complex biostatistical analyses. Throughout his work, Erik establishes complex data workflow to facilitate interdisciplinary workflows. Most recently, his research has explored the implications of environmental change on ecological networks at macroecological scales. In his current position at the University of Oslo, Erik is working towards integration of European biodiversity research infrastructures including eLTER and GBIF.
Workshop 2. Mapping Norwegian Biodiversity: What are we doing, and how can we avoid duplicating effort?
This workshop will survey existing projects working to map Norwegian biodiversity, with a focus on avoiding replication and providing tools to avoid duplication of effort in the future.
Bob O’Hara’s work is at the interface of ecology and statistics. At the moment his main focus is on developing models for the large-scale distribution of species. The problems here revolve around correctly modelling the sampling processes, and effectively modelling the distributions of species, and their dynamics, in both space and time. As well as thinking about single species, we also have to consider multiple species, and how they may respond in similar ways to the environment and interact with each other.