Living Norway Ecological Data Network – Status and vision
Open science and open data is providing some new challenges but also opportunities for applied ecologists that are unprecedented in the history of science. Living Norway Ecological Data Network has a core mission to promote open data based on the FAIR guiding principles, but also engage with and work for an open scientific culture across the ecological and related sciences. Although Living Norway Ecological Data Network is – as the name implies – based in Norway, the ideas, data, tools and workflows we share and develop are open and universal. In this opening talk, the main vision, activities and motivation behind our project will be presented, and we will argue why we believe building a strong culture for open data and open science in general is needed to successfully increase the societal impact of applied ecological research in the years and decades ahead of us.
Recent paperMandeville, Caitlin; Koch, Wouter; Nilsen, Erlend Birkeland; Finstad, Anders Gravbrøt.
Open Data Practices among Users of Primary Biodiversity Data. BioScience 2021 s. –
Gold standards for expensive data: streamlining archiving, distribution, and integrated analysis of long-term individual-based data
Improved accessibility and transparency around ecological data are central to tackling both the biodiversity crisis and concerns related to research reproducibility. Much progress has been made in recent years with regards to open access policies, development of data standards (e.g. FAIR principles, Darwin Core Standard, etc.), and possibilities for archiving in large databases and reliable repositories. This progress has, however, not yet permeated all fields of ecological research equally and has – at least so far – been focused heavily on data, which constitutes only the first step of research workflows.
Costly long-term data on marked individuals have functioned as the backbone of demographic studies aimed at understanding and predicting population dynamics and their responses to environmental changes for decades. While research environments working with such data have – at times – been reluctant to embrace certain open science practices, recent initiatives (e.g. COMADRE/COMPADRE, SPI-Birds) have highlighted the great potential for integrative studies resulting from standardisation and “FAIR-ification” of such data.
The Darwin Core (DwC) Standard provides enough flexibility for capturing the intricacies of data from marked individuals (capture-mark-recapture data). Working out a protocol for mapping data sets to the DwC standard within the research community and providing tools for converting data to and from Darwin Core will greatly facilitate storing, sharing, and analysing these valuable data on scales not previously possible.
Standardising raw data also paves the way for applying the gold standard of open science to the remainder of the scientific process. So far, best practices for accessible and reproducible code and workflows have received less attention than those for data, and it is about time to step up our game there as well.
Recent paperAntica Culina,Frank Adriaensen,Liam D. Bailey,Malcolm D. Burgess,Anne Charmantier,Ella F. Cole,Tapio Eeva,Erik Matthysen,Chloé R. Nater, et. al. Connecting the data landscape of long-term ecological studies: The SPI-Birds data hub. Journal of Animal Ecology – Wiley Online Library.
Modelling the heterogeneity within citizen science data for biodiversity research
Diana E. Bowler, Nick J.B. Isaac and Aletta Bonn
Large amounts of species occurrence data are compiled by platforms such as GBIF but these data are collected by a diversity of methods and people. Statistical tools, such as occupancy-detection models, have been developed and tested as a way to analyze these heterogeneous data and extract information on species’ population trends.
However, these models make many assumptions that might not always be met. More detailed metadata associated with occurrence records would help better describe the observation/detection submodel within occupancy models and improve the accuracy/precision of species’ trend estimates. Here, we present examples of occupancy-detections models applied to example citizen science datasets and typical approaches to account for variation in sampling effort and species detectability.
Using results from a recent questionnaire in Germany, we also characterize the different approaches that citizen scientists take to sample and report species observations. We use our findings to highlight examples of key metadata that are often missing in data sharing platforms but would greatly aid modelling attempts of heterogeneous species occurrence data.
An Editor’s perspectives on data archiving
In this talk, I will briefly summarize the history of the data repository Dryad and the early adoption of data archiving policies by evolutionary journals such as The American Naturalist.
I will then describe the changing community response to data archiving over the past decade, during which time it is increasingly clear that it has multiple tangible benefits ranging from facilitating meta-analyses, to detecting fraud. I discuss the weaknesses of the present system, and options to rectify those weaknesses in the near future.
How should we share research data? Recommendations from the Norwegian committee on sharing and reuse of research data
In 2020, the Ministry og Education and Research asked the Research Council Norway and UNIT to set up a committee to examine issues related to rights and licensing of research data. To ensure Open and FAIR data, a licence should be applied when making data publicly available. But how do you decide what licence to use? Who has the rights to publicly funded research data? Maybe more importantly: How do we enable sharing and reuse of research data, and support best practice? In this presentation, Ingrid Heggland, one of the members of the Norwegian committee will highlight some of the recommendations in the final report.
LivingNorwayR – Creating a Darwin Core Standard compliant data archive (“a data package”) for your biodiversity data
Although data sharing is quickly becoming standard practice in ecology, the nature of typical workflows, funding and time constraints often result in it being the last thing on a long list of tasks to reach the end of a project. The workflow facilitated by our new LivingNorwayR package allows users to create a “data package” immediately after the data has been digitised and cleaned and to make use of this packaged data for analysis.
The package provides a workflow for creating a Darwin Core standard-compliant data archive. This facilitates FAIR (Findable, Accessible, Interoperable, Reusable; https://www.go-fair.org/fair-principles/) data sharing and uploading of Darwin Core archives (data packages) to repositories such as GBIF.
The Living Norway package also provides tools for the processing and manipulation of metadata associated with Darwin Core archives and for the import and export of metadata according to the EML (Ecological Metadata Language; https://eml.ecoinformatics.org/) standard. We will demonstrate some key functionality of the package and request feedback on what features researchers would find useful to be included in future versions of the package.
Navigating the landscape of sequencing databases and concepts for machine readable data use restrictions
The landscape of various databases for sequencing data and their interplay can be difficult to navigate. This presentation provides an overview on different databases, their potential relevance for ecological research, the used standards, their interconnections and how curated services utilize them.
The second part provides a starting point for further discussion on how conditions for accessing sensitive ecological data can be made machine readable outgoing from earlier developments in the health research area
What challenges do a common researcher face in large integrative research projects?
Research funding is increasingly aimed at large integrative projects. Such projects involve a variety of methods to acquire or generate data, needs for processing, structuring, before they can be used to address research questions. In addition, many institutions from different sectors are involved, which each have their traditions, strategies and policies for data storing and sharing.
In this presentation I will highlight two such research projects, the Vega moose project and Ecosystem Trøndelag. These project involves a combination of “old-fashion” ecological data and modern technologies for acquiring information at different organismal level, but also collect non-ecological data that are important to address relevant research questions. I will point on some challenges that researchers face both in planning the project and in the daily work.
Standardizing inventory data capture, sharing and re-use for biodiversity modeling and assessment with the Humboldt extension to Darwin Core
Access to high-quality ecological data is pivotal to assessing and modelling biodiversity and its change through space and time. Inventory data (i.e., recording multiple species at a specific place and time) are particularly relevant monitoring species distribution and potentially abundance, but their reliability for use in downstream models depends on reporting on the methodology implemented, and associated sample effort and completeness. Reporting about inventory processes are often either unreported or described in an unstructured manner, greatly limiting their potential re-use for larger-scale analyses. In order to both support re-using previous inventories and to assure better standardization of newly collected inventory data, we have developed a framework to standardize inventory data reporting that is general enough for broad use.
The Humboldt Core was introduced in Guralnick et al. (2018) as a proof of concept. In 2021 the TDWG Humboldt Core Task Group was established to review how to best integrate the terms proposed in the original publication with existing standards and implementation schemas. Through the ratification of Humboldt as a Darwin Core Event extension, we expect to provide the community a usable solution, tied to well established data publication mechanisms, to share and use inventory data. This effort promises to overcome a key bottleneck in the sharing of critically important ecological data, enhancing data discoverability, interoperability and re-use while lowering reporting burden and data and metadata heterogeneity.
Is the peer-reviewed literature a source or sink for open primary biodiversity data?
The open sharing, access, and analysis of primary biodiversity data is easier than ever, and an enormous amount of open biodiversity data is available to support research and conservation. But how much has the peer reviewed literature shifted to reflect and support the growing normalization of open data?
We used the idea of data sources and sinks as a lens to consider the peer-reviewed literature in a systematic review. Publications that generate or perpetuate openly shared data can be considered sources of open data, and publications that report on unshared data or obscure the source of data so it cannot be easily reused can be considered data sinks.
We examined both successes and barriers to the integration of open data practices within the biodiversity literature, and we identified some trends that indicate promising directions for further integration of open data practices into the peer-reviewed literature.
Recent paperMandeville, Caitlin; Koch, Wouter; Nilsen, Erlend Birkeland; Finstad, Anders Gravbrøt. (2021) Open Data Practices among Users of Primary Biodiversity Data. BioScience.
Applying and promoting open science in ecology – surveyed drivers and challenges
Open Science (OS) comprises a variety of practices and principles that are intended for improving research, and the concept is gaining traction. Since OS has multiple facets and still lacks a unifying definition, it may be interpreted quite differently among practitioners. Moreover, successfully implementing OS on a wide scale requires a better understanding of the conditions that facilitate or hinder OS engagement, and in particular, how practitioners learn OS in the first place.
We addressed these themes by surveying OS practitioners that attended a workshop hosted by the Living Norway Ecological Data Network in 2020. The survey contained scaled-response and open-ended questions, allowing for a mixed-methods approach. From 128 registered participants, we obtained survey responses from 60 individuals. Responses indicated usage and sharing of data and code, as well as open access publications, as the OS aspects most frequently engaged with. Men and those affiliated with academic institutions reported more frequent engagement than women and those with other affiliations.
When it came to learning OS practices, only a minority of respondents reported having encountered OS in their own formal education. Consistently, a majority of respondents viewed OS as less important in their teaching than in their research and supervision. Even so, many of the respondents’ suggestions for what would help or hinder individual OS engagement included more knowledge, guidelines, resource availability and social and structural support, indicating that formal instruction can facilitate individual OS engagement. Taken together, these results may be indicative of tendencies in the wider OS movement. We suggest that incorporating OS in teaching and learning can yield substantial benefits to OS practitioners, student learning, and ultimately, the objectives advanced by the OS movement.
Recent paperHalbritter, A.H. Telford, R.J., Enquist, B.J. and Vandvik, V. et.al (2021). Next generation field courses: integrating Open Science and online learning. Ecology and Evolution 11, 3577–3587.
Novel tools for reproducible and transparent workflows in ecology and evolution
Workshop on day 2 of the colloquium
Disseminate your data: How to make your data human-readable
Workshop on day 2.