Research Course on Open, Reproducible and Transparent Science in Ecology

Manage, produce, use, and reuse data with reproducible workflows

Universities, journals, and funding bodies increasingly demand open and reproducible research practices across the scientific community. Research data needs to be FAIR (findable, accessible, interoperable, and reusable), workflows need to be reproducible, and science needs to be transparent. This aims to improve the efficiency and quality of research and thus to increase the credibility of science. 

Do you want to gain new skills in data management, making your data FAIR, and your workflows reproducible? Do you want training in open-source tools such as GitHub and R that can be used to achieve this?

We offer hands-on training on methods and technologies to make research more open, reproducible and transparent. The course is centred around the life-cycle of data from planning, managing, collecting, curating, analysing, publishing, storing, sharing and reusing data. It is aimed at PhD students (will be prioritised) as well as early-career researchers in ecology, who produce their own data (i.e. collect data in the field/lab), use data from others (i.e. databases) or both.

The course will be held at Hjerkinn Vandrerhjem Dovrefjell, from the 13. – 18. November 2023. Students will be introduced to open science, data management, data repositories/databases, data standards (e.g. FAIR, CARE), best practise and reproducible workflows (e.g. data curation, analysis, GitHub, reporting results). We will invite experts to give lectures on these topics and provide hands-on training that give the students opportunities to practise new skills. We will provide examples, but also encourage the students to bring their own data and problems to work on.

The course fee (including accommodation) is 7450 NOK. The course participants have to organise and pay for the travel to Hjerkinn themselves.

Application deadline: 3. September 2023

The course is 2.5 ECTS credits. Sufficient preparation and active participation and conducting the exercise are expected. A certificate for participating in the course will be issued by the University of Bergen.     

To apply, please send a short description (¼ page) of your research and how this course fits into your career plan. Send your application to Aud Halbritter (aud.halbritter@uib.no) and please mark the subject with APPLICATION OS Course 2023 and your name.

For more information and about previous courses see:

https://open-science-course.github.io/course_website/

The course is organised by Living Norway in collaboration with the University of Bergen (UiB), Nord University, Norwegian Institute for Nature Research (NINA), Norwegian University of Science and Technology (NTNU), Norwegian University of Life Sciences (NMBU), University of Oslo (UIO) and the Norwegian GBIF node.

Living Norway Colloquium 2023

Open biodiversity data for a better world

Welcome to this years colloquium!

The Colloquium theme is Open biodiversity data for a better world. See the Colloquium website for additional information.

As in past years, the Colloquium aims to shine a spotlight on and create dialogue about the opportunities offered by open ecological data. Day 1 (23 May) will feature a lineup of invited speakers showcasing opportunities opened up by open ecological data. Day 2 (24 May) will feature two hands-on workshops, the first focusing on skills in accessing open ecological data, and the second exploring current efforts to map Norwegian biodiversity.

Learn more, view the event program, and register (cost-free) here: https://livingnorway.no/colloquium/living-norway-colloquium-2023/ 

We kindly ask you to confirm your participation via the online registration form. You are also welcome to forward the invitation to everyone interested.

We look forward to welcoming you to this exciting and inspiring event!

Living Norway data portal is launched

At Living Norway Ecological Data Network, we are working to facilitate open, reproducible and transparent research. How we manage and share data is important aspect of this activity. We are happy to announce that we have now launched our brand new data portal.

Since we started our network back in 2019, a central goal has been to build tools and data infrastructure to help researchers manage their data in a FAIR yet efficient manner. Along the way, we have built tools such as the LivingNorwayR package for software R, and an app built in R-shiny that can be used to visualize the content of a Darwin Core Archive data package. Now, we are happy to inform that we have also launched our own data portal. Please have a look at it, explore the data, and provide us with any feedback you might have.

Living Norway is working close together with GBIF, including the Norwegian GBIF node, and we use the GBIF infrastructure as a foundation: All data that you can find in the data portal is published using the GBIF publishing tools and registered with GBIF. You can read more about the data flow model here. We find this approach very awarding, and it allow us to focus our efforts on building additional tools rather than duplicating the great job that is already done by GBIF and other organizations.

One goal with our data portal is that data providers can gain visibility for their published data sets. We also hope the portal will help researchers and other data users find relevant data, and judge to which extent they are useful for their use case. Once you have found relevant datasets, you can use our LivingNorwayR package to import data directly to R. As you will see, each dataset is accompanied with a R-code chunk that can be used to import exactly that dataset. You can read more about how to access and use the data here. We will develop richer funcionality in the future, but if you have specific ideas for such functionality that would be useful, please let us know. You can e.g. post it as an issue on the portal GitHub repository.

Thanks to GBIF for hosting our portal!

When we were in the process of screening options for a dedicated data portal for our network, we were happy to learn that GBIF was just about to launch a hosted portal program (read more about this here). This is a service where GBIF offer to host a portal for networks and organizations that need their own data portal. The portals are then developed in collaboration with the GBIF secretariate in Copenhagen and the respective organizations. Living Norway will take this opportunity to thank the GBIF secretariate and the Norwegian GBIF node for fantastic help and support all the way.

Visit our data portal here:

New research addresses the role of the peer-reviewed literature in open biodiversity data

Author: Caitlin Mandeville

In a recent article published in the journal BioScience, we aimed to dig into one big question: Is the peer-reviewed literature a source or sink for open biodiversity data?

As authors of the study, we were motivated by the importance of open data stewardship. Open biodiversity data are increasingly mainstream; a quick glance through the Global Biodiversity Information Facility (GBIF) makes it clear that the digital world is teeming with biodiversity data. Historical data from museums, herbaria, and file drawers are given new life in digital repositories, and new data—especially from citizen science—are continually added to the global collection of open biodiversity data.

These open data are a powerful resource for biodiversity science and conservation. The reusability of digital data opens new doors for data synthesis, metaanalysis, and reproducibility, and the inventory of existing biodiversity data can be used to prioritize the collection of new data that will have the biggest impact

The growing volume of open data suggests that more researchers than ever are sharing their data. But to really measure progress in data sharing, we also need to understand how often data are held back from being shared. We thought that a better understanding of unshared biodiversity data could shed light on the barriers that keep them behind closed doors, so we set out to see what we could learn about biodiversity data that are reported in the literature but not shared openly.

The first challenge was to develop a way to survey instances where data could be shared, but aren’t. It’s a twist on an age-old challenge in biodiversity science: it’s easier to demonstrate the presence of a process (in this case, data sharing) than its absence. We took on this challenge with a broad review of the peer-reviewed literature, systematically searching through the Web of Science to find all articles that shared one common characteristic: all of them contained reports of species occurrence, which are the among the simplest, most unstructured types of biodiversity data (and arguably the easiest to share). Our search yielded thousands of articles spanning much of the earth’s taxonomic diversity. We developed a systematic review process (inspired by the excellent PRISMA protocol), rolled up our sleeves, and got to work reading articles and collecting data.

The diversity of biodiversity data

It turned out that 42% of our collected articles relied on data from an open database. Another testament to the impact of open data! But we were surprised to find that 81% of the articles used at least some data from a source other than an open database. Of these, about 40% reported new data collected by the study authors. Others reported previously unpublished data, data from government agencies and private organizations, and data gathered from other published sources. Roughly a third of the articles attributed some of their data to citizen science.

Figure shows the sources directly accessed by the authors of articles that we reviewed.

The articles included a great variety of analysis approaches. Species distribution modeling was very common, but so were various descriptive statistics, species richness studies, and a range of other inferential approaches. In the many applications of biodiversity data, we found evidence that good data stewardship supports creativity and innovation—studies that reported metadata about data structure, as well as those that integrated unstructured data with different data types, were more likely to take on uncommon, innovative analysis approaches.

So already, one fundamental question was answered. Unstructured biodiversity data in the literature are not only derived from sources that are already openly accessible—to the contrary, new unstructured biodiversity data are being collected, reported, and analyzed in the literature all the time.

The peer-reviewed literature: source or sink for open data?

The ultimate question we wanted to address was: what happens to these data after they’re reported in the peer-reviewed literature? Do authors who report newly collected data go on to make them openly available—in other words, is the literature a source of open data? Or does it act as a sink, where data are reported once but then locked away from future reuse?

We found that most newly generated biodiversity data reported in the literature are not openly shared. This was true even when the data users were familiar with open data—authors who integrated data from open sources with other data were no more likely to share their new data than authors who did not use any open data. Clearly, there are still significant barriers keeping many data collectors from sharing their data.

Figure shows the proportion of articles using data accessed from each direct source that go on to openly share their data. Grey bars indicate articles that integrated data from multiple sources, which were not accounted for in this figure.

Solutions to these barriers will vary depending on characteristics of the unshared data, including source, structure, and ownership. For example, authors of some articles in our review compiled historical data from dozens or even hundreds of non-digitized sources. Variation in data ownership and structure may make it difficult for these authors to share their compiled data, so solutions in these cases will probably involve the efforts of institutions, including continued data digitization and provision of DOIs to facilitate data citation.

In other cases, barriers might be more straightforward for individual researchers to overcome. Original data collected by study authors are the most straightforward to share, but just 27% of authors in our review who had collected original biodiversity data shared those data after publication in the literature. Encouragingly, the sharing rate was about twice as high for original data from citizen science.

Looking to the 42% of reviewed articles that obtained data from open sources, we saw that practices for engaging with open data were similarly variable. Articles in our study accessed data from 117 different open data aggregators, ranging from well known data aggregators like GBIF to small databases with a narrow geographic or taxonomic scope. Studies show that the digital infrastructure supporting small databases is likely to become obsolete over time, so our finding that these small databases underlie so much research points to the importance of finding ways to preserve these data in the long term.

When open data were used in the literature, they were rarely cited in a way that would allow a reader to replicate the dataset. The practice of clearly citing data with a DOI is a small area for improvement that will make a big difference for reproducibility.

Communities of data users leading the way to data sharing

Despite many remaining hurdles to overcome, our results illustrate that data sharing is on the rise. The proportion of reviewed articles that share biodiversity data has increased over the last few years. And even more optimistically, our review of the literature led us to many recent articles unequivocally showing that perceptions of open data sharing are largely positive.

Figure shows the changing rates of data sharing in our dataset over time.

These trends suggest that, if practical barriers can be overcome, many more researchers will be ready to join the growing movement towards open data sharing. There are many existing resources that can help with this. One thing that especially stood out from the papers we reviewed was the importance of community. Because the barriers to sharing data are often localized and specific, the solutions often must be too. Communities of practice focused on supporting data sharing within subdisciplines or geographic regions can support the growth of community norms around open data and address specific barriers to data sharing.

We ended our article by pointing readers to some resources that we have found helpful for those just getting started with sharing biodiversity data. If that’s you, we encourage you to take a look! The shift to a culture of open data sharing relies on both institutional change and individual actions, and every small step can make a difference.

Read the study

Open Data Practices among Users of Primary Biodiversity Data. 2021. Caitlin P Mandeville, Wouter Koch, Erlend B Nilsen, Anders G Finstad. BioScience 71:11. https://doi.org/10.1093/biosci/biab072

Join our “Open Science Lab”

 

Join the Living Norway Ecological Data Network through our “open science lab”. The aim of the lab is to reach out to students and researchers in ecology, conservation science and ecology that share an interest in open, transparent and reproducible science. Sharing and reusing research data will be a particular focus area, but the lab will be a place for discussions about all kinds of aspects of open science practices.

Establishing a network of people

The Living Norway Ecological Data Network promotes management and sharing of ecological data based on best principles. Open sharing of (ecological) data is embedded in a deeper understanding of the value of open science. Therefore, we also explore and promote these wider open science opportunities (and challenges) for our research community and society at large.

We now launch our ” Open Science Lab ” to bring our activities closer to the wider research community in ecology, conservation and evolution. The “Open Science Lab”.is an initiative to connect people across career stages and institutions, in ecology and related disciplines, and in Norway and abroad. Members can participate in monthly webinars, join our discussion forums, learn from others, and contribute to developing and improving tools and procedures.  The input from the research community will be immensely important for how Living Norway develops in the future.

We now launch our ” Open Science Lab ” to bring our activities closer to the wider research community in ecology, conservation and evolution. The “Open Science Lab”.is an initiative to connect people across career stages and institutions, in ecology and related disciplines, and in Norway and abroad. Members can participate in monthly webinars, join our discussion forums, learn from others, and contribute to developing and improving tools and procedures.  The input from the research community will be immensely important for how Living Norway develops in the future.

The Open Science lab is launching

The open science labconnects people who share a common interest and passion for open, transparent, and reproducible ecological research centered around open and FAIR data. There are many unresolved questions and challenges (including data-infrastructures and software, use of standards for ecological data etc.), meaning there also are ample opportunities to make a real contribution if that is what you want. However, you are just as welcome to join the lab if you are just starting out on your journey towards more open science, and would simply like to learn more about FAIR data management and transparent and reproducible research practices.

What do you gain by joining our open science lab?

  • Access to monthly webinars & lab meetings
  • Access to networking across institutions and career stages
  • Possibility to contribute to code and software development, testing & review – for improved management of ecological data
  • Updated know-how and best practices guidelines for open science practices, reproducible workflows, and FAIR data management in ecology and beyond
  • Get help and advice about how to publish your ecological data (using the Darwin Core standard)
  • Possibility contribute to or even guest-edit the Living Norway blog
  • Suggest open science activities and hackathons, and get support from Living Norway to arrange those 
  • Contribute to and join discussions about the wider work of Living Norway Ecological Data Network

Is the open science lab something for you?

We certainly think so! The only expectation from you is that you have an interest in open, transparent and reproducible ecological research. Everybody across career stages, institutions, and backgrounds is welcome and joining and participation are entirely free. It is completely up to you how much you want to investigate in the lab – we appreciate your attendance in any case. If this sounds interesting, sign up and join us for this webinar series starting March 3rd.

To join our open science lab you can register by following this link: Registration. Further information about the open science lab can be found here.

Join our Open Science Lab

Note: Our open science lab is open to anyone. Joining  not require your institution to sign Living Norway’s network agreement, and you do not need to be associated with a Norwegian institution to join us.

Invitation to workshop – Managing and publishing data using LivingNorwayR

Living Norway Ecological Data Network will host a workshop December 7th – 8th, focusing on data management and data publishing of ecological data using program R combined with other tools.

We hereby invite you to participate in our workshop about data management and data publishing using R and other tools. In the workshop, we will present how you can use our newly developed package LivingNorwayR (for statistical software R) to manage ecological data, map data into the Darwin Core standard, document your data with rich metadata, and zip it all together to a Darwin Core Archive that can be archived or published with e.g. Living Norway and GBIF.

During the workshop, we will first present a general introduction to data management, data standardisation, data documentation, data archiving and data publishing. Then we will present the main functionality of the LivingNorwayR package. It will be possible to follow these lectures online. Then we will have a hands-on workshop session, where workshop participants will work on a set of pre-defined exercises. Finally, workshop participants are invited to bring their own data sets that they can work on, with a goal of producing a complete Darwin Core Archive data archive. The workshop organisers will be available and provide supervision during this session. This data package can then be stored locally on your computer, or it can be published and registered with Living Norway and GBIF using a specific tool provided by GBIF. The exercises will be made available to all participants, but we will only be able to provide practical help to in-house participants.

You do not need to be an expert in data standardisation or data management to participate in the workshop, but familiarity with program R is preferable if you want to take part in the practical exercises.

Tentative program:

  
December 7th 
10:30 – 12:00Introductory lectures (available for online participants)
12:00 – 12:45Lunch
12:45 – 13:15Introduction to exercises (available for online participants)
13:15 – 14:00Exercises
14:00 – 15:00Start to work on participants data
  
December 8th 
9:00 – 11:00Work on participants data
11:00 – 11:30Short recap
11:30 –Lunch

To register for the workshop, follow this link. Deadline for registration is December 1st.

Getting ready for the 3rd Living Norway Colloquium

We are getting ready for the 3rd Living Norway Colloquium – and hope you are also getting ready to join us! The theme for this years conference is “The ethics and technical know-how of open science in ecology and evolution” – a theme we are sure will be relevant across the ecological research community.

October 25th – 26th, we invite fellow ecologisits and others that are interested in open science, open data and research ethics to join our annual colloquium. Following up from last years event, we have decided that also this event will be a hybrid event. Thus – you can either join the event “live” at NINA-huset in Trondheim, or online from anywhere in the world. Last year, our colloquium attracted participants from >20 different countries, and we hope this years event will attract a comparable number of participants.

There is no fee to attend the event – and should you decide to join us live in Trondheim, you even get free lunch! This has been made possible through event support from the Research Council of Norway and in-kind contribution from NINA – and we take the opportunity to thank both for their kind support.

More information about the colloquium, including detailed program and link to the registration form can be found here.

Transforming science education to meet the needs of today’s students and tomorrow’s science

Elizabeth Law (NINA), Vigdis Vandvik (UIB), Matt Grainger (NINA), Erlend B. Nilsen (NINA)

As open science and reproducible research practices are becoming mainstream across the scientific community, we are becoming increasingly aware that this ‘FAIR open revolution’ in how science is planned, conducted, reported, communicated, and assessed, must also transform the way we teach and learn science. At the Living Norway 2020 ‘FAIR open education’ workshop, we shared experiences and plans, were presented with some interesting and inspiring case studies, and discussed opportunities and ways forward. We are working towards a publication of the workshop outcomes, but in the meantime, here are some of the main take-home messages

LivingNorway recently partnered with NINA, UiB bioCEED, and 128 registered participants from 23 countries (as part of the LivingNorway 2020 colloquium) to workshop how open science could and should change how we teach ecology. A recording of the talks is freely available, and the slides are being collated in a Zenodo community. Much gratitude to the organising team (alphabetical order: Sehoya Cotner, Dagmar Egelkraut, Aud Halbritter, Anette Havmo, Kelly Lane, Elizabeth Law, Chloe Nater, Erlend Nilsen, Christian Strømme, Richard Telford, Vigdis Vandvik), and the amazing invited speakers (with links to their slides: Hannah Fraser, Rob Salguero-Gómez, Aud Halbritter and Tanya Styrdom, Luis Verde Arregoitia, and Vigdis Vandvik’s opening). While we are working towards writing a manuscript  that goes into more detail based on the survey sent out to the participants, here is a summary of the day’s presentations and discussions.

A wordcloud representing initial thoughts on open science and education, developed from the first discussion groups using the R-package InteractiveWordcloud (available on github).

Open Science is rapidly and dynamically transforming how science is done 

Open science is transforming how we think about, do, and communicate science. No longer constrained to discussion on free and open access to read research articles (open access), and to download and use data (open data, e.g. LivingNorway and GBIF), the emerging open science landscape includes openness in all stages of research, from openness about research planning (e.g., pre-registered reports) via methods (open protocols) to data (open data, FAIR data practices) and analyses (open code, e.g. in R and on github) to research outputs (open access, open peer review, open research synthesis). Associated with this development are new platforms for sharing and participating in all these different aspects of open science (e.g. OSF). As this landscape has evolved, the classical view of ‘access’ as the main benefit of open science has broadened to realize that openness in science is key to promote quality, reproducibility, efficiency, and broad sharing or research both within and beyond the scientific society. 

The Living Norway Open Educational Workshop emerged from a realization that this ongoing transformation of how science operates should have profound implications for how science should be taught and learnt, as Vigdis Vandvik emphasised in the opening of the workshop. Students need to learn these Open Science skills – and therefore we have a responsibility to teach them the principles and practice. Not only are these skills becoming required for best practice and ethical research, but they also are increasingly essential to gaining funding and publications, building meaningful networks, answering emerging large-scale and integrative questions, and developing careers both inside and outside scientific research (these skills are also highly transferable to professional work environments). We need to do our best in transforming science education to meet the needs of today’s students and tomorrow’s science.

Opening the opportunity for next-level science

As a founder and coordinator of the COMADRE/COMPADRE matrix population model global database (currently containing 1177 species and 11231 models), Roberto Salguero-Gómez (Oxford University; slides here) knows a fair bit about the challenges but also the great benefits of being able to synthesise knowledge across species and regions. These databases have led to important new insights, such as uncovering worldwide patterns in plant demography, developing demographic theory, as well as forecasting impacts of global change (see the complete list of papers here). But getting to this point has involved years of hard work with the data, for example developing appropriate metadata, standardising protocols, and ensuring accessibility and credit to contributors.

Open Science brings many new opportunities for teaching ecology. For example, Rob and his team have focused on making their data accessible through workshops and teaching material, allowing students a fast track to use open data and code developed for the databases (e.g. in the form of R packages for COMADRE/COMPADRE) and learn the concepts of matrix demography through open education materials. More generally, there is increasing availability of free and engaging online resources, particularly for complex topics, such as those available for learning R, or learning complex math like eigen-stuff). 

Opening doors for next-generation scientists

Open science offers learning opportunities beyond classical educational settings. Students can learn open science by integrating their classes into real science workflows. For example Aud Halbritter (UiB) and Tanya Styrdom (Université de Montréal), gave a coordinator and student perspective (respectively) on how Open Science integrates into the UiB Plant Functional Traits course. This started as a necessity: improving workflows to enhance the quality of the data being collected (via standardised measurement protocols) but quickly evolved to also including students in best practice data processing, management and publication. Over the last five years, the course has blossomed from being ‘just another field ecology course’, to one where the students collect and manage real data which are shared openly through  real scientific data publications, and  later used in real science (e.g papers listed here). Through their participation, students learn both the principles and practice of open and reproducible research (e.g. standardised protocols, best practice data management, use and contribution to open data, use and development of open code with R and github). Collaborative and cross-cultural communication skills emerged as important added learning outcomes from the real research participation in the courses. 

Open Science expanding the student experience in the UiB Plant Functional Traits course –  CC-BY Strydom, Tanya, & Halbritter, Aud H. (2020, October). Taking FAIR and open science to the field. The evolution of the PFTC field course. Zenodo. http://doi.org/10.5281/zenodo.4117504 (slide 2)

Dynamic change and challenges

These Open Science skills, technologies, resources, and practices are dynamic, however, leaving us with a moving target. How do  educators keep up with the times, but not overload ourselves (or our students), and what will be relevant in the future?

An example of this question came through the discussions, highlighting how R is often both the solution and the problem. While we commonly document code and workflows (aiming for transparency and repeatability) through R scripts and packages, the general lexicon evolves and branches from base R to tidyverse. What to teach first? Both dialects are ostensibly required, and while some prefer base because it is fundamental and used by most other packages, others find tidyverse a more intuitive introduction. Also, many packages (particularly with the ongoing development of tidyverse) get periodically updated, and these updates can often lead to code breaking or becoming “buggy” over time. Luckily, more open-source software comes to the rescue, such as packrat or renv or conda. But a balance needs to be found: participants agreed docker is possibly ‘best practice’, however it is currently more challenging to use (although see a friendlier introduction here) and possibly overkill in many contexts, as one commenter put it “using a sledgehammer to hammer a nail”.  

Did we lose you there with all that discussion of R tools  many of us have never heard of? This is something we need to address: the acknowledgement of the risks of Open Science simply preaching to the converted, alienating those who are not, and expanding the existing equity gap of student experience, rather than narrowing it. While Open Science has the core values of reproducibility, accountability, and FAIRness, it also “inherits many systematic barriers that already exist in mainstream science”. This elephant-in-the-room might need to be a question for future LivingNorway colloquiums. For now, let’s return to the Open Science values of reproducibility and accountability.

Cherry-picking: the lowest-hanging fruit

The first step in change is recognising there is a problem. For example, the UiB Plant Functional Traits course recognised they had an issue with poor quality data that was preventing them doing quality science. But these problems are far from unique.

Hannah Fraser (University of Melbourne) (slides from her talk) revealed how common questionable research practices are in ecology and evolution: at levels similar to those causing alarm in education, and psychology (also here). Of these, she focused on cherry-picking, which is often unintentional; a result of the culture of field ecology going out and measuring a bunch of different variables, analysing them in many different ways, and selectively reporting only the ‘best’ results. Hannah also highlighted the reproducibility crisis (very few studies are repeated, and if they are, often different answers are found). Hannah discussed two emerging solutions – pre-registration to address cherry-picking, and repeat studies to address the issues and implications of reproducibility.

In terms of cherry-picking, pre-registration – a statement of the analysis intention and hypotheses prior to collecting or collating the data – is arguably the lowest-hanging fruit. We already do this in the form of project proposals, but there is a lack of impetus and culture to be stringent about revisiting these when writing up for publication, or to make them publicly available. In the case of student theses/dissertations, Hannah points out there is even the cultural expectation that projects will change. And this is reasonable, Hannah notes, plans can be updated, we just need to change the culture to make it clear when deviations occur, and thereby distinguish exploratory vs confirmatory research. Several participants in the workshop agreed that there are many benefits to improved pre-registration, particularly for graduate students, including clarifying research questions and hypotheses, spreading the workload of writing across the project. Going one step further, developing analysis scripts prior to getting the data can be really useful in terms of focusing the analysis on getting the methods right without getting lost trying to get the ‘right’ results (and is ‘best practice’ science). All of this can help keep students on task, and on track.

The infrastructure for pre-registration is there, so what is stopping us? Do we fear that commitment  to a predefined hypothesis may preclude explorations of interesting unexpected results? This fear is unfounded, as pre-registration does not preclude exploratory analyses (see above). If we fear being “scooped”, this fear is also unreasonable, as pre-registration actually does the opposite – it provides precedence, a foot in the door even before our results are in and ready. Do we fear the possibility of negative results impacting publishability? Again, pre-registration may actually provide a solution: many journals now accept or encourage pre-registrations, typically with a commitment to publish from the journal’s side (Hannah mentions Ecology and Evolution, and Conservation Biology as examples). Also, pre-registration gives us valuable access to peer-reviewers’ comments at a stage in the research when changes can still be made to plans and protocols. Or do we fear being wrong, and fear this could damage our reputation as a scientist? We need to rise above this: a negative or contradictory result is rarely ‘wrong’, indeed, a study suggests that, in science, “admitting wrongness … is less harmful to one’s reputation than not admitting”.

A glimpse into the workshop, collated from workshop output, the LivingNorway twitter stream, and photos by Vigdis Vandvik.

A cumulative practice

Which brings us to replication studies, because, to be fair, that study on the value of admitting wrongness was in the context of replication studies… but it should extend to any scientific research endeavour, because most of the experiments that we do are indeed quasi-replications (testing an existing hypothesis in a novel context) since science is, afterall, a cumulative practice. But in the overwhelming emphasis of ‘novelty’ and ‘innovation’ pushed by funders, publishers, and ourselves, we are apparently extremely loath to admit this, with only 0.023% of studies self-identifying as replications (typically those that seek to exactly replicate prior work). In ecology, exact replications may be impossible in all but some rather extreme cases, but conceptual replications (aiming to replicate prior work as close as possible) are more likely to be possible.

Why? Perhaps with our cultural predilection with novelty, our entrancement by the complexity of ecology, and our belief that our study subject is so unique and special, the thought of repeating work is horrifying. But considering the natural variability in our natural world, and the massive rapid changes our environment is facing, replication studies – including exact, conceptual, and quasi-replications – are essential to confirm knowledge and develop theory. It is exactly this lack of self-identified replication studies should horrify us.

Replication studies, particularly in undergraduate and postgraduate contexts, can be useful teaching tools as well as important contributions to knowledge. For example see the open Collaborative Replications and Education Project. Given constraints on time and funding, these provide blueprints, but also data to analyse if none can be collected in the appropriate frame. Students will also learn the value of standards, transparency, and metadata for ensuring repeatability (essential for replication). Replication studies, or even ‘just’ repeat analyses of existing data, can lead to highly cited and citable papers (including finding simple errors in the work leading to seminal theories). Explicit acknowledgement of how new work repeats (or else distinguishes itself from) past work is also key to avoiding research waste. It is clear we need to change our perception of replication studies, from being ‘boring’ or difficult to publish – to value knowledge over novelty.

Start small, simple, salient, and supportive

As the technical barriers to data sharing continue to fall, we face a more intimate, and perhaps more complicated, obstacle to open data – the one in our minds.”

Overall, within the workshop there was general agreement that starting small, simple, and salient is helpful, because often the most challenging and intimidating part is simply to start. Pre-registration and repeat studies are two really great examples of how we can potentially affect big and meaningful change through relatively small changes in our perceptions and practices, particularly in the context of teaching. But they also highlight how it is typically our (mostly unfounded, but still felt) fears that may hold us back.

For the student, the teaching environment gives us a fantastic opportunity to help quell those fears. For example, Tanya Styrdom suggested that while committing code to github can feel scary, that doing this in a safe environment, with support from her teachers, made it more approachable. This is all well and good for the proficient student, but what about those with less experience with coding, but how about the teacher? Integrating Open Science can see many of us feeling out of our depth. Here Aud Halbritter emphasised starting incrementally, and focussing on the changes that will give the most benefit (and indeed, focussing on the benefits). With the recognition that not all students are necessarily so comfortable with coding and technology, the course coordinators chose to make many of the more ‘advanced’ Open Science aspects of the course optional extras. This allows all students to at least be aware of the possibilities, and is a great option for both teachers and students to learn together (though perhaps requiring some adaptability and humility of the teacher).

Luis Verde Arregoitia presented another approach to starting small, in safe, supportive environments: blogging. Blogging, he argues, is a really approachable way to start interacting with Open Science. This is true in the case of both learners and teachers. Luis started blogging as a way to help fellow students, inspired by this advice, but his blog is now an access point for many opportunities in teaching, learning, and collaborating. Blogs can include several elements to make them effective for teaching, including being self-contained interactive exercises, well structured, and engaging. But blogs can take many forms, from formal tutorials to a more mutual learning experience (e.g. ‘today I learned’), and it is the informal tone that makes them so approachable when learning. Another participant agrees, “blogs can teach you that you are not alone in your problems”. And you don’t have to get them perfect the first go: being editable, they are a ‘low commitment’ way of starting a journey to open science.

So no excuses! Go forth and (teach and learn) open science!

Getting ready for the second Living Norway colloquium!

In a couple of days, the Living Norway Colloquium 2020 will take place as a hybrid event. With close to 150 participants from more than 15 different countries spread across several time zones, we are a bit overwhelmed by the interest in the conference. But this for sure is a luxery that we should be able to handle!

The main reason for the great interest is obviously the many excellent presenters that we have managed to fit into the program on both days. The updated colloquium program can be found here. On day 1 (October 12th), we will have a series of lectures on key topics related to open and reproducible science for the 21st century. On day 2 (October 13th), we will organize two workshops that both will include group discussions and other group tasks. We hope all participants are ready to join in on the discussions!

Although the conference is just a few days ahead of us, it is still possible to sign up to attend the conference online (unfortunately, we cannot host more people in-house in Trondheim). You can find the link at the conference web page here. If you just want to view the presentations without attending the conference and joining the discussions, you could do so by the livestream embedded in the conference web page (here).

We are looking forward to meeting you all early next week – be it in person in Trondheim (Norway) or online from anywhere in the world!

Living Norway Colloquium 2020

Publishing complex ecological data using the Darwin Core standard

Wouldn’t it be nice if we could share our ecological data using a common format, in a common place, freely available for everyone? In his blog post on Living Norways technical blog site, researcher Jens Åström from the Norwegian Institute for Nature Research (NINA), discuss how you can use the Darwin Core standard to publish complex ecological data.

Making your data publicly available is quickly becoming a standard task for researchers. It is increasingly demanded by journals when publishing your research findings, or even by funding agencies when applying for grants. Journals have traditionally accepted data in file format, which can be reached through their websites along with the paper. Wouldn’t it be nice if we could store our ecological data using a common format, in a common place, freely available for everyone?

Foto: Jens Åström

In his blog post, researcher Jens Åström from the Norwegian Institute for Nature Research (NINA) discuss how he formatted and published a multi-year observation data set of ~80 species with a hierarchical survey scheme, while incorporating all collected environmental covariates, and meta-data into GBIF. The data set is similar in structure to many other data sets that typically arise from ecological monitoring and research programs. Read the blog post here.

Publishing complex ecological data