Overview

How to start practicing open data science!

Open data science means that methods, data, and code are available so others can access, reuse, and build from it without much fuss. We use a variety of programs, tools, and practices to do reproducible research.

Our workflow depends on R, RStudio, RMarkdown, and Git/GitHub. These resources will get you started.

Introduction to Open Data Science is a hands-on training book that introduces the tools, practices, and workflows that underpin our work (Lowndes et al. 2017).

The learning hub at NCEAS has so many excellent trainings and resources. It is worth scrolling through the materials to see what is available. For example:

Learn R (and other skills) using Swirl

More about Git: Happy Git with R by Jenny Bryan (short course)

Improving collaboration

Collaboration is messy! Working with people can be challenging! But working together is one of the most rewarding things we can do in science (and, maybe, as humans). And, besides, the problems facing our world and the challenges of doing good science are too big to solve as individuals. So collaborate we must!

In addition to R, RStudio, RMarkdown, and GitHub, we use these resources to improve collaboration:

Dealing with spatial data

If you are dealing with ecological data, at some point you will need to embrace spatial data.

Additional Training

The eco-data-science study group at the University of California Santa Barbara has created a number of useful tutorials. So much goodness there!

Understanding relational data is very important. This chapter in Hadley Wickham & Garrett Grolemund’s R for Data Science provides a good explanation.

Dealing with color can be painful, this color guide should help.