DOI

Introduction to Reproducible Analyses in R

A Royal Society of Biology one-day Continuing Professional Development course held 9 December 2019.

Overview

An increase in the complexity and scale of biological data means biologists are increasingly required to develop the data skills needed to design reproducible workflows for the simulation, collection, organisation, processing, analysis and presentation of data. Developing such data skills requires at least some coding, also known as scripting. This makes your work (everything you do with your raw data) explicitly described, totally transparent and completely reproducible. However, learning to code can be a daunting prospect for many biologists! That’s where an Introduction to reproducible analyses in R comes in!

R is a free and open source language especially well-suited to data analysis and visualisation and has a relatively inclusive and newbie-friendly community. R caters to users who do not see themselves as programmers, but then allows them to slide gradually into programming.

Who is this course for?

Introduction to reproducible analyses in R is aimed at biologists at all stages of their careers interested in experimenting with R to make their analyses and figures more reproducible.

Prerequisites

No previous coding experience will be assumed. Pre-course instructions for participants are given below

Learning outcomes

After this workshop the successful learner will be able to:

  • Find their way around the RStudio windows
  • Create and plot data using the base package and ggplot
  • Explain the rationale for scripting analysis
  • Use the help pages
  • Know how to make additional packages available in an R session
  • Reproducibly import data in a variety of formats
  • Understand what is meant by the working directory, absolute and relative paths and be able to apply these concepts to data import
  • Summarise data in a single group or in multiple groups
  • Recognise tidy data format and carry out some typical data tidying tasks
  • Develop highly organised analyses including well-commented scripts that can be understood by future you and others
  • Use R Markdown to produce reproducible analyses, figures and reports

Pre-course instructions for participants

Computing requirements

Laptops should have the following installed prior to attending the workshop:

  • R version 3.6
  • RStudio (1.2)

Installing R

Download the pre-compiled binary for your OS from https://cloud.r-project.org/ and install. More specifically:

For Windows

Click “Download R for Windows”, then “base”, then “Download R 3.6.1 for Windows”. This will download an .exe file; once downloaded, open to start the installation.

For Mac

Click “Download R for (Mac) OS X”, then “R-3.6.1.pkg” to download the installer. Run the installer to complete installation.

For Linux

Click “Download R for Linux”. Instructions on installing are given for Debian, Redhat, Suse and Ubuntu distributions. Where there is a choice, install both r-base and r-base-dev.

Installing R Studio

Downloads are available from https://www.rstudio.com/products/rstudio/download3/ (scroll to the end of the page to see the downloads).

For Windows with no admin rights

Download the .zip source archive under “Zip/Tarballs”. Extract the files to a folder where you have write access, e.g. C:\Users\username\RStudio. In this folder, open the bin directory and find the RStudio program: it is named rstudio.exe, but the file extension will typically be hidden, so look for rstudio. Right-click this executable to create a desktop shortcut. Double-click the executable or use the shortcut to open.

Issues

If you have problems setting up your laptop we will try to help at the start of the workshop.

Slides

Slides

Creative Commons License
Royal Society of Biology CPD: An Introduction to Reproducible Analyses in R by Emma Rand is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.