It is rarely the case that we are given tidy data to do an analysis. Most of the time, we will need to clean up untidy data and make it tidy, in order to do analysis and plotting in R. This process of tidying is what we will focus on for the next part of today's tutorial. (See this paper by Hadley Wickham for a lengthier discussion of tidy data.)
However, it is worth noting that if you are creating an experiment or a way to collect data, you will save yourself time upfront by storing the resulting data in a tidy format.
The tidyr package contains several helpful functions:
gather()
: collapses multiple columns into fewer columns, based on a key-value pair
spread()
: spreads a dataframe into more columns, based on a key-value pair
Let's practice in R: open RStudio, navigate to the script for this lecture, and copy the script text into a new R script.