R for Researchers: Data preparation solutions
April 2015
This article contains solutions to exercises for an article in the series R for Researchers. For a list of topics covered by this series, see the Introduction article. If you're new to R we highly recommend reading the articles in order.
There is often more than one approach to the exercises. Do not be concerned if your approach is different than the solution provided.
Exercise solutions
Do the following exercise in the AlfAnalysis script.
Import the dataset in alfalfa.txt from the course Dataset folder.
####################################################### ####################################################### ## ## Import Data ## ####################################################### ####################################################### alfalfaIn <- read.table("Datasets/alfalfa.txt", header=TRUE) str(alfalfaIn)
'data.frame': 25 obs. of 4 variables: $ shade : int 1 1 1 1 1 2 2 2 2 2 ... $ irrigation: int 1 2 3 4 5 1 2 3 4 5 ... $ inoculum : chr "A" "B" "D" "C" ... $ yield : num 33.8 33.7 30.4 32.7 24.4 37 28.8 33.5 34.6 33.4 ...
alfalfa <- alfalfaIn
Change the variable names to "shade","irrig", "inoc", and "yield"
colnames(alfalfa) <- c("shade","irrig","inoc","yield")
Create a new variable for shade level (shadeLev) from the shade variable by setting 1 to "full", 5 to "none", and the rest to "part".
shadeLev <- cut( alfalfa$shade, c(0,1.5,4.5,6), labels=c("full","part","none") ) str(shadeLev)
Factor w/ 3 levels "full","part",..: 1 1 1 1 1 2 2 2 2 2 ...
Change the type of shadeLev to factor.
shadeLev <- factor(shadeLev)
Include shadeLev in the alfalfa data.frame.
alfalfa <- data.frame(alfalfa, shadeLev=shadeLev )
Change inoculate level E to control. This is a more challenging problem.
alfalfa$inoc <- ifelse(alfalfa$inoc=="E","cntrl", as.character(alfalfa$inoc) ) alfalfa$inoc <- factor(alfalfa$inoc) str(alfalfa)
'data.frame': 25 obs. of 5 variables: $ shade : int 1 1 1 1 1 2 2 2 2 2 ... $ irrig : int 1 2 3 4 5 1 2 3 4 5 ... $ inoc : Factor w/ 5 levels "A","B","C","cntrl",..: 1 2 5 3 4 5 4 2 1 3 ... $ yield : num 33.8 33.7 30.4 32.7 24.4 37 28.8 33.5 34.6 33.4 ... $ shadeLev: Factor w/ 3 levels "full","part",..: 1 1 1 1 1 2 2 2 2 2 ...
Commit your changes to AlfAnalysis.
There is no code associated with the solution to this problem.
Return to the Data presentation article.
Last Revised: 2/9/2015