R for Researchers: Data preparation solutions

April 2015

This article contains solutions to exercises for an article in the series R for Researchers. For a list of topics covered by this series, see the Introduction article. If you're new to R we highly recommend reading the articles in order.

There is often more than one approach to the exercises. Do not be concerned if your approach is different than the solution provided.

Exercise solutions

Do the following exercise in the AlfAnalysis script.

  1. Import the dataset in alfalfa.txt from the course Dataset folder.

    #######################################################
    #######################################################
    ##
    ##   Import Data
    ##
    #######################################################
    #######################################################
    
    alfalfaIn <- read.table("Datasets/alfalfa.txt", header=TRUE)
    str(alfalfaIn)
    'data.frame':   25 obs. of  4 variables:
     $ shade     : int  1 1 1 1 1 2 2 2 2 2 ...
     $ irrigation: int  1 2 3 4 5 1 2 3 4 5 ...
     $ inoculum  : chr  "A" "B" "D" "C" ...
     $ yield     : num  33.8 33.7 30.4 32.7 24.4 37 28.8 33.5 34.6 33.4 ...
    alfalfa <- alfalfaIn
  2. Change the variable names to "shade","irrig", "inoc", and "yield"

    colnames(alfalfa) <- c("shade","irrig","inoc","yield")
  3. Create a new variable for shade level (shadeLev) from the shade variable by setting 1 to "full", 5 to "none", and the rest to "part".

    shadeLev <- cut( alfalfa$shade, 
                     c(0,1.5,4.5,6),
                     labels=c("full","part","none")
                     )
    str(shadeLev)
     Factor w/ 3 levels "full","part",..: 1 1 1 1 1 2 2 2 2 2 ...
  4. Change the type of shadeLev to factor.

    shadeLev <- factor(shadeLev)
  5. Include shadeLev in the alfalfa data.frame.

    alfalfa <- data.frame(alfalfa,
                          shadeLev=shadeLev
                          )
  6. Change inoculate level E to control. This is a more challenging problem.

    alfalfa$inoc <- ifelse(alfalfa$inoc=="E","cntrl",
                           as.character(alfalfa$inoc)
                          )
    alfalfa$inoc <- factor(alfalfa$inoc)
    str(alfalfa)
    'data.frame':   25 obs. of  5 variables:
     $ shade   : int  1 1 1 1 1 2 2 2 2 2 ...
     $ irrig   : int  1 2 3 4 5 1 2 3 4 5 ...
     $ inoc    : Factor w/ 5 levels "A","B","C","cntrl",..: 1 2 5 3 4 5 4 2 1 3 ...
     $ yield   : num  33.8 33.7 30.4 32.7 24.4 37 28.8 33.5 34.6 33.4 ...
     $ shadeLev: Factor w/ 3 levels "full","part",..: 1 1 1 1 1 2 2 2 2 2 ...
  7. Commit your changes to AlfAnalysis.

    There is no code associated with the solution to this problem.

Return to the Data presentation article.

Last Revised: 2/9/2015