R for Researchers: Data exploration solutions

April 2015

This article contains solutions to exercises for an article in the series R for Researchers. For a list of topics covered by this series, see the Introduction article. If you're new to R we highly recommend reading the articles in order.

There is often more than one approach to the exercises. Do not be concerned if your approach is different than the solution provided.

These solutions require the solutions from the prior lesson be run in your R session.

Exercise solutions

These exercises use the alfalfa dataset and the work you started on the alfAnalysis script. Open the script and run all the commands in the script to prepare your session for these problems.

  1. Do a summary of the alfalfa data.frame.

    summary(alfalfa)
         shade       irrig      inoc       yield       shadeLev 
     Min.   :1   Min.   :1   A    :5   Min.   :24.40   full: 5  
     1st Qu.:2   1st Qu.:2   B    :5   1st Qu.:33.20   part:15  
     Median :3   Median :3   C    :5   Median :34.60   none: 5  
     Mean   :3   Mean   :3   cntrl:5   Mean   :34.11            
     3rd Qu.:4   3rd Qu.:4   D    :5   3rd Qu.:36.90            
     Max.   :5   Max.   :5             Max.   :39.10            
  2. Do a cor of the shade, irrig, and yield variables. Use only the variables which have a sensible order.

    cor(alfalfa[,-c(3,5)])  
              shade      irrig      yield
    shade 1.0000000  0.0000000  0.5205662
    irrig 0.0000000  1.0000000 -0.2233344
    yield 0.5205662 -0.2233344  1.0000000
  3. Create a frequency table for shade and irrig.

    table(alfalfa$shade,alfalfa$irrig)
    
        1 2 3 4 5
      1 1 1 1 1 1
      2 1 1 1 1 1
      3 1 1 1 1 1
      4 1 1 1 1 1
      5 1 1 1 1 1
  4. Use aggregate to find the mean of the inoculate groups.

    aggregate(alfalfa$yield, by=list(alfalfa$inoc),FUN=mean)
      Group.1     x
    1       A 35.76
    2       B 35.04
    3       C 35.68
    4   cntrl 29.16
    5       D 34.90
  5. Commit your changes to AlfAnalysis.

    There is no code associated with the solution to this problem.

  6. Use plot to create pairwise plots for the alfalfa data.frame.

    plot(alfalfa)

  7. Use ggplot to plot to create a scatter plot of the yield vs. inoc. Use a white background and color the observations based on shade level.

    ggplot(alfalfa) +
      geom_point(aes(x=inoc, y=yield, color=shadeLev)) +
      theme_bw()

  8. Commit your changes to AlfAnalysis.

    There is no code associated with the solution to this problem.

Return to the Data exploration article.

Last Revised: 2/9/2015