Spring 2015 Midterm

A chief executive officer (CEO) is a leader of a firm or organization. The CEO leads by developing and implementing a strategic policy for the firm. The CEO is in charge of a management team that is responsible for the daily firm operations, financial strength and corporate social responsibilities.

The CEO also leads the firm in compensation. Generally, a CEO is the most highly paid person in a firm; CEO salaries are at the top of the pyramid. Although some industries have employees whose salaries exceed the CEO’s, for example sales agents, the broad rule is that CEO salaries form an effective upper bound for employee compensation. Thus, although very few managers ever become chief executive officers, there is a great deal of interest in CEO salaries. CEO compensation indirectly influences salaries for a large portion of the firm workforce.

CEO salaries in the United States are of interest because of their relationship to salaries in international firms and to salaries of people that do not belong to Corporate America. Top managers in the United States have come under a great deal of criticism for being so highly paid compared to their international counterparts. Yet, compensation of CEOs may not be out of line compared to top professionals in other fields. For example, Linden and Machan (1992, “Put Them at Risk!” Forbes Magazine, p. 158) compares CEO salaries with professionals such as actors, models, surgeons, sports personalities and so on, and finds the compensation comparable.

Measuring annual compensation for a CEO is fraught with difficulties. Compensation clearly includes salary plus bonuses, that is, cash payments that may or may not be performance related. Other compensation is more difficult to measure and may include restricted stock awards and contributions to retirement, health insurance, and other employee benefit plans. Remuneration may also come in the form of stock gains based on the CEO’s stock ownership or exercise of stock options, although we did not consider this source of income.

The data for this study were drawn from the May 25, 1992 issue of Forbes Magazine entitled “What 800 Companies Paid for their Bosses.” This article provides several measures of CEO compensation, as well as characteristics of the CEO and measures of his firm’s performance. We say “his” because of the 800 CEOs studied in this article, only one was a woman. The goal of this report is to study CEO and firm characteristics to determine the important factors influencing CEO compensation.

To understand the determinants of CEO compensation, one hundred observations were randomly selected from the 800 listed in the Forbes article. Although the Forbes article did not cite the basis for a firm to be included in its survey, the 800 companies seem to represent the largest publicly traded companies in the United States. Our sample of one hundred CEOs and their firms represent a cross-sectional sample of America’s largest corporations. In our cross-section, the CEO and firm characteristics were based on 1991 measures.

Table 1 provides variable definitions.
$$
{scriptsize
begin{matrix}{large text{Table 1. Variable Definitions} }\
begin{array}{ll} hline
Variable & Definitions \
hline
text{COMP} & text{Sum of salary, bonus and other 1991 compensation, in thousands of dollars.} \
& ~~~text{Other compensation does not include stock gains}. \
text{AGE} & text{CEOs age, in years} \
text{SALES} & text{1991 sales revenues, in millions of dollars} \
text{TENURE} & text{Number of years employed by the firm} \
text{EXPER} & text{Number of years as the firm CEO} \
text{VAL} & text{Market value of the CEOs stock, in thousands of dollars} \
text{PCTOWN} & text{Percentage of firm’s market value owned by the CEO } \
text{PROF} & text{1991 profits of the firm, before taxes, in millions of dollars} \
text{EDUCATN} & text{Education level.} \
& 0 text{ indicates that the CEO does not have an undergraduate degree} \
& 1 text{ indicates that the CEO has only an undergraduate degree} \
& 2 text{ indicates that the CEO has a graduate degree} \
text{BACKGRD} & text{Categorical variable to professional background of the CEO} \ hline \
end{array}
end{matrix}
}
$$

Part I. Preliminary Summarization.

1. From a preliminary examination of the data, the 51st observation, had an unusually low compensation. This was Craig McCaw, CEO of McCaw Cellular, who reported a salary of $155,000 in 1991. This was despite a five-year total reported salary of over fifty-three million dollars. As founder of McCaw Cellular, Mr. McCaw received a substantial amount of remuneration outside of figures reported in 1991. Omit him from the sample.
Solution

2. Create the variables LOGCOMP, the natural logarithm of COMP, LOGSALES, the natural logarithm of SALES and LOGVAL, the natural logarithm of VAL.

  • 2a. Create histograms of COMP and LOGCOMP; compare the two distributions, commenting in particular on the effect that the logarithmic transformation has on the symmetry.
  • 2b. Do this also for SALES and VAL.

Solution

3. Compute summary statistics of the continuous variables COMP, LOGCOMP, AGE, SALES, LOGSALES, TENURE, EXPER, VAL, LOGVAL, PCTOWN, and PROF. Identify the median value of each variable.
Solution

Part II. Basic Linear Regression.

1. Plot SALES versus COMP and then LOGSALES versus LOGCOMP. Discuss the difficulties in modeling the relationship between SALES versus COMP that are not apparent in a relationship between LOGSALES and LOGCOMP.
Solution

2. Compute correlations among the continuous variables COMP, LOGCOMP, AGE, SALES, LOGSALES, TENURE, EXPER, VAL, LOGVAL, PCTOWN, and PROF. Identify the variable (excluding LOGCOMP) that seems to have the strongest relationship with COMP. Also, identify the variable (excluding COMP) that seems to have the strongest relationship with LOGCOMP.
Solution

3. Fit a basic linear model, using LOGCOMP as the outcome of interest and LOGSALES as the explanatory variable.

  • 3a. Interpret the coefficient associated with LOGSALES as an elasticity.
  • 3b. Provide 90% and 99% confidence intervals for your answer in 3a.

Solution

Part III. Multiple Linear Regression - I.

1. Create a binary variable, PERCENT5, that indicates whether the CEO owns more than five percent of the firm's stock. Create another binary variable, GRAD, that indicates EDUCATN=2.
Solution

2. Run a regression model using LOGCOMP as the outcome of interest and four explanatory variables, LOGSALES, GRAD, PERCENT5, and EXPER.

  • 2a. Interpret the sign of the coefficient associated with GRAD. Comment also on the statistical significance of this variable.
  • 2b. For this model fit, is EXPER as statistical significant variable? To response to this question, use a formal test of hypothesis. State your null and alternative hypotheses, decision-making criterion, and decision-making rule. Use a 10% significance level.

Solution

Part III. Multiple Linear Regression - I.

We run a regression model using LOGCOMP as the outcome of interest and four explanatory variables, LOGSALES, GRAD, PERCENT5, and EXPER. Correlations and the fitted regression model appear below.

III.3

  • a. Determine the partial correlation coefficient between EXPER and LOGCOMP, controlling for other explanatory variables.
  • b. Compare the usual correlation coefficient between EXPER and LOGCOMP to the partial correlation calculated in part a. Contrast the different appearances that these coefficients provide and describe why differences may arise for this data set.
Table. Correlation Coefficients
> round(cor(cbind(LOGCOMP,LOGSALES,GRAD,PERCENT5,EXPER,LOGVAL)),digits=3)
         LOGCOMP LOGSALES   GRAD PERCENT5  EXPER LOGVAL
LOGCOMP    1.000    0.496 -0.331   -0.181  0.216  0.366
LOGSALES   0.496    1.000 -0.159   -0.034 -0.062  0.114
GRAD      -0.331   -0.159  1.000   -0.256 -0.207 -0.402
PERCENT5  -0.181   -0.034 -0.256    1.000  0.247  0.530
EXPER      0.216   -0.062 -0.207    0.247  1.000  0.535
LOGVAL     0.366    0.114 -0.402    0.530  0.535  1.000

Fitted Regression Model
> model2 <- lm(LOGCOMP ~ LOGSALES+GRAD+PERCENT5+EXPER+LOGVAL)
> summary(model2)

Call:
lm(formula = LOGCOMP ~ LOGSALES + GRAD + PERCENT5 + EXPER + LOGVAL)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.9347 -0.2800  0.0077  0.2019  1.2599 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.916566   0.375611  13.090  < 2e-16 ***
LOGSALES     0.246090   0.044939   5.476 3.68e-07 ***
GRAD        -0.239013   0.098382  -2.429    0.017 *  
PERCENT5    -1.011556   0.179882  -5.623 1.95e-07 ***
EXPER        0.005557   0.006291   0.883    0.379    
LOGVAL       0.132218   0.029558   4.473 2.18e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4314 on 93 degrees of freedom
Multiple R-squared:  0.5313,	Adjusted R-squared:  0.5061 
F-statistic: 21.09 on 5 and 93 DF,  p-value: 4.908e-14

Solution

Part IV. Multiple Linear Regression - II.

Professional background (BACKGRD) of the CEO contains eleven categories, such as marketing, finance, accounting, insurance and so on. We use this factor to explain logarithmic compensation (LOGCOMP).

IV.1. The number and mean effects of BACKGRD on LOGCOMP are described in the table below. A boxplot is given in Figure 1. Describe what we learn from the table and boxplot about the effect of BACKGRD on LOGCOMP.

> cbind(summarize(LOGCOMP,BACKGRD,length),
+      round(summarize(LOGCOMP,BACKGRD,mean,na.rm=TRUE),digits=3))
   BACKGRD LOGCOMP BACKGRD LOGCOMP
       0       1       0   6.690
       1      17       1   7.064
       2       3       2   7.103
       3      13       3   6.916
       4      13       4   6.679
       5      12       5   6.553
       6       7       6   6.764
       7      12       7   7.067
       8       6       8   6.914
       9      14       9   6.621
      10       1      10   5.956
> boxplot(LOGCOMP~EDUCATN,ylab="LOGCOMP",xlab="EDUCATN")

Box plot of logarithmic compensation, by professional background
Figure 1. Box plot of logarithmic compensation, by professional background

Solution

IV.2. Consider a regression model using only the factor, BACKGRD; the fitted output is below.

  • a. Provide an expression for the regression function for this model, defining each term.
  • b. Provide an expression for the fitted regression function, using the fitted output. Further, give the fitted value for an observation with BACKGRD = 0 and with BACKGRD = 1, both in logarithmic units as well as dollars.
  • c. Is BACKGRD a statistically significant determinant of LOGCOMP? State your null and alternative hypotheses, decision-making criterion, and your decision-making rules. (Hint: Use the R2 statistic to compute an F-statistic.)
> summary(lm(LOGCOMP ~ factor(BACKGRD)))

Call:
lm(formula = LOGCOMP ~ factor(BACKGRD))

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)
(Intercept)        6.68960    0.60605  11.038   <2e-16 ***
factor(BACKGRD)1   0.37441    0.62362   0.600    0.550
factor(BACKGRD)2   0.41388    0.69980   0.591    0.556
factor(BACKGRD)3   0.22644    0.62892   0.360    0.720
factor(BACKGRD)4  -0.01012    0.62892  -0.016    0.987
factor(BACKGRD)5  -0.13649    0.63079  -0.216    0.829
factor(BACKGRD)6   0.07449    0.64789   0.115    0.909
factor(BACKGRD)7   0.37771    0.63079   0.599    0.551
factor(BACKGRD)8   0.22431    0.65461   0.343    0.733
factor(BACKGRD)9  -0.06845    0.62732  -0.109    0.913
factor(BACKGRD)10 -0.73376    0.85708  -0.856    0.394
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.606 on 88 degrees of freedom
Multiple R-squared: 0.1248,     Adjusted R-squared: 0.0253

Solution

Part V. Variable Selection.

V.1 We run a regression model using LOGCOMP as the outcome of interest and four explanatory variables, LOGSALES, GRAD, PERCENT5, and EXPER. In Figure 2 is a set of four diagnostic plots of this model.

  • a. In the upper left-hand panel is a plot of residuals versus fitted values. What type of model misspecification does this type of plot help detect?
  • b. Does the plot of residuals versus fitted values in Figure 2 reveal a serious model misspecification?
  • c. In the upper right-hand panel is a normal qq-plot. Describe this plot and say what type of model misspecification it helps to detect.
  • d. Does the normal qq-plot in Figure 2 reveal a serious model misspecification?
  • e. In the lower right-hand panel is a plot of standardized residuals versus leverages. Describe this plot and say what type of model misspecification it helps to detect.
  • f. Observation 87 appears in Figure 2. Is it a high leverage point? Describe the average leverage for this data set and give a rule of thumb cut-off for a point to be a high leverage point.
  • g. Observation 87 appears in Figure 2. Is it an outlier? Give a rule of thumb cut-off for a point to be an outlier.

Figure 2. Diagnostic Plots of a Model of Logarithmic Compensation
Figure 2. Diagnostic Plots of a Model of Logarithmic Compensation

Solution

V.2 We run a regression model using LOGCOMP as the outcome of interest and four explanatory variables, LOGSALES, GRAD, PERCENT5, and EXPER. In Figure 3 is a plot of LOGVAL versus the residuals from this model. The correlation between these two variables is 0.292.

  • a. What do we hope to learn from a plot of a potential explanatory variable versus residuals from a model fit?
  • b. What new model does the information in Figure 3 suggest that we specify?

Figure 3. Plot of LOGVAL versus Standarized Residuals from a Model of Logarithmic Compensation
Figure 3. Plot of LOGVAL versus Standardized Residuals from a Model of Logarithmic Compensation

Solution

Part VI. Some Algebra Problems.

VI.1 Regression through the origin. Consider the model (y_i=beta_1 z_i^2 + varepsilon_i), a quadratic model passing through the origin.

  • a. Determine the least squares estimate of (beta_1).
  • b. Using the following set of n=5 observations, given a numerical result for the least squares estimate of (beta_1) determined in part (a).

begin{equation*}
begin{array}{l|rrrrr}
hline
i & 1 & 2 & 3 & 4 & 5 \
z_i & -2 & -1 & 0 & 1 & 2 \
y_i & 4 & 0 & 0 & 1 & 4 \ hline
end{array}
end{equation*}
Solution

VI.2 You are doing regression with one explanatory variable and so consider the basic linear regression model (y_i = beta_0 + beta_1 x_i + varepsilon_i).

  • a. Show that the ith leverage can be simplified to
    begin{equation*}
    h_{ii} = frac{1}{n} + frac{(x_i - overline{x})^2}{(n-1) s_x^2}.
    end{equation*}
  • b. Show that (overline{h}= 2 / n).
  • c. Suppose that (h_{ii} = 6/n) . How many standard deviations is (x_i) away (either above or below) from the mean?

Solution

[raw] [/raw]