Soc 360 Lecture 1, Spring 2000

Practice Exam #2 Answers

A.

1. The best prediction of race of victim is black, since Y=Black has the modal frequency (5715).

2. When X=Other, the best prediction of Y is Other. The error of prediction is 213 – 137 = 76.

3. Lambda is the appropriate PRE measure since both X and Y are nominal variables with more than 2 categories.

4. E1 = 11546 – 5715 = 5831

E2 = (5051-4686) + (6282 – 5393) + (213 – 137) = 1330

Lambda = (E1-E2)/E1 = (5831-1330)/5831 = 0.77

There is a quite strong association between Race of Victim and Race of offender. You can reduce 77% of errors by predicting race of victim with race of offender.

B.

1. Ns = 218(286+33+107+25)+199(33+25)+205(107+25)+286(25) = 144070

Nd = 56(205+286+189+107)+199(205+189)+33(189+107)+286(189) = 186300

2. Gamma = (Ns-Nd)/(Ns+Nd) = –.13

Errors in predicting beliefs about the importance of poor schools and lack of effort as causes of poverty can be reduced by 13 percent by using one variable to help predict the other. There is a negative association between feeling that poor schools are an important cause of poverty and believing that lack of effort is an important cause of poverty.

C. (Note: Xbar = the mean of X, Ybar = the mean of Y)

1.

           X

     X-Xbar

  (X-Xbar)2

           Y

   Y-Ybar

 (Y-Ybar)2

          (X-Xbar)(Y-Ybar)

 

 

 

 

 

 

 

11

4

16

65

6.2

38.44

24.8

9

2

4

59

0.2

0.04

0.4

7

0

0

61

2.2

4.84

0

2

-5

25

47

-11.8

139.24

59

6

-1

1

62

3.2

10.24

-3.2

Xbar = 7, Ybar = 58.8

å (X-Xbar)2 = 46, å (Y-Ybar)2 = 192.8

å (X-Xbar)(Y-Ybar) = 81

Sxy = å (X-Xbar)(Y-Ybar)/(N-1) = 81/(5-1) = 20.25

rxy = å (X-Xbar)(Y-Ybar)/(Ö å (X-Xbar)2Ö å (Y-Ybar)2) = 81/(Ö 46Ö 192.8) = 0.86

The correlation coefficient is better because its magnitude indicates the strength of association in a standardized manner.

2. b = Sxy/Sx2 = å (X-Xbar)(Y-Ybar)/å (X-Xbar)2 = 81/46 = 1.76

a = Ybar – bXbar = 58.8 – 1.76(7) = 46.48

3. France: X = 9. Plug in X = 9 in the regression equation.

Ŷ = 46.48 + 1.76X = 46.48 + 1.76(9) = 62.32

Error: Y – Ŷ = 59 – 62.32 = -3.32

D.

1. rxy = å (X-Xbar)(Y-Ybar)/(Ö å (X-Xbar)2Ö å (Y-Ybar)2) = 1572.108/(Ö 29518.848Ö 137.976) = 0.78

There is a pretty strong positive linear relationship between the number of poverty residents per nurse and black infant death rate.

2. b = Sxy/Sx2 = å (X-Xbar)(Y-Ybar)/å (X-Xbar)2 = 1572.108/29518.848 = .053

a = Ybar – bXbar = 17.42 - .053(74.31) = 13.48

Ŷ = 13.48 + .053X

3. The slope b is the amount of change in Y associated with the unit change in X. As the number of poverty residents per nurse increases by a hundred, black infant death rate increases by 0.053 per 1,000.

The intercept a is the value of Y when X = 0. In this case, X = 0 means that there is no poverty resident in the city, which is an improbable situation. Therefore, the intercept has no particular meaning.

4. Ŷ = 13.48 + .053(32) =15.18

Since X=500 is far out of the range of X in the data, making such a prediction with the regression equation based on the given data would not be valid.

5. r2 = (E1-E2)/E1 = (å (Y-Ybar)2å (Y- Ŷ)2)/å (Y-Ybar)2

Or you can calculate the coefficient of determination simply by squaring the correlation coeffcient: r2 = (rxy)2 = (0.78)2 = .608

E1 = SSTO = å (Y-Ybar)2 = 192.8

E2 = SSE = å (Y- Ŷ)2 = 54.276

Note: E1-E2 = SSTO – SSE = SSR = å (Ŷ-Ybar )2 à reduction of error by regression

In this case, SSR = 137.976 – 54.276 = 83.7

Even when SSE = å (Y- Ŷ)2 is not given, you can calculate SSR directly using the following formula:

SSR = b2å(X-Xbar)2