COR639 Original: November 22, 1995 Revised: June 12, 1996 Revised: November 12, 1996 Revised: May 4, 1999 Construction of 1990-basis Occupation Characteristics and SEI Scores Update and Recommendations In their 1997 work to produce an SEI index for 1990-basis Census Detailed Occupation categories, Hauser and Warren (1997) made several recommendations regarding the construction and use of occupational status scores. First, they recommend against the use of SEI scores in general. As they argue, "...in rudimentary models of occupational stratification, prestige-validated socioeconomic indexes are of limited value. They give too much weight to occupational earnings, and they ignore intergenerational relationships between occupational education and occupational earnings." They recommend that "we ought to move toward a more specific and disaggregated appraisal of the effects of occupational characteristics on social, psychological, economic, political, and health outcomes." To this end, the use of occupational education scores and occupational earnings scores themselves, rather than a weighted average of these scores (i.e., SEI) is recommended (see also Warren, Sheridan and Hauser 1998). Occupational education scores are defined as the percentage of persons in a 1990 Census occupation/industry/class-of-worker category who completed one year of college or more. Occupational earnings scores are defined as the percentage of persons in a 1990 Census occupation/industry/class-of-worker category who earned $14.30 per hour or more in 1989. These percentages were determined through analysis of the 5% PUMS file see below for a detailed description of these scores. The second recommendation in the Hauser and Warren (1997) paper is that a transformation of the raw occupational education and earnings percentages is preferable, because in their analyses it reduces heteroscedasticity in the residuals from the regression of prestige on occupational earnings and education. They recommend transforming the raw percentages into started logit scores: Where yi = percentage in occupation category who completed 1 or more years of college (for occupational education scores) or yi = percentage in occupation category who earned $14.30 per hour in 1989 (for occupational earnings scores). While performing the task of coding the WLS occupations into 1990- Basis Detailed Occupation and Industry codes, we had to make decisions about which status scores to include in the codebooks. Because of the recommendations made in Hauser and Warren (1997), we decided to NOT include any 1990-basis SEI scales; rather to include only the components of such scales occupational education and earnings scores, and the percentage of persons in the 1989 NORC GSS prestige study who rated an occupation category 5 or higher on a 9- point scale (see COR683). References Hauser, Robert M. and John Robert Warren. 1997. "Socioeconomic Index of Occupational Status: A Review, Update, and Critique." Pp. 177-298 in Sociological Methodology, edited by Adrian Raftery. Cambridge: Blackwell. Warren, John Robert, Jennifer T. Sheridan, and Robert M. Hauser. "Choosing a Measure of Occupational Standing: How Useful are Composite Measures in Analyses of Gender Inequality in Occupational Attainment?" Sociological Methods & Research. 27(1):3-79. Procedures Used to Create 1980- and 1990-Basis SEI Scores We used three methods to select samples for these analyses. First, we selected everybody in the employed civilian labor force (ECLF), and coded earnings as the percentage of people in an occupation who earned $25,000 or more in 1989. Second, we selected everybody in the ECLF, but this time we coded earnings as the percentage of people in an occupation who earn $14.30 per hour or more in 1989; to compute the wage rate, we divided annual earnings (REARNING) by the product of the number of hours worked per week (HOUR89) multiplied by the number of weeks worked per year (WEEK89). Third, we selected everybody in the ECLF who worked at least 35 hours per week and who worked at least 50 weeks in 1989. In this case, we coded earnings as the percentage of people in an occupation who earned $25,000 or more in 1989. For all three samples, we coded education as the percentage of people in an occupation who had completed at least one or more years of college. Separately for each of the three sample groups, we extracted data from the 1990 PUMS 5% A files for each of the 50 states and the District of Columbia. (This very large file is called COR639.PUM or PUMS90.DAT and is stored elsewhere). For the two sampling methods which selected everybody in the ECLF, 5,559,121 records were selected, and after weighting (by PWGT1) these cases represent 112,169,744 people in the entire ECLF. For the sampling method which selected only those members of the ECLF who worked full-time, year-round in 1989, we selected 3,491,686 records, which represent 70,517,365 people after weighting. Tables A through O (contained in Lotus1-2-3 files called TABLE_X.WK4) are used to organize and construct the variables needed for the creation of the socioeconomic indexes. Tables A through E pertain to the ECLF sample with earnings coded as annual earnings; Tables F through J pertain to the sample of Full-Time, Year Round workers; and Tables K though O pertain to the ECLF sample with earnings coded as a wage rate. The word "TOTAL," "MALE," or "FEMALE" below indicates whether the characteristics described in that table pertain to the total population or to one gender group only. Sample: All Members of Employed Civilian Labor Force (ECLF) Table A. TOTAL: Frequency of Occupation by Age Table B. TOTAL: Education (% 1 year College) by Age by Occupation Table C. TOTAL: Earnings (% $25,000+/yr) by Age by Occupation Table D. MEN: Frequency of Occupation by Age; Education (% 1 year College) by Occupation; Earnings (% $25,000+/yr) by Occupation Table E. WOMEN: Frequency of Occupation by Age; Education (% 1 year College) by Occupation; Earnings (% $25,000+/yr) by Occupation Sample: All Members of ECLF Working 35+ hrs/wk and 50+ wks/yr Table F. TOTAL: Frequency of Occupation by Age Table G. TOTAL: Education (% 1 year College) by Age by Occupation Table H. TOTAL: Earnings (% $25,000+/yr) by Age by Occupation Table I. MEN: Frequency of Occupation by Age; Education (% 1 year College) by Occupation; Earnings (% $25,000+/yr) by Occupation Table J. WOMEN: Frequency of Occupation by Age; Education (% 1 year College) by Occupation; Earnings (% $25,000+/yr) by Occupation Sample: All Members of Employed Civilian Labor Force (ECLF) Table K. TOTAL: Frequency of Occupation by Age (NOTE: Identical to Table A) Table L. TOTAL: Education (% 1 year College) by Age by Occupation (NOTE: Identical to Table B) Table M. TOTAL: Earnings (% $14.30/hr+) by Age by Occupation Table N. MEN: Frequency of Occupation by Age; Education (% 1 year College) by Occupation; Earnings (% $14.30/hr+) by Occupation Table O. WOMEN: Frequency of Occupation by Age; Education (% 1 year College) by Occupation; Earnings (% $14.30/hr+) by Occupation Notes: Indirect standardization in these tables was done using the method described by Duncan (1961, pg. 135). The 1980 and 1990 Census occupational classifications are very similar. There are 503 occupations in the 1980 classification system and 501 occupations in the 1990 classification system. Nakao and Treas, using the 1980 classification system, were able to directly assign prestige ratings from the 1989 NORC Prestige Study to all but three: lines 564, 569, and 635, which are all apprenticeship categories (prestige scores were later imputed for these lines). In Table P we present the NORC prestige measures, Nakao and Treas male- and total-based SEI scores, and the component data used to construct those SEI scores. Although Nakao and Treas were working with the 1980 occupational classification system, Table P presents the above mentioned figures as they pertain to 1990-basis occupation lines. While Nakao and Treas were able to impute prestige scores for lines 564, 569, and 635, they excluded them from subsequent analyses. Likewise, we have excluded these three occupation lines from our analyses. Thus, since we use the 1990 classification, the effective sample size at the outset of our analyses is 498 cases. Table Q is a useful compilation of all of the components which are required for constructing a variety of SEI scores. In this table, [T], [M], and [F] indicate whether the components or scores pertain to the total population, just men, or just women. Also, [RW], [IS], and [DS] indicate whether the education and earnings scores are raw, indirectly standardized, or directly standardized. Note that the two sample selection methods which include everyone in the ECLF share the same education scores. All of the data from this table, and the prestige scores from Table P, were put into an SPSS exportable file called SCORES.EXP for analysis. Variables in that file are described below. Note that "ECLF" stands for employed civilian labor force and "FTFY" stands for full-time, full-year labor force. Variables in SCORES.EXP CODE 1990 Basis Occupation Code PRESTIGE 1989 NORC Prestige Score PCT5 1989 NORC Prestige Score, % Ranking Job 5 or Higher ECLF_T Unweighted Sample Count, Men and Women in the ECLF ECLF_M Unweighted Sample Count, Men in the ECLF ECLF_F Unweighted Sample Count, Women in the ECLF FTFY_T Unweighted Sample Count, Men and Women in the FTFY Labor Force FTFY_F Unweighted Sample Count, Men in the FTFY Labor Force FTFY_M Unweighted Sample Count, Women in the FTFY Labor Force NT1 Weighted Sample Count, Men and Women in the ECLF NM1 Weighted Sample Count, Men in the ECLF NF1 Weighted Sample Count, Women in the ECLF NT2 Weighted Sample Count, Men and Women in the FTFY Labor Force NM2 Weighted Sample Count, Men in the FTFY Labor Force NF2 Weighted Sample Count, Women in the FTFY Labor Force Education Variables EDUCTDS1 Total-based, Directly Standardized, ECLF EDUCTDS2 Total-based, Directly Standardized, FTFY EDUCTIS1 Total-based, Indirectly Standardized, ECLF EDUCTIS2 Total-based, Indirectly Standardized, FTFY EDUCTRW1 Total-based, Raw Scores, ECLF EDUCTRW2 Total-based, Raw Scores, FTFY EDUCMIS1 Male-based, Indirectly Standardized, ECLF EDUCMIS2 Male-based, Indirectly Standardized, FTFY EDUCMRW1 Male-based, Raw Scores, ECLF EDUCMRW2 Male-based, Raw Scores, FTFY EDUCFIS1 Female-based, Indirectly Standardized, ECLF EDUCFIS2 Female-based, Indirectly Standardized, FTFY EDUCFRW1 Female-based, Raw Scores, ECLF EDUCFRW2 Female-based, Raw Scores, FTFY Earnings Variables EARNTDS1 Total-based, Directly Standardized, ECLF (Annual Earnings) EARNTDS2 Total-based, Directly Standardized, ECLF (Wage Rate) EARNTDS3 Total-based, Directly Standardized, FTFY (Annual Earnings) EARNTIS1 Total-based, Indirectly Standardized, ECLF (Annual Earnings) EARNTIS2 Total-based, Indirectly Standardized, ECLF (Wage Rate) EARNTIS3 Total-based, Indirectly Standardized, FTFY (Annual Earnings) EARNTRW1 Total-based, Raw Scores, ECLF (Annual Earnings) EARNTRW2 Total-based, Raw Scores, ECLF (Wage Rate) EARNTRW3 Total-based, Raw Scores, FTFY (Annual Earnings) EARNMIS1 Male-based, Indirectly Standardized, ECLF (Annual Earnings) EARNMIS2 Male-based, Indirectly Standardized, ECLF (Wage Rate) EARNMIS3 Male-based, Indirectly Standardized, FTFY (Annual Earnings) EARNMRW1 Male-based, Raw Scores, ECLF (Annual Earnings) EARNMRW2 Male-based, Raw Scores, ECLF (Wage Rate) EARNMRW3 Male-based, Raw Scores, FTFY (Annual Earnings) EARNFIS1 Female-based, Indirectly Standardized, ECLF (Annual Earnings) EARNFIS2 Female-based, Indirectly Standardized, ECLF (Wage Rate) EARNFIS3 Female-based, Indirectly Standardized, FTFY (Annual Earnings) EARNFRW1 Female-based, Raw Scores, ECLF (Annual Earnings) EARNFRW2 Female-based, Raw Scores, ECLF (Wage Rate) EARNFRW3 Female-based, Raw Scores, FTFY (Annual Earnings) Note that most of the education and earnings variables have minimum values of zero and maximum values of 1 in Tables A through O and in SCORES.EXP (the same cannot be said for the prestige measures). The scores which are higher than 1 or less than 0 were top-coded at 1 or bottom-coded at 0 in all subsequent analyses. The command file TOPCODES.SPS, and its output file TOPCODES.LOG, detail the scores which were altered in this procedure. In addition, for the purposes of our analyses we initially multiplied each earnings and education measure by 100. Nakao and Treas (1994) provide the variable labelled PRESTIGE for 1980-basis and 1990-basis occupation lines, but PCT5 for only 1980-basis lines. Converting PCT5 from 1980-basis to 1990-basis occupations is relatively simple, but in one instance it was impossible to compute PCT5. For 1980-basis line 019, which was split into 1990-basis lines 017, 021, and 022, we computed PCT5 by regressing PRESTIGE on PCT5 for cases in which PRESTIGE fell between 35 and 65; we then used the parameters of this model to predict PCT5 for lines 017, 021, and 022 (It was unnecessary to use this procedure for 1980-basis line 468, which was split into 1990-basis lines 466, 467, and 468, because PRESTIGE was the same for all three 1990-basis lines). In addition to the PRESTIGE and PCT5 variables, we constructed another dependent variable, a started logit of PCT5. Specifically: PCTLOG5 = ln((PCT5+1)/(1-(PCT5+1))) Likewise, we constructed started logits of each of the education and earnings component variables. The command file FINAL_SEI.SPS (and it's log file FINAL_SEI.LOG) produces a wide variety of SEI prediction equations. After consideration of the functional form of the relationships between variables and of the influence of outliers, the final models exclude a number of occupation categories. Table 1 of Hauser and Warren (6/96) lists the categories which were excluded in the total, male, and female samples. The specification of all of the models, as well as their results, are described in Table 2 of Hauser and Warren (6/96). After deciding which were our preferred models and producing scores, all three sets of scores were transformed so as to range between 0 and 100; for all three, we added 2.08 to the predicted value of PCTLOG5 and then multiplied by 17.3. Table V2 presents the 1990-basis TSEI, MSEI, and FSEI scores and component data for all 501 1990-basis occupation lines. Table W2 presents the 1980-basis TSEI, MSEI, and FSEI scores and component data for all 503 1980- basis occupation lines. Note that in converting from the 1990 Census Occupational Classification System to the 1980 Occupational Classification System, 6 1990-basis lines were merged into 2 1980- basis lines (017, 021, and 022 into 019; 466, 467, and 468 into 469), and 6 1990-basis lines were split into 12 1980-basis lines (353 into 349 and 353; 368 into 368 and 369; 436 into 436 and 437; 674 into 673 and 674; 795 into 794 and 795; 804 into 804 and 805). When 1990-basis lines were split into multiple 1980-basis lines, the values for education, earnings, and SEI scores were applied to each of the new lines. When 1990-basis lines were merged into a single 1980-basis line, we computed a weighted average of the education and earnings scores (weighted by the number of occupational incumbents), and then recomputed the SEI scores for the new line. Finally, note that many 1990-basis lines were renumbered for the 1980-basis classification, and that some lines were renamed. See Nakao and Treas (1994, page 40-41) for details. Finally, we extracted the occupation codes, TSEIs, MSEIs, FSEIs, and component data from Tables V2 and W2 to produce machine readable versions of the 1980-basis and 1990-basis scores. HWSEI80.DAT contains the 1980-basis scores, and HWSEI90.DAT contains the 1990-basis scores. HWSEI80.CTL is a control file for use with HWSEI80.DAT, and HWSEI90.CTL is a control file for use with HWSEI90.DAT. Finally, README.TXT describes how to use these files.