COR723 Construction of the WDPI Dataset that Links to the WLS Notes and supporting documenation for the high school district data file. This dataset contains information regarding school staffing and spending for the school districts attended by most respondents in the WLS. Every year, officials from Wisconsin public school districts are required to supply information regarding staffing, spending and receipts to the Wisconsin Department of Public Instruction. These data are stored on microfiche in the archives of Wisconsin State Historical Society. It is all publically available. In 1998, Craig Olson and Deena Ackerman collected this information from the state archives and created a dataset of WDPI information for 336 of the 421 school districts in the state of Wisconsin that had students in grades nine through twelve. Data were collected for 1954-1957, the four years that the WLS respondents were in high school. Districts were excluded from the dataset based on the following criteria: a) District had fewer than five students in the full WLS. b) District did not provide reliable information over any of the 4 years of the survey. Many of these districts were in the process of merging with other districts. Twelve districts were excluded for this reason. The first restriction was imposed due to the costs of data extraction. The sample used in Olson and Ackerman (2000) contained students in 278 of the 326 districts. (The rest were excluded because no WLS respondents in that district provided sufficient information for inclusion in their study.) Their sample included at least one observation from school districts that collectively enrolled a substantial share of the Wisconsin senior class of 1957. The high schools in their sample enrolled 28,609 senior, or about 84 percent of the total population of seniors in the state in 1957 (State of Wisconsin 1957). Thus, this dataset provides a relatively complete picture of school spending in the state of Wisconsin for the period. This extract contains most of the important variables from the WDPI, as well as additional variables constructed to aid the user. Most variables included are district averages over the years of available data. For these variables, all information is in 1957 dollars. Other variables include information provided for one year only. Unless otherwise noted, 1957 values are reported for these variables. A more complete extract with separate observations for each year is available from the authors upon request. It was often impossible to distinguish missing observations from zeros in the original data. This problem was usually at the DPI level (the original reports contained blanks or dashes for obvious zeros). However, sometimes this problem was due to carelessness or misunderstanding on the part of our undergraduate coders. For most questions asked, we were able to make reasonable assumptions regarding the proper assignment of missing versus zero and have corrected the data accordingly. We do not believe that this discrepancy will have any effect on any analysis conducted with the data. Notes regarding specific variables in the extract Variables constructed to aid with use of this dataset: discode - this is a unique district identifier that links the dpi data to the WLS data. f_merge - this variable flags those districts involved with mergers in this period. It is NOT true that only 13 districts were involved in mergers. It is true that only 13 districts involved with mergers provided sufficient WDPI information in this period to be included in the sample. f_nocf - this variable records the number of coding forms used to construct averages f_ hsonly - this variable flags those district that were high school only. This information is based on reported data, and not on any formal knowledge of district composition. For nine districts, this information was not consistent across years of available data. years - this variable reports which years supplied data. For example, the district with years=1011 provided data in 1954, 56, 57, but not in 55. The districts reporting 111 provided information in 55,56,57. For those variables where information is reported for a select year only, information is for the most recent year available. enrollment variables - i.e. en_b9_4 = enrollment for ninth grade males in 1954. this section contains enrollment information for ninth graders in all years by sex, as well as totals and seniors in 1957. Missing variables are coded "." teaching staff variables - i.e. te_clast = teaching staff - number of classroom teachers in grades 9-12. These variables are averaged over the 4 years (or as many years as are available). There was some minor cleaning involved to substitute missing information from the dpi forms with accurate information since te_tea_t + te_supt + te_assup + te_princ should equal te_clast. Those remaining observations for which it was still impossible to determine the number of teachers were all very incomplete reports and are not included in this sample. This cleaning was done on the district/year level. teacher information - these variables only include information regarding high school teachers. i.e. t_sal_m = mean teacher salary for men. The number of teachers in the school appears multiple times in this section in different guises. For example, t_no_t is the number of teachers for whom salary data is reported; t_dx_no is the number of teachers for whom district experience is reported. We believe that t_no_t is the most accurate count of the number of high school teachers in the district. The following information is reported: t_sal = mean teacher salary t_dexp = mean teacher experience within district t_text = mean teacher experience in total t_tsch = mean years of post-secondary education per teacher t_psl4 = mean percent of teachers with less than a bachelors degree t_psg4 = mean percent of teachers with more than a bachelors degree days = mean days in the school year receipts - these variables report receipts to the school districts (high school only receipts only). Missings are coded as "." These are defined in codebook. i.e. r_staidh = total state aid for the district disbursments - these variables report receipts to the school districts (high school only receipts only). Missings are coded as "." These are defined in codebook. i.e. d_sal = disbursments for teacher salaries Rural district variables These are reported for most recent year available. All missing variables are assumed to be zeros. i.e. ru_no1rm = number of one room school houses in district. district administration - these variables contain information regarding how the district is administered and funded. ct=city district, cs=common school district union status - this variable reports if the teachers are unionized. Information is for most recent year available. Citations Olson, Craig A. and Deena Ackerman. 2000. High School Inputs and Labor Market Outcomes for Male Workers in Their Mid-Thirties: New Data and New Estimates from Wisconsin. Institute for Research on Poverty Discussion Paper no. 1205-00 State of Wisconsin. 1957. "Thirty-eighth Report of the Superintendent of Public Instruction of the State of Wisconsin." Madison, WI.