MANUAL FOR CODING OCCUPATIONS IN THE WISCONSIN LONGITUDINAL STUDY 1992-94 FOLLOW-UPS COR476.WP5 S C Noah Uhrig* August 1994 * Robert M. Hauser was Principal Investigator of the Wisconsin Longitudinal Study. Taissa S. Hauser was responsible for general direction and oversight of data collection, coding, management, and documentation. Survey data were collected by the staff of the Letters and Sciences Survey Center, UW-Madison. Linda Jordan was responsible for project computing and programming. Jeff Bloom and SC Noah Uhrig assisted her in the management of occupational data. Heidi Gilbertson-Ostermeier managed the occupational coding team including: O.Colby Carr, Chris Celi, Mike Cik, Amy Esler, Peggy Glynn, Ellen Hopkins, Carrie LaFond, Victoria Rateau, Patricia Richards, and Yin-Yeng Chen. MANUAL FOR CODING OCCUPATIONS IN THE WISCONSIN LONGITUDINAL STUDY 1992-94 FOLLOW-UPS Table of Contents The WLS questions ascertaining job descriptions . . . . . . . . 2 QUESTIONS FOR RESPONDENTS . . . . . . . . . . . . . . . . 2 WORK ASPIRATION QUESTIONS . . . . . . . . . . . . . . . . 3 JOB QUESTIONS IN THE SIBLING STUDY . . . . . . . . . . . 5 From Interview to Database . . . . . . . . . . . . . . . . . . 6 INTERVIEWS AND FILES . . . . . . . . . . . . . . . . . . . 6 OCCUPATIONAL RECORDS . . . . . . . . . . . . . . . . . . 7 OCCUPATIONAL CODING FILES. . . . . . . . . . . . . . . . .10 Assigning Codes . . . . . . . . . . . . . . . . . . . . . . . 11 CODER TRAINING . . . . . . . . . . . . . . . . . . . . . 11 1970 AND 1990 CODES. . . . . . . . . . . . . . . . . . . 12 JOB CHANGES WITHIN AN EMPLOYER . . . . . . . . . . . . . 13 MEMOS USED FOR CODING. . . . . . . . . . . . . . . . . . 14 RELIABILITY ASSESSMENTS. . . . . . . . . . . . . . . . . 14 Data Management Techniques. . . . . . . . . . . . . . . . . . 15 CODE FLAGS . . . . . . . . . . . . . . . . . . . . . . . 15 BACK-TO-THE-FIELD . . . . . . . . . . . . . . . . . .16 ALLOCATION. . . . . . . . . . . . . . . . . . . . . .16 COMMENT . . . . . . . . . . . . . . . . . . . . . . .16 WAITING . . . . . . . . . . . . . . . . . . . . . . .16 DONE . . . . . . . . . . . . . . . . . . . . . . . .16 FILE MOVEMENT. . . . . . . . . . . . . . . . . . . . . . 17 DATA CLEANING. . . . . . . . . . . . . . . . . . . . . . 19 BACKUP PROCEDURES. . . . . . . . . . . . . . . . . . . . 20 Validating Occupational Codes . . . . . . . . . . . . . . . . 21 Lost Records and Coding Problem Cases . . . . . . . . . . . . 22 LOST RECORDS . . . . . . . . . . . . . . . . . . . . . . 22 PROBLEM CASES. . . . . . . . . . . . . . . . . . . . . . 24 Appendix A - MEMOS . . . . . . . . . . . . . . . . . . . . . 28 MEMO 79 - Typical problems in occupational coding and general guidelines to improve interviewing and reliability. . . . . . . . . . . . . . . . . . .29 MEMO 83 - Scanning for occupational call-backs . . . . . .49 MEMO 88 - Managers and foremen . . . . . . . . . . . . . .50 MEMO 111 - More on managers and foremen. . . . . . . . . .51 MEMO 93 - Secretaries, nec, truck drivers, multidimensional firms, and retail/wholesale trade. . . . . . . .52 MEMO 96 - Military jobs, foremen, volunteers, housespouses, and lower priced department stores.. . . . . . .53 Appendix B - CODING SCREEN FORMAT . . . . . . . . . . . . . . 58 Appendix C - COMPUTER REQUESTS. . . . . . . . . . . . . . . . 59 COR 447 - Programs used to match and merge the data received from the survey center and create files to be used for the purpose of coding occupations. . . . . .59 COR 497 - Programs used to merge, update, and to do reliability checking of all occupation coding. .60 COR 466 - Search for coding errors by major occupational group comparison . . . . . . . . . . . . . . . .68 Social scientists are sometimes challenged by capturing the concept of work in easily measured and manipulated variables. This document describes the procedures and processes used by the Wisconsin Longitudinal Study (WLS) to translate qualitative job descriptions into numeric variables. The WLS assigned, to all job reports in the 1992-94 round of interviews, 1970 U.S. Census seven digit occupation and industry codes: a three digit industry code, a three digit occupation code, and a one digit class of worker code. The WLS used 1970 Census codes for congruency with previous WLS data files. The seven digit codes can be recoded into a variety of other classifications. For example, the WLS staff recoded 1970 Census job codes into variables representing major industry group, major occupation group, and socioeconomic index values. We expect to use the seven digit codes to sort the text, and eventually code to 1990 U.S. Census code standards. These seven digit codes are, therefore, the basis for all subsequent classification and scaling of occupation and industry reports. The WLS is archiving not only the seven digit code, but also the original text of the occupation and industry reports for other potential uses. These text files are not available to the public. The WLS job questions (See Appendix F for Employment Histories) QUESTIONS FOR RESPONDENTS The WLS used a slightly altered form of the U.S. Census question series to obtain codable job descriptions. In gathering information about the respondents' jobs in 1975, the WLS employed two open-ended questions to secure task descriptions: What kind of work were you doing at that company in that year ? (FOR EXAMPLE: ELECTRICAL ENGINEER; STOCK CLERK; FARMER) Second, What were your most important activities or duties ? (FOR EXAMPLE: KEPT ACCOUNT BOOKS; FILED; SOLD CARS; OPERATED PRINTING PRESS; FINISHED CONCRETE) Coders assigned a three digit occupational code to these question responses. In order to capture the industry in which the respondent worked in 1975, the WLS asked two questions. What kind of business or industry was this ? (FOR EXAMPLE: ELEMENTARY SCHOOL; TV AND RADIO MANUFACTURING; RETAIL SHOE STORE; STATE LABOR DEPARTMENT; FARM) Was this mainly manufacturing, wholesale trade, retail trade or something else ? (MANUFACTURING, WHOLESALE, RETAIL, or SOMETHING ELSE) These question responses were translated into a three digit industry code. To obtain the class of worker for the respondents' 1975 job, each respondent was asked a series of questions as appropriate. First, Were you employed by government, by a private company or organization, or were you self-employed or working in a family business ? (GOVERNMENT, PRIVATE COMPANY OR ORGANIZATION, INCLUDING NON-PROFIT FIRMS, SELF-EMPLOYED, WORKING IN FAMILY BUSINESS, OTHER SPECIFY) If the respondent indicated they were self-employed or worked for a family business, then the respondent was asked, "Was this business incorporated?" Then only those respondents indicating that they worked in a family business were asked, "Were you working for pay?" This information allowed WLS coders to determine whether the person worked for the government, an independent organization, one's self, or a family business without pay. The WLS obtained detailed information, then, about the job of each respondent. The above series of questions were used to obtain information about respondents' occupation, industry and worker classification. This same series of questions also asked about their current spouse, their selected child, selected sibling, and other respondent jobs. For example, rather than asking "What were your chief activities or duties?", when referencing the respondents' spouse the WLS asked "What were your spouse's chief activities or duties?" WORK ASPIRATION QUESTIONS The WLS was also interested in obtaining information about the respondents' aspirations for the future, both as a recollection of what respondents wanted to be doing today in 1975, and what they would want to do 10 years in the future. For example, to obtain a detailed description of respondents' 1975 aspirations, the WLS asks: Now, we are interested in the plans that people make about their lives. Think back to 1975, which would be about 18 years ago. What did you want to be doing today ? First, did you want to be working or not working ? If the respondent said they did not want to be working, they were asked, "What did you want to be doing ? (RETIRED, KEEPING HOUSE, SPECIFY SOMETHING ELSE)" However, if the respondent said they wanted to be working, the WLS asks, "At that time, what kind of work did you want to be doing today ? (FOR EXAMPLE: ELECTRICAL ENGINEER; STOCK CLERK; FARMER)" Followed by, "Was this the same kind of work as you were doing in 1975?", and "Was this the same kind of work as you are doing today?" If the respondent answered "No" to either of these questions, then the WLS queried the respondent more about what they claimed they wanted to be doing: You said that you wanted to ... Can you tell me more about that kind of work ? (PROBE: WHAT WOULD BE YOUR MOST IMPORTANT ACTIVITIES OR DUTIES?) (FOR EXAMPLE: KEPT ACCOUNT BOOKS; FILED; SOLD CARS; OPERATED PRINTING PRESS; FINISHED CONCRETE) Therefore, the WLS was able to obtain a more detailed description of the respondents occupational aspiration. To ascertain a more detailed description of the industry in which the respondent hoped to work, the WLS asked: What kind of business or industry would that be in ? (FOR EXAMPLE: ELEMENTARY SCHOOL; TV AND RADIO MANUFACTURING; RETAIL SHOE STORE; STATE LABOR DEPARTMENT; FARM) Finally, to capture class of worker, the WLS asked "Would that be working for yourself or for someone else?" Additionally, a similar series of questions was used to obtain a detailed description of what respondents wanted to be doing 10 years from the time of interviews: Now we would like to ask a few more questions about your plans for the future. If you were free to choose, what would you like to be doing 10 years from now, in terms of your work ? Would you like to be working full-time, working part-time, not working, retired, or something else ? If the respondent would like to be working, the WLS asks, "Would this be the same kind of work that you are doing now?" If the respondent answers "YES" then the current job is entered as the aspired job. However, if this is not what the respondent wants to be doing, then the WLS probes, "What kind of work would you like to be doing?" and "What would be your principal activities or duties?" For an industry description, the WLS asks "What kind of business or industry would that be in?" To obtain class of worker, "Would that be working for yourself or for someone else?". JOB QUESTIONS IN THE SIBLING STUDY After finishing respondent production coding, WLS staff undertook coding of job descriptions from the WLS Sibling Study. Similar to the WLS respondents and WLS 1975 non-respondents, jobs were coded in the WLS Sibling Study to the 1970 U.S. Census standards of occupation, industry and class of worker. First, siblings were asked, "Have you ever held a full-time or part-time job?" If so, they were asked about their first job since completing their last full year of school: You told us that you completed your highest grade or year in school in some year. We would like to know about the first full-time civilian job you had after you completed your highest grade in school. What kind of work were you doing? Do include full-time work in a family business or farm, even if you were working without pay. If the sibling had not held a full time job since they finished their last year of school, or if they had not finished school yet, they were not asked this series of questions. However, those that were able to describe their first occupation since completing school were asked next, "What were your most important activities or duties?" Then to ascertain industrial descriptions the sibling was asked, "What kind of business or industry was this?" and "Was this mainly manufacturing, wholesale trade, retail trade or something else?" Then to obtain class of worker information, "Were you employed by government, by a private company or organization, or were you self-employed or working in a family business?", "Was this business incorporated?", and "Were you working for pay?" From Interview to Database INTERVIEWS AND FILES With one exception noted below (see Occupational Call Backs), all of the occupation and industry data in the 1992-94 round of the WLS were obtained through computer-assisted telephone interviews (CATI), using the CASES system, by the Letters and Science Survey Center (LSSC). Therefore, descriptive text, as well as closed-ended responses were obtained in machine- readable form. The LSSC delivered all interview data to the WLS in "batches". These "batches" of data contained information for approximately 100 interviews in two files. The LSSC placed fixed-length data from closed-ended questions in one file, called an output file, and variable-length data from open-ended questions in another file, known as the history file. As with respondents, the LSSC sent data in "batches" of approximately 100 sibling interviews each. Data for the occupational data base come from both history and output files. OCCUPATIONAL RECORDS WLS staff used a series of programs that assembled the information from both history and output data files (see WLS COR447 and the COR's adjoining programs 48, 49, 50, and 51 for details) into an occupational coding file. Each respondent could have maximum of 18 jobs including selected sibling, spouse, selected child and job aspiration. Siblings could have a maximum of six jobs. But the average number of job reports per respondent interview was six, and sibling interview was three. Hereafter, each job is referred to as a record. A separate data line was written for each job in ASCII form. Each line contained a unique identification number (known as "IDSWL"), a question value representing to which job the question series referred (known as "QUES"). Possible "QUES" values are: ORIGINAL RESPONDENTS: CHILD Selected child's current job. Data come from questions 338z-380x RESP1 Respondent's first employer, job in 1975 or first job since 1975. Data come from questions cc01-cc40 RESP11 Respondent's first employer, second job since 1975. Data come from questions cc79-cc80 RESP2 Respondent's second employer, first job. Data come from questions dd01-dd40 RESP21 Respondent's second employer, second job Data come from questions dd79-dd80 RESP3 Respondent's third employer, first job. Data come from questions ee01-ee40 RESP31 Respondent's third employer, second job. Data come from questions ee79-ee80 RESP4 Respondent's next to last employer, first job. Data come from questions gg79-gg80 RESP41 Respondent's next to last employer, second job. Data come from questions gg79-gg80 RESP5 Respondent's current/last employer, first job. Data come from questions hh01-hh40 RESP51 Respondent's current/last employer, second job. Data come from questions hh79-hh80 RESP6 Respondent's current employer, first job. Data come from questions ii01-ii40 Sib A Selected sibling's current job (sib alive). Data come from questions 404n-406q Sib D Selected siblings last job (sib deceased). Data come from questions 401R-401g Spouse Spouse's current job. Data come from questions 78mz-80x FUTR.Q Respondent's future job. Data come from questions 934f-934s EMPL.Q Respondent's aspired job. Data come from questions aa01-aa21 FOR 1975 NON-RESPONDENTS ONLY: Respa Respondent's first job after completing school Data come from questions z2-z2aa HeadS Job of head of household when respondent was a senior in high school Data come from questions z5-z5g MotherS Mother's job when respondent was a senior in high school Data come from questions z9a-z9h FOR SIBLINGS ONLY: Sib F Sibling's first job since last full year of school Data come from questions bb04-bb24 Sib77 Sibling's job in 1977 if different from Sib F Data come from questions cc05-cc24 Sib L Sibling's current or last job Data come from questions dd05-cc24 Respd Job of original respondent if dead or missing Data come from questions 401R-401q Respl Current or last job of original respondent if otherwise not interviewed Data come from questions 402q-402x Each line in the occupational coding file also contained all text describing occupation, industry, class of worker, but also several other pieces of information. Other information in each line of the file included group number, respondent's (or sibling's) name, 1977 interview status, and company name. However, coders did not have access to view all of this information. See Occupational Coding Files for a description of the data base used to assign codes. OCCUPATIONAL CODING FILES WLS Staff named each occupational coding file based on the date the LSSC created the output and history files. Each ASCII file prepared for coding received a name of the form "Merxxxx.dat", where 'xxxx' represents the date the data were prepared by LSSC. In the case of siblings, the name was "Smerxxxx.dat". These are temporary, intermediate files, which were aggregated and archived after coding was completed. WLS staff converted the "Merxxxx" and "Smerxxxx" files into dBASE III for use with PC-File. This file conversion was accomplished in one of two ways. WLS staff either "imported" the ASCII file into an existing PC-File data base or the staff used a special command in the PC-Version of SAS (see WLSPG 93). WLS staff designed a unique PC-File format that displays each record and allows coders to enter industry and occupational codes. See appendix B for a description and display of this format. While preparing the ASCII files for coding, portions of text were occasionally too long for the field's holding the text in PC-File. This "extra" text was printed, during preparation of the "Merxxxx" or "Smerxxxx" file, and referenced to the QUES, IDSWL and field where the text belonged. After converting to dBASE III format, a coding supervisor reviewed and paraphrased these text portions that were too long. Assigning Codes This document is not meant to be a thorough review of how coders arrive upon a seven digit code for text describing respondents' jobs. WLS attempted to follow the cognitive coding process of the 1975 occupation coders. However, this document outlines two things 1) How coders were trained to use the 1970 U.S. Census codes and 2) How WLS managed the data and entry of codes. CODER TRAINING After an introduction to the concepts and principles of industry and occupation classification and coding, training consisted of working through coding examples. First, novice coders worked through examples in a large group and then consecutively smaller groups until each individual was coding on their own. Coding examples were taken from the 1992 WLS occupation file. Training also consisted of reading the 1975 coding handbook. 1970 AND 1990 CODES For comparability with earlier measurements, the WLS used 1970 U.S. Census Industry and Occupation codes. These were the same codes used in the 1975 Wisconsin Study project. As in 1975, coders used the following reference manuals: U.S. Bureau of the Census, 1970 Census of Population Alphabetical Index of Industries and Occupations, U.S. Government Printing Office, Washington, D.C. 1971 U.S. Bureau of the Census, 1970 Census of Population Classified Index of Industries and Occupations, U.S. Government Printing Office, Washington, D.C. 1971 U.S. Department of Labor, Dictionary of Occupational Titles (Volume 1, 3rd Edition) U.S. Government Printing Office, 1965. The United States job market, however, has admittedly changed between 1970 and 1990. Market activity typically generates new jobs and new firms as time passes. Consequently, the U.S. Census changed its occupational classification system in both 1980 and 1990. In instances where the 1970 manuals did not extend to modern jobs and firms, WLS coders were instructed to search for a 1990 code and then translate this code into a legal 1970 code. To do this, the coders looked up a code in the 1990 alphabetic index, then used the outline version of the 1990 classification system to identify the code's descriptive title (see Appendix E for outlines of 1970 and 1990 codes). That title was then transferred to the 1970 listing of descriptive titles and a 1970 code was selected that most closely matched the 1990 title. References for this activity were: U.S. Bureau of the Census, 1990 Census of Population and Housing Alphabetic Index of Industries and Occupations, U.S. Government Printing Office, Washington, D.C. 1992. U.S. Bureau of the Census, 1990 Census of Population and Housing Classified Index of Industries and Occupations, U.S. Government Printing office, Washington, D.C. 1992. JOB CHANGES WITHIN AN EMPLOYER For original respondents, the WLS obtained information about job changes within an employer. For example, a respondent may have been hired as a technical writer for a computer software manufacturing company. This respondent could have been promoted to a different job within the same employer, for example a budget analyst. Thus two lines in the occupation coding file would be written for this respondent. Both lines have identical industry information (computer software manufacturing) but varying information about the occupation (technical writer verses budget analyst). WLS endeavored to code jobs separately and independently of one another for any given respondent. Therefore, WLS coders used a special sorting sequence based on whether the job had actually been coded (see Code Flags below), the date of the interview, and the "question" (from above). In this way, all respondents' first jobs were coded at the same time, then all the second jobs, then third, and so forth. Because the first job would come earlier in the sort sequence, respondents' first jobs would be coded before respondents' second jobs for a given employer. To insure that the industry code was the same for these jobs at the same employer the text "See Resp(n) for this IDSWL for INDCODE" was printed in the job change record, where (n) was replaced with the appropriate job number. Coders then retrieved the earlier industry code for the second job. MEMOS USED FOR CODING As novice coders became more proficient at coding, their questions become more generalized, and the answers they sought became more generalized. Several memos were generated throughout the coding process that outline the "official" WLS ruling on how to code certain jobs. Please see appendix A for the memos circulated during coding. RELIABILITY ASSESSMENTS During the initial few months of new coders tenure with the WLS, supervisors closely monitored intercoder reliability. An approximate 5 percent random sample of respondents were coded twice, once by a check coder and then a second time by a production coder. The two sets of codes were checked to see if they matched on one digit industry, one digit occupation, three digit industry, three digit occupation, and the full seven digit code. In this way, WLS was able to assess whether we were creating and using a reliable coding instrument. Aspiration descriptions tended to be more ambiguously articulated by respondents, therefore they were coded separately and were not involved in reliability checks. WLS supervisors elaborated the reliability analysis to identify specific problems codes based on a more stringent statistical procedure. All mismatching codes were analyzed by occupation and industry group and those codes that significantly reduced reliability were selected for retraining (please see WLS COR 466). We maintained a target of 80 percent three digit occupation code reliability and 85 percent reliability on three digit industry code. Once coders attained these levels, supervisors performed reliability assessments less frequently. Data Management Techniques CODE FLAGS Coders assigned a flag to each record as they coded it. These flags reflected the action that was taken on the record. For example, the coder could have completed a code or sent the record back to the field for more information. In assigning flags, the coder had five options: "BACK-TO-THE-FIELD" (Flag "B") The record needs to be sent back to the LSSC, where interviewers then phone the respondent to ask further questions about the job. These questions are written by the coders and are subject to supervisor approval and editing. "ALLOCATION" (Flag "A") Some portion of the seven digit code could be one of several codes. The coder flags the record "A" if a special allocation process should be used to assign a code (See "Coding job aspirations and achievement") "COMMENT" (Flag "C") The coder could not identify any code at all or felt uncertain about their first choice code. This is called commenting. "WAITING" (Flag "W") The coder could not complete the code because a job change had occurred and the prior job was sent back to the field, allocated or commented. "DONE" (Flag "D") The coder was able to successfully assign a code during production coding. No special action needed to be taken with the record. Once production coders completed a "Merxxxx" or "Smerxxxx" file, the file was sorted by IDSWL and all records marked as "A", "B", "C" and "W" were printed to paper using the report utility in PC- File. This paper report was reviewed by a supervisor. At this point, the supervisor determined whether records marked "B" contain adequate questions, whether "A" or "C" records should be returned to the field, whether records marked "B" should instead be allocated, etc. ... The LSSC wanted all call back records for each respondent or sibling kept together, hence the sort by IDSWL. In this way, all further questions for respondents or siblings could be asked in a single call. Furthermore, WLS endeavored to have all call backs completed soon after the interview. For a typical collection of 500 occupations, roughly 30 were returned to the field, 15-20 were allocated, and 10 were commented. As coders became more experienced, however, the distribution of allocations and comments changed. More allocations occurred as coder's accumulated experience and comment became less frequent. Once these special case records were printed, the flag was changed to either "S" for "B" codes, "R" for "C", or "N" for "A". Allocations and comments were completed by coders trained in these activities. The assigned codes were entered by a supervisor, who simultaneously checked the code for accuracy. Once respondent call backs were returned from the field, a coder assigned codes and a supervisor entered the code and any additional text into the electronic version of the record. All paper copies of comments, allocations and "back-to-the-fields" were archived in three ring binders. FILE MOVEMENT Several techniques were tested throughout the coding process to manage accumulating files. While initially coding respondent job reports, three computers housed occupational data sets to which supervisors would add "Merxxxx" files as they became available. This technique, however, proved to be inconvenient. First, only three coders could work at a time. Second, as each data base grew, movement from record to record slowed. Third, when records were finished, they were sporadically removed and no clear system of completing "back-to-the-fields", "allocations" and "comments" existed. Often, job change lookups could not be completed because the corresponding job had been removed from the data base. Furthermore, this system generated over 600 missing records. We attribute missing records to PC-File's "export" utility. When records were finished, a supervisor would "export" the record from the data base. Even though the information was removed, the physical space for the record remained in the data base and a supervisor was to delete this physical space via a search and delete algorithm. If the supervisor was not careful to export the same number of records as they deleted, records were sure to have inadvertently become missing. Consequently, WLS changed the file management strategy completely. WLS supervisors initiated several radical changes in the coding process prior to completing respondent job coding and commencing sibling job coding. First, each "Merxxxx" file was reviewed individually, not from a "heap", by coders before records were printed for allocation, comment or call-back. Second, all data entry resulting from special case procedures was completed before the data was cleaned (see Data Cleaning below). Third, the PC version of SAS became the preferred method of translating dBASE III files into ASCII. Fourth, supervisors moved "Merxxxx" files from directory to directory depending on their coding status. Files that coders had reviewed were moved from a production coding directory to a data entry directory. After data entry, the files were moved from the entry directory to a cleaning directory. And fifth, supervisors also began using a Novell based network to move files and allow coders access to files more easily. This final process was firmly established as coders began work on the sibling job reports. Consequently, no records became missing during sibling production coding. To elucidate this process further, including the early coding process and the change in process, please see the flow chart in Appendix D. DATA CLEANING Once coders completed each "Merxxxx" or "Smerxxxx" file, a supervisor executed a series of cleaning programs to the file before aggregating these data with previously coded occupational data. First, all alphabetic codes were translated into numeric codes (several common, three digit industry and occupation codes have one-letter equivalents that coders use to increase efficiency). Second, all codes were checked against the list of legal U.S. Census codes to be sure only meaningful industry and occupation codes were assigned. Job records with illegal occupational or industry codes were written to a file and later reviewed and corrected. Each file was reviewed for duplicate records, which were subsequently deleted. Once occupational data were cleaned at this level, they were copied into an accumulating record file named "Allocc". Sibling interview job reports were copied to a file named "Socdone". "Allocc" and "Socdone" contained one record for each job report in the WLS, this file was then transposed into a file that contained all jobs for each respondent or sibling in one record, including spouse, child, sibling, and aspiration jobs as appropriate. Intermediate "Merxxxx" and "Smerxxxx" files were compressed and saved in the event that data needed to be recovered. However, at this point, "Merxxxx" and "Smerxxxx" files were essentially obsolete. BACKUP PROCEDURES All "Merxxxx" and Smerxxxx" files were backed up daily, nightly, and weekly. Coders saved a version of the file to a floppy disk (1.4 Mb), when their shift was over. Also, the file would be saved to a different floppy at the end of the work day, and onto a third disk at the end of the work week. We cycled through four disks on a daily basis, and two different disks for the nightly and weekly back up. Because of this process, we could recover all work completed for a given day, were the computing system ever to malfunction or any other errors were to occur. The final occupational coding file was also backed up to 3480 tape, and supervisors cycled through three different tapes for this task as well. Validating Occupational Codes WLS staff suspected that several codes were often assigned incorrectly. WLS staff had several sources of information about this. First, coders and coding supervisors reported continuing problems with certain coding decisions. Second, we obtained additional diagnostic information from our reliability checks, by classifying disagreement by code (rather than coder) and looking for typical patterns of disagreement. Third, the 1992-94 interviews with 1975 respondents included a retrospective report of industry and occupation in 1975. By classifying the code for this retrospective report by the code for the contemporaneous (1975) report, we were able to locate specific occupations and occupation groups where disagreement occurred frequently. Using the information from these three sources, we chose specific occupations or groups of occupations for detailed review. For example, "managers and administrators, not elsewhere classified," the 1970 occupation code "245", was one suspected misidentified group. All job records coded "manager" were selected, sorted by class of worker, and industry codes. Then, within class of worker and industry, a special "sounds like" sort was carried out on the respondent's answer to "What type of work do you do?". All vowels were removed from the WORK1 field (see coding fields in Appendix B). Then the remaining consonant string was used in the "sounds like" sort, in ascending order. In this way, all records were aligned by text that seemingly described similar tasks within industry and class of worker. These selected job records coded "manager" were written to a file. Added to this file were four fields, three holding valid occupation, valid industry, and valid class of worker code. The final field was for a special flag that indicated the action taken with the record during editing. A coder reviewed each record and determined whether the assigned code accurately reflected the tasks described. The concept was similar to the children's song "One of these things is not like the other". If the record, as sorted, was similar to those records grouped near it, then the code remained. If, however, the text described a job that was very different from the jobs grouped near it, the code was suspect, and the job description was recoded. Once reviewed, the file was cleaned as described above and added to "Allocc" or "Socdone". This technique of verifying records was also performed on "Foremen, not elsewhere classified" ("441") and clerical workers ("301" to "394"). Lost Records and Coding Problem Cases LOST RECORDS During the respondent coding process over 600 records became missing. The finished file, known as 'Allocc', was compared with a master file containing all possible occupational records for respondents. WLS staff believe that these missing records occurred for one of two reasons. First, while removing respondent records from the coding data bases during the initial production coding process, staff deleted records from the data base once "exported" through PC-File. We believe that a large number of records that were not completed also were deleted during these instances via either a PC-File identification error, or a supervisor typo, or both. Second, we believe that PC-File would sometimes indicate that a record was "exported" when in fact it was not. PC-File would sometimes fill records with a set of "Z's" consequently we were not sure whether all records were actually complete. During cleaning, these records would be deleted as not holding appropriate values in some portion of the record. At that time, the actual coding record would have been deleted. Although we realized this may have been occurring, we could not identify an effective means of translating the dBASE III coding files into ASCII files for analysis. We found the command within the PC version of SAS that performs this task only half-way through coding of respondent data. After changing to the PC version of SAS to translate files into ASCII format, and using the "batch" file processing scheme, we were able to maintain to the proper number of cases throughout the coding process. Consequently, and fortunately, no sibling records became missing during production coding of sibling job descriptions. Those records that were determined to be missing were re- written to a file and coded by production coders under normal production coding procedures. However, the coders were instructed not to send those records back to the field for call backs. This may have violated WLS policy to perform callbacks in a timely relation to the interview. This file of lost records was cleaned and then copied into "Allocc". PROBLEM CASES A small number of records turned into very large problems. We find identifying or generalizing these problems very difficult. Therefore, here we discuss several examples of problem cases and how they were remedied. 1) The job described was not a job at all: Work: student/// Duties: going to class, taking tests//// Kind: University of Wisconsin- Eau Claire/// Decision: All variables that indicate that this is a job were given blank values in the data file. The text remained in the history files, however no record was written to "Allocc" or the final occupational data tape. 2) Person was not employed at all, i.e. the section should have been skipped: Work: not employed due to arthritis/// Duties: none/// Kind: none/// Decision: This case was originally returned for a call back. However, the call back was returned to us indicating the person was not employed at all. Therefore, all variables the indicate that this is a job were given blank values in the data file. The text remained in the history files, however no record was written to "Allocc" or the final occupational data tape. 3) The responses were uncodable. Text for first job in an employer (Resp1): Work: /// Duties: same as before only now I work with newer equipment/// Kind: University/// Text for second job in the employer (Resp21): Work: /// Duties: see above, department is closed so they're getting rid of books, equipment, keeping records/// Kind: See QUES - Resp1 for this IDSWL for INDCODE Decision: All previous jobs were viewed for the most recent job to this one. The occupation from that was assigned to the first job. The second job was given the same code as the first job. 4) Test cases: For about 20 respondents, LSSC used our "real" data for pre-testing the questionnaire and interviewer training. The data they gave us contained both phoney and real answers. Work: attorney///waitress///stock broker/// Duties: litigating in the court room///bringing people their eggs or whatever they wanted///selling stocks and bond/// Kind: legal firm///diner///brokerage/// Decision: Coders were instructed to code the last job listed. However, this was not uniformly practiced. All test cases were dumped and the last job listed was validated as the actual job coded. 5) Records that were updated. Work: Bank Manager/// cm:os: disregard, wrong kid cm: travel agent/// Duties: Managing the bank/// cm:os: disregard, wrong kid cm: creating travel itineraries/// Kind: Bank/// cm:os: disregard, wrong kid cm: travel agency/// Decision: The wrong child was selected in earlier versions of the WLS questionnaire. Consequently all respondents were called back to ascertain the information for the correct child. Here coders were instructed to code the job indicated as the correct job. This was not uniformly practiced. All such cases were dumped and the job was validated as the actual job coded. 6) The child, spouse or sibling was unemployed. Text for a child: Work: still looking for an attorney job/// Duties: lawyer/// Kind: service/// Decision: The screening questions to get into the job section for selected children censor this person from the occupation questions. The information from the data files was deleted, the text remained in the history files. No record was written to the final occupational data tape. 7) The coder could not find the corresponding job change record. This was remedied through a lookup of the initial job in Allocc (see WLSPG 89). If the corresponding job wasn't found then it was assumed to be "lost" and the record was held in a special file until all "lost" records were coded. Then the job change records were completed and added to "Allocc". Lost job change records occurred with approximately 189 records while production coding respondents. 8) The person in question was retired. Work: Retired/// Duties: /// Kind: /// Decision: LSSC should have asked what the job was before retirement. When timely this was returned for a call back, when not timely, this was treated as a refusal and all data points were entered as such. This occurred with approximately seven occupational records. 9) No text was entered during the interview. Work: /// Duties: /// Kind: /// Decision: These cases, when encountered by the coders, were treated as refusals and coded appropriately. 10) Interviewer used vague references to previous jobs. Work: see above/// Duties: see above/// Kind: /// Decision: Interviewers were retrained to indicate to which job exactly the responded was referring. Otherwise, these cases were called back for the appropriate information. If that failed to produce codable information, refusal codes were assigned. 11) Respondent was never contacted on a call back due to a move, refusal, or the interviewer reconstructed the interview from memory. The information as returned to us was either allocated or commented if possible. If completely uncodable or the interviewers recollections were all WLS coders had available, the job was coded as a refusal. 12) Call back information did not answer the questions asked by the coder. Either the respondent misinterpreted the questions or the interviewer did not probe for appropriate information. For example, Work: RN/// Duties: administering medication and caring for the infermed/// Kind: Health Care/// Question: Does this RN work at a hospital, a doctors office, a doctors clinic, a school, university, manufacturing firm or what? Response: R is an registered nurse. Decision: The text available was allocated between all possible responses. These are actual examples of problem cases. Only problems 11) and 12) persisted through sibling production coding. Once repaired these cases were added back into "Allocc" or "Socdone". Appendix A MEMOS WLS staff generated the following memos to assist coders with the task of assigning 1970 occupation and industry codes to text. Memo 79: Typical problems in occupational coding and general guidelines to improve interviewing and reliability. Memo 83: Scanning for occupational call-backs Memo 88: Managers and Foremen Memo 111: More on coding managers Memo 93: Secretaries, neb and lists of tasks. Memo 96: Military Jobs, Foremen, Volunteers, Housespouses, and lower priced department stores. Truck Drivers, Multidimensional firms, and retail/wholesale trade. MEMO 79 Purpose: Typical problems in occupation coding and general guidelines to improve reliability. Uses of Additional Information Code-First-Listed Procedure WORK Works for an architectural firm doing word processing and some graphic design/// DUTIES word processing and graphic design/// KIND architectural firm for universities or hospitals/// MAINLY retail TYPEEMP private company or organization coder1 coder2 888-190-2 888-391-2 The correct answer is 888-391-2. The 1970 Alphabetic Index yields: INDUSTRY - 888 - Engineering and Architectural Services OCCUPATION - 190 - 'Graphic Designer' 391 - 'Typist--word processor' Code the First Activity Mentioned This case demonstrates an instance when the coder must rely on the first activity mentioned. The respondent lists tasks that are different from each other; they do not enrich the understanding of a person's main activities or duties. The task listed first is coded because it is very different from other tasks consistently listed after it and the second task seems less frequently performed, "...some graphic design." - allocation would normally be used if the respondent reports performing both tasks equally. Using the Classified Index to Eliminate Incorrect Codes WORK directing a youth program/// DUTIES recruiting teachers, providing leadership training and programming/// KIND church/// MAINLY something else TYPEEMP private company or organization coder1 coder2 877-101-2 877-090-2 The correct answer is 877-090-2. The 1970 Alphabetic Index yields: INDUSTRY - 877 - Religious Organizations OCCUPATION - 101 - 'Recreation worker', although incorrect, could be arrived at by looking up "Youth Program Director." Yet The 1970 Classified Index indicates that this code for 'youth program director' is reserved for YMCA and YWCA workers. 240 - 'School Administrator' might apply, yet The 1970 Classified Index suggests that this code is reserved for industry 857 (K) and the respondent reported working for a church not a school. 090 - 'Religious workers' is correct. The 1970 Classified Index lists 'Religious-education director.' This code was found under both "Director" and "Religious-education" in The 1970 Alphabetic Index. Clarify Uncertainties through other References The Dictionary of Occupational Titles describes the difference between ADMINISTRATORS and DIRECTORS. The 1970 Classified Index lists occupations and industries occurring within a given code. The Index groups by similarities, hence this resource can offer insight into the appropriateness of a questionable code. Multiple Occupational Entries WORK works for the milwaukee railroad, he's a caller he calls together the train crews/// DUTIES telephoning/// KIND railroad/// MAINLY something else TYPEEMP private company or organization coder1 coder2 407-394-2 407-333-2 The correct code is 407-394-2. The 1970 Alphabetic Index offers: INDUSTRY - 407 (D) Railroads and railway express services OCCUPATION - 333 - 'Caller' . . . . . . . . . . . . for industries C, 107-398 394 - 'Caller' . . . . . . . . . . . . for industries D, 408, 409, 419, 427 932 - 'Caller' . . . . . . . . . . . . for industries 777, 809 Identifying the Appropriate Occupational Entry Multiple occupational codes can be identified as appropriate, given industry has been coded. Since industry 407 (D), Railroad and railway express service, has been selected. The appropriate entry 394 (caller) is the matching occupation code. The other entries for 'caller' have different industries specified, therefore are incorrect. - Allocation procedures would be used if the selected industry is not specified in The 1970 Alphabetic Index. Uses of Additional Information WORK i managed the laundry/// DUTIES overseeing the operation of the laundry (ae) i hired and fired, managed the hospital laundry and housekeeping depts/// KIND hospital/// MAINLY something else TYPEEMP private company or organization coder1 coder2 838-441-2 838-950-2 The correct code is 838-245-2. Neither of the above codes is correct. The 1970 Alphabetic Index suggests: INDUSTRY - 838 (J) - Hospitals OCCUPATION - 441 - Under 'Laundry' we find 'Laundry-Foreman,' but no code for 'laundry manager.' If the difference between manager and foremen is unclear, consult the Dictionary of Occupational Titles. 950 - Under 'Housekeeping,' we do not find a code for 'Housekeeping-Manager.' Code 950 (Housekeepers) is incorrect. Although the respondent manages the housekeeping department, they do not perform housekeeping duties, per se. 245 - Under 'Manager' there is a code for 'Manager- launderette,' yet the wrong industry is specified [see Multiple Occupational Entries]. No other entries listed seem to fit, therefore the "any not listed above" code of 245 is assigned. Determine the Application of Additional Information This case demonstrates how the additional information adds to the coders understanding of the occupation. The tasks are complementary, not different. Allocation or Code-First-Listed procedures are not appropriate because the additional information enriches activities already mentioned. Weighing the General Trend of a List WORK Fixes vaults, and money machines at bank. Technical job. Safety deposit boxes--fixes them if they are broken/// DUTIES Maintenance of bank vaults, deposit boxes, safety deposit boxes, and money machines--mosler company/// KIND service repair business/// MAINLY something else TYPEEMP private company or organization coder1 coder2 759-492-2 758-430-2 The correct code is 759-492-2. The 1970 Alphabetic Index suggests: Industry - Various types of repair businesses are listed under 'Repairing or repair shop'--vaults, safes, bank deposit boxes, and money machines are not listed. Hence, 758 - 'Any not listed above, electric' 759 - 'Any not listed above, non-electric' Occupation - 'Repairer' and 'Repairman' are referred to 'Mechanic.' Mechanics are coded by the material that they repair, yet vaults, safes, bank deposit boxes, and money machines are not listed. 492 - 'Safe-and-Vault Serviceman.' The 1970 Classified Index lists under code 492, 'Safe-and-Vault Serviceman' as well as 'Safe Expert.' 430 - 'Electrician' is incorrect. It is consistent with the electrical repair business route, however this is a non-electrical repair business. Eliminating Options The debate is about whether to assign an electrical or non- electrical repair business industry code. Employing the Code- First-Listed rule, we can assume this is a non-electric repair business. Although electrical, money machines are listed later. In addition, majority of the items listed are non-electrical. Industrial Specifications WORK teachers aide grade school/// DUTIES teachers aide elementary school/// KIND elementary school/// MAINLY something else TYPEEMP government coder1 coder2 857-114-1 857-942-1 The correct code is 857-382-1. The 1970 Alphabetic Index lists: INDUSTRY - 857(K) - Elementary and Secondary Schools OCCUPATION - 382 - "Teacher's Aide," for an elementary school that is not a nursery school. 942 - "Teacher's Aide," for an elementary school that is also a nursery school. 114 - "Psychology Teacher," is incorrect. This code can be found under "Child Development," however The 1970 Classification Index suggests this is not the right area. Interpreting the Industrial Specifications For a complete discussion see pages V-VI in The 1970 Alphabetic Index. Generally, occupational listings will specify industries within which a given occupation will occur. Single industries or groups of industries can be specified. In the above code, the specification reads "Nursery school K" or "Exc. Nursery school K." Since a nursery school is not reported by the respondent the correct code is 382. Construction Codes WORK roofing construction commercial residential put on roofs shingles tar gravel/// DUTIES putting roofs on buildings/// KIND construction of buildings/// MAINLY something else TYPEEMP private company or organization coder1 coder2 077-534-2 069-534-2 The correct code is 067-534-2. The 1970 Alphabetic Index offers: For Industry - 067 - 'General building contractors,' the respondent reported construction of buildings as industry. 069 - 'Special trade contractors,' this is incorrect. This code is generally reserved for individuals indicating that they are self-employed or are employed by a company or individual that is very specific, such as plumbing. 077 - 'Not specified construction,' this is incorrect. The respondent reports constructing buildings as the industry. For occupation - 534 - "Roofers and Slaters," this is a craftsmen code and clearly captures the duties of this individual. 751 - "Construction Laborers," although accurate this code is reserved for general workers in construction. This respondent reports putting roofs (etc) on buildings. Code as Specific as Possible Only opt for the general industry or occupation codes when limited information is available. Search for key words or phrases [see Key Words and Phrases]. Use general building for 'home construction' or 'residential and commercial....' Use special trade if a trade is specified as the industry "I work for a plumber," or "Its a plumbing business." Use 'not specified' codes if you cannot discern what is being built or how. Sales Clerks, Cashiers, and others WORK secretarial and sales clerk/// DUTIES is suppose gathering monies and making deposit slips, keeping all the tickets for business done and entering it on ledgers/// KIND retail lumber store/// MAINLY retail TYPEEMP private company or organization coder1 coder2 607-280-2 607-305-2 The correct code is 607-280-2. The 1970 Alphabetic Index offers: 280 - "Salesmen and sales clerks." The index specifies this code under alphabetic listings salesmen, clerk, sales clerk, and store clerk based on the determined industry, 607. 305 - "Bookkeepers," The Dictionary of Occupational Titles outlines the differences between bookkeepers and the bookkeeping activities of salesmen. The respondent reports 'gathering monies' while outlining the bookkeeping activities of a sales clerk. Bookkeepers' duties are more archival and do not entail money collection. 372 - "Secretaries, neb," generally perform numerous office tasks. The duties this respondent reports are specific. This code does not fit as well as "Salesmen and sales clerks". 395 - "Clerk, not listed above," performs a variety of duties. Since the duties are listed, this is not appropriate. Coding the Occupation not the Title(s) Often a respondent will list a title that is dissimilar to the duties they perform. The key to this code is the gathering of monies, a task that secretaries and bookkeepers do not perform [see The Dictionary of Occupational Titles]. Although this respondent reports being a secretary first, the duties listed suggest sales. Lastly, this occupational code is specified for this industry in several places. Coding the Occupation Not the Title WORK head of the physiology department at penn state./// DUTIES chairman of the dept. at the university. Retired but is still finishing the research projects that he had already started./// KIND university/// MAINLY something else TYPEEMP private company or organization coder1 coder2 858-113-1 858-235-1 The correct code is 858-044-1. INDUSRTY - 858 - Colleges and Universities OCCUPATION - Three coding avenues could be investigated in The 1970 Alphabetic Index: 113 - 'Physiology Professor.' This is an assumption. Teaching is not reported. Code 113 is reserved for individuals currently teaching. This respondent reports currently doing research only. 235 - 'College Administrator.' Although reporting department head, this person is retired and is now finishing research projects that they have started. He is clearly performing one activity. 044 - 'Physiologist,' A physiologist does research. See 'physiologist' in The Dictionary of Occupational Titles. This is the correct answer. Code the Activity Performed Often respondents list various titles they have accumulated. Some are suggestive of various occupation codes while the activities listed imply something else. This respondent may have been faculty or administration at various times but currently does not engage in these activities. This person does physiology research. Code the activities that the respondent currently performs. Helpers and Assistants WORK she's a medical assistant in a clinic/// DUTIES she works in a family practice clinic, putting patients in the room and preparing them for when the doctor comes in, taking blood pressure, etc/// KIND medical/// MAINLY something else TYPEEMP private company or organization coder1 coder2 828-075-2 828-925-2 The correct code is 828-075-2. INDUSTRY - 828 - Offices of Physicians Two codes are possible - 075 - 'Nurse' 925 - 'Nurse Assistant, Attendant and Orderly' The Dictionary of Occupational Titles lists Nurse and Nurse Assistant with different activities. In addition, The 1970 Classified Index suggests the difference as well. Generally, Nurses perform patient prep work and Nurse Assistants perform less technical duties. Who is assisting whom With assistants or helpers, figure out who the person is assisting. In this case, the respondent is assisting the doctor, a nurse activity. A Nursing aide, orderly or attendant would be assisting the nurse or performing duties that are less technical. Identifying Type of Industry WORK screening computer boards/// DUTIES he works at a computer place, they make computer boards, and i don't know if he stays on the same job or not, he doesn't talk to me about his work (screener) silk screening, i don't KIND computer industry (specific) components for computers (model) no, i don't know/// MAINLY wholesale TYPEEMP private company or organization coder1 coder2 189-690-2 539-690-2 The correct code is 189-690-2. The 1970 Alphabetic Index lists: INDUSRTY - 189 - "Computers and computing equipment, mfg" Although the respondent reports 'wholesale' this company makes computer boards, or computer components. 539 - "Machinery equipment and supplies, wholesale" Wholesaling is reported in the MAINLY field, although the reported description implies manufacturing. OCCUPATION - 690 - 'Machine Operatives, miscellaneous, specified', this person operates a silk screening machine, The Alphabetic Index indicates this code for a silk screener. Identifying the Real Industry Manufacturing, Wholesaling, and Retailing are terms to be used to assist assigning a code. It is not necessary to restrict code assignment to comply with the MAINLY field. Assign codes that match the respondents' qualitative description. Use the information in the MAINLY field when in doubt. KEY WORDS OR PHRASES--OCCUPATION Group 1: Operatives Operates a machine of some sort, phrasing will describe the operation of a specific device. Code by device operated. {operates a drill press, in a furniture manufacturing company} Craftsmen Creates products, phrasing will describe the person's activities as creating a specific item. Code by product crafted. {hand builds furniture, in a furniture manufacturing company} Laborers Use codes in this category when phrasing indicates no special machine operated or specific product is crafted. Also this area represents performing unskilled or less skilled tasks. {moves furniture around, in a furniture manufacturing company} Group 2: Service Workers and Assistants Attendants, assistants, waiters, hair stylists and guards. Phrasing describes duties of assisting other workers or providing personal services. {Midwife or birthing assistant, in a hospital} Clerical Workers Perform general office, or office related functions: payroll, accounts receivable or payable, secretaries, office machine operators (different from manufacturing machine operators) cashiers and clerks. {admitting clerk, in a hospital} Technical and Professional Workers Skilled individuals performing technical tasks. Including doctors, lawyers, sociologists, writers (etc). Phrasing describes applying skills and performing tasks that seem relatively non-routine. {charge nurse, RN, or neonatal nurse, in a hospital} Group 3: Managers Managers, assistant managers, administrators, directors. - Managers of people are different from managers of budgets and school operations, and health care policy. - University Department heads are coded separately from faculty within that department. - Heads of technical departments are coded separately from the technical workers within those departments. - Self-employed, and owners are generally coded managers regardless of the tasks they perform, e.g. Self-Employed Carpenters, Owner of a Maid Service. Sales Workers Sales, marketing, telemarketing. Phrasing indicates carrying out economic transactions and/or MINOR bookkeeping. Cash-register operators are clerical workers as opposed to sales workers. KEY WORDS OR PHRASES--INDUSTRY Wholesale "...Distribute..", "...they supply...", "sell supplies...", "sell to other businesses"--identify the parties of the transaction. Use a wholesale code indicating the product being sold. Consulting and Repairs They advise and provide technical or professional services to businesses and individuals. Use a business and consulting code indicating the task or object of consultation or repair. Manufacturing "...Make...", "...They make...", "...They build...cars, trains, machines, engines etc..." Use a manufacturing code indicating what is being made. Retail They sell the END product or service--identify the parties of the transaction. Use a retail code indicating service or product being sold. Construction They build buildings, they install fixtures, etc... Try to figure out what they build. Use a code describing the specificity of what is being constructed. Health Care (Codes 828-848) - Identify the client, how the client is being treated (residential or outpatient, specialist or general practitioner). - Public health agencies (County Health Inspector), health insurance, and health clubs are different from health care services (hospitals, clinics). - Use a code that describes the client and how they are being treated. GENERAL NOTES - Codes 997 and 999 - When the respondent reports "Don't know," use code 997 for both industry and occupation. USE ONLY IF OTHER OPTIONS HAVE BEEN EXHAUSTED. - When the information simply cannot be adequately coded use code 999 for both industry and occupation. USE ONLY IF OTHER OPTIONS HAVE BEEN EXHAUSTED. - TAKE CARE WITH THIS, this procedure is different from the US Census and is not outlined in the Alphabetic Index or The Classified Index. - Begin by coding the industry, THEN the occupation. - Proceed through the books: 1. Consult The 1970 Alphabetic Index. 2. Consult The 1970 Classified Index. 3. Discern differences between occupations with The Dictionary of Occupational Titles. - Always check The 1970 Classified Index for related occupations or industries to be sure you've chosen the correct code. - Code as specific as possible and use 'neb' codes and 'miscellaneous' codes sparingly. - Only when you cannot decide between two or more codes should you allocate. - Sending cases back to the field for further information is an option, but should also be used sparingly. If you really cannot decide a code to assign, phrase a clear and complete question for the interviewer to ask the respondent. - Work together....AND ASK QUESTIONS! WATCH FOR THIS SPECIFIC PROBLEM - The fields WORK1, DUTIES1, KIND1 should NOT be empty - /// counts as an entry. *If any field is empty - indicate this with an L in the CODEFLG. *Make a note of the IDSWL and QUES, then alert Linda Jordan to the problem. WHEN REENTERING THE RETURNS FROM THE FIELD... - When data returns from the field, after having been sent back, please update the WORK, DUTIES, and KIND fields with the new information. *Interviewer comments, unless entered into these fields, cannot be used for future coding, or rationale for the assigned codes. RE: Revised version of 31 pointers for probing occupation/industry questions ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ INDUSTRY and OCCUPATION are TWO SEPARATE CONCEPTS Think: Industry is where a person performs an Occupation. Note: Industry and Occupation are coded in different ways. NOTES TO HELP PROBE INDUSTRY Manufacturing: Find out WHAT IS MADE EXACTLY, then OUT OF WHAT IT IS MADE, what it is used for is often extraneous information, but non-the-less helpful as a third component. Examples: Goodwill/Handicapped Shops - What does the shop make? Foundries - Brass? Bronze? Steel? Iron? What? Processes used to make metal products - Casting? Stamping? Fabrication? Canned Goods - What is canned? Motors, what kind - Aircraft Motors? Electric Motors? Outboard Motors? Rocket Motors? Printing - What is printed? On what is it printed? Paper - Refined or structural? Engines, what kind - Diesel Engines? Steam Engines? Turbines? Clothing, made of - Knit Cotton? Woven Cotton? Knit Acrylic? Woven Acrylic? What? Shoes, made of - Felt? Canvas? Leather? Rubber Soled? Mining and Oil: Specify WHAT is mined, then HOW it is treated - Metal, what type? Coal, crude petroleum, natural gas. Oil industries are of various types including, retailing crude oil, extracting crude oil, wholesaling crude oil, oil storage, oil refining, refined oil retailing, refined oil wholesaling. Wholesale and Retail Trades: WHAT exactly is being sold? TO WHOM is it being sold? Examples: Restaurant Supplies - Food Products, Cleaning Products, Plates, Glasses, Cutlery, what? Health and Beauty AIDS - Bobby Pins, Shoe Strings, Shampoo, Soap, Toothbrushes, Antacids, what? Appliances - household, commercial, electric or non-electric? Utilities: Electric Light and Power? Water? electric and Gas? Gas and Steam? Telephone? Business, Management and Repair Services: Consulting - About What? Management Service - what is managed? Service Shop - What is serviced? What does the respondent mean by service? Repair Shop - what is repaired? Holding Company? Investment Firm? Professional Services: DO NOT GIVE THE NAME OF THE COMPANY WITHOUT FOLLOWING WITH THE GOODS PRODUCED OR SERVICES PROVIDED BY THE PLANT/OFFICE!!! Additional Examples: Schools - High Schools, Junior High, Elementary, Middle School, College/University/Vocational. Health Care - Hospital? Clinic? Visiting Nurses Association? Charitable Nursing Home, Non- Charitable Nursing Home, Charitable Non-Nursing Home, Non-Charitable Non-Nursing Home? Doctors Office? Dentists Office? Doctors clinic? or Hospital Clinic? Welfare Services - Residential, Non-Residential? Government What LEVEL of Government - Local (City or County?) State? Federal? Also, what Department? NOTES TO HELP PROBE OCCUPATION Professional and Technical Workers: Often elusive categories follow- Examples: Nurses, RN's or LPN's 9Registered Nurse or Licensed Practicing Nurse) By 1970 US Census codes, College Professors do research, College Teachers teach subject matter, on both cases get the subject matter. Engineers - Chemical, Material, Process, Electrical, Mechanical, etc? Management: Managerial tasks are confusing, PLEASE SPECIFY WHAT/WHO IS MANAGED. Individuals, Groups, a segment of the organization, objects (e.g. Stage Props, Budgets)? Also, what level of management? Administrative, Departmental, or Crew. Repairmen: What is repaired? Specify to the level of detail as you would with any manufacturing industry. Babysitters: The home of the respondent or the home of others? Military: Try to secure whether commissioned, non- commissioned, or enlisted. Clerical: Focus in on a main activity or duty, i.e. "...I do typing, filing, data entry, and dictation...(but mostly they do data entry." Craftsmen, Operatives, Laborers: (Skilled, Semi-skilled, or Unskilled Labor) Be careful when describing these, each is distinctive. Craftsmen: What CRAFT is the person trained in? e.g. glaziers, painters, carpenters, roofers, masons. Persons managing craftsmen are called foremen. Operatives: What machine does the person operate? Included are Assemblers, Cutting Machine Operatives. Lathe Operators, Drill Press Operators, etc. Laborers: Unskilled labor, distinguish between unskilled labor and other types of jobs. Examples: Stock Room Workers vs Store Shelf Stockers Animal Caretakers vs Veterinarians or Ranchers Construction Laborers/Helpers vs Skilled Craftsmen Teachers: What level: High School, Middle School, Junior High, Elementary, Pre-Kindergarten? What Subject? NOTES: IF THE RESPONDENT INDICATES SEVERAL ACTIVITIES, PLEASE NOTE WHICH ACTIVITY RECEIVES THE MOST TIME OR ATTENTION. IF AN OCCUPATION OR INDUSTRY ENTRY IS EXACTLY THE SAME AS ONE MENTIONED PREVIOUSLY, PLEASE EITHER INDICATE THE QUESTION NUMBER TO WHICH WE CAN REFER OR FILL IN ALL INFORMATION. OCCUPATION AND INDUSTRY ENTRIES FOR THE SAME RESPONDENT ARE CODED SEPARATELY. THUS, IF ONE ENTRY IS NOT AS FULLY DESCRIPTIVE AS ANOTHER, IT MAY BE CODED INCORRECTLY. MEMO 83 Scanning for occupational callbacks In order to do callbacks in a timely manner, we have been scanning new data received from LSSC and identifying cases which will need to go back to the field. The procedure for this is as follows: 1) Computer output from a file is examined for cases in which a response was too long to fit in the allotted PCfile field. For such cases "too long" is printed adjacent to the line in the printout, indicating both which field was too long and the answer from the history file. These screens are subsequently edited so that all relevant information will be included in the screen seen by occupational coders. 2) Comments added to the file in the scrubbing process are not included in the screen seen by occupational coders. These remarks are scanned on the printout, and in cases where the quality of occupational code may be improved, the relevant information is added to the PCfile field. This is most often done through redistributing information between work and duties fields - as both are used in assigning the code. (Note: This step will no longer be necessary for future files, as these comments will be incorporated by computer.) 3) The file is next scanned for cases which will require call- backs, and these are assigned a "B" in the code flag field. Specific information needed to assign a code is typed in the comment field, and these cases are printed out and sent to LSSC for immediate callbacks. This procedure appears to be working fairly well - and is increasing the speed in which many callbacks are made. It is not yet certain what proportion of cases requiring callbacks are identified in the initial scan - but for more experienced coders this is suspected to be fairly large. MEMO 88 Managers and Foremen for occupational coding. There might be some confusion on how to code supervisors, managers, and foremen. Craftsmen and Kindred workers are skilled laborers, persons who supervise craftsmen and kindred workers are foremen. They are coded by the specific craft or kindred worker they are supervising. If the specific craft cannot be discerned they are given the not elsewhere classified code of 441. For example, the supervisor of plumbers is coded as a plumber, whereas the supervisor of workers for a general construction company is a foremen, neb. Some coders assign the code 245 to persons in supervisory roles. Although this code might seem to apply (managers and administrators, neb) it does not. This code is reserved for persons in administration that are not directly connected with crafting the end product or service. The example of this would be in the construction industry once more, the person who describes their duties as lining up the contracts (contractor), being the president of a construction company, or give the impression in describing their duties as NOT being in contact with the skilled workers. The distinction is between the person in charge of getting a task done, utilizing specialized workers to get the task done vs. the person in charge of running some portion of the company, making business or administrative type decisions. This same concept can be noticed in other industries as well. For example the supervisor of secretaries still gets the secretary code, if they describe their duties as supervising the typists, hiring and firing typist (for example), assigning jobs to filers, typists, or other office machine operators. Compare this to the duties of the Vice-President of the same company that might be in charge of acquiring new accounts through negotiations with other businesses. The duties are very different, and so the 245 code is really reserved for persons doing business managing, vs work force management. Similarly janitors, sextans, cleaning supervisors are coded the same way. Watch for people who describe their duties as managing a department, these key words tip you off to their administrative role (e.g. 245 Managers and Administrators), BUT we are still trying to capture their activities, if for any reason worker supervisory responsibilities seem to be clearer, stick with the coding procedure outlined above. Further, I would like to remind coders to focus on the person's activities when coding. Specifically with clerical workers. There are many types of secretaries and clerical workers. Often office duties are divided up. When someone calls themselves a secretary but describes their duties as filing and typing, code them either a filer or typist or allocate between the two. Other common activities with specific codes are receptionists, data entry, billing clerks, order clerks, information clerks, filers, typists, dictaphone operators, other office machine operators. MEMO #111 A reminder for coding "Managers" ************************************************************* While doing validity coding for occupations that have been given a 245 by production coders, I have encountered some problems with occupations being given a 245 when they should be given a different managerial code. These problem areas are: federal, state, and local officials, hospital administrators, restaurant managers, apartment managers, and school officials, all of which have managerial occupation codes specific to their industries and do not receive the 245 code. This problem can easily be resolved if coders pay more attention to the industries and if coders look up the occupation in the book to double check their codes. I have also found that some inventory or stock managers and especially sales managers are given 245's incorrectly. Clearly this is a matter of not reading the occupation information carefully and not re-checking in the book. We are still having trouble with the 245 code being given to occupations that should be coded as foremen, 441. The difference between managing operations and managing people needs to be re- emphasized as well as the fact that the word "manager" does not necessitate the 245 code. I also found some tricky areas that need clearing up, particularly the occupations of financial officers and managers. Listed under 245 are financial director and financial officer, except L,M, 907, and 927, and financier-709. However, under 202, Bank Officers and Financial Managers, is manager, financial, with no industries listed. Many coders seem to overlook the 202 code and code financial managers as 245's. These seem to be the major trends in the problems with the 245 code. They should mostly be cleared up with a few reminders about special managerial codes and the distinction between managing/supervising people and managing the operations or production, as well as a reminder to always look in the book before coding. MEMO #93 RE: Secretaries and "nec" titles, truck drivers, multidimensional firms, and retail/wholesale trade The problem with "Secretaries, nec" The "nec" following many entries means "not elsewhere classified". Often the US Census simply assigns titles an "nec" code, especially if the specific product or service, or set of tasks the individual performs, are unique enough as not to be grouped into another area. Many times, individuals will report performing various clerical tasks that can be described as "secretarial". Since 1970, many of the "secretarial" specifics have become combined through the introduction of office computing systems, and various other time and space saving devices. Further, many of the once specific items such as "paper files" and "adding machines" have disappeared, and thus personnel that would "file" or use the "adding machine" simply don't exist. Consequently, we are stuck with an antiquated coding mechanism. Despite technological advances, we will try our best to represent the work of the 1990 work force, using 1970 codes. Keep in mind that the 1970 code book was still in rule for roughly half of the occupations we are coding. What to do - When the respondent lists a series of duties, and makes no mention of which might be the most frequently occurring duty, we will consider the first duty listed. Sometimes, the last duty will listed will in fact be a rough title describing the preceding duties. If this occurs go with what is listed last - otherwise, code the first duty listed. However, if the listed duties follow a trend of some sort that have to do with a series of specific office duties (not all duties listed have to fit the list), then code according to the trend. Although "secretaries" is the title given to us by the respondent, and there is a "secretaries, nec" code, all clerical workers are considered secretaries of some sort by the US Census. Therefore, use "Q" when assigning another code is out of the question. MEMO 96 RE: Military Jobs, Foremen, Volunteers, Housespouses, 'Shopkos' 1. For military occupations, use the following codes, which do not appear in the 1970 Census O&I Classification: AFC (970) Armed Forces Commissioned Officer AFN (971) Armed Forces non-Commissioned Officer AFE (972) Armed Forces Enlisted AFX (973) Armed Forces no information on rank Class of worker for all military codes will be inappropriate. Following is a list of the specific titles for each category: AFC Armed Forces, Commissioned Officer Army & Air Force Warrant Officer 2nd Lt. 1st Lt. Captain Major Lt. Colonel Colonel Br. General Major General Lt. General General Navy Ensign J.G. Ensign S.G. Lt. J.G. Lt. S.G. Vice Commander Commander Captain Rear Admiral Vice Admiral Admiral Fleet Admiral AFN Armed Forces, non-Commissioned Officer Army, Marines, Air Force Sgt, Staff Sgt, Tech Sgt Master sgt Chief Msr. Sgt. Navy Petty officer Petty officer 2nd class Petty officer 1st class Chief Petty officer AFE Armed Forces, Enlisted Man Army & Marines Private (PFC) Corporal Lance Corporal "Spec's" 1-2-3-4-5-6 Navy Seaman apprentice Seaman Seaman first class Air Force Airman Airman 1st class AFX Armed Forces, no information on rank All occupations in the armed forces have an industry code of L. The following was the decision about military jobs in 1975. In coding military jobs in 1993, we will use the same guidelines. "-----Since the Alpha Index lists codes for civilian labor force occupations only, codes for Armed Forces occupations had to be devised. they were: AFC (Armed Forces, commissioned officer), AFN (Armed Forces, non- commissioned officer), AFE (Armed Forces, enlisted man) and AFX (Armed Forces, no information on rank), see Appendix F of Social Factors in Aspirations and Achievement: Occupation and Industry Coding Handbook 1974-1975. If a respondent specified a civilian job in addition to his rank in the Armed Forces, the Armed Forces code was assigned. All Armed Forces codes were preceded by an industry code of L (Federal Public Administration)." (From Social Factors in Aspiration and Achievement: Occupation and Industry Coding Handbook 1974-1975.) 2. Foremen: When the respondent describes a role in which the person is in direct supervision of some type of craft worker, they are considered a foremen. When looking up both foremen and supervisor, we see: "...Foremen, specified occupation--If craft(codes R.S.401-575), code by craft--for 'Foreman carpenter, see Carpenter. --If not craft, code 441." Under 'supervisor' we find: "...Supervisor, if craft trades (codes R.S,401-575)-- see Foreman" Code 441 is for Foremen and supervisors, not elsewhere classified. This indicates that the person is a foreman, but their job cannot be not classified in any other way. Sometimes, this is the preferred code and will be specified as we search the occupation index. Other types of supervisory roles are often coded similarly to those that the person is supervising. Some such codes include food service supervisors, supervisor of typists, Police Chief and others. Keep your eyes open for other similar cases. Managers, unlike foremen or supervisors, may not directly supervise production workers, but typically have the authority and power to hire and fire, to plan and assign work to be accomplished (and supervised directly) by others, and are usually responsible for oversight of more than one kind of productive activity. Managerial jobs are usually in the 200 series of occupation codes. 3. Volunteers When a person describes a volunteer role, code as per usual -- activities and duties. If there is not enough information to code, send it back to the field for clarification, just like any other incomplete report for which you can pose additional questions. 4. Housework When a person describes a housewife or househusband type role, assign industry codes HHH, and occupation code HHH. These codes will not be found in the 1970 Occupation-Industry Classification System. When assigned, complete the codeflg with D. In this way we can process these codes separately later. (These roles are not occupations in the paid labor force, and interviewers discourage respondents from reporting them as "jobs." However, some respondents do report them, and these additional codes will make such cases easier to identify.) The numeric codes created are industry (768), occupation (985) and class of worker (inappropriate). 5. Shopko, Walmart, Farm & Fleet, etc Retail stores with departments are department stores, stores without departments are coded by the product they sell. Thus, Shopko, Walmart, K-Mart, Farm & Fleet, etc., are department stores, just like Sears, Penneys, Younkers, Marshall Field, and the like. 6. Over the road truck drivers vs. delivery truck drivers Over the road truck drivers start with a full truck and take the load from point A to point B. At point B they drop off their load, and pick up another load which they then bring back to point A. The point here is that over the road truck drivers seldom return from point B with an empty truck. Delivery truck drivers start with a full truck and cover a certain area, making many stops and emptying their trucks little by little, until they reach the last stop on their route and their truck is empty. They return then to their start point with an empty truck. 7. 3M, RJ Reynolds, Kohler, ....... You must have specific information when coding large multi- faceted manufacturing firms. A description of the industry as simply "manufacturing" is not enough, since all such corporations manufacture many products. If the information given is not specific enough for you to assign a code, and providing there are no circumstances which would prevent you from sending it back, we suggest you do the following: Send this case back to the field and find out what specifically is being manufactured at the plant where the person actually works. Keep in mind that the person may not work at a mfg. plant but in a corporate office or research lab owned by the company. 8. Wholesale trade vs. retail trade Wholesale trade is whenever a product is being transferred from one business to another business, regardless of whether the second business resells the product. The product could be used by the second business as a raw material, for example. So there are two parts to wholesale: 1) the company that acts as a middle man - buys wholesale and sells wholesale; 2) there are key resources and supplies that are sold wholesale. Retail trade is simply the transaction between an individual, non-corporate entity (you or I) and some business. The important thing is to know who is engaging in the transaction. 9. Unpaid workers If someone owns a business, but reports that he/she is not paid, do not assume that this occupation is uncodable. If the person lists duties and work such that you are able to assign a code, then this job is codable. The same holds true for agricultural workers. If they work on the family farm without pay, but still list duties that are codable, then please code. There is even a specific code for "farm laborers, unpaid family workers". The "moral of the story" is to exhaust all possible sources before giving up on any occupation. Appendix B CODING SCREEN FORMAT This is the layout of the PC-File data entry screen. This is what coders look at when assigning codes during production codes. IDSWL(1) QUES(2) Work:[ (3) ] [ ] [ ] Duties:[ (4) ] [ ] [ ] [ ] Kind:[ (5) ] [ ] [ ] Mainly:[ (6) ] Type:[ (7) ] Pay?[ (8) Company Name:[ (9) ] INDCODE (10) OCCCODE (11) CLSWRK (12) CODEFLG (13) CODER (14) Comment:[ (15) ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] The parts of the screen are: 1) Respondent's IDSWL. 2) QUES - indicating what series of questions this is. 3) Response to "What type of work do you do?" 4) Response to "What are your chief activities or duties?" 5) Response to "What kind of business or industry is this?" 6) Response to "Is this mainly manufacturing, retailing, wholesaling or something else?" 7) Response to "Where you employed by government, a private company or organization....." 8) Response to "Were you working for pay?" 9) The name of the company for whom the respondent worked. 10) The coder assigned industry code. 11) Coder assigned occupational code. 12) Class of worker code. 13) Code Flag. 14) Coder's initials. 15) Space for coders to put questions for call backs, possible codes for allocation or other comments. Appendix C COMPUTER REQUESTS This section does not include the specific programs that are referenced in the overall document in order to conserve space. However, complete descriptions of the programs are included. Please consult Wisconsin Longitudinal Study staff for copies of the specific programs requested in this document. ***************************************************************** COMPUTER OPERATIONS REQUEST 447 The following programs were run to combine the data received from LSSC so far into one file which has all questions for occupation coding: ALLQUES.FOR Reads the OTHSP.DAT file which has all open-ended WLSPG048 questions and writes a record for each set of variables which need to be coded for occupation. Each record has the IDSWL and QUES plus the response for WORK, DUTIES and KIND. ALLEMP.SAS Read the REFCUM.DAT file and write a record for WLSPG049 each set of variables which would need to be coded for occupation. Each record has IDSWL and QUES plus MAINLY, TYPE, INCORP, PAY, COMPNAM, FIRSTNM, LASTNM, and NAMEQ. ALLMER.SAS Merge datasets from previous two programs by IDSWL WLSPG050 and QUES, and write an ASCII file which can be input to PC file. CHKMER.SAS Writes list of all records that are either in the WLSPG051 OTHSP.DAT file but not REFCUM.DAT or in REFCUM.DAT but not in OTHSP.DAT COMPUTER OPERATIONS REQUEST 497 This document is divided into five parts. Part "A" is a listing of the programs used to update the master occupational data file. Part "B" is a listing of the programs used to locate lost records. Part "C" lists the programs used in the initial stages of the clean-up operation. Part "D" lists miscellaneous "tool" programs and describes why or when they may be used. Finally, Part "E" describes and lists programs used in assessing coder reliability. *************************************************************** PART "A" Following is a list of programs used for updating the occupational coding file. After the master occupational data base was updated, a version was copied to the VMS side of the SSC cluster. All SAS programs are UNIX based, however the DCL programs for backing up the data is for use on VMS. 1) WLS_TOGETHER (wlsprg077) (SAS) This program concatenates several files together into one file known as "update". The unified file is further processed. 2) WLS_ALPH2NUM (wlsprg078) (SAS) This program converts all alphabetic occupation and industry codes into valid numeric occupation and industry codes in the "update" file. 3) WLS_CHKCODES (wlsprg079) (SAS) This program checks the "update" file and verifies that the industry and occupational codes are valid. Lists of records with invalid codes are written to a file for review. 4) WLS_CHKDUPS (wlsprg080) (SAS) This program checks the "update" file for duplicate records and removes those that are duplicates. Also, it checks for blank records. 5) WLS_UPDATED (wlsprg081) (SAS) This program updates the grand occupational data file with the "update" file. The program writes a new occupational data file. 6) BAK192, BAK193, BAK194 (wlsprg082) (DCL) These three programs are identical. However, each copies the grand occupational data file to a different 3480 tape. ***************************************************************** Part "B" The programs described in this section are of two types. First, I describe a series of programs that was used during the July of 1993 to initially locate missing records. This series takes into consideration records that were currently in the coding process and therefore a straight comparison of the occupational file and the data file would not have yielded the correct missing records. The second, set of programs was used to identify missing records from the occupational file by simply comparing this file to the raw data records. * SET 1 * 1) GETCUR0714 (wlspg046) (SAS) Collects and concatenates data currently being coded on the Industry and Occupation Coding Machines. 2) GETSYS (wlspg047) (SAS) Collects and concatenates data waiting to be processed prior to addition to the Industry and Occupation Coding Machines. 3) ALLQUES (wlspg048) (FOR) Read the OTHRSP.DAT file which as all open-ended question and writes a record for each set of variables which needs to be coded for occupation. Each record has the IDSWL and QUES plus the response for WORK, DUTIES and KIND. ALLQUES.COM has names of files and can be submitted to run this program. 4) ALLEMP (wlspg049) (SAS) Reads REFCUM.DAT file and writes a record for each set of variables which would need to be coded for occupation. Each records has IDSWL, and QUES plus MAINLY, TYPE, INCORP, PAY, COMPNAM, FIRSTNM, LASTNM, and NAMEQ. 5) ALLMER (wlspg050) (SAS) Merge datasets from previous two programs by IDSWL and QUES, and write an ASCII file which can be input to PC-file. 6) CHKMER (wlspg051) (SAS) Writes list of all records that are either in the OTHSP.DAT file but not REFCUM.DAT or in REFCUM.DAT but not in OTHSP.DAT. 7) LOST (wlspg052) (SAS) Merges the results of GETCUR0714, GETSYS, and the currently stored data that has been coded. Then LOST compares this file to the results of ALLMER and writes a file containing all records not occurring in the master data set. 8) INTDAT (wlspg053) (SAS) Sorts the results of LOST, eliminating duplicate records. Writes a program "LOCATION" with results of the sort. 9) LOCATION (wlspg054) (SAS) Search all "mer" files for the dates of interview and location of each missing record. Writes a file containing the date we received each case from LSSC. 10) TOAD (wlspg055) (SAS) Merges the identifiers from the lost industry and occupation records with remainder of the backup data we have for each. Then it writes the complete record to an ascii file for a special coder to assign codes. 11) WH_DEL (wlspg056) (SAS) Reads the records from LOST that are to be deleted, divides them into three temporary working files representing on of three locations each records could be currently located. Writes programs "CUR_DEL" and "SYS_DEL". 12) CUR_DEL (wlspg057) (DCL) Searches all 'mer' files for the dates of interview and location of each record to be deleted. Writes a file containing the date we received each case from LSSC. 13) SYS_DEL (wlspg058) (DCL) Searches all 'mer' files for the dates of interview and location of each record to be deleted. Writes a file containing the date we received each case from LSSC. * SET 2 * NOTE: The above programs 3, 4, 5, and 6 were re-run after the LSSC had delivered all data for respondents and 1975 non-respondents. The remaining programs identified records that were lost. 1) GETREC (wlsprg083) (SAS) Compares master occupational coding file (known as Merques.dat) with the master occupational file that is completely coded (known as Allocc.done). All records that appear in Merques.dat but not Allocc.done are considered lost and were written by this program to a file for recoding. 2) DESCRIPTIVES (wlsprg084) (SAS) Records that remained uncoded from the July, 1993 attempt to recode missing records were also discovered to be missing after the production coding was complete. This program locates which records were considered lost from the summer and which records were "recently" lost. Those records lost from the summer, were then delete from the "recently" lost record file -- so as to not form duplicate records and for more simplistic file management. ***************************************************************** Part "C" The following programs were initially used in identifying records for validity checking and other forms of minor cleaning prior to analysis. 1) GETVALID (wlsprg068, also known as GET245) (SAS) This program selects all persons with a specified occupation or industry code from the accumulated occupational coding file (known as Allocc). In addition, if the selected record is a job change, this program selects the matching former job for the output file. The records in the output file are sorted by CLSWRK (ascending), INDCODE (ascending), and finally WORK1 (ascending w/o spaces nor vowels). 2) LAWYER (wlsgrg085) (SAS) This program dumps records coded lawyer from the accumulated occupational coding file. The 1970 U.S. Census Occupation codes do not capture the tasks of a "Para-Legal". Initially coders gave a "Para-Legal" a lawyer code, however upon consultation with the 1990 U.S. Census Occupational codes, they would translated into a completely different occupational code. 3) CHECK (wlsprg087) (SAS) This program checks to see whether the accumulated occupational coding file (Allocc) contains bad CODEFLG and bad QUES values. ***************************************************************** Part "D" This section describes several "tool" programs that could be used for any end project. 1) FIXIT (wlsprg088) (SAS) Many "fixit" programs were written of this genre. This program removes records specified by unique identifiers IDSWL and QUES from the accumulated occupational coding file (Allocc). These records were later replaced with updated data from LSSC. 2) LOOKUP (wlsprg089) (AWK) This Awk program reads the accumulated occupational coding file (Allocc) and writes a file containing sorted IDSWL, QUES, OCCCODE and INDCODE. This output file was referenced for job change lookups during the coding process. 3) IDSWL (wlsprg090) (AWK) This Awk program prints all the IDSWLs from the accumulated occupational coding file (Allocc). 4) CHANGEIT (wlsprg091) (Awk) This Awk program systematically identifies and changes codes for easily identifiable records. 5) DBF2ASC (wlsprg092) (SAS for PCs) This program changes any occupational coding file from dBASE III form into ASCII form. This is faster and less prone to error than the export utility in PC-File. 6) ASC2DBF (wlsprg093) (SAS for PCs) This program changes any occupational coding file from ASCII form into dBASE III form. This is faster and less prone to error than the import utility in PC-File. 7) SIB (wlsprg094) (SAS) This program identifies records that contained illegal QUES values for siblings and writes their unique identifiers, IDSWL and QUES, to an output file. ************************************************************* Part "E" Programs in this section were used to assess reliability early in the coding process, and in training new coders. 1) FIVE_PERCENT (wlsprg095) (SAS) This program generates a randomized 5 % sample of codes for reliability coding. 2) GETCHK (wlsprg096) (SAS) This program gets the latest coding done by established reliability coders. Recodes the alphabetic codes to numeric codes and adds SEI scores. 3) GETPROD (wlsprg097) (SAS) This program gets the latest production coding. Recodes letters to numbers and add's SEI codes. 4) COMBCODES (wlsprg098) (SAS) This program merges the production and check coding by IDSWL and QUES. Furthermore, this program creates variables INDSAME, OCCSAME, IND1, OCC1, and SEIS. INDSAME and OCCSAME indicate whether the three digit industry and occupation codes assigned by production and check coders match. IND1 and OCC1 indicate whether the one digit industry and occupation codes match. And SEIS indicate whether the determined SEI scores match. Finally, this program prints a frequency distribution of each variable as an indicator of reliability. 5) ECCHECK and ECPROD (wlsprg099 and wlsprg100) (SAS) ECCHECK compares each production coders code assignments to all check coders code assignments. ECPROD compares each check coders assignments to all production coders assignments. 6) OUTPUT (wlsprg101) (SAS) Managers and Clerical workers were hypothesized to be problem areas for coders. This program determines puts out manager and clerical codes that were not reliably coded for analysis of text that led to poorly coded records. 7) FREQ (wlspg060) (SAS) This program tests the hypothesis that major group determines how reliably coded the occupational data is overall. Specific major occupational and industry codes were analyzed for their effect on overall reliability. I found a significant association between major group and reliability. "Managers and Administrators" were the least reliably coded occupations. "Business Organizations" were the least codable for industry. 8) BAD_CASES (wlsprg102) (SAS) This program dumps records that seemed typical of the sorts of errors coders were making. The records were used for retraining coders. Occupation and Industry codes were deleted from this file. 9) ANSWERS (wlsprg103) (SAS) This program prints the IDSWL, ques, occcode and indcode for each record written in BAD_CASES. 10) RAN_SETS (wlsprg086) (SAS) This program generates small sets of records so that new occupational coders may practice coding. COMPUTER OPERATIONS REQUEST 466 RE: Search for coding errors by major occupational group comparison ***************************************************************** WLSPG060.LIS was used to diagnose an occupational coding error. A set of occupational codes prepared by production coders was compared to the same records as coded by non-production, independent coders. A variable was constructed to capture the nature of the disagreement between the three digit industry or three digit occupation code. Matches for industry codes and matches for occupation codes were viewed separately. The variable was coded "1" for matches on all three digits, "2" for disagreement on three digits but matches on major groups, and "3" for disagreement on three digits and on major group. By taking the cross tabulation of this variable and major group, we obtained a listing of major groups with higher proportions of mismatches between production and independent coders. Codes were further sorted by production code, and then reliability code, as standards. Comparisons of cross-tabs generated after sorting supported the proposition that major group misidentification by production coders contributes significantly less than adequate reliability. We were then able to focus training efforts on those major groups that were more likely to contain major group mismatches. The most frequently missed scenario was the craft foreman who was assumed to be in administrative management. Following is an example of the output from this program that led to a more focused training about coding occupations. This cross- tab is the analysis of coding after focused training efforts. TABLE OF OCS BY OCP OCS(Occ Same) OCP(Check Codes) Frequency | Percent | Row Pct | Col Pct |Professi|Managers|Sales |Clerical|Craftsme| |onal | | | |n | Total -----------+--------+--------+--------+--------+--------+ Codes Same | 14 | 13 | 11 | 9 | 6 | 72 | 15.56 | 14.44 | 12.22 | 10.00 | 6.67 | 80.00 | 19.44 | 18.06 | 15.28 | 12.50 | 8.33 | | 82.35 | 72.22 | 91.67 | 64.29 | 85.71 | -----------+--------+--------+--------+--------+--------+ Diff, Same | 2 | 1 | 0 | 4 | 0 | 8 | 2.22 | 1.11 | 0.00 | 4.44 | 0.00 | 8.89 | 25.00 | 12.50 | 0.00 | 50.00 | 0.00 | | 11.76 | 5.56 | 0.00 | 28.57 | 0.00 | -----------+--------+--------+--------+--------+--------+ Diff, Diff | 1 | 4 | 1 | 1 | 1 | 10 | 1.11 | 4.44 | 1.11 | 1.11 | 1.11 | 11.11 | 10.00 | 40.00 | 10.00 | 10.00 | 10.00 | | 5.88 | 22.22 | 8.33 | 7.14 | 14.29 | -----------+--------+--------+--------+--------+--------+ Total 17 18 12 14 7 90 18.89 20.00 13.33 15.56 7.78 100.00 (Continued) TABLE OF OCS BY OCP OCS(Occ Same) OCP(Check Codes) Frequency | Percent | Row Pct | Col Pct |Operativ|Transpor|Farmers |Service | |es |t | |Workers | Total -----------+--------+--------+--------+--------+ Codes Same | 6 | 1 | 2 | 10 | 72 | 6.67 | 1.11 | 2.22 | 11.11 | 80.00 | 8.33 | 1.39 | 2.78 | 13.89 | | 75.00 | 100.00 | 100.00 | 90.91 | -----------+--------+--------+--------+--------+ Diff, Same | 0 | 0 | 0 | 1 | 8 | 0.00 | 0.00 | 0.00 | 1.11 | 8.89 | 0.00 | 0.00 | 0.00 | 12.50 | | 0.00 | 0.00 | 0.00 | 9.09 | -----------+--------+--------+--------+--------+ Diff, Diff | 2 | 0 | 0 | 0 | 10 | 2.22 | 0.00 | 0.00 | 0.00 | 11.11 | 20.00 | 0.00 | 0.00 | 0.00 | | 25.00 | 0.00 | 0.00 | 0.00 | -----------+--------+--------+--------+--------+ Total 8 1 2 11 90 8.89 1.11 2.22 12.22 100.00 The SAS System 4 15:59 Monday, TABLE OF OCS BY OCR OCS(Occ Same) OCR(Prod Coders) Frequency | Percent | Row Pct | Col Pct |Professi|Managers|Sales |Clerical|Craftsme| |onal | | | |n | Total -----------+--------+--------+--------+--------+--------+ Codes Same | 14 | 13 | 11 | 9 | 6 | 72 | 15.73 | 14.61 | 12.36 | 10.11 | 6.74 | 80.90 | 19.44 | 18.06 | 15.28 | 12.50 | 8.33 | | 73.68 | 86.67 | 100.00 | 56.25 | 85.71 | -----------+--------+--------+--------+--------+--------+ Diff, Same | 2 | 1 | 0 | 4 | 0 | 8 | 2.25 | 1.12 | 0.00 | 4.49 | 0.00 | 8.99 | 25.00 | 12.50 | 0.00 | 50.00 | 0.00 | | 10.53 | 6.67 | 0.00 | 25.00 | 0.00 | -----------+--------+--------+--------+--------+--------+ Diff, Diff | 3 | 1 | 0 | 3 | 1 | 9 | 3.37 | 1.12 | 0.00 | 3.37 | 1.12 | 10.11 | 33.33 | 11.11 | 0.00 | 33.33 | 11.11 | | 15.79 | 6.67 | 0.00 | 18.75 | 14.29 | -----------+--------+--------+--------+--------+--------+ Total 19 15 11 16 7 89 21.35 16.85 12.36 17.98 7.87 100.00 (Continued) The SAS System 5 15:59 Monday, October 25, 1993 TABLE OF OCS BY OCR OCS(Occ Same) OCR(Prod Coders) Frequency | Percent | Row Pct | Col Pct |Operativ|Transpor|Laborers|Farmers |Service | |es |t | | |Workers | Total -----------+--------+--------+--------+--------+--------+ Codes Same | 6 | 1 | 0 | 2 | 10 | 72 | 6.74 | 1.12 | 0.00 | 2.25 | 11.24 | 80.90 | 8.33 | 1.39 | 0.00 | 2.78 | 13.89 | | 100.00 | 100.00 | 0.00 | 100.00 | 90.91 | -----------+--------+--------+--------+--------+--------+ Diff, Same | 0 | 0 | 0 | 0 | 1 | 8 | 0.00 | 0.00 | 0.00 | 0.00 | 1.12 | 8.99 | 0.00 | 0.00 | 0.00 | 0.00 | 12.50 | | 0.00 | 0.00 | 0.00 | 0.00 | 9.09 | -----------+--------+--------+--------+--------+--------+ Diff, Diff | 0 | 0 | 1 | 0 | 0 | 9 | 0.00 | 0.00 | 1.12 | 0.00 | 0.00 | 10.11 | 0.00 | 0.00 | 11.11 | 0.00 | 0.00 | | 0.00 | 0.00 | 100.00 | 0.00 | 0.00 | -----------+--------+--------+--------+--------+--------+ Total 6 1 1 2 11 89 6.74 1.12 1.12 2.25 12.36 100.00 Frequency Missing = 1