q`9 @@@ @@@@,?>= EN DB  3{ Geffert1965x Geffert1965 Geffert1965 Geffert1965 Geffert1966 Geffert1966 Geffert1966 Geffert1966 Geffert1966 Geffert1966 Geffert1966 Geffert1966 Geffert1966 Geffert1966k Groves1965 Groves1966Hinckley1966Hinckley1966 Kolloft1979 Lieberman1966 Lieberman1967 Lieberman1967 Lieberman1967" Lieberman1968) Lieberman1968/ Lieberman19680 Lieberman1968Loniello1967 Mastronarde1967 Menchik1979 Menchik1979 Menchik1979 Menchik1979 Menchik1980 Miller1964D Miller1965J Miller1965K Miller1965L Miller1965M Miller1965j Miller1965 Miller1966 Miller1966 Miller1966 Miller1966 Miller1966 Miller1971 MISSING Moyer1964 Moyer1964 Moyer1964 Moyer1964  Moyer1964# Moyer1964" Moyer1965+ Moyer1965/ Moyer19650 Moyer19656 Moyer19659 Moyer1965< Moyer1965= Moyer1965> Moyer1965@ Moyer1965A Moyer1965E Moyer1965^ Moyer1965b Moyer1965d Moyer1965h Moyer1965m Moyer1965n Moyer1965p Moyer1965v Moyer1965z Moyer1965{ Moyer1965~ Moyer1965 Moyer1965 Moyer1965 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Nelson1967 Nelson1970wNONENONE Robertson1967 Robertson1967* Robertson1968, Robertson1968 Rosenbaum1980 Rosenbaum1981 Rosenbaum1981" Roubal1965O Ryshpan1965R Ryshpan1965U Ryshpan1965W Ryshpan1965y Ryshpan1965P Schilberg1969Q Schilberg1969V Schilberg1969 Schroeder1967 Schroeder1967% Schroeder1968. Schroeder19681 Schroeder19686 Schroeder1968; Schroeder1968= Schroeder1968@ Schroeder1968C Schroeder1968H Schroeder1968L Schroeder1969M Schroeder1969_ Schroeder1969` Schroeder1969a Schroeder1969c Schroeder1969d Schroeder1969j Schroeder1969s Schroeder1970t Schroeder1970u Schroeder1970v Schroeder1970x Schroeder1970y Schroeder1970} Schroeder1970 Schroeder1970 Schroeder1970 Seavey1964( Seavey19657 Seavey1965V Seavey1965& Seavey1965 Smith1967 Smith1967 Smith1967 Smith1967  Smith1967' Smith19687 Smith1968 Tam1971 Tam1971 Tam1971 Tam1971 Tam1971 Tam1971 Tam1971 Tam1971~VonSchneidemesser1965|VonSchneidemesser1965}VonSchneidemesser1965VonSchneidemesser1965VonSchneidemesser1965VonSchneidemesser1965VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1966VonSchneidemesser1967VonSchneidemesser1967VonSchneidemesser1967VonSchneidemesser1967VonSchneidemesser1967 VonSchneidemesser19672VonSchneidemesser19683VonSchneidemesser1968eVonSchneidemesser1969 Waller1981Y Weininger1965:Whitaker1968DWhitaker1968EWhitaker1968GWhitaker1968NWhitaker1969OWhitaker1969SWhitaker1969TWhitaker1969UWhitaker1969WWhitaker1969[Whitaker1969Whitaker1970Whitaker1970 Wiegner1966X Wilde1969Y Wilde1969 Wright1978 Wright1978 Young1980 Young19801980 Young19801980 Young1980 Young1980 Young198080 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young198080 Young19801980 Young19801980 Young19801980 Young19801980 Young19801980 Young198080 Young198080 Young198080 Young198080 Young198080 Young198080 Young198080 Young198080 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young1980 Young198080 Young198080 Young198080 Young198080 Young19801980 Young19801980 Young19801980 Young1980ung1980 Young198080 Young1980 Young198080 Young198080 Young198080 Young19801980 Young198080 Young198080 Young1980 Young1980g1980 Young1980 Moyer1965@ Moyer1965A Moyer1965E Moyer1965^ Moyer1965b Moyer1965d Moyer1965h Moyer1965m Moyer1965n Moyer1965p Moyer1965v Moyer1965z Moyer1965{ Moyer1965~ Moyer1965 Moyer1965 Moyer1965 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Moyer1966 Nelson1967k Nelson1970wNONE Pechman1964 Robertson1967 Robertson1967  Robertson1968  Robertson1968" Roubal1965O Ryshpan1965R Ryshpan1965U Ryshpan1965W Ryshpan1965X Ryshpan1965y Ryshpan1965  Schilberg1968. Schilberg19691 Schilberg19697 Schilberg1969|Schneidemesser1965}Schneidemesser1965~Schneidemesser1965Schneidemesser1965̀Schneidemesser1965́Schneidemesser1965̓Schneidemesser1966̙Schneidemesser1966̜Schneidemesser1966̡Schneidemesser1966̤Schneidemesser1966̥Schneidemesser1966̧Schneidemesser1966̴Schneidemesser1966̹Schneidemesser1967̼Schneidemesser1967̽Schneidemesser1967Schneidemesser1967Schneidemesser1967Schneidemesser1967Schneidemesser1968Schneidemesser1968FSchneidemesser1969 Schroeder1967 Schroeder1967 Schroeder1968 Schroeder1968 Schroeder1968 Schroeder1968 Schroeder1968 Schroeder1968 Schroeder1968! Schroeder1968$ Schroeder1968) Schroeder1968/ Schroeder19690 Schroeder1969@ Schroeder1969A Schroeder1969B Schroeder1969D Schroeder1969E Schroeder1969K Schroeder1969T Schroeder1970U Schroeder1970V Schroeder1970W Schroeder1970Y Schroeder1970Z Schroeder1970[ Schroeder1970_ Schroeder1970b Schroeder1970g Schroeder1970 Seavey1964 Seavey1965( Seavey19657 Seavey1965V Seavey1965& Seavey1965 Smith1967 Smith1967 Smith1967 Smith1967 Smith1968 Smith1968 Staff1965e Staff1965q Tam1971Y Weininger1965%Whitaker1968&Whitaker19685Whitaker1968,Whitaker1969-Whitaker19693Whitaker19694Whitaker19696Whitaker19698Whitaker1969<Whitaker1969eWhitaker1970fWhitaker1970̨ Wiegner19669 Wilde1969: Wilde1969 AuthorsXJournalsKeywords 1T                                  X Aldrich, BarbAldrich, BarbaraAthreya, VenkateshAythreya, Venkatesh Ball, Robert Barense, JackBarger, Robert Bauman, R.Bauman, RichardBegum, JahanaraBhargava, Ashok Bridges, BenBridges, BenjaminBussmann, WynnCassidy, VictorCook, Billy Dee David, Martindef deVries deVries, John Dey, Srilekha Duchan, AlanDuddleston, William Durant, Ron Ellis, M. Ellis, Max Esterly, Bob FitzgeraldFrost Frost, RobertFrost, Robert Robert Gates, Bill Gay, BobGeffert, James Geffert, Jimghi Glaaser, EdGroves, HaroldHaber, LawrenceHeller, Robert HinckleyHinckley, Marcia KolloftLampman, Robert Letterman, K.Lieberman, MarkLoniello, George Mansfield, M.Mansfield, MikeMastronarde, RichardMcClung, Nelson Menchik, Paul Merriam, Ida Miller, R. F. Miller, Roger MISSING Moyer, Gene Nelson, KenNONE Orcutt, GuyPechman, Joseph Rifkind, IraRobertson, Paul RosenbaumRosenbaum, David Roubal, Hilde Ryshpan, JonRyshpan, JonathanSchilberg, MargaretSchneidemesser, Mike vonSchneidemesser, vonSchroeder, LarrySeavey, Marshall Smith, Jan Smith, JanetStaffSteward, Donald Tam, WalterVonSchneidemesser, MikeWadleigh, LloydWaller, CharlesWeininger, AdrianaWhitaker, JanetWiegner, Edward Wilde, Mark Wright Wright, DavidYoungYoung9\85$$#1-"- &)?V+*0.0Jl<:>>>DK'=MM!y$;ke@2@LWGQGoE`2[2Scda[`hjUBb_BtBB]]vm|ixsz^w^IPYOV4P((6n,VH<r1C  <1 645-001 645-003 645-063 656-033 678-002 678-031Administration Age Data AnalysisAveraging StudiesBenefit Analysis Benefit File Benefit File- Social SecurityConsistency of DataCross TabulationsData Processing(#Data- 1962 (State Tax Roll Records)$ Data- 1962 State Tax roll RecordEdits Extract 01 Extract 79,'Fixed Format Identification File (FFID)(%Fixed Format Identification- 805 File Formats$General Papers (Regarding WAIS) History File Kahn OutputLongitudinal Analysis,&Maintenance System - Files, Data, Etc. Marital UnitMaster File- Tax RecordsMedical Expense Data Miscellaneous("Missing Data (Master File Records) Mobility Programs Property File(%Proposals- For Analyses, Theses, etc.Selection FileSimulation and TUHSocial Security("Social Security Earnings Data- 805Survey Data and File Tables WAIS SampleWAIS-Wealth: FilesWAIS-Wealth: General$WAIS-Wealth: Sample ProcessingWAIS-Wealth: StudiesxJames Geffert 1966 Kahn Tape File DescriptionJanuary 4, 1966 WAIS paper656-035p Kahn OutputrJames Geffert WAIS 656-035 January 4, 1965 Kahn Tape File Description I. Physical. tapes in extract file: SSRI 297 Reel 1 of 11 SSRI 291 Reel 2 of 11 SSRI 168 Reel 3 of 11 SSRI 341 Reel 4 of 11 SSRI 211 Reel 5 of 11 SSRI 142 Reel 6 of 11 SSRI 202 Reel 7 of 11 SSRI 162 Reel 8 of 11 SSRI 203 Real 9 of 11 SSRI 128 Reel 10 of 11 SSRI 359 Reel 11 of 11 II. Record Counts Each of the first 10 reels contain 28,080 80 character records followed by a dummy record containing 8's followed  James Gefferta 1965JDTransformation of Multi-dimensional Arrays to One-dimensional ArraysAugust 24, 1965. WAIS paper656-018.6/Data Processing General Papers (Regarding WAIS)wPJDocument James Geffert WAIS Paper 656-018 24th August, 1955 Transformation of Multidimensional Arrays to One Dimensional Arrays.* The problem of efficient computer treatment of multidimensional arrays of data is essentially a problem of storing and referencing elements of such arrays in an unambiguous fashion without wasting storage space. In general, the problem is one of transforming the subscripts of elements in multidimensional arrays into single subscripts of a onedimensional array such that each element of the multidimensional array has a unique subscript in the one-dimensional array. Let p be a k dimensional array in which the typical element is p (sl, s2, ..., sk) where s1 = 1, 2, ..., nl s2 = 1, 2, ..., n2 . . . sk = 1, 2, ..., nk Let a be a one-dimensional array in which the typical element is a(i). Elements of p can be converted to elements of a by the following method. a(i) _= p(sl,s29 ..., sk) where i = (s1-1)(n2n3...nk) + (s2 - l)(n3n4 ... nk) + (sk-1 - 1)(nk) + sk * A less general method of transformation can be found in Loniello, WAIS 656-017, August 24, 1965.hahttp://www.ssc.wisc.edu/wais/WAIS656018.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656018.txtmJames Geffert@ 1966Format of Kahn RecordsJanuary 4, 1966a WAIS paper656-0365 Kahn Outputr James Geffert WAIS 656-036 January 4, 1965 FORMAT OF KAHN RECORDS Card Posistion Label Item Number (400 char. M.F.) ALL 1-8 B9 WAIS ID # ALL 9-10 B11 Year of Return ALL 11 Card number 1 12-14 B14 Res James Geffert@ 1966 Kahn Tape File DescriptionJanuary 4, 1966D WAIS paper656-035p Kahn OutputrJames Geffert WAIS 656-035 January 4, 1965 Kahn Tape File Description I. Physical tapes in extract file: SSRI 297 Reel 1 of 11 SSRI 291 Reel 2 of 11 SSRI 168 Reel 3 of 11 SSRI 341 Reel 4 of 11 SSRI 211 Reel 5 of 11 SSRI 142 Reel 6 of 11 SSRI 202 Reel 7 of 11 SSRI 162 Reel 8 of 11 SSRI 203 Real 9 of 11 SSRI 128 Reel 10 of 11 SSRI 359 Reel 11 of 11 II. Record Counts Each of the first 10 reels contain 28,080 80 character records followed by a dummy record containing 8's followed by a tape mark. The eleventh and last reel contains 10,680 80 character records followed by a dummy record containing 9's. The dummy 9's record is not followed by a tape mark. III. Recording Information 1. Tape density: 556 characters per inch 2. Record length: 80 characters 3. Blocking: 1 record per block 4. Inter-record gap: .75 inch 5. Record Mode: Binary-coded decimal Even Parity Move mode IV. Record Order Primary sorting field 1-6 First secondary field 9-10 Second secondary field 7 V. Fortran Considerations 1. Decimal places: In all fields containing items of income, expense or deduction the decimal point is assumed to be two places to the left of the low order position of the field. It is suggested that a FORTAN format specification of F9.2 be used to read these fields. 2. Signs: Negative quantities are indicated by a minus sign (-) in the high order position of the field. Positive quantities are unsigned.ahahttp://www.ssc.wisc.edu/wais/WAIS656035.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656035.txt t js< Ron Durant 1965~xOperating Instructions for the Updating of Individual (9-Digit) Fields in the Character 400 Post-Consistency Master FileAugust 27, 1965f WAIS paper656-013LF?Data Processing Maintenance System - Files, Data, Etc. ProgramslRon Durant WAIS Paper 656-013 August 27, 1965 Operating Instructions For the Updating of Individual (9-Digit) Fields in the 400 Character PostConsistency Master File OUTLINE: Page I. E.A.M. Sort of Updating Data 1 II. Card-to-Tape Updating Data 1 III. TAX-08 1 Appendix A Systems Flowchart A-1 I. E.A.M. Sort of Updating Data. 1. Sort updating data on columns 12 thru 1. II. Card-to-Tape with Updating Data. 1. Load SSRI C/T (Blocked 50) program in Card Reader. [Located in SSRI (DURANT) Drawer # 1 - 1410 Room]. 2. Load Updating data behind program deck. 3. Mount Scratch Tape on Unit 1. 4. Perform Standard 1410 Processor-108 Initialization Routine on Console. 5. C/T Output will be on Unit 1 at end of job. III. Update of Individual (9-Digit) Fields in Post-Consistency Master File: (TAX-08) 1. Load TAX-08 Object Deck in Card Reader. [Program Deck is located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Processor 108 Initialization). 2. Mount Sorted PRE-TAX-08 Output on Unit 1. Mount 1st Reel of WAIS Master I/P on Unit 3. " Scratch Tape " " 2. Bring Printer to Ready. 3. During the course of the program, console messages will notify the operator as to the mounting of additional input and output reels of tape. 4. TAX-08 Output is as follows: (a) Updated Master File on Unit 2. (b) Edit Listing on the Printer. Systems Flowchart for the Updating of Individual (9-Digit) Fields In The 400 Character Post-Consistency Master File* Step I EAM Sort of Updating Data Step II Card to Tape With Sorted Updating Data Sorted Updated Data Step III TAX-08 Edit Listing Input Master File Updated Master File *Logic of Flowchart for TAX-08 is the same as TAX-07.hahttp://www.ssc.wisc.edu/wais/WAIS656013.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656013.txtX Ron Durant 1965tmOperating Instructions for the Updating of Individual Fields in the 310 Character Pre-Consistency Master FilenAugust 27, 1965t WAIS paper656-014F?Data Processing Maintenance System - Files, Data, Etc. Programs/ Ron Durant WAIS Paper 656-014 August 27, 1965 Operating Instructions for the Updating of Individual Fields in the 310 Character Pre-Consistency Master File OUTLINE: Page I. Card-to-Tape - Updating Data 1 II. Sort Updating Data 1 III. TAX-07 2 Appendix A Systems Flowchart A-1 Appendix B TAX-07 Flowchart B-1 I. Card-to-Tape with Updating Data: (50 Records Per Block) 1. Load SSRI C/T (Blocked 50) program in Card Reader. [Located in SSRI (DURANT) Drawer # 1 - 1410 Room]. 2. Load Updating data behind program deck. 3. Mount Scratch Tape on Unit 1. 4. Perform Standard. 1410 Processor-108 Initialization Routine on Console. 5. C/T Output will be on Unit I at end of job. II. Sort Tape created in I; Sorting Sequence is: (1) WAIS Identification Number (Cols 1-8) (2) Year (Cols 9-'10) (3) Entry Code (Cols 11-12). 1. Load PRE-TAX-07 Sort Control Cards in Card Reader. [Located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Operating Systems Initialization). 2. Mount SOF [Tape # SSRI. 1,30A (HALF REEL)] on Unit 0. 3. Mount I/P Tape to be Sorted on Unit 1. " Scratch Tape 2. " " 3. " " 4. 4. After Halt signaling that the Input Tape has been read in and unloaded: Mount Scratch Tape on Unit 1. 5. At end of job a console message will inform the operator as to which unit the sorted output is located. III. Update of Individual Fields in Pre-Consistency Master File: (TAX-07) 1. Load TAX-07 Object Deck in Card Reader. [Program Deck is located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Processor-108 Initialization). 2. Mount Sorted PRE-TAX-07 Output on Unit 1. Mount 1st Reel of WAIS Master I/P on Unit 3. " Scratch Tape " " 2. Bring Printer to Ready. 3. During the course of the program, console messages will notify the operator as to the mounting of additional input and output reels of tape. 4. TAX-07 output is as follows: (a) Updated Master File on Unit 2. (b) Edit listing on the Printer. Systems Flowchart for the Updating of Individual Fields in the 310 Character Pre-Consistency Master File Appendix A A-1 Step I CARD TO TAPE WITH UPDATING DATA UPDATED DATA Step II SORT DATA INTO SEQUENCE OF MASTER FILE SORTED UPDATED DATA Step III TAX-07 EDIT LISTING INPUT MASTER FILE UPDATED MASTER FILE Appendix B B-1 TAX Flow Chart START GET MASTER TO MASTER I/P AREA GENMASTIN ABLE1 GET DATA CHANGE I/P TO CHANGE I/P AREA & EDIT REENTRY CMP MASTER TAG TO NINES Done in GENMASTIN ROUTINE END OF JOB ROUTINT COMPARE DATA TAG TO MASTER TAG PUT OUT ERROR MS6 TRYING TO UPDATE NON-EXIST MASTER SETON M SWITCH ABLE1 SETON M & C SWITCHES M SW. ON SETOFF SW.M MOVE MASTERIN RCD TO MAREA SETON SW.W OFF C SW. OFF GO TO PAGE 2 ON SETOFF SW.C REVISE MAREA AFTER GETTING ADDRESS OF ENTRY CODE GET DATA CHANGE I/P TO CHANGE I/P AREA & EDIT COMPARE CURRDATATAG TO PREVDATATAG HALT - SEQ. ERROR ON DATA CHANGE I/P B-2 FROM PAGE 1 W SW. OFF REENTRY ON SETOFF SW.W MOVE MAREA TO MAST O/P AREA & CHECK IF COMPLETE BRANCH TO GENMASTOUR ROUTINE BRANCH TO GENMASTIN ROUTINEMhahttp://www.ssc.wisc.edu/wais/WAIS656014.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656014.txt / James Geffert 1964Format, Card RecordSeptember 16, 1964 WAIS paper645-004s& Master File- Tax Records Formats..Jim Geffert September 16, 1964 Format, Card Record Draft 645-004 TYPE 1 CARD l Card Master Pos. Desig. Desig. Information 1 C 1 None Card number 1 2 - 9 C 9 M 9 Identification number 10 - 11 C11 M 11 Year 12 - 27 C27 M 27 Data sheet 28 C28 M307 Form type 1 29 C35 M251 Block or column 30 - 35 Personal Exemption allowance 36 - 42 C42 M258 Net normal or total tax 43 - 49 C49 M265 First installment 50 - 56 C56 M 34 Largest wage or salary 57 - 63 C63 M 41 Second wage or salary 64 - 70 C70 M 48 Total other wage and salary 71 - 77 C77 M 55 Interest received 78 C78 M294 Number sources of wages and salaries 79 C79 None Blank 80 C80 M295 Farm schedule. Profit and lose TYPE 1 CARD 2 1 A 1 None Card number 2 2 - 9 A 9 M 9 Identification number 10 - 11 All M 11 Year 12 - 18 A18 M 62 Dividend received 19 - 25 A25 M 69 Rents 26 - 27 A27 None Blank 28 A28 M307 Form type 1 29 - 35 A35 M 76 Gain or loss, sale of assets 36 - 42 A42 M 83 Profit or loss from business 43 - 49 A49 M 90 Trustee income 50 - 56 A56 M 97 Partnership income 57 - 63 A63 M104 All other income 64 - 70 A70 Mill Total income 71 - 77 A77 M139 Standard deduction net total income 78 A78 M296 Stock dividend received? 79 A79 M297 Auto exemption itemized? 80 A80 M298 Other enclosures TYPE 1 CARD 3 Card Master Pos. Desig. Desig. Information 1 A 1 None Card number 3 2- 9 A 9 M 9 Identification number 10 - 11 All M 11 Year 12 - 18 A18 M146 Wisconsin income tax paid 19 - 25 A25 M153 Union dues 26 - 27 A27 None Blank 28 A28 M307 Form type 1 29 - 35 A35 M160 Medical - dental deductible 36 - 42 A42 M167 Interest paid 43 - 49 A49 M181 *Ordinary business expenses or dividends deductable 50 - 56 A56 M188 Other deductions 57 - 63 A63 M209 Total deductions before federal tax and donations 64 - 70 A70 M216 Net income before federal tax and donations 71 - 77 A77 M223 Federal tax and social security deductable *both to M181 TYPE 1 CARD 4 1 A 1 None Card number 4 2 - 9 A 9 M 9 Identification number 10 - 11 A11 M 11 Year 12 - 18 A18 M230 Net income before donations 19 - 25 A25 M237 Donations 26 - 27 A27 None Blank 28 A28 M307 Form type 1 29 - 35 A35 M244 Net taxable income 36 - 42 A42 M174 Business interest paid 43 - 49 A49 M272 Military, etc. 50 - 56 A56 M279 Social security received TYPE 2 CARD 1 Card Master Pos. Desig. Desig. Information 1 C 1 None Card number 1 2 - 9 C 9 M 9 Identification number 10 - 11 C11 M 11 Year 12 - 27 C27 M 27 Data sheet 28 C28 M307 Form type 2 29 - 35 C35 M 34 Largest wage and salary 36 - 42 C42 M 41 Second wage and salary 43 - 49 C49 M 48 Total other wage and salary 50 - 56 C56 M 62 Dividend received 57 - 63 C63 M 55 Interest received 64 - 70 C70 M104 All other income or receipts 71 - 77 C77 M1ll Total income 78 C78 M294 Number of sources of wages and salaries 79 C79 None Blank 80 C80 M295 Farm schedule. Profit and loss TYPE 2 CARD 2 1 A 1 None Card number 2 2 - 9 A 9 M 9 Identification number 10 - 11 All M 11 Year 12 - 18 A18 M258 Net tax 19 - 25 A25 M265 First installment 26 - 27 A27 None Blank 28 A28 M307 Form type 2 29 - 35 A35 M272 Military, etc. 36 - 42 A42 M279 Social security received TYPE 3 CARD 1 Card Master Pos. Desig. Desig. Information 1 C 1 None Card number 1 2 - 9 C 9 M 9 Identification number 10 - 11 C11 M 11 Year 12 27 C27 M 27 Data sheet 28 C28 M307 Form type 3 29 Block or column C35 M251 30 - 35 Personal exemption allowance 36 - 42 C42 M258 Total tax 43 - 49 C49 M265 First installment 50 - 56 C56 M 34 Largest wage and salary 57 - 63 C63 M 41 Second wage and salary 64 - 70 C70 M 48 Total all other wage and salary 71 - 77 C77 M 55 Interest received 78 C78 M294 Number of sources of wages and salaries 79 C79 None Blank 80 C80 M295 Farm schedule, profit and loss TYPE 3 CARD 2 1 A 1 None Card number 2 2 - 9 A 9 M 9 Identification number 10 - 11 All M 11 Year 12 - 18 A18 M 62 Dividend received, total 19 - 25 A25 M 69 Rent received,- total 26 - 27 A27 None Blank 28 A28 M307 Form type 3 29 - 35 A35 M 76 Gain or loss, sale of assets 36 - 42 A42 M 83 Profit or loss on business 43 - 49 A49 M 90 Trust income 50 - 56 A56 M 97 Partnership income 57 - 63 A63 M104 A11 other income 64 - 70 A70 Mill Total income 71 - 77 A77 M218 Schedule F expenses 78 A78 M296 Stock dividend 79 A79 M297 Auto expenses itemized? 80 A80 M298 Other enclosures TYPE 3 CARD 3 Card Master Pos. Desig. Desig. Information 1 A 1 None Card number 3 2 - 9 A 9 M 9 Identification number 10 - 11 All M1l Year 12 - 18 A18 M125 Adjusted gross income 19 - 25 A25 M139 Net taxable income standard deduction basis 26 - 27 A27 None Blank 28 A28 M307 Form type 3 29 - 35 A35 M146 Wisconsin tax paid 36 - 42 A42 M253 Union dues 43 - 49 A49 M160 Medical - dental expenditures 50 - 56 A56 M167 Interest paid 57 - 63 A63 M188 Other deductions 64 - 70 A70 M209 Total deductions before federal taxation 71 - 77 A77 M216 Net income, before federal taxation TYPE 3 CARD 4 1 A 1 None Card number 4 2- 99 A 9 M 9 Identification number 10 - 11 All M 11 Year 12 - 18 A18 M223 Federal tax and social security deductable 19 - 25 A25 M230 Net income before donations 26 - 27 A27 None Blank 28 A28 M307 Form type 3 29 - 35 A35 M237 Donations 36 - 42 A42 M244 Net taxable income 43 - 49 A49 M274 Business interest paid 50 - 56 A56 M272 Military, etc. 57 - 63 A63 M279 Social security received TYPE 4 CARD 1 Card Master Pos. Desig. Desig. Information L C 1 None Card number 1 2 - 9 C 9 M 9 Identification number 10 - 11. M11 M 11 Year 12 - 27 C27 M 27 Data sheet 28 C28 M307 Form type 4 29 - 35 C35 M 34 Largest wage or salary 36 - 42 C42 M 41 Second wage or salary 43 - 49 C49 M 48 Total other wage and salary 50 - 56 C56 M 62 Dividend received 57 - 63 C63 M 55 Interest received 64 - 70 C70 M 83 Net profit and loss on business 71 - 77 C77 M 69 Net rents 78 C78 M294 Number of sources of wages and salaries 79 C79 None Blank 80 C80 M295 Farm schedule. profit and loss TYPE 4 CARD 2 1 A 1 None Card number 2 2 - 9 A 9 M 9 Identification number 10 - 11 All M 11 Year 12 - 18 1 A18 M104 all other income 19 - 25 A25 M111 Total income 26 - 27 A27 None Blank 28 A28 M307 Form type 4 29 - 35 A35 M251 Personal exemption allowance 36 - 42 A42 M258 Total tax 43 - 49 A49 M265 First installment 50 - 56 A56 M272 Military, etc. 57 - 63 A63 M279 Social security received TYPE 5 CARD 1 Card Master Pos. Desig. Desig. Information 1 C 1 None Card number 1 2 - 9 C 9 M 9 Identification number 10 - 11 C11 M 11 Year 12 - 27 C27 M 27 Data sheet 28 C28 M307 Form type 5 29 - 35 C35 M 34 Largest wage and salary 36 - 42 C42 M 41 Second wage and salary 43 - 49 C49 M 48 Total other wage and salary 50 - 56 C46 None All other income 57 - 63 C63 M111 Total income 64 - 70 C70 M118 Business expense 71 - 77 C77 M125 Adjusted gross income 78 C78 M294 Number of sources of wages and salaries 79 C79 M299 Spouse income 80 C80 M295 Farm. schedule, profit and loss TYPE 5 CARD 2 1 A 1 None Card number 2 2 - 9 A 9 M 9 Identification number 10 - 11 A11 M 11 Year 12 - 18 A18 M306 Net taxable income 19 - 25 A25 M251 Personal exemption allowance 26 - 27 A27 None Blank 28 A28 M307 Form type 5 29 - 35 A35 M258 Total tax 36 - 42 A42 M265 First installment 43 - 49 A49 M 55 Interest income 50 - 56 A56 M 62 Dividends income 57 - 63 A63 M 69 Rent income 64 - 70 A70 M 76 Gain or -loss, sale of assets 71 - 77 A77 M 83 Profit or loss, business 78 A78 M296 Stock dividend 79 A79 M297 Auto expenses itemized 80 A80 M298 Other enclosures TYPE 5 CARD 3 Card Master Pos. Desig. Desig. Information 1 A 1 None Card number 3 2 - 9 A 9 M 9 Identification number 10 - 11 A11 M 11 Year 12 - 18 A18 M 97 Partnership income 19 - 25 A25 M 90 Estate or trust income 26 - 27 A27 None Blank 28 A28 M307 Form type 5 29 - 35 A35 M104 Other income 36 - 42 A42 M174 *Business interest paid 43 - 49 A49 M167 *Non business interest paid 50 - 56 A56 M160 Medical - dental deductions 75 - 63 A63 M146 Wisconsin income tax paid 64 - 70 A70 M153 Union dues 71 - 77 A77 M195 Alimony paid *Add A42 and A49 and move to M167 TYPES CARD 4 1 A 1 None Card number 4 2 - 9 A 9 M 9 Identification number 10 - 11 A11 M 11 Year 12 - 18 A18 M202 Forest crop land 19 - 25 A25 M209 Total deductions before federal taxes and donations 26 - 27 A27 None Blank 28 A28 M307 Form type 5 29 - 35 A35 M216 Net income before federal tax and donations 36 - 42 A42 M223 Federal tax and social security deductable 43 - 49 A49 M230 Net income before donations 50 - 56 A56 M237 Donations 57 - 63 A63 M244 Net taxable income 64 - 70 A70 M272 Military, etc. 71 - 77 A77 M279 Social security received TYPE 6 CARD 1 Card Master Pos. Desig. Desig. Information 1 C 1 None Card number I 2 9 C 9 M 9 Identification number 10 - 11 C11 M 11 Year 12 - 27 C27 M 27 Data sheet 28 C28 M307 Form type 6 29 C35 M251 Block or column 30 - 35 Personal exemption allowance 36 - 42 C42 M258 Net normal or total tax 43 - 49 C49 M265 First installment 50 - 56 C56 M306 Taxable income Taxable income for this form is assumed to be taxable income, Incomplete Long Form TYPE B CARD 9 1 C 1 None Card number 9 2 - 9 C 9 M 9 Identification number 10 - 11 C11 M 11 Year 12 - 27 C27 M 27 Changed data sheet information TYPE B CARD 5 1 A 1 None Card number 5 2 - 9 A 9 M 9 Identification number 10 - 11 A11 M 11 Year 12 - 18 A18 None Previous income 19 - 25 A25 M286 Adjusted taxable income 26 - 27 A27 None Blank 28 A28 None Form type B 29 - 35 A35 None Additional normal tax 36 - 42 A42 None Interest computed 43 - 49 A49 None Additional 25%(20%) tax 50 - 56 A56 None Interest computed 57 - 63 A63 None Additional teacher surtax 64 - 70 A70 None interest 71 - 77 A77 M293 Total additional tax FORMAT DATA RECORD Pos 1 M 1 1 (2 if missing card or cards) 2 - 9 M 9 Identification number 10 - 11 M 11 Year of return 12 - 14 Residence location 15 - 16 County prior year 17 Address change 18 - 19 Occupation 20 Occupation change 21 M 27 Return reason 22 Partnership 23 Spouse separate income 24 Marriage details l 25 Read of family 26- 27 Number of dependents 28 - 34 M 34 Largest wage or salary 35 - 41 M 41 Second wage or salary 42- 48 M 48 Total other sources wage or salary 49 - 55 M 55 Total interest received 56 - 62 M 62 Dividends received, total 63 - 69 M 69 Rent received, total 70 - 76 M 76 Gain or loss on sale of assets 77 - 83 M 83 Profit or loss from business 84 - 90 M 90 Income from trustees or fiduciaries 91 - 97 M 97 Partnership income 98 - 104 M104 Other income I 105 - 111 M111 Total of sources of income 112 - 118 M218 Auto or business expenses 119 - 125 M225 Income (adjusted gross) less auto expense 126 - 132 M132 Standard deduction allowed 133 - 139 M139 Net taxable income, standard deduction basis 140 - 146 M146 Wisconsin tax paid 147 - 153 M153 Union dues 154 - 160 M160 Medical-dental, expenses 161 - 167 M167 Total interest paid FORMAT DATA RECORD 168 - 174 M174 Business interest paid 175 - 181 M181 Dividend deductable 182 - 188 M188 Other deductions 189 - 195 M195 Alimony paid 196 - 202 M202 Forest crop land 203 - 209 M209 Total deductions before federal tax and donations 210 - 216 M216 Net income before federal tax and donations 217 - 223 M223 Federal tax and social security deductable 224 - 230 M230 Net income before donations 231 - 237 M237 Donations 238- 244 M244 Net taxable income itemized basis 245 lock or column 246 - 251 M251 Personal exemption allowance 252 - 258 M258 Net normal or total tax 259 - 265 M265 First installment 266 Type of item in 267 - 272 267 - 272 M272 Miscellaneous information 273 - 279 M279 Social security received 280 - 286 M286 Assesed taxable income 287 - 293 M293 Total additional taxes ; 294 M294 Number of sources wages and salaries 295 M295 Farm schedule, profit and loss 296 M296 Stock dividend 297 M297 Auto expenses 298 M298 Other enclosures 299 M299 Spouse income 59 or 60 300 - 306 M306 Taxable income incomplete-form or net taxable income, type 5 form 307 M307 Type of form 308 M308 1 incomplete, 0 complete 309 M309 Empty 310 Record markhahttp://www.ssc.wisc.edu/wais/WAIS645004.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645004.txt; Gene Moyer 1964D>Methods of Devising a Distribution of Unrealized Capital GainsNovember 9, 1964 WAIS paper645-005eB Pi Pit < Pi* Pi > it ! Pi* possible Then Cj(t+Y)* = (Pi(t+9) - B4i) (B4i s Pi) ~y v (10) ci(t+y) = (P(t+y) - 134i) ci(t+y)* _ (pi(t+y) .. B51) (851 - Pf) (U) c1(t+Y) = tPi(t+y) c (t i(t+y)* i(+y) But for some purposes a point estimator of Bi is required. Three such point estimators which may be used with various degrees of bias and efficiency are: (12) B61 = ~E (Pi + Pi* n 7 (13) B Z P _ where 7 = (the number of years from taxpayer's age 7i j=f i(t 3) 25 to year (t-1)) 7t (14) B 1 = Y E P (_) W where W some weight factor 3 3 J=l 16 B6i is an appealing estimator because of its ease of computation. When so many computations are required, ease of computation is an important consideration. It is probably a biassed estimator however depending upon the distribution of prices between Pi and Pi. Since B6i does not take this distribution into account and since there seems to be little reason for believing the distribution of prices from Pi to Pi to be symmetric, the probability of bias seems great. The bias might be reduced by using B7i, since it does take the distribution of prices into account, but since it gives. as much weight to prices in time (t-Y) as it does to prices in time (t-1), and since the probability is great that a taxpayer holds an asset only a short time, one would expect that B(B7i - Bi) < B(B7i - Bi ) but still not zero. The choice of age 25 as the cutoff point of Y is arbitrary, but self-accumulation of capital does not ordinarily begin until about that age and inheritances would not ordinarily occur until that age or later, so it seems to be justifiable. It is, however, arbitrary and one might use any other age up to 30 with equal justification. B8i is by far the best choice in the opinion of this writer, but the choice of weights is difficult. If (l3) Wj - 3 , then the weight given to recent years is very high and at j - Y (for an old taxpayer) is near zero. Another weighting scheme might be (16) W' - Y This too monotonically decreases as one goes back into time, but the speed of decline is much less. For a taxpayer of age 65, Wj = 39 when j - 1 and 2 when j - 10. Both these weighting schemes, however, suffer from the deficiency that they are arbitrary and divorced from the data. A bettor scheme which is more realistic is to compute the value of assets in the class to which the ith asset belongs on which capital gains have been realized during the years of our tax sample (R). Since each of these are accompanied by the date the asset was purchased , the value of those which were held j years (H i ) may be computed thus (17) W j Another scheme which also has a basis in the real world is to form a distribution of the prices of the ith asset over the Y years thus f (P) I P. 1 p where f(p) = a discrete variable indicating the ratio of the total number of periods the asset sold at a given price to the total number of periods.* Therefore the sum of the f(p) = 1. The indicated distribution is a mythical example. Having formed such a distribution, then, the proper weight would be (18) Wj = f(p) Both (17) and (18) have the advantage of having bases in the real world even though there is no assurance that they are actually behaviorally sound.. Also, they are not necessarily monotonic. If the assumption in (15) and (16) that the probability that an asset will be held for j years decreases as j increases is in error, the fact that (17) and (18) are not monotonic may be a great advantage. *The writer is indebted to Martin H. David for the idea of forming the probability distribution of prices to use as a weight. Time and money may ultimately be the most important factors in determining which estimator to use or (if B8i is chosen) which weighting scheme to use. Estimating Capital Gains on Assets in Strata When only asset class and annual earnings are available, the problem becomes one of estimating prices as well as the taxpayer's basis. One simple method which would seem to have a small bias is to determine the ratio of Mean earnings paid to asset owners in year t for the class of Mean value of assets in year t 9 t assets to which the ith asset belongs. _ d Vt ( E ~ P~T,)t where P is the price and N is the number of issues KKl outstanding of the d assets in that class of assets. -1 Then ( ) where Eit is the actual earnings received by the Vt taxpayer. If it is also assumed that Bi = Pit, then one can use Bt t V B9i as the estimator of Bi. The matrix of estimated unrealized capital gains for this stratum will be -1 (20) Ci(t+y) Ei( y) B91 (y = 0, 1, 2,..., (t+y}l when y - 0, of course, Ci(t+y) = 0. This procedure has only the fact that it is possible to recommend it, but that fact is of great significance when so little information is available. Estimating Capital Gains for Stratum V Where not even the asset class is available, one might determine M- I (21) Sigma St Vt K-1 M- 1 Vt K for the m-i ascertained classes of assets defined in the study. Then Emt =Pmt where Em is the earning of the non ascertained assets of the taxpayer in year t. If one further assumes as in Stratum IV that Bi ft Pit, one can use -1 (23) B10i = Emt as the estimator of BV The estimated capital gains for assets in Stratum V, then, will be t -I (24) ci(t+y) _ - i(t+y) B1oi (y 10 o, 1, 2,..., n) e~y) again, when y = 0, Ci(t+y) = 0. Estimating Capital Gains for Stratum VI. Estimating capital gains for assets in this stratum is in some ways the most difficult of any of the strata because one gets a few discrete observations but never a series of observations. For example, suppose one had an observation on an asset sold in 1959 which the taxpayer had owned since 1950. The basis of the asset is ascertained, but one has no idea about the series of capital gains except for the final realized gain. Perhaps the only thing which can be done is to determine the proportion of the asset's sale value which the earnings in the sale year represent (ri) and assume that this proportion is constant over the years since the taxpayer's first return. Then would seem to be the best estimator of Ci(t+y). It would seem that each case of this nature should be examined and an individual determination made for each. (25) 1 r Eit (26) (Pi(t+y) - Bi) (y = 0, 2,. .., n)hahttp://www.ssc.wisc.edu/wais/WAIS645005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645005.txt Gene Moyer 1964 Gathering a List of PricesOctober 13, 1964 WAIS paper645-006 Property FileGene Moyer WAIS 645-006 October 13, 1964 DRAFT Gathering a List of Prices 11. Sources of Information The source of this information will be the Stock Exchange listings for the year-end closing (Summary) from the Wall Street Journal. These listings are generally grouped by the following exchanges*. Numerical designation 1 New York Stock Exchange (STOCKS AND BONDS) 2 American Stock Exchange (STOCKS AND BONDS) 3 Midwest Stock Exchange 4 Large city markets (Canada and far West) Over the counter markets 5 Industrials and utilities 6 Bank stocks 7 Insurance stocks 8 Corporate bonds 9 Public authority bonds Mutual funds Not on any list 2. Modus Operandi A. Investigate the list of companies in the "Non-Chicago list" (those stocks not on the Chicago tape) to find the exchange on which the company is listed (if it is listed on any). B. Divide the list of companies by the exchange on which each company is listed by punching the number designation on the Identification Card and sorting on this number. C. Get price information on all the issues in each exchange. 3. Information Needed A Year's high B Year's low C Year's closing D Year's dividend E Bond rates F Capital gains distributions from mutual funds 4. Years for Which Information Needed A. The year the firm name first appeared on our tax forms will be punched on the Identification Cards. B. Price information should be gathered (if possible) for the ten years preceding that date and for the succeeding years to 1963.hahttp://www.ssc.wisc.edu/wais/WAIS645006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645006.txte John deVries 1966<5Check for Presence of Proper Cards in the Survey FileSeptember 28, 1966 WAIS paper667-009ySurvey Data and File John de Vries WAIS 667-009 September 28, 1966 Check for presence of proper cards in the survey file As a sequel to WAIS 667-007 (in which the validity of the survey data was discussed), I have modified an earlier paper by Geffert dealing with the presence of proper cards in the survey file WAIS 656-061, dated May 4, 1966). Although the information in Geffert's paper is largely correct, it is also incomplete. It specifies which cards should be present and under which conditions this should be so; it does not specify under which conditions certain cards should not be present (naturally, if the conditions required for the presence of a card are not met, that card should not be present; there is, however, also a number of cards for which only one per respondent can be present; there are also situations where the presence of one card demands the absence of another card. More than Card Required 1 allowed? Remarks 02 Yes No 03 Yes No 04 Yes No 05 Yes No 06 If R male and ever No Cannot concur with card 7; should not married (card 4, col. 48 & 5) exist if ID # indicates that R is female 07 If R female and ever No Cannot concur with card 6; should not married (card 4, col.48 & 5) exist if ID 3 indicates that R is male 08 Yes No 09 Except when card 8, col. No 13=2, col. 14=0 and cols. 15-16 indicate that R never had a job 10 Yes No 11 Never No! Cards 11, 12 and 13 were not included; 12 Never No! pressence of any of these therefore 13 Never No! suspect 14 Sometimes; conditions for No Only present cards can be checked presence are contained on the card itself; legitimacy of absence can therefore not be verified More than Card Required 1 allowed? Remarks 15 Yes No 16 If card 15, cols, 63-64 Yes 2-digit counter not 00 or blanks 17 If card 15, col. 65 not Yes 1-digit counter 0 or blank 18 If card 15, col. 67 not Yes 2-digit counter 0 or blank 19 If card 15, col. 68 not Yes 1-digit counter 0 or blanks 20 If card 15, col. 69 not Yes 1-digit counter 0 or blank 21 Yes No 22 If card 23, col. 53=3 No and col. 54=1 23 If card 21, col. 53=3 No and a) col. 54=2 or 3, or b) col. 54=1 and card 22, col. 80=1 24 If card 21, col. 53=1 No Cannot concur with card 26 or col. 55=1 25 If card 21, col. 53=1 No Cannot concur with card 26 or col. 55=1 26 If card 21, col. 55=2 No Cannot concur with cards 24 and 25 or 3 27 If card 21, col. 53 No not blank 28 Yes No 29 Yes No The following cards all deal with information in the assets booklet; although, officially, several cards are always required, non-response to the assets booklet may be considered a "legitimate" reason for absence of a card. (There were 1105 assets booklets for 1300 respondents; therefore, 195 absences are legitimate). 30 Yes No 31 Yes No 32 If card 29, cols. 41-43 Yes 3-digit counter not zeros or blanks 33 If card 29, cols. 44-46 Yes 3-digit counter not zeroes or blanks More than Remarks Card Required 1 allowed? 34 If card 29, cols. 47-49 Yes 2-digit counter not zeroes or blanks 35 If card 29, cols. 50-51 No not zeroes or blanks 36 If card 29, cols. 52-53 Yes 2-digit counter not zeroes or blanks 37 If card 29, cols. 54-56 Yes 3-digit counter not zeroes or blanks 38 If card 29, cols. 57-58 Yes 2-digit counter not zeroes or blanks 39 If card 29, cols. 59-60 Yes 2-digit counter not zeroes or blanks_ 40 If card 29, cols, 61-62 Yes 2-digit counter not zeroes or blanks 41 If card 29, cols. 63-64 Yes 2-digit counter not zeroes or blanks 42 If card 29, cols. 65-67 Yes 3-digit counter not zeroes or blanks 43 If card 29, cols. 65-67 Yes Counter must match not zeroes or blanks counter for card 42 and if card 42, col. 72-1 44 If card 29, cols. 68-70 Yes 3-digit counter not zeroes or blanks 45 If card 29, col. 71 not 0 or blanks 46 If card 45, col. 73-1 No 47 If card 29, cols. 72-73 Yes 1-digit counter not zeroes or blanks 48 Yes Nohahttp://www.ssc.wisc.edu/wais/WAIS667009.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667009.txtBRichard Bauman 19652,Editing Conventions Used in Interview CodingMarch 30, 1965 WAIS paper645-043nSurvey Data and FileRichard A. Bauman WAIS 645-043 March 30, 1965. Editing Conventions Used in Interview Coding This list describes the editing actions which were taken in a number of cases and which may, therefore, be important in analysing the data. The number of recorded instances may be less than the estimated total number because of editing during keypunching. Questions Involved Action Taken # of Recorded Cases Est.# of Total Cases (a) We made the number of children described 9 15 Q.25, 26, in Q's 26 and 27 agree with the answer 27 to Q. 25 (e.g. R sometimes noted that he had children by a previous marriage but didn't answer 26 and 27 about them. We then marked the appropriate rows "NA" for Q's 26 and 27.) (b) When R had duplicated job information 10 10 Q.56c, 67, (e.g. same job in Q.56 and Q.68, or same 68, 76 job in Q.67 and Q.76) we crossed out the information which seemed least appropriate. (c) When R gave high and low estimates of an 37 50 Q.94a, 98a amount (e.g. $2000-$3000) we edited the 106, 107, response to the average of the 2 figures 108, 119, (if the new answer was also compatible 136, 144, with any other pertinent answers.) 145, A145, A148, 196, 197, (B)Q.12a, (B)13a, (B)36b, (1b)41e, (3)41, (b) 44b. (d) R often said he gave a single amount to 1 20 Q.98a several persons (e.g. mother and father$1200). We edited this to equal amounts for each person. (e) We crossed out health insurance benefits 9 9 Q.101f answered in this question. Questions Involved Action Taken # of Recorded Cases Est.# of Total Cases (f) When several legatees, beneficiaries, 5 25 Q.105, 114, or donees were mentioned for one 123, 130 inberitance, trust, or gift, we divided same equally among them. (g) If more than 3 places of residence 0 25 Q.140a were listed here, we had the first two and the last one punched. (h) When R had a multiple unit dwelling 34 34 Q.144-148b or commercial-dwelling combination A143-A148 we divided the property according to the editing instructions. (This was usually done strictly according to the number of units but some exceptions based on a priori information are described in our editing records.) (i) Q.A147a-A147b When R prepaid or increased his 10 10 mortgage 2 or more times, we crossed out the several dates in A147a and the several amounts given in A147b and entered "77" for A147a. 0) The interviewer's instructions 6 10 Q.175 following Q.174 erroneously led interviewer to skip Q.175. We marked Q.175 NA whenever Q.167 or Q.168 were answered "yes." (k) When it was apparent that a type of 9 9 Q.196, 197, income was answered to one of these 198a, 198b, questions (e.g. social security in 200, 201 Q.198a, farm income in Q.196), we shifted the income to the appropriate answer. (1) When these questions were inconsistent 26 26 Q.196, 197, with marginal notes or other "income 198a, 198b, questions", an "I" was coded and punched. 200, 201 (m) When it appeared that these income 7 7 Q.197, 198b, amounts were gross income, a "G" was 201 coded and punched. Questions Involved Action Taken # of Recorded Cases Est.# of Total Cases (n) When R answered appliances owned by 11 11 Q.1 10b someone else (landlord, relative, friend) we crossed them off. (o) U.S. Savings Bonds were crossed out 12 12 Q.14-17 if answered to Q.17 and were added to Q's 14-16 if not already answered there. (p) When R gave an answer to Q.21f and an 14 20 Q.21 answer to Q.21c which was equal to Q.21b, we crossed out the answer to Q.21c. (q) In many cases, the responses to these 58 65 Q.44 questions were obviously in error. (e.g. response to Q.44d or Q.44f greater than response to Q.44b) Therefore several actions were taken by the editors. (a) When the total term insurance held (Q.44f) was greater than the answer to Q.44b, and it was impossible to ascertain what was included in Q.44b, Q.44b was marked "NA". (b) When Q.44d-44g was left blank, these were marked "NA". (c) When it was possible to ascertain that a portion of Q.44d was non-term, the answers were adjusted accordingly.hahttp://www.ssc.wisc.edu/wais/WAIS645043.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645043.txt4+d Gene Moyer 1965F@ID Numbers Assigned to Individuals from the Unmatched State FileJanuary 14, 1965 WAIS paper645-0240Survey Data and File.(The data in this document is restricted.iSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Gene Moyer WAIS 645-024 January 14, 1965 Draft Numbers Assigned to Questionnaires in A =3, 4, 5, 6 (Unmerged State File). Gene Moyer 1965F@ID Numbers Assigned to Individuals from the Unmatched State FileJanuary 14, 1965 WAIS paper645-0240Survey Data and FilepjSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdha_data@ssc.wisc.edu for more information. Gene Moyer WAIS 645-024 January 14, 1965 Draft Numbers Assigned to Questionnaires in A =3, 4, 5, 6 (Unmerged State File).(The data in this document is restricted.||   James Geffertr 1964\UProposed Method of Merging Social Security Information with Wisconsin Income Tax DataeOctober 23, 1964 WAIS paper645-009.(Master File- Tax Records Social SecurityJames Geffert Draft October 23, 1964 WAIS 645-009 Data Processing Document: Proposed method of merging Social Security information with Wisconsin income tax data. Contents: 1. Flow diagram 2. Description of procedure General Flow Outline Phase A Soc. Sec. 14,000 records SORT (?) Soc. Sec. 14,000 records Phase B Fixed Format ID Information 18,000 two card records Card Image ID File SORT SS#, CD# Sorted Card Image ID File EDIT Program EDIT Listing EDIT Cards ID File 122 Char. Records Phase C Claim and Other Cases Final Merged Info. to Master File EDITS Unmatched EDIT Listing MERGE Desired Information We wish to add two kinds of information to the existing master file at this point: a) fixed format identification information for all individuals in the sample; and b) data on individuals from the Social Security Administration. In general, the procedure will involve matching the individuals in our file of fixed format identification information with individuals in the social security information file, creating a file of merged information, and updating the master record file from the merged identification and social security data file. With reference to the diagram, Phase A involves ordering the social security tapes by social security account number if necessary. Phase B involves putting the fixed format identification information on tape, sorting the records, and editing them for mispunches and missing cards. The output of Phase B will be a tape of 122 character records ordered by social security account number. Phase C will merge social security and identification information into a single record for each person. At this point we expect to find a number of unmatched individuals for whom no social security information exists. Therefore, we create a tape file of these unmatched individuals to send to the Social Security Administration for further processing. The matched information will be sorted by identification number and merged into the master file.hahttp://www.ssc.wisc.edu/wais/WAIS645009.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645009.txt James Geffert 1964@9Proposed Identification and Social Security Record FormatxNovember 16, 1964 WAIS paper645-012 NHSocial Security Earnings Data- 805 Fixed Format Identification- 805 FileJames Geffert 645-012 November 16, 1964 Draft Proposed Identification and Social Security Record Format Position SS Tape Position Item 1 Alpha I 2- 9 Wisconsin Identification number 10- 11 Blank 12- 20 Social Security Account number 21- 37 Last name 38- 39 Title 40- 52 First=name 53- 64 Middle name 65- 74 Street or box number 75- 77 RR, Rt. or RFD number 78- 94 Street name or "box" 95- 98 Street class or type 99-119 Post office (city) 120-121 Zone number 122-123 County code Line 1 124-124 13 Multiple account number indication 125-125 26 Indication that name on record does not agree with finder card 126-130 31- 35 Month and year of birth 131-131 38 Race indication 132-138 40- 46 Sex (alpha) Line 2 139-143 76- 80 Indication of railroad activity 144-145 84- 85 Newly posted credit earnings item 146-147 87- 88 Additional earnings indication 148-149 90- 91 Active earnings discrepancy 150-154 94- 98 Account in benefit status other than disability 155-158 100-103 Benefit status other than disability was terminated 159-161 107-109 Account in disability benefit status or disability freeze status Position SS Tape Position Item 162-165 111-114 Disability status was terminated 166-169 117-120 Credit indication 170-174 123-127 Earnings statement issued in year indicated 175-177 130-132 Indication of self-employment activity 178-180 133-135 Indication of delinquent self-employment item 181-182 138-139 Indication of agricultural activity Line 4 183-191 223-231 Earnings, 1937 to date 192-193 234-235 Wage quarters of coverage, 1947 to date 194-195 239-240 Self-employment quarters of coverage 1951 to date 196-197 242-243 Agricultural quarters of coverage 1955 to date 198-206 259-267 Earnings 1951 to date: 207-208 270-271 Wage quarters of coverage 1951 to date 209-210 275-276 Self-employment quarters of coverage 1951 to date Line 5 211-218 294-301 1951 earnings 219-219 310 1951 self-employment quarters of coverage 220-227 330-337 1952 earnings 228-228 346 1952 self-employment quarters of coverage Line 6 229-236 364-371 1953 earnings 237-240 374-377 1953 quarterly wage quarters of coverage pattern 241-241 380 1953 self-employment quarters of coverage 242-249 400-407 1954 earnings 250-253 410-413 1954 quarterly wage quarters of coverage pattern 254-254 416 1954 self-employment quarters of coverage Line 7 255-262 434-441 1955 earnings 263-266 444-447 1955 quarterly wage quarters of coverage pattern 267 450 1955 self-employment quarters of coverage 268 453 1955 agricultural quarters of coverage 269-276 470-477 1956 earnings 277-280 480-483 1956 quarterly wage quarters of coverage pattern 281 486 1956 self-employment quarters of coverage 282 489 1956 agricultural quarters of coverage Position SS Tape Position Item Line 8 283-290 504-511 1957 earnings 291-294 514-517 1957 quarterly wage quarters of coverage pattern 295 520 1957 self-employment quarters of coverage 296, 523 1957 agricultural quarters of coverage 297-304 540-547 1958 earnings 305-308 550-553 1958 quarterly wage quarters of coverage pattern 309 556 1958 self-employment quarters of coverage 310 Record mark 1 Alpha K 2- 9 Wisconsin ID number 10- 11 Blank 12- 12 559 1958 agricultural quarters of coverage Line 9 13- 20 574-581 1959 earnings 21- 24 584-587 1959 quarterly wage quarters of coverage pattern 25- 25 590 1959 self-employment quarters of coverage 26- 26 593 1959 agricultural quarters of coverage 27- 34 610-617 1960 earnings 35- 38 620-623 1960 quarterly wage quarters of coverage pattern 39 626 1960 self-employment quarters of coverage 40 629 1960 agricultural quarters of coverage Line 10 41- 48 644-651 1961 earnings 49- 52 654-657 1961 quarterly wage quarters of coverage pattern 53 660 1961 self-employment quarters of coverage 54 663 1961 agricultural quarters of coverage 55- 62 680-687 1962 earnings 63- 66 690-693 1962 quarterly wage quarters of coverage pattern 67 696 1962 self-employment quarters of coverage 68 699 1962 agricultural quarters of coverage Position SS Tape Position Item Line 11 69- 76 714-721 1963 earnings 77- 80 724-727 1963 quarterly wage quarters of coverage pattern 81 730 1963 self-employment quarters of coverage 82 733 1963 agricultural quarters of coverage 83-309 Blank for additional information 310 Record markhahttp://www.ssc.wisc.edu/wais/WAIS645012.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645012.txt |F Gene Moyer 19642+The Locked-In Effect and the Aging InvestoroDecember 1, 1964 WAIS paper645-015p4.Analysis Proposals- For Analyses, Theses, etc.{({"Gene Moyer WAIS Paper 645-015 December 1, 1964 2nd Revision The Locked-In Effect and the Aging Investor For many years the New York Stock Exchange has kept up a running stream of invective against the capital gains tax. One reason for their animosity toward the tax is the fact that certain countries, notably England, do not levy such a tax. The exchange feels, then, that American investors are the victims of discrimination. But since tax systems among nations are in general different, the exchange is willing for American investors to put up with this "cross" of discrimination. The exchange contends however, that the tax has other effects which are much more insidious than that mentioned above. The most important of these is the "locked-in" effect. This theoretical structure involves the contention that investors can be divided into several classes graded according to the amount of risk they are willing to take in order to get a specified amount of yield. At the top of this hierarchical structure are individuals who are willing to take great risks in order to get high profits (including capital gains). These individuals are the primary purchasers of the equity issues of new and untried companies. Without the funds which these high risk yield investors provide, new companies would have a very difficult time getting started and the new products they provide would be introduced into the economy at much less frequent intervals. Therefore the economy would have a much greater tendency to go toward a Schumpeterian static state. The method by which the tax "locks-in" an investor is to keep new risky investments from having the profit differential to older investments which they would have in the absence of the tax. Also, over time, investments either become "safe," low-yield investments or become capital losses. Therefore an investor who owns an investment security must trade (i.e. sell his "mature" assets) in order to maintain his risk/yield position. But suppose, for example that an investor owns an asset worth $1000 which he purchased some years ago for $500. The security pays (dividends and expected gain) 10% annually on his original $500 or 5% annually on the current $1000 value. Another asset is available on which the expected return is 5.5% annually. The investor would trade for this new asset except that he must give up part of his capital to pay the capital gains tax upon selling his old asset. The 5.5% return, then, must be reduced by the rate of tax which the investor must pay to buy that asset. This rate is the tax rate (25% if the investor is in the 50% marginal ordinary income bracket) times the percent change in value of the old asset. In this case this amounts to 25% x (100-500)/1000 = 25% x 500/1000 = 12.5% 1000 Therefore 5.5% (100% - 12.5%) - 5.5% (87.5%) = 4.8% which is less than the 5% the old asset pays and the investor is forced by the tax to keep his old asset. Thus, the exchange says, the investor is locked-in by the tax. But economists have objected to this analysis for several reasons. For one thing, the tax rate on capital gains is at most half that on ordinary income. This gives taxpayers a large incentive to enter the market to get capital gains and so to get additional funds which are taxed at the preferential rate. Since these people desire capital gains to the exclusion of dividends, they are willing to invest in the equities of corporations which may not be in a position to pay dividends until a long period of growth has passed. Therefore this desire for capital gains produces what Walter Heller has termed a "flock-in" effect in that new funds "flock" into the market, and these new funds are concentrated in the "risky" equity issues about which the exchange (and others) are so concerned. Therefore, the "locked-in" effect as the exchange formulates it would seem to be of diminished importance because of the fact that these people whose funds "flock-into" the market are not only not deterred from selling because of the tax, but are encouraged to sell because of the difference between the rate on ordinary income and the rate on gains. This difference in rate gives the investor a much greater amount for consumption or for further investment if he realizes gains than if he gets even quite high dividends. Therefore, Seller and others say, the "flock-in" effect and the preferential rate probably render the "locked-in" effect as the exchange formulates it practically valueless for explaining market behavior. But, they say, there may be a locked in effect which is of importance in making markets less fluid than they would otherwise be. This "locking-in" is possibly caused by two facets of the tax law, the six-months "holding-period" and the forgiveness of the tax at death. The "holding-period" (the period an investor must hold an asset before he can get the preferential rate on capital gains) is significant because it may cause an investor to hold an asset which he would prefer to sell after (say) three months an extra three months before he sells it. To the extent that this consideration keeps him from trading for an asset with a higher expected yield (of either dividends or capital gains), he is "locked-in" by the tax. Forgiveness of the tax at death may significantly impede capital flow because older investors may decide that avoiding the loss of capital caused by paying the tax may be a better course of action than trading for a stock with better prospects. They will allow the assets they have to pass into their estates so that their heirs may trade these same assets without paying the tax. Thus as the investor ages, this provision should cause his capital to be less and less mobile. This last is the major hypothesis the writer wishes to test in his thesis. In order to adequately test it, however, it is necessary to test many other theses which are significant in the literature. This is necessary because the effects of these may otherwise be "mixed" with the effects of age. A Test of the Locked-In Effect Before one can talk about the locked-in effect at all, it is necessary to have a model which explains the investor's determination of whether or not to sell an asset. The "simple model" of WAIS Paper 645-011 (revised and included as an appendix to this paper) has no behavioral parameters, but it does provide many of the variables which an investor would presumably consider in determining whether or not to sell the ith asset. These variables are: 1. the dividend rate on the ith asset (Di/Pixi=Di/Vi) 2. the dividend rate on alternatives (a1) 3. the investor's marginal ordinary income tax rate (S) 4. the investor's marginal capital gains tax rate (r) 5. the appreciation rate of the ith asset ((Pit - Pit-1)/Pit-1)) 6. the appreciation rate on alternatives (a2) 7. the investor's age (from Z) 8. the amount of risk associated with the ith asset. There is at least one consideration which the model does not take into account because recourse to the model at all assumes some interest and acumen. But some investors do have more acumen and interest in maximizing their portfolio than others. One would assume then that some variable needs to be included to capture this. The Federal Reserve study indicated that the occupation of the investor was a very significant indication of the amount of acumen an investor has, so that it should be included in any behavioral model. The problem of testing the locked-in effect can be viewed as a problem of estimating the probability that the investor will sell an asset or not. If a variable with a negative sign explains a significant amount of the variation in a dependent variable which has the value of 1 if an asset with a capital gains was sold, and 0 if the asset was not sold, then one would suppose that the variable was very important in determining the amount of lock-in inherent in an asset and the investor who holds it. But the variables in the model are not at all independent and probably work together in encouraging or discouraging the investor's sale of an asset. Therefore the model which explains the probability of sale of an asset should use certain of these variables in combination with others. If it is postulated that the relationship between these variables and the dependent variable is linear, then the parameters to be estimated are the 131 in a regression of the form (a) Pr(Sale) = B0 + B1X1 + B2X2 +...+ BnXn + U It is only necessary, then to specify the variables which form the Xi and the expected signs of the Bi. The Xi, as was mentioned earlier, are variables from the "Simple Model," singly or in combination. The first variable in the regression should be the age of the taxpayer weighted by the proportion of the value of the asset which is a capital gain to the taxpayer. This variable is important because the present value of the tax saving is primarily a function of the two variables age and the amount of gain which has accrued on the asset. The more advanced is the age of the taxpayer and the higher the proportion of the asset's value which is capital gain, the greater is the value of the tax saving through forgiveness at death. Since the higher is the present value of the tax saving the less likely is the taxpayer to sell the asset, the expected sign of Bl is negative. The first variable, then, is (b) Xi = At Gi/Pit where At = the age of the taxpayer in time t Gi = the amount of capital gain which has accrued on one unit of the ith asset up to time t and pit = the price of the ith asset in time t. Another variable which should be extremely important in encouraging an investor to sell or to keep an asset is the ratio of the yield from the asset to the yield from good alternative investments. Accepting the argument that a reasonable yield standard for investments to meet is that the yield be as great as the return on "yield" mutual fund shares, one should use (at first glance) (c) rit/a1t where rit = Dividends in time t from the ith asset/(Price of the ith asset) (Number of units of the ith asset) and ait = Total dividends paid by "yield" mutual fund shares in time t/Total value of yield shares in time t as X2. But this variable suffers from the defect that a1t is not the real rate of return the investor can get from alternatives because he must pay a capital gains tax in order to buy those alternative investments. This can perhaps be best seen if we equate the total income the investor gets from the ith asset and the total income from alternatives. For convenience, let us set Xit = 1. The total income from the ith asset in time t (after taxes) is ritPit(1 - St) where St = the investor's marginal ordinary income tax rate. The total income from alternatives to the ith asset is the rate of return from alternatives times the amount of capital available for investment. This is a1t(Pit - rtGi) = a1t(Pit - Tit) where rt is the investor's marginal capital gains tax rate in time t, Tt is the amount of the tax due on the ith asset if the investor sells it. Equating these two amounts of income, then, we have (d) ritPit(l - St) = a1(Pit - Tit) Therefore (e) rit(1 - St) = a1t((Pit - Tit)/Pit), Pit Tit - pit lt pit - pi -*it l - pit and V-141 - p is the true alternative to rit(1 - St). The variable which should be used, then, is r it (l-S) (f) R2 T t alt (1 - pit i But investors may react slowly to yield changes on assets they hold for the "long run." In other words, they may base their decisions on yield records in times previous to t. Therefore one should include also (g) where n is the number of periods the investor has held the asset. The effect of X2 and X3 on the probability of selling an asset are obviously that the higher are X3 or X3 the less likely the investor is to want to sell the asset. Therefore the coefficient of each of these variables should be negative. Another determinant of whether an investor will want to sell an asset is the rate of appreciation of the value of the asset in relation to the rate of appreciation on other assets. Again the simple rate of appreciation on "growth" mutual funds (a reasonable standard alternative) is not by itself indicative of the two alternatives open to the taxpayer. This can be shown by an argument similar to that given on alternative yield rates. The rate of accumulation for a single unit of the ith asset is pit - pit-1 . Pit-1 8 it The rate of accumulation on "growth" mutual fund shares is Total capital gains on mutual fund shares in time t Total value of "growth" mutual fund shares in time t Equating the total amount of capital gains in time t on a single unit of the ith asset with that of its alternative, then (h) aitPit - a2t(pit - Tit) as in equation (d).Therefore ait = alt CitP- Titl cx2t( pit - P Tjt Qc2t (1 - P and it ~' it i i a2t(l - Pit ) is the true alternative to ait. The variable which should be used, then is (j) Again, however, investors may react more slowly to changes in appreciation rates. Therefore (k) should also be included. Since an investor would presumably not want to sell an asset with a high X4 or X5 value, the coefficient of each should be negative. The relative value of the marginal ordinary income tax rate and the marginal capital gains tax rate pose a problem. Both are functions of income and should be well approximated by an income variable. But P(S) = f(Income) is probably a quadratic function. The need to maintain consumption would encourage investors with low income (especially low transitory income) to sell their assets and consume out of the gain or even their original capital. At the same time, a person with a high income is encouraged to realize capital gains because of the fact that the ratio of marginal capital gains tax rates to marginal ordinary income tax rates falls from .5 over most of the income scale to .2747 when marginal ordinary income tax rates reach 91% as they did during the years of our study. Therefore one is hard put to determine what the expected sign of the coefficient should be. Atkinson, in an earlier study of Wisconsin tax forms, showed, however, that people who owned publicly traded non-local equity issues tended to have above-average incomes. In addition, Lampman showed that half of the common stocks on the New York Stock Exchange were owned by the top 1% of income recipients. Therefore the sign of the coefficient will probably be positive in the ranges of our sample. The probability of a positive sign will be closer to l if the population of assets sampled is limited to stocks listed on the New York Stock Exchange as is planned. At any rate, (1) X6 = It where It = the income (not including capital gains) of the investor in time t. The best measure of the riskiness of the asset would seem to be the one used as the index in the "simple model," (m) X = R ai i 7ites t2 am+a-m A high value of Rit would indicate a very high level of risk. Therefore if the investor does wish to minimize the risk in his portfolio, it seems obvious that an asset with a high Rit will have a high probability of being sold and the expected coefficient should be positive. The remaining variable to be specified is an indication of the acumen and interest of the investor in his portfolio. The Morgan study showed that professional people and executives in the investment industry are much more cognizant of the various factors which influence the value of their portfolios or which should influence their investment decisions. But our data do not include occupation information detailed enough for us to discriminate between executives in various industries and do not include investment analysts as a separate profession. However, if one assumes that the education of professional people and executives better qualifies them for making investment decisions than nonprofessionals or non-executives, then a variable (n) X8 = 1 if the investor's occupation is coded as 18 or lower or if he is a non-farm self-employed business man (i.e. if he is a professional, an executive, or a non-farm business man) 0 if the investor's occupation is coded as any other category should provide some index of business acumen. The more acumen which an investor has, the greater is the probability of a trade because he will always be looking for new opportunities to invest his capital to better advantage. The person with less acumen and interest would presumably be much more prone to leave his capital in its present form rather than to go to the trouble of selling the asset he owns and reinvesting the capital in alternative assets. Also, he may well in his ignorance of alternatives, decide to "leave well enough alone" and retain the asset in his portfolio. Therefore the expected sign of the coefficient is positive. These variables should account for most of the variation in P(S). The disturbance term, however, will include the "taste" of the investor in preferring to hold one asset over another for some reason which is not related to the financial characteristics of the asset. For example, a minister might prefer to hold stock in a religious supply house simply because he wishes to encourage such enterprises rather than because it is a really good investment. Or a lineman of the telephone company might decide to hold telephone company stock merely because he can "watch after his investment," although one can hardly understand how much of the company's operation he can oversee from the top of a telephone pole. Or a person may decide to hold telephone company stock because "almost everybody has one" even though the market has fully capitalized this growth in the years before he purchased his stock. At any rate, these variables should account for enough of the variation in P(S) so that the true coefficient of the most important variable in this study should be estimated well. If the coefficient for age is significant and negative, one can say that age does affect the probability of sale and impedes sale. Therefore the forgiveness of the capital gains tax at death probably does lock-in investors. It is significant and positive, one will have to seek out some better theoretical or statistical explanation for this result. It may be that the relationship is quadratic or of higher power so that a multiplicative estimation process will give a better fit to the data or that age should be divided into categories and a coefficient estimated for each age group because of some behavior peculiar to some age group which is not explained by the regression. Only running a regression will show whether such problems exist or not. The Population of Assets to be Sampled This study is basically a study in specific market imperfections. Because of the desire to study specific market imperfections, it is necessary to delete assets whose markets are always very imperfect or whose nature gives rise to other imperfections included in the disturbance term. An example of a generally imperfect market is that for small businesses or real estate. To talk about the amount of unrealized gain on such an asset is somewhat difficult if the current price of the asset is unknown. In addition one can hardly equate the actual sale of such an asset with the propensity of an investor to sell it if the investor must look for a buyer for six or seven months after he decides to sell. Therefore the asset chosen must be one which is in many respects similar to its fellows and for which an organized market exists so that the price can be determined with certainty and so that sale can follow the desire to sell almost immediately. In addition it is desirable that the "taste" differentiation among single assets be minimized. While the section which discussed the disturbance term indicated that the "taste" factor can enter into the desire to sell any asset, the "taste" factor is probably most important in selling real assets. The investor may hold an asset because it was a business started by his grandfather, because it was the first house he and his wife lived in after their marriage, or because he spent his boyhood there. To talk about estimating the probability of selling such an asset with the data available in the project is probably meaningless. Since these things are true, the asset chosen should be one for which the most perfect market possible is available and which has as little of the "taste" factor associated with it as possible. Common stocks of companies listed on the New York Stock Exchange would seem to meet these criteria better than any other class of assets. In addition more complete data is available for this class of assets than for any other. Therefore these assets would seem to be ideal for this study. APPENDIX A A Simple Model of Optimal Asset Management Including the Investor's Estate Motive In traditional economic analysis of asset management, the analysis begins with a utility function of the investor similar to (1) Ut - U(V1,V2,..., VE) where Ut = utility in time t and Vi = the E various service flows from assets used for present consumption as well as for future consumption (i.e., from investment assets.) Given this utility function, the investors problem is to maximize Ut subject to an income constraint of the form (2) Rt = Yt - Et = 0 where Yt = (the total income of the investor in time t) and Et -(the total spent by the investor on consumption goods in time t + the total amount saved (= the amount invested) in time t.) When one introduces an estate motive into the analysis so that the problem of the "locked-in" effect of forgiveness of the capital gains tax at death can be considered, the analysis becomes almost hopelessly complicated. Holt and Shelton, therefore, try to solve the problem analytically by formulating some decision rules an investor might follow in deciding whether it was optimal to sell or to hold an asset.* Their analysis, however was admittedly inadequate in that differences in risks among assets were not considered. Also, while their analysis tells the investor how much of a difference in yield is required in order to justify switching assets of equal risk, it does not provide any simple method of determining the optimality of the entire portfolio. The analysis in this paper provides a simple way of accomplishing this objective. The analysis stems from the utility maximization model (equations 1 and 2) given above. If the budget constraint (equation 2) is binding, the investor can increase his utility by increasing his income and his wealth. The following linear programing model provides a simple way of maximizing the current value of the portfolio. While there is some question about whether this is what the investor is trying to maximize when the estate motive is present, the investor must decide what he wants to maximize and if the present value of future wealth is taken into account, maximizing present value seems most sensible. *Holt, Charles C. and Shelton, John P., "The Lock-in Effect of the Capital Gains Tax,", National Tax Journal, (December, 1962) XV: 337-352. In order to allow the model to concentrate on investor decisions, and especially upon the decision to exchange an asset presently held for another, we shall assume that non-property income and consumption expenditures are exogenously determined. The first "process" in the model, then, is (3) Wlt = E(Ylt) E(Clt) where E(Ylt) = the expected non-property income of the investor during time period t. The determination is made at the beginning of time t. Clt = the expected consumption expenditures of the investor during time t. We shall assume throughout the analysis that E(Yl) = Ylt and that E(Clt) = Clt. The rest of the processes in the model, then, are concerned with the n-1 assets in the investor's portfolio and are the really relevant processes. For the ith asset, (i = 2,3,...,n) the investor must determine (4) Wit = E(yit) + E(git) + E(hit) where E(yit) = the expected yield of the ith asset during time period t less the expected yield from that amount of money in its best alternative use. E(git) = the expected amount of capital gain from the ith asset during time period t less the expected gain from that amount of money in its best alternative use. E(hit) = the expected value of the tax saving if the investor decides not to sell the ith asset. Rit = a risk index associated with the ith asset In order to analyze the decisions of the investor, some assumption must be made about the way the investor forms his expectations. Decisions under three different methods of expectation formation will be examined in the following analysis. First we shall examine the case in which (a) E(theta it) = theta it-l where eit-1 s yit-1' sit-1' or hit-1' Then we shall examine the case in which (b) E(theta it) = L E R=1 1/L theta i(t-R) where L - the number of periods during which the investor has held the ith asset. Finally we shall examine the case in which L (c) E(theta it) = L E R=1 1/L theta i(t-R)(1 - R-1/L ) Obviously each of these methods of expectation formulation are simpler than the process an investor might use to formulate his own expectations. He (or his investment analyst) might use highly complex and esoteric formulae to determine his optimal investment decisions. On the other hand these esoteric methods used by investment analysts often involve computations much like (a), (b), or (c) and many investors base their expectations on such items as a chance conversation with the manager of the supermarket. In view of these facts, the expectations formed by these methods of formulating expectations may not be much different from the expectations actually generated by successful investors. The term "best alternative use" needs better definition before beginning the analysis. There are several ways of determining the best alternative use. One might decide that the experience of the stock market as a whole would be a good "yardstick" by which to measure the optimality of the investor's portfolio. If the investor's assets are not doing as well as the market, one might decide that these assets are not so good as the best alternative. On the other hand, perhaps this is too difficult a feat to manage for most investors. And, given that they cannot do as well as the market, many investors would have little idea about how to do better. Therefore, perhaps the experience of professional managers of assets is a better yardstick by which to judge the performance of individual investors. If they are doing better than they could do if they put their money in the hands of professional investors (i.e., in mutual funds), they are doing well. If they are doing less well than they could do in mutual funds, they should switch to other assets. To some large extent, then, we have implicitly defined the model. It only remains to state the "naive expectation" model (case a) because the less naive models (B and C) are only functions of that model. In the case (a), then Wit S yit-l + git-1 + hit-1 Dit-i(l-st-i) - alt-1[Pit-1xit-1 - Tt-1 (Pit-iRit-1 Bi)] it-1 Pit-lxit-l where Dit-1 the total amount of dividends (or interest) paid on the ith asset in time t-l St-l , the investor's marginal income tax rate in time t-a. Pit-l = the price of the ith asset in time t-1 xit-1 , the number of shares of the ith asset the investor owned in time t-1. The total amount of dividends paid to the owners of "yield" sit-1 mutual fund shares in t-l The total value of those "yield" shares in t-1 Tt-l M the marginal capital gains tax rate of the investor in time t-1. Bi - the basis of the ith asset for capital gains tax purposes. (Pit-1 - Pit-2) xit-1 alt-ilpit-lxit-1 -/T(Pit-lxit-i - Bi)] it-i Pit-lxit-l where alt = total gain made by selected "growth" mutual funds in t-l Total value of stock at the beginning of t-1 in such funds hit-1 is somewhat more difficult to define. The taxpayer can effect a tax saving in several ways by waiting to realize gains. If he holds an asset over six months, he can realize a gain and be taxed at the preferential rate. If he holds the asset until he dies he can avoid being taxed on his gains at all. If he waits until he has some losses against which to balance his gains, he can similarly avoid being taxed on the gain. In addition, he receives imputed interest on the amount of the tax liability for as long as he waits before realizing the gain. The present value of the tax saving effected by not realizing gains until the six months waiting period is over is given by At-i '~ El j (st-l{pit-1%it-l - Bi) (l + alt-l)l ~l it-I where K - the number of periods which must elapse before the asset becomes eligible for the preferential rate and j - t. tilt l is used as an imputed interest rate.The present value of the tax-saving effected by holding the asset until death is given by L Bt-1 " 4+1 (l+rit-1)Z-J 1r t-1(pit-lXit-1 - Bi) (1 + Ctlt-1)] where Z - the time in which the taxpayer reaches his current life expectancy. The riskiness of the asset (Rit) is basically that the price of the asset may fall so that gains are wiped out or losses are incurred. One measure of this change in prices is the variance (a it) of prices over time. But this in itself is meaningless without some standard of riskiness which the asset might reasonably attain. One such standard in riskiness is the variance (a2m) of the common stock market average over time. If a2i = a2m, the riskiness of the ith asset is rather high, since common stocks are probably among the riskiest of assets. However, Markowitz and Farrar have shown that the riskiness of the portfolio is probably more important to the investor than that of a single asset because the portfolio can be, by diversification, less risky than many assets within the portfolio. Yet the investor is probably still wise to compare the variance of any asset to the mean variance of the assets in his portfolio (a2) as well. In addition to the variance of prices, however, a measure of the riskiness of the asset, however, is the trend of prices over time. An asset with a small price variance but whose prices have steadily decreased is probably very risky. Again, however, the trend of some standard prices and the trend of the stock market over the same number of years would would seem to be an intelligent standard to use. Therefore ui/um would also seem to be a good ratio to include in any index of the riskiness of the asset, when ui is the slope of Pi = a + uit and um is the slope of m = a + umt where m is the stock market average. The complete risk index, then should probably be Rit = a2i-ui/a2m+a-2-um since the effect on the riskiness of an asset is the reverse of the sign of the slope. Having computed all the Wi s, the investor should if he is astute, sell the assets for which Wi is negative and use the proceeds of the sale plus the "proceeds" from Wi to buy more units of the Wi which are positive. If he does this, he will have maximized the iEl WiW. In comparing the (IEIWi) among various taxpayers, one would only compare ( I E 2 Wi) since to compare Wi would cause one to make interpersonal utility comparisons among taxpayers. Only with some very restrictive assumptions about investors' utility functions can one do this. Having formulated the model using this most naive of expectation formation, one need only to determine the Wi for the L periods during which the investor has held the ith asset, find the mean of each value over those L periods (raw in case b and weighted in case c) and maximize the computed Wi using the same algorithm as in case a .hahttp://www.ssc.wisc.edu/wais/WAIS645015.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645015.txt! Ron Durant 1964$Description of Tax-04 Program.November 20, 19646 WAIS paper645-014sProgramsRon Durant WAIS Paper 645-014 November 20, 1964 Report To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Moyer, Bauman, Geffert, Roubal, Seavey, Wiegner From: Ron Durant Document: Description of Tax-04 Program I. Tax-04 has been written and run in order to perform the following: 1. Consolidate the 5 reels of WAIS Tax Master Data created in Tax-01. 2. Drop all "ID" records from the Master File and thereby retain data records only. 3. To provide a listing of Identification numbers for which data now exists on the Master File. TAPE-SSRI-259 TAPE-SSRI-266 TAPE-SSRI-293 TAPE-SSRI-294 TAPE-SSRI-296 TAPE-SSRI-262 TAPE-SSRI-263 TAPE-SSRI-298 PHASE I MAST. 1 OF 2 PHASE I MAST. 2 OF 2 PHASE II MAST. 1 OF 2 PHASE II MAST. 2 OF 2 PHASE III MAST. 1 OF 1 TAX-04 LISTING OF ID NUMBERS ON DATA FILE DATA MAST. 1 OF 3 DATA MAST. 2 OF 3 DATA MAST. 3 OF 3 MASTER RECORDS IN 151,914 MASTER ID RECORDS DROPPED 19,282 MASTER DATA RECORDS OUT 132,632 IBM 1410 TIME 85 MINUTEShahttp://www.ssc.wisc.edu/wais/WAIS645014.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645014.txtT Ron Durant 1964$Edit Procedures (Keypunching)xDecember 2, 1964 WAIS paper645-016oPrograms Ron Durant WAIS Paper 645-016 December 2, 1964 Edit Procedures (Keypunching) To: Hilde Roubal and Keypunching Staff Info: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Bauman, Geffert, Moyer, Rishpan, Seavey, Wiegner From: Ron Durant Re: Resubmission of edits from Tax-01 (Wisconsin Income Tax Master Creation Run) and General Correction Procedure. Corrections to identification records (I in col. 1) may be ignored since new ID records will be forthcoming from the merged Social Security and Wisconsin Income Tax data. All ID records previously on the Tax Master tape were deleted in the Master Consolidation Run (Tax-04). If in analyzing the edits, it is determined that there is currently on the Tax Master tape an incorrect year data record for an individual, the following card entry will drop such a record from the Master File during the EditMaster Updating Run (Tax-03). Column(s) Data 1 P 2- 9 Identification number 10-11 Year 12-80 Blank If it is further desired to drop all year data records for an individual, the following card entry will be required. Columns(s) Data 1 3 2- 9 Identification number 10-80 Blank If any data are dropped in this manner, then new data must be repunched from the individual's folder. Note: In the rare instance where a 4 data card Master Record for a year or years was created instead of a 2 data card Master Record, it will be necessary to delete the existing Master Record and resubmit the 2 data cards which will create a corrected New Master Record. If it is desired to change an identification number for a Master Record in the Edit Master Updating Run (Tax-03), the following card entry will be required. *Column(s) 1 C 2- 9 Old identification number 10-17 New identification number Note: This entry will place the new ID number in place of the old for all of an individual's Master Records. It will be very important to make sure that a master record for a particular individual under this new ID number does not already exist on the Master Tape or will be created with data records updates the Master File along with changing the ID numbers. This is crucial because all these new ID number Master Records will be put out as a supplementary Master File, sorted to sequence and merged back in with the basic Master File in a subsequent processing. Thus, if the above stipulation is not followed duplicate Master Records could result in the merging. Therefore, should it be determined that a Master Record already exists under the new ID number, then delete the old ID number Master Record and resubmit the basic data with the new ID number punched in it and it will be added to the already existing new ID Master Record. In order to call records off the Edit Tape created in Tax-01, the following entry will be required. Edits will be called off the Edit Tape in the Edit Call Run (Tax-05). *Cards of this type (C in column 1) should be kept separately from all the other corrections and deletions. For these ID change cards will be submitted to the Edit Master Updating Run (Tax-03) as a separate file. **Column(s) Data I Card 2- 9 Identification number 10-11 Year However, care should be taken in the use of this entry to make sure that there are not duplicates on the Edit Tape. If there are, the corrections should be repunched rather than selected from the Edit Tape. **Cards of this type should be kept separately from the other corrections and deletions. For these edit call cards will be used in a separate run (Tax-05) to pull off corrections from the Edit Tape.hahttp://www.ssc.wisc.edu/wais/WAIS645016.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645016.txt$*fRichard Bauman 19644-Methods of Constructing a Life Income MeasureDecember 11, 1964 WAIS paper645-019n,%Proposals- For Analyses, Theses, etc.)H)ARichard Bauman WAIS Working Paper 645-019 December 11, 1964 Draft Methods of Constructing a Life Income Measure The WAIS has data from the Wisconsin Tax Study covering approximately 18,000 individuals over the period 1946-1960. The number of years of observation for any given individual my vary from 1 to 15. Our sample may be described, therefore, as 1) a series of cross-sectional samples as well as 2) a variable-length sample of "panel data" with the length of successive years' observations ranging from 2 to 15. It is also possible to have nonsuccessive observations on individuals for several reasons, e.g. housewives who intermittently enter and leave the labor force. Both of the above types of data are desirable for testing hypotheses concerning economic relationships. Cross-sectional data is valuable since it a abstracts from dynamic relationships which may or may not be due to individual characteristics. Successive cross-sectional data are often questioned because, while such data do measure dynamic relationships of some sort, it is impossible to relate the dynamics in a meaningful way to individuals. It is therefore desirable to have the second type of data if one is going to say anything about individual related changes. So-called "panel data" are a solution to the problem of gaining an insight into dynamic relationships. The Wisconsin Income Tax Sample offers a comparatively long period of successive observations on a random sample of taxpayers. For some purposes, it is desirable to look at the entire "life income" of an individual. This may be done in one of two ways: A. Given a large enough cross-sectional type sample, the incomes of successively older groups with a given set of characteristics may be combined into a hypothetical "life income" for those with the given characteristics. This type of "constructed" life income is based on the assumption that each age group is homogeneous with respect to the other characteristics. B. Given a long enough time series of panel data a "life income" for an individual or group of individuls can actually be observed. This writer is not aware of any large scale attempt to measure this.* *Data with many years of income observations for the same individual were used by J.R. Walsh in "Capital Concept Applied to Man," Quarterly Journal of Economics (February, 1935) 255-285, but only for a few limited classes, i.e., doctors, lawyers, or engineers, most of whom graduated from the same university. The following is a suggested treatment for the WAIS data which is intended to combine the more desirable aspects of cross-sectional and panel data in a meaningful way. Before presenting the proposed method' for deriving a measure of life Income, it may be helpful to present a condensed outline of the usefulness of having such a measure. In this writer's opinion,. the main areas where it would be helpful fall into 3 categories: 1. Testing the hypothesis of a difference between a. the distribution of income "over time"; and b. the distribution of income as it is usually measured. This would include getting some notions about the magnitude of "hard-core" poverty or "hard-core" wealth. 2. Identifying and analysing the influence of various factors such as: education, occupation, region, marital status, number in family unit, the relationship between earned and non-earned income, and others; on a broader measure of income than simply that income which flows to an individual during a one year period. 3. Provide a basis for testing the effects of various policy parameters which are intended to have a "long-term" effect such as: a. Some form of income tax averaging b. Public policy toward those persons which are at either end of the age-income distribution - specifically those over age 65 and those under age 25. This writer intends to be primarily concerned with category 3.b. A thorough analysis of that category would, of course, involve paying considerable attention to the influence of those variables mentioned in category 2. as well as other variables important to these age groups. A Proposal for Forming Estimates of Life Incomes Based on the WAIS DataSeveral qualities appear desirable in the formulation of a scheme to construct "life incomes" from data such as that gathered by the WAIS. 1. The model should make use of all observations while retaining the identity of the micro-units. 2. The model should allow for the use of successive observations on the same micro-unit in order to estimate the size and importance of dynamic relationships which may be present. 3. It is also desirable to formulate a model which retains most of the simplicity of construction inherent in the constructions of life incomes based on cross-sectional data. There are several possible disadvantages of the proposed method as it is now envisioned. 1. Suppose an entire group (those with relative deviations in Xh in the following presentation) were subject to "abnormally depressed conditions," e.g., due to structural unemployment, with respect to the overall mean, Xo for a given period of years (less than the total member of years studied). It would seem that the resultant vector, *h, would not be a good representation of the true value of *h. However if such a group could be described as "normally depressed", e.g., due to seasonal unemployment characteristic of the selected group's occupation, the vector *h would seem to reflect the true value of *h. 2. The proposed method, by its construction, would eliminate the overall trend in the period covered by the sample. Trends for individuals or the specified group, however, would be captured. This facet of the treatment would be similar to the construction of life incomes through cross-sectional data. A meaningful overall trend (if it were to be included in the * vector) would seem conceptually invalid because of the shift from a "year" time reference to an "age" time reference. DESCRIPTION OF THE METHODS The observed values for the incomes of a micro-unit over the years it reports may be presented in a vector: Yi= [yi1 yi2...yij...yir] Where yij is the income of the ith micro-unit (i = 1,...,n) in the jth year (j= 1,...,r) We may compute means for each year: Y= [y1 y2...yj...yr] = grand mean vector of incomes Defining a vector of deviations as Zi Zi= Yi - Y - {yij - yj} = vector of deviations from the grand means for the ith micro-unit Suppose that there are 1 groups in the sample 1 1 n We can define: Zh= E Zi (h = 1,...,1) all i in hth class It is also useful to define a mean vector of the Zh's: Mean of Zh= Hh' Zh where Hh= [1/n1 1/n2...1/nj...1/nr] The above procedure has the advantage that the most general economic conditions affecting incomes such as those reflected in changes in the factor price level are subtracted. It seems reasonable to proceed, however, on the assumption that the Zi's are proportional to the Yj's, rather than independent of them, therefore by defining Xij= Zij/Yj , Xi= [Xi1 Xi2...Xij...Xir] = a vector of the relative deviations from the grand means The Xi's may be grouped like the Zi's by defining Xh = E Xi (h= 1,...,1) A11 i in hth class We may now proceed in finding some descriptions of life patterns of income. Case I - A simple life income measure Step 1 - Find the youngest group of individuals in the sample, say all with age = K. Step 2 - The mean value of this group's xij(k)'s where the superscript k denotes the earliest age for filing in our sample = 1/nk E xij(k) (nk denotes the no. sampled with age = K) multiplied by some arbitrary mean value of income (Y= 1/s Ei Ej yij; where s denotes the total no. of observations in all years; seems to be a good value since it would tend to minimize the bias inherent in Panache and LaSpeyras type indices) would provide a value for the first terms of *1= |*k *k+1...*k+d...*k+f| where *1 is a vector of mean incomes for individuals at successive ages (k, k+1,..., k+f) where k+f is the age of the oldest group filing returns. A number of observations on the values of k+d at the extremes of the vector *1 should be made. 1) There would seem to be an exceedingly small no. of observations. 2) While it may be expected that the value of the xij's near * k+d = *k is negative, at a very low k, this value would tend to be influenced highly by "property" income. This, then, is just one reason for pursuing a more sophisticated life income measure. Step 3 - To obtain value for *k+1 the product of 1/nk+1 E Xij(k+1) and Y can be taken. There are two groups which will be included in this summation: a) Those individuals who filed at the kth age and the k+1st age, b) those who began filing at the k+lst age. Those who filed at the kth age but dropped out at the k+lst age would have no Xij(k+l) of course. Step 4 - Values for *k+2 and succeeding ages may be obtained in a similar way. Case II - A simple life income measure It may be that Xh is found to be significantly different from Xo= [0 0 0 0 0r]. It would then be desirable to form a *h for such groups. One way to proceed would be to form a *h exactly as in case I while restricting attention to those individuals in the hth class. In this way a number of life income measures ( -a, where a is the number of groups with Xh not significantly different from Xo) will be found. However this method seems to be undesirable for two reasons: 1) The groups must be defined prior to constructing the measure. In order to avoid missing a "good explainer" of significantly different life incomes, it would be nice to have very homogeneously defined groups. 2) This would seem to lead to an astonishingly large and cumbersome volume of statistics even if only the mean vectors (*k) and the covariance matrices were computed (o2*k) Case III - A simple life income measure It is possible to conceive of the observations in the (k+d)th age as one "subsample." By defining a variable dependent such as Wi(k+d) = Xij(k+d) - 1/nk+d E Xij(k+d) = Xij(k+d) - *k+d/x it would be possible to select those variables (independent) or groups of variables which are not significant in explaining variation in wik+d by using the testing procedures of linear regression analysis. It may prove fruitful to estimate several regressions of this sort for the years K+5, K+10 .., K+70 .., in order to check for the consistent appearance of some variables or sets of variables as "significant" or not "significant." For those variables which are adjudged significant, stratification and reversion to the method of Case II would seem to be a good approach.hahttp://www.ssc.wisc.edu/wais/WAIS645019.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645019.txt) Gene Moyer 1964XQA Simple Model of Optimal Asset Management Including the Investor's Estate Motive}November 4, 1964 WAIS paper645-011gAnalysis''Gene Moyer WAIS 645-011 November 4, 1964 Draft A Simple Model of Optimal Asset Management Including the Investor's Estate Motive In traditional economic analysis of asset management, the analysis begins with a utility function of the investor similar to (1) Ut - U(V1,V2,..., VE) where Ut = utility in time t and Vi = the E various service flows from assets used for present consumption as well as for future consumption (i.e., from investment assets.) Given this utility function, the investors problem is to maximize Ut subject to an income constraint of the form (2) Rt = Yt - Et = 0 where Yt = (the total income of the investor in time t) and Et -(the total spent by the investor on consumption goods in time t + the total amount saved (= the amount invested) in time t.) When one introduces an estate motive into the analysis so that the problem of the "locked-in" effect of forgiveness of the capital gains tax at death can be considered, the analysis becomes almost hopelessly complicated. Holt and Shelton, therefore, try to solve the problem analytically by formulating some decision rules an investor might follow in deciding whether it was optimal to sell or to hold an asset.* Their analysis, however was admittedly inadequate in that differences in risks among assets were not considered. Also, while their analysis tells the investor how much of a difference in yield is required in order to justify switching assets of equal risk, it does not provide any simple method of determining the optimality of the entire portfolio. The analysis in this paper provides a simple way of accomplishing this objective. The analysis stems from the utility maximization model (equations 1 and 2) given above. If the budget constraint (equation 2) is binding, the investor can increase his utility by increasing his income and his wealth. The following linear programing model provides a simple way of maximizing the current value of the portfolio. While there is some question about whether this is what the investor is trying to maximize when the estate motive is present, the investor must decide what he wants to maximize and if the present value of future wealth is taken into account, maximizing present value seems most sensible. *Holt, Charles C. and Shelton, John P., "The Lock-in Effect of the Capital Gains Tax,", National Tax Journal, (December, 1962) XV: 337-352. In order to allow the model to concentrate on investor decisions, and especially upon the decision to exchange an asset presently held for another, we shall assume that non-property income and consumption expenditures are exogenously determined. The first "process" in the model, then, is (3) Wlt = E(Ylt) E(Clt) where E(Ylt) = the expected non-property income of the investor during time period t. The determination is made at the beginning of time t. Clt = the expected consumption expenditures of the investor during time t. We shall assume throughout the analysis that E(Yl) = Ylt and that E(Clt) = Clt. The rest of the processes in the model, then, are concerned with the n-1 assets in the investor's portfolio and are the really relevant processes. For the ith asset, (i = 2,3,...,n) the investor must determine (4) Wit = E(yit) + E(git) + E(hit) where E(yit) = the expected yield of the ith asset during time period t less the expected yield from that amount of money in its best alternative use. E(git) = the expected amount of capital gain from the ith asset during time period t less the expected gain from that amount of money in its best alternative use. E(hit) = the expected value of the tax saving if the investor decides not to sell the ith asset. In order to analyze the decisions of the investor, some assumption must be made about the way the investor forms his expectations. Decisions under three different methods of expectation formation will be examined in the following analysis. First we shall examine the case in which (a) E(theta it) = theta it-l where eit-1 s yit-1' sit-1' or hit-1' Then we shall examine the case in which (b) E(theta it) = L E R=1 1/L theta i(t-R) where L - the number of periods during which the investor has held the ith asset. Finally we shall examine the case in which L (c) E(theta it) = L E R=1 1/L theta i(t-R)(1 - R-1/L ) Obviously each of these methods of expectation formulation are simpler than the process an investor might use to formulate his own expectations. He (or his investment analyst) might use highly complex and esoteric formulae to determine his optimal investment decisions. On the other hand these esoteric methods used by investment analysts often involve computations much like (a), (b), or (c) and many investors base their expectations on such items as a chance conversation with the manager of the supermarket. In view of these facts, the expectations formed by these methods of formulating expectations may not be much different from the expectations actually generated by successful investors. The term "best alternative use" needs better definition before beginning the analysis. There are several ways of determining the best alternative use. One might decide that the experience of the stock market as a whole would be a good "yardstick" by which to measure the optimality of the investor's portfolio. If the investor's assets are not doing as well as the market, one might decide that these assets are not so good as the best alternative. On the other hand, perhaps this is too difficult a feat to manage for most investors. And, given that they cannot do as well as the market, many investors would have little idea about how to do better. Therefore, perhaps the experience of professional managers of assets is a better yardstick by which to judge the performance of individual investors. If they are doing better than they could do if they put their money in the hands of professional investors (i.e., in mutual funds), they are doing well. If they are doing less well than they could do in mutual funds, they should switch to other assets. To some large extent, then, we have implicitly defined the model. It only remains to state the "naive expectation" model (case a) because the less naive models (B and C) are only functions of that model. In the case (a), then Wit S yit-l + git-1 + hit-1 Dit-i(l-st-i) - alt-1[Pit-1xit-1 - Tt-1 (Pit-iRit-1 Bi)] it-1 Pit-lxit-l where Dit-1 the total amount of dividends (or interest) paid on the ith asset in time t-l St-l , the investor's marginal income tax rate in time t-a. Pit-l = the price of the ith asset in time t-1 xit-1 , the number of shares of the ith asset the investor owned in time t-1. The total amount of dividends paid to the owners of "yield" sit-1 mutual fund shares in t-l The total value of those "yield" shares in t-1 Tt-l M the marginal capital gains tax rate of the investor in time t-1. Bi - the basis of the ith asset for capital gains tax purposes. (Pit-1 - Pit-2) xit-1 alt-ilpit-lxit-1 -/T(Pit-lxit-i - Bi)] it-i Pit-lxit-l where alt = total gain made by selected "growth" mutual funds in t-l Total value of stock at the beginning of t-1 in such funds hit-1 is somewhat more difficult to define. The taxpayer can effect a tax saving in several ways by waiting to realize gains. If he holds an asset over six months, he can realize a gain and be taxed at the preferential rate. If he holds the asset until he dies he can avoid being taxed on his gains at all. If he waits until he has some losses against which to balance his gains, he can similarly avoid being taxed on the gain. In addition, he receives imputed interest on the amount of the tax liability for as long as he waits before realizing the gain. The present value of the tax saving effected by not realizing gains until the six months waiting period is over is given by At-i '~ El j (st-l{pit-1%it-l - Bi) (l + alt-l)l ~l it-I where K - the number of periods which must elapse before the asset becomes eligible for the preferential rate and j - t. tilt l is used as an imputed interest rate.The present value of the tax-saving effected by holding the asset until death is given by L Bt-1 " 4+1 (l+rit-1)Z-J 1r t-1(pit-lXit-1 - Bi) (1 + Ctlt-1)] where Z - the time in which the taxpayer reaches his current life expectancy. But there is same level of risk involved in holding this asset for even a short period. This risk is basically that the price of this asset may fall so that gains are wiped out. One measure of this change in prices is the variance (a it) of prices. The ratio of this to the variance of the stock market (apt) average provides a measure of the riskiness of the asset. The higher is alt, however, the riskier is the given asset. (1 - 2 )therefore () provides a measure of the lack of risk in a given asset. This measure is defective, however, in that some assets whose variance is high may still provide some security in that their prices tend to vary in some way different from that of the stock market average. They may be, then, "hedges" against inflation or deflation. (1 - pit) provides some measure of this where piMt is the correlation coefficient between the prices of the ith asset and the stock market average. The smaller is the correlation coefficient, the smaller is the riskiness of the asset. Therefore the complete variable is 2 (At-1 + Bt-1) (1 alt-) (1 + pmt-1) 8(h it s hit-i ~' amt-l Pit-lXit-l Having computed all the Wi s, the investor should if he is astute, sell the assets for which Wi is negative and use the proceeds of the sale plus the "proceeds" from Wi to buy more units of the Wi which are positive. If he does this, he will have maximized the iEl WiW. In comparing the (IEIWi) among various taxpayers, one would only compare ( I E 2 Wi) since to compare Wi would cause one to make interpersonal utility comparisons among taxpayers. Only with some very restrictive assumptions about investors' utility functions can one do this. Having formulated the model using this most naive of expectation formation, one need only to determine the Wi for the L periods during which the investor has held the ith asset, find the mean of each value over those L periods (raw in case b and weighted in case c) and maximize the computed Wi using the same algorithm as in case a .hahttp://www.ssc.wisc.edu/wais/WAIS645011.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645011.txt/2 Gene Moyer 196581A Report on the U.W. Economic Study, Summer, 1964January 13, 1965 WAIS paper645-025.(Analysis General Papers (Regarding WAIS)10Gene Moyer WAIS 645-025 January 14, 1965 Draft A Report on the University of Wisconsin Economic Study, Summer, 1964 This report has two main objectives. The first, of course, is to acquaint you with the type of data gathered by the study. The second is more ambitious. We hope that the statistics in this report will acquaint you with the people of Wisconsin. Where did they grow up? Was it in the state or was it in some other part of the United States or the world? Was it on a farm, a small town, or a large city? How much education do they have? In what occupation grouping are most of them employed? How much do they earn? How much do they expect to earn in the future? What are their attitudes toward investing money? In what assets do they have their money invested at the present time? These and many other questions are answered in this report. One word of caution is necessary, however. As we mentioned before, this sample of individuals was selected according to certain characteristics which the income tax rolls indicated. A greater percentage of people with high taxable incomes were selected than would have been found if we had sampled without regard to characteristics which people possessed. Therefore to some extent our respondents represented a group which was better off than most of Wisconsin's population. This fact must always be kept in mind when interpreting any of our statistics. This report is divided into four general categories which describe our respondents as follows: I. The Lives of Our Respondents from Birth to Marriage II. The Work of Our Respondents and The Rewards from Their Work. III. The Homes Our Respondents Live In IV. Our Respondents Look Toward the Future I. The Lives of Our Respondents from Birth to Marriage Our respondents were born between __ and __ years ago; their average age is __ years. __% of them are men __% are women. We know very little about their parents, but we do know that __% of their fathers were working as (majority occupation of group) while our respondents were growing up. The rest of their fathers worked at: ______________ __% ______________ __% ______________ __% ______________ __% Most of their fathers were rather highly educated for their own time, although they were not so well educated as our respondents. The largest group of respondents' fathers had completed grade school. The rest had No schooling __% Some High School __% Finished High School __% College __% __% of our respondents did not know how much education their fathers had completed. Our respondents grew up in the following locations: Northern Wisconsin __% Southwestern Wisconsin __% Southeastern Wisconsin __% Minnesota __% Illinois __% Iowa __% Michigan __% Rest of United States __% Rest of the World __% Most of these places were farms, although only __% of our respondents were part of the great migration from the farm to the city which has taken place in the U. S. during the last thirty to fifty years. The rest of these places were: Small towns (under 2500 population) __% Small cities (2500 - 50,000 population) __% Large cities (over 50,000 population) ___% Most of our respondents had completed (some college)(?)(__%). The rest had completed: Less than eighth grade __% Eighth grade __% Some high school __% High school __% College (B.A.) __% Graduate or Professional degrees __% (Rank by frequency) In addition __% of our respondents had completed from 1 to 24 months of vocational schooling which in many cases allowed them to enter their present occupations. Of those who had been to college, __% attended the University of Wisconsin during their undergraduate days. __% attended one of the several state universities, called state normal schools or state teachers colleges or state colleges when our respondents were in residence. __% attended one of Wisconsin's several fine private colleges. The remainder, __% attended a college outside Wisconsin. Most of our respondents (__%) were married at the time we interviewed them. The average age at which they were married was __ years, although this figure may be a little misleading since this is the age at which they married their present spouses. Many of them had been married to someone else earlier in their lives. Of the remaining respondents, __% were widows or widowers, __% were divorced or separated, and __% had never married. From __ to __ children had blessed these marriages; the average number (sometimes from more than one marriage) was __. The wives of our male respondents were (their husbands said) from __ to __ years old. Their average age was __ years. Most of them (__%) also grew up on farms and evidently followed (or possibly preceded) their husbands to the city. The rest of them grew up in: Small towns (under 2500 population) __% Small cities (2500-50,000 population) __% Large cities (Over 50,000 population) __%. Their educations were not so complete as those of their husbands since they completed (on the average) __ years of school. Still __% had a college degree and __% had completed some college. Many of them (__%) had completed from 1 to 24 months of vocational schooling. After their marriage at the average age of __, __% of those wives worked outside the home for an average of __ years. For many of them, (__%) their husbands said that the extra money they earned was not really important, but __% said that the money the wife earned was very important to the economic well being of the family. __% of these wives are still working outside their homes. II. The Work of Our Respondents and the Rewards from Their Work Most (__%) of our respondents were employed in 1963 even though state unemployment rates were quite high. The percentages in each general occupation category were: Professional and technical workers __% Business executives __% Nonfarm self-employed __% Farm Owners and Managers __% Clerical workers __% Service workers __% Skilled workers __% Unskilled laborers __% Not ascertained which category __% Of those with supervisory duties, our respondents had from one to 5,000 employees responsible to them. Most (__%) had from __ to __ employees responsible to them. They worked from __ to __ hours per week although many (__%) had such varied hours that they could not give any absolute number which they worked per week. In 1963, __% of our respondents also had second (or even third) jobs in addition to their main jobs. These jobs ranged from working weekends at the corner gasoline service station to running a profitable manufacturing firm. On the average our respondents worked __ hours per week on each "second" job in 1963. Of those who were unemployed at the time of our study (__%), the most usual cause of their unemployment was ______________. The fact that so large a number were employed at the time of the study marks the fact that __% of our respondents (including those unemployed in 1964) said that they were unemployed every year or every few years. Thus unemployment was a threat to many of our respondents. Some of our respondents (__%) had left the labor force because of retirement, but a sizeable number, after a short period of inactivity, had taken new jobs and in some cases had new careers. The rewards from the duties which our respondents performed was above the state average. Our respondents averaged $__ income from work in 1963. Business or farm income is not included in this figure because it is a combination of income from work and from investment. In addition to current money income, however, __% of our respondents received health insurance paid for in whole or in part by their employers, __% received life insurance for which their employers paid some part, and __% received a retirement nest egg from their employers. III. The Homes Our Respondents Live In Our respondents were in general well satisfied with the housing arrangements they had last summer. __% said that they were either "satisfied" with their living arrangements or "very satisfied" with them. __% of our respondents lived in nonfarm houses and __% owned their homes or were in the process of paying off mortgages on them. Most of these dwellings were single family dwellings although __% had apartments or rooms in the house they owned and lived in. The majority of these dwellings were estimated by our respondents to be worth from $__ to $__ at the time of the interview. The average monthly rent paid by those who did not own their homes was $__ per month but many people paid some of this in the form of work or by paying for utilities, mortgages, and like expenses. Of the farmers, __% owned their farms which ranged from __ to __ acres. IV. Our Respondents Look Toward the Future. Our respondents protected themselves against an uncertain future in several ways. One way was by owning various assets. The following table indicates the percentages of people who owned various types of assets: (The percentages do not sum to one because many of our respondents owned more than one type of asset.) U. S. Government Bonds __% State and Local Government Bonds __% Corporate Bonds __% Savings Accounts __%% Common Stocks __% Preferred Stocks __% Real Estate (in addition to the respondent's home) __% A Business __% The total net worth of our respondents can be seen in the following table: Total Net Worth % of our respondents Under $1,000 __% $1,000 $4,999 __% $5,000 - $9,999 __% $10,000 - $24,999 __% $25,000 - $49,999 __% $50,000 - $74,999 __% $75,000 - $99,999 __% $100,000 -$199,999 __% $200,000 or more __% In addition to the assets they own, __% of our respondents were members of a retirement plan run through their place of business and __% had a program in which they only collected part of their salary in 1964 with the balance to be paid them at some time in the future. Our respondents also had invested in life insurance. __% of our respondents had from $__ to $__ worth of insurance on the head of the family and many had insurance on other members of the family as well. Of course our respondents also looked to government or their families for aid when their own resources were not adequate to meet the demand, of living. __% of our respondents have lived with relatives during the last fifteen years although in many cases this was during childhood. __% of our respondents were receiving or had received in the last fifteen years pensions, education subsistence, social security benefits, public assistance, or workmen's compensation, all of which are governmental programs. At the same time __% of our respondents supported one or more persons outside their homes during 1963 and __% have had at least one other person staying in their homes during the last five years. Our respondents were also generous in other ways. __% of them gave some money or goods to charity in 1963. Despite their concern for security, our respondents are optimistic about the future. __% of our respondents felt that the total family income would be greater in 1968 than it had been in 1963. __% thought their income would be 50% more in 1968 than it had been in 1963. Only __% thought it would be less and these usually listed retirement as the cause of the probable decreased income. One of the major causes of optimism about their future family income level was that our respondents felt that the economy of the United States would continue to grow and that their incomes would keep pace with that growth. In addition, our respondents were both optimistic about their employment possibilities and ambitious to improve them. __ of our respondents have worked for more than one company in the last five years and __% said these job changes had helped their ability to get ahead. __% said that they had looked for new job opportunities at least once during the last five years and __% said that there was at least a possibility that they would make an important job change during the next five years. Of these __% felt that ______________ would be the most important advantage they would get from the changes and __% felt that ____________ would be the most important advantage they would get from the change. To summarize, then, our respondents are mobile in that they have come from all parts of the earth to Wisconsin and they welcome the prospect of changing to new and better employment opportunities. They desire certainty in an uncertain world and yet they do not fear a future for which they have made plans.They are well housed, well fed, and healthy. In short they represent a group of which the nation can be proud.hahttp://www.ssc.wisc.edu/wais/WAIS645025.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645025.txt# Gene Moyer 1964$Response Rates for the SurveyiDecember 17, 1964 WAIS paper645-018Survey Data and FileGene Moyer WAIS 645-018 December 17, 1964 First Revision Response Rates for the Survey The survey is now complete. Scanlon's two interviews have come in, and a determination has been made on each of the ten partially completed schedules. Nine of the partially completed interviews were counted as interviews and one was counted as a refusal. The tables on pages 2-5 are revisions of the table given in the Memorandum of November 27, 1964, which is the first draft of this paper. The eleven cover sheets added by the Wisconsin Survey Research Laboratory because of recent family breakups are still included separately, although all totals include these eleven wives' cover sheets. The number of possible interviews is a matter of controversy among our staff and the Survey Laboratory. Each staff proposal is given as a separate table, and the relevant percentages have been computed for it. The difference concerns the categories of cover sheets which should be termed "non-sample" and those which should be termed "non-participants". The following chart indicates the differences among the various proposals. Non-Sample Cover Sheets Proposal A Proposal B Proposal C Proposal D Deceased Deceased Deceased Deceased Not an eligible Not an eligible Not an eligible Not an eligible respondent respondent respondent respondent Unable to Moved out of state Moved out of state participate Moved within state Moved within state no other address no other address known known Moved, no address known Moved, no address known Unable to participate Unable to contact Response Rates, Proposal A Category Number of Cover sheets % of 2080 Original Cover sheets % of 1958 Possible Interviews 1. Original cover sheets 2. Number in original sample 2069 3. Added by Survey Laboratory 11 4. Total original cover sheets 2080 100 5. Non-sample cover sheets 6. Deceased 27 1.3 7. Not an eligible respondent 95 4.6 8. Total non-sample cover sheets 122 5.9 9. Non-participating respondents 10. Moved out of state 80 3.8 4.1 11. Moved within state, no other address known 6 0.3 0.3 12. Moved, no address known 59 2.8 3.0 13. Refused 446 21.5 22.8 14. Unable to participate 33 1.6 1.7 15. Could not contact 34 1.6 1.7 16. Total non-participating respondents 658 31.6 33.6 17. Interviews 18. Interviews with booklets 1139 54.8 58.1 19. Interviews without booklets 161 7.7 8.2 20. Total interviews 1300 62.5 66.4 21. Total possible interviews (16 + 20) 1958 94.1 100.0 Response Rates, Proposal B Category Number of Cover sheets % of 2080 Original Cover sheets % of 1925 Possible Interviews 1. Original cover sheets 2. Number in original sample 2069 3. Added by Survey Laboratory 11 4. Total original cover sheets 2080 100 5. Non-sample cover sheets 6. Deceased 27 1.3 7. Not an eligible respondent 95 4.6 8. Unable to participate 33 1.6 9. Total non-sample cover sheets 155 7.5 10. Non-participating respondents 11. Moved out of state 80 3.8 4.2 12. Moved within state, no other address known 6 0.3 0.3 13. Moved, no address known 59 2.8 3.1 14. Could not contact 34 1.6 1.8 15. Refused 446 21.5 23.1 16. Total non-participating respondents 625 30.0 32.5 17. Interviews 18. Interviews with booklets 1139 54.8 59.1 19. Interviews without booklets 161 7.7 8.4 20. Total interviews 1300 62.5 67.5 21. Total possible interviews (16 + 20) 1925 92.5 100.0 Response Rates, Proposal C (WSRL) Category Number of Cover sheets % of 2080 Original Cover sheets % of 1813 Possible interviews 1. Original cover sheets 2. Number in original sample 2069 3. Added by Survey Laboratory 11 4. Total original cover sheets 2080 100 5. Non-sample cover sheets 27 1.3 6. Deceased 7. Not an eligible respondent 95 4.6 8. Moved out of state 80 3.8 9. Moved within state, no other address 6 0.3 known 10. Moved, no address known 59 2.8 11. Total non-sample cover sheets 267 12.8 12. Non participating respondents 446 21.5 24.6 13. Refused 14. Unable to participate 33 1.6 1.8 15. Could not contact 34 1.6 1.9 16. Total non-participating respondents 513 24.7 28.3 17. Interviews 1139 54.8 62.8 18. Interviews with booklets 19. Interviews without booklets 161 7.7 8.9 20. Total interviews 1300 62.5 71.7 21. Total possible interviews (16 + 20) 1813 87.2 100.0 Response Rates, Proposal D Category Number of Cover sheets % of 2080 Original Cover sheets % of 1746 Possible Interviews 1. Original cover sheets 2. Number in original sample 2069 3. Added by Survey Laboratory 11 4. Total original cover sheets 2080 5. Non-sample cover sheets 6. Deceased 27 1.3 7. Not an eligible respondent 95 4.6 8. Moved out of state 80 3.8 9. Moved within state, no address known 6 0.3 10. Moved, no address known 59 2.8 11. Unable to participate 33 1.6 12. Could not contact 34 1.6 13. Total non-sample cover sheets 334 16.0 14. Non-participating respondents 15. Refusals 446 21.5 25.5 16. Total non-participating respondents 446 21.5 25.5 17. Interviews 18. Interviews with booklets 1139 54.8 65.2 19. Interviews without booklets 161 7.7 9.3 20. Total interviews 1300 62.5 74.5 21. Total possible interviews (16 + 20) 1746 84.0 100.0hahttp://www.ssc.wisc.edu/wais/WAIS645018.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645018.txt% Ron Durant 1964RLProposed Format of an Extraction Record to be Used in WAIS Cross TabulationsDecember 15, 1964p WAIS paper645-021 Cross Tabulations FormatsxqRon Durant WAIS Working Paper 645-021 December 15, 1964 DRAFT Proposed Format of an Extraction Record to be Used in WAIS Cross Tabulations To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Bauman, Geffert, Moyer, Ryshpan, Roubal, Seavey, and Weigner From: Ron Durant Re: (1) R.F. Miller's Draft (7/28/64) - WAIS Memorandum on Initial Tabulations of Tax Return Data. (2) WISTAB (Wisconsin Tabulator) Users Manual. An extraction run (TAX-06) will be written to extract two files from the WAIS Master File. The extracted output files will consist of: (1) individual return filers; and (2) single persons and combined husband and wife units. The following is a proposed record format for these extracted files. For extracted output [file (1)] there will be one of the following records for every individual in the WAIS sample. For extracted output [file (2)] there will be one record for every single person and one record for every combined husband and wife in the WAIS sample. Record Position Data 1- 8 Identification No. (individual or husband) 9-10 Number of years filed (Ni) for the ith individual or husband during the period 1947-1959. (xx) 11-15 AGI average, 1947-59 (AGIi) (xxxxx) $ only 16-20 NTI " " (NTIi) " " 21-25 W&S " " (Wi&Si) " " 26-30 D&G " " (Di&Gi) " " 31-35 B&P " " (Bi&Pi) " " Record Position Data 36-40 D average, 1947-59 (Di) (xxxxx) $ only 41-45 G " " (Gi) 46-52 Standard Deviation (AGIi) (xxxxx.xx) or (e.g.) 53-59 Standard Deviation (NTIi) (xxxxx.xx) 60-66 " " (Wi&Si) " 67-73 " " (Di&Gi) " 74-80 " " (Bi&Pi) " 81-87 " " (Di) " 88-94 " " (Gi) 95-101 Trend of (AGIi) over time (xxxxx.xx) (e.g.) 102-108 Trend of (NTIi) over time (xxxxx.xx) 109-115 " " (Wi&Si) " " " 116-122 " " (Di&Gi) " " " 123-129 " " (Bi&Pi) " " " 130-136 " " (Di) " " " 137-143 " " (Gi) " " " 144-150 Constant term (AGI i) (xxxxx.xx) (e.g.) ai = AGIi - bi Ti 151-157 " " (NTIi) " 158-164 " " (Wi&Si) " 165-171 " " (Di&Gi) " 172-178 " " (Bi&Pi) " 179-185 " " (Di) " 186-192 " " (Gi) " 193-199 AGIi Sum of Residual variation from trend (xxxxx.xx) (e.g.) 200-206 NTIi Sum of residual variation from trend (xxxxx.xx) 207-213 Wi&Si " " " " " " " " 214-220 Di&Gi " " " " " " " " 221-227 Bi&Pi " " " " " " " " 228-234 Di " " " " " " " " 235-241 Gi " " " " " " " " 242-248 Coefficient of variation of (AGIi) (xxxxx.xx) (e.g.) 249-255 " " (NTIi) " " 256-262 " " (Wi&Si) " " 263-269 " " (Di&Gi) " " 270-276 " " (Bi&Pi) " " 277-283 " " (Di) " " 284-290 " " (Gi) " " 291-292 Number of job changes 1947-1959 (xx) 293 Sex of filer M or F 294 Marital status (S = single, M = married) 295-299 AGIi 1947 (xxxxx) 300-304 NTIi " " 305-309 Wi&Si " " 310-314 Di&Gi " " 315-319 Bi&Pi " " 320-324 Di " " 325-329 Gi " " 330-334 AGIi 1953 " 335-339 NTIi " " 340-344 Wi&Si " " 345-349 Di&Gi " " 350-354 Bi&Pi " " 355-359 Di " " 360-364 Gi " " 365-369 AGIi 1959 " 370-374 NTIi " " 375-379 Wi&Si " " 380-384 Di&Gi " " 385-389 Bi&Pi " " 390-394 Di " " 395-399 Gi " " 400-401 Occupation 1947 " 402-403 " 1953 (xx) 404-405 " 1959 (xx) 406-407 No. of dependents 1947 (xx) 408-409 " 1953 (xx) 410-411 " 1959 (xx) 412 Record mark (F) General Notes On Data Fields: 1. If any component of an "average" field has an "MI" (Missing Information) for a given year, the data field for that year is bypassed. e.g.: Individual has filed t = 47, 48, 49, 50, 51, however AGI field in year '49 has an MI in the units position. 48 51 _ It-=47 AGIit + tEE;O AGI it . AGIi 4 In this case the Ni (position 9-10) for the individual would be 5 however the Ni in AGIi = 4. It seems that an Ni = 5 would not be distorting in any subsequent calculation of weights since AGIi would, be an appropriate estimate to of the particular AGI 2. If an individual did not file in any one or two or all of the years; 1947, 1955 and 1959, an "NF" (not filed) will appear in the units position of the appropriate fields in the extracted data record. Likewise for missing information an "MI" will appear in the units position of a particular field. WISTAB can either keep a separate count or no count at all of these items in both the frequency and value added options of the cross-tabulation program.' Timing and Blocking Considerations We now have a 412 character record for each of our 18,000 individuals [file(l)]. For file (2), we will have something less than 18,000 (depending on number of combined husband and wives) 412 character records. Both of these files are input to WISTAB. One IBM high density tape will hold approximately 19,000 412 character records. It takes approximately 12 minutes to read one reel (with 18,000 such records) of IBM Tape Drive 729 II. Now blocking the 412 character records could cut down on the read time (which is not crucial here), but it would also cause less memory to be available in WISTAB for tables, since a larger than 412 character read-in area would have to be provided. Since we desire to do as many tables as possible in one pass, it is recommended that we leave the extracted data records unblocked. Referring to WISTAB write-up (page 16), we can adapt the "Number of Tables, Per Pass of Data" formula to our two dimensional needs as follows: (1) iEl[(xi + I) (Yi + 1)] C < Machine Core Size - 111,000 + (412 - 80)] where Xi is the number of X intervals in the ith table, etc., C is the counter (or cell) size specified and n is the number of tables to be run. Since Machine Core Size = 40,000 we have: (2) inl 1(Xi + 1) (Yi + 1)) C < 28,668 Now for a straight bivariate frequency table with each variable having 12 classes we have; Xi = 12; Yi = 12 and C = 5 (3) n[(13) (13)) 5 <_ 28,668 n < 28,668/845 n < 34 Tables per pass Now for a straight bivariate value added table with each variable having 5 digits in length and 12 classes we have; Xi= 12; Yi= 12 and C= 6 (4) n < 28,668/1014 n < 28 Tables per pass There is no timing formula for the length of a WISTAB pass and no experience on our part in predicting the length of running time with our volume of data. We only know that it will take approximately 12 minutes of read time for each pass. In regards to timing, the following is quoted from page 16 of the WISTAB write-up. "Timing is affected by many factors, the number of positions in each variable, the number of axes defined in X,Y,Z control cards and the number of intervals specified for each axes, so that a formula would be too complicated to use. Users of WISTAB have in general found it to be an efficient means of tabulating."hahttp://www.ssc.wisc.edu/wais/WAIS645021.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645021.txt? 85421 Ron Durant 1965$Description of Tax-06 ProgramnFebruary 1, 1965 WAIS paper645-027Programs,&Ron Durant, WAIS Paper 645-027 February 1, 1965 Report To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Moyer, Bauman, Geffert, Ryshpan, Roubal, Seavey, Wiegner. From: Ron Durant Document: Description of Tax-06 Program. I. Tax-06 has been written and run in order to perform the following: 1. Edit and consolidate Resubmitted Edit File (From Tax-01 Master Creation Run) and New Folder Data File (additional data which never got into Tax-01 Run). TAPE-SSRI-306 SORTED EDITS TAPE-SSRI-307 SORTED NEW FOLDER DATA TAX-O6 CONSOLIDATED O/P I/P to TAX-03 Master Updating Run TAPE-SSRI-316 LISTING OF EDITS NEW FOLDER RECORDS 2,748 EDIT I/P 10,470 RECORDS REJECTED 646 CONSOLIDATED O/P 12,571 IBM 1410 TIME 25 MINUTES II. Edit Listing has been handed over to the keypunching group for processing.hahttp://www.ssc.wisc.edu/wais/WAIS645027.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645027.txt  Ron Durant 1965:4Description of Tax-03 (Edit and Master Updating Run)February 1, 1965 WAIS paper645-028ProgramsRon Durant WAIS Paper 645-028 February 1, 1965 Report To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Moyer, Bauman, Geffert, Ryshpan, Roubal, Seavey, Wiegner. From: Ron Durant Document: Description of Tax-03 (Edit and Master Updating Run). I. Tax-03 has been written and run in order to perform the following: 1. Update existing Master Records. 2. Add new Master Records. 3. Delete Master Records. 4. Change I.D.# and write out recycled Master Records. 5. Perform editing on updating input. MASTER IN FILE TAPES-SSRI-262, 263 & 298 TAPE-SSRI-316 RESUBMITTED DATA TAX-03 NEW MASTER FILE TAPES-SSRI-291, 168 & 307 EDIT LISTING RECYCLED MASTER RECORDS TAPE-SSRI-297 EDITS TAPE-SSRI-292 DATA IN RECORDS 12,571 MASTER RECORDS IN 132,633 EDIT RECORDS REJECTED 7 MASTER RECORDS DROPPED 42 MASTER RECORDS OUT 134,754 RECYCLED MASTER RECORDS OUT 207 NEW COMPLETE MASTER FILE IBM 1410 TIME 100 MINUTES II. Edit Listing has been handed over to the keypunching group for processing.hahttp://www.ssc.wisc.edu/wais/WAIS645028.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645028.txte Ron Durant 1965NGProposed Format of an Extraction Record to be Used in Cross TabulationsFebruary 3, 1965 WAIS paper645-031i Cross Tabulations FormatsrRon Durant WAIS Working Paper 645-031 February 3, 1965 Draft To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Bauman, Geffert, Moyer, Ryshpan, Roubal, Seavey, Wiegner. From: Ron Durant. Document: Proposed Format of an Extraction Record to be used in WAIS Cross Tabulations. I. The following record is proposed to be extracted for all individual return filers. Record Position Data 1- 8 Identification Number 9-10 Year 11 Sex 0 = Male 1 = Female 12 Marital Status 0 = Single 1 = Married 13-14 Occupation 15-18 Year of Birth 19-20 County of Residence (Prior Year) 21-22 Number of Dependents 23 Race 24-29 AGI (Adjusted Gross Income) 30-35 NTI (Net Taxable Income) 36-41 W & S (Wages and Salaries) 42-47 Total Dividends Received 48-53 Gain at Loss on Sale of Assets 54-59 Profit or loss from Business & Partnership Income 60-65 Total Dividends Received & Gain or Loss on Sale of Assets 66-71 Total Interest Received 72-77 Total Rent Received 78-83 Income from Trustees or Fiduciaries 84 Blank 85 Record Mark (not =)hahttp://www.ssc.wisc.edu/wais/WAIS645031.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645031.txt Ron Durant 1965Revision of WAIS Working Paper 645-031-- Proposed Format of an Extraction Record for Extract 01 File to be Used in WAIS Cross Tabulations and Regressions March 2, 1965 WAIS paper645-040LExtract 01 Formats Ron Durant WAIS Working Paper 645-040 March 2, 1965. Revision To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Bauman, Geffert, Moyer, Ryshpan, Roubal, Seavey, Wiegner. From: Ron Durant Document: Revision of WAIS Working Paper 645-031 Proposed Format of an Extraction Record for EXTRACT #1 FILE to be used in WAIS Cross Tabulations and Regressions. I. The following record is proposed to be extracted for all individual return filers. All amount fields are in dollars only. Extracted Record Position Extracted Data Source Record Position Source File A=WAIS Income Master File B=ID & Social Sec. File 1- 8 Individual Identification If cols. 8-9 of A = 00 *then col. 7 of Extract Rcd = +0 If cols. 8-9 of A = 10 *then col. 7 of Extract Rcd = 1 2- 9 A 9-10 Year 10-11 A 11 Sex Male - 0 Female - 1 8 A 12 Marital Status (Spouse Separate Income) 0 - no spouse 1 - spouse, no separate return 2 - spouse, with separate return 3 - spouse, died during year 23 A 13-14 **Age 129-130 B Addition of "+" one will enable husband and wife to sort ahead of the remainder of the family. ** Blank(s) will appear in these fields if no social security information exists. Extracted Record Position Extracted Data Source Record Position Source File A=WAIS Income Master File B=ID & Social Sec. File 15 ** Race 131 B 16-17 Number of Dependents 26-27 (excluding spouse if there is one) A 18-19 Occupation 18-19 A 20 Return Reason 1 - yes 2 - no, insufficient income 3 - no, student 4 - no, lack of knowledge 5 - no, to military service 6 - no, not Wisconsin resident 7 - no, unemployed 8 - no, reason unknown 9 - unknown (item not completed by taxpayer) 21 A 21-22 Current Year County 12-13 A 23 Third Digit Current Residence Location 14 A 24-25 Prior Year County 15-16 A 26-32 AGI (Adjusted Gross Income) 145-151 A 33-39 NTI (Net Taxable Income) 163-169 or 298-304 A whichever is smaller 40-46 W & S (Wages & Salaries) 28-34 37-43 46-52 Sum of these W&S fields 47-53 Total Dividends Received 64-70 A 54-60 Gain or Loss on sale of Assets 82-88 A 61-67 Profit or Loss from Business & Partnership Income 91-97 109-115 Extracted Record Position Extracted Data Source Record Position Source File A=WAIS Income Master File B=ID & Social Sec. File 68-74 Total Interest Received 55-61 A 75-81 Total Rent Received 73-79 A 82-88 Income from Trustees or Fiduciaries 100-106 A 89-90 **Year of birth 129-130 B 91 ( )Record Markhahttp://www.ssc.wisc.edu/wais/WAIS645040.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645040.txt \Victor Cassidy 1965{A Proposal to Determine Birth and Death Date Data for Persons in the Master File Who Have No Social Security Account Number 1965NGIdentification Code for Social Security Administration Punch Card FilesOctober 21, 1965 WAIS paper656-027e("Social Security Earnings Data- 805hahttp://www.ssc.wisc.edu/wais/WAIS656027.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656027.txt2Mike VonSchneidemesser VonSchneidemesser WAIS 656-027 October 21, 1965 Identification Code for Social Security Administration Punch Card Files Along with the tape files of 805 information, the SSA sent us punch card files for those cases which for some reason or another are not contained on the tape file. These card files are described in the letters by Robert Heller to Roger Miller (March 6, 1964) and for the second delivery, Ira Rifkind to Gene Moyer (September 2, 1965). The layout of these card files is given in the "Description of tape file of form 805 information and punch card files", and for the second delivery in the above-mentioned letter by Ira Rifkind. In order to identify the different kind of cards if necessary an identifying code was entered in columns 74-80 of each card. Codes with less than seven positions are right justified. Code Explanation INC3 Incorrect Social Security account number was specified. IMP4 Impossible case; no SS# like that was ever given. DUP2 Duplicate number, no record of claims case; i.e. two WAIS numbers with the same social security number were specified. DUPINC5 Duplicated incorrect SS& DUPCLM8 Duplicate claim; i.e. two WAIS numbers were associated with the same SS# in claims status. MULCLM7 Multiple claim NIF9 Not in file; SSA does not appear in the files of the Social Security Administration. DUPIMP6 Duplicate impossible; an impossible SS# was given a second time. PINCO Previously incorrect; a SS# which was identified already with the first delivery as incorrect. To separate cards from the first delivery from those of the second delivery (the layouts are different!) a zone punch in row eleven of column 80 was put in the cards of the second delivery. Thus PINCO will always have a zone punch and the numeric characters of the second delivery cards in column 80 will appear as alphabetic characters when interpreted. Note: Normal claims cards do not have an indentifying code in columns 74-80, However the second delivery of claims cards has the eleven zone punch in column 80. Ashok Bhargava 1967Selection File Format July 14, 1967 WAIS paper667-054Selection File Ashok Bhargava WAIS 667-054 July 14, 1967 SELECTION FILE FORMAT The following is the format of the selection file. It covers both the old MF and the new MF (i.e. from 1946-1964). The location of fields of MF records is based on the old MF, which will be changed to the new (integrated) MF fields, when they are available. The population of the Selection File includes persons in the MFs and women (spouses of males in sample) who may not have filed at all. Dummy records will be created for these women, when there is an indication on the males record that he is married. Dummy records will also be created for males who do not file, while their spouses do file. FORMAT Field Input Location in Size Item Description File Input File 8 WAIS ID # 9 Social Security # sample, high income (1,0001s) benefit it (70's) Record present/absent (gap plugging) indicators (1946-64: 2 each year). 2-9 10-18 2-4 10-11 ea. yr. 2 First year f Had- return 2 Vast year filed return First year filed (not 1946) 2 I Total year records appear NF 33 ;Binary code for interest, dividends, capital gains, rent, business (1946-64) I 2 10-11 10-11 10-11 10-11 55-99 ea. yr. 19 Spouse separate *income 19 Return reason code (previous year) 57 Residence location. 1 38 1 County prior year 19 Marriage details 7 Year 46-52 indicator 0 - Total income 1 3500 1 HP (1946148) Total income 1 3500 (1949-52) 1 - Otherwise 7 ;Form type (1946-52) 2 Month of death 2 23-e4,: yr.. yr - 12-14 ea. yr. 15-16 lei ea. yr, 127-135 ea yr: .387 ea. yr. Card 2/Sequence -Field 56-61 (BF) 52-57 (Age) 2 Year of last known address I FFID 1 122-123 Field Size 2 County Code 2 Month of birth 2 Year of birth 1 Absent - present indicator 1 Race 1 Chosen in survey 1 Response 1 Asset Booklet 19 Absent-present indicator 2 First year filed 2 Last year filed 19 Absent - present indicator 2 First year filed 2 Last year filed 1 "N" (husband-Wife) 131 ID# By hand By hand 10-11 ea. 10-11 10-11 27-28 ea. 27-28 27-28 yr Yr . 18 1 "WN" Indicator Indicator Indicator of type of dummy record MENCit (Entry code) Primary year for MENCit Secondary year for MENCit it I it "Entry code) Year of IENCit MEXCit (Exit code) 667-012 667-012 667-012 667-012 667-012 667-012 667-012 667-012 667-012 19 I 20 1 21 1_ 22 2 23 2 24 I 25 2 26 1 27 Field Item Description Input Location in Size File input file 2 Primary year for MEXCit 667-012 28 2 Secondary year for MEXCit 667-012 27 1 IEXCit (Exit code) 667-012 28 2 Primary year for IEXCit 667-012 29 2 Secondary year for IEXCit 667-012 30hahttp://www.ssc.wisc.edu/wais/WAIS667054.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667054.txt 3  James Geffert 19654-Proposed Flow Diagram of Weighting ProcedureseFebruary 1, 1965 WAIS paper645-029pSurvey Data and File Jim Geffert WAIS Paper 645-029 February 2, 1965. To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Moyer, Bauman, Durant, Ryshpan, Roubal, Seavey, Wiegner. From: Jim Geffert Document: Proposed flow diagram of weighting procedures. SOURCE FILES a) 1962 Last Name Groups Identified With Name and Address and SS#. b) MASTER Income Information Identified With WAIS ID#. c) ID FIXED FORMAT Identification Information (with Social Security) by SS# and WAIS ID# and Name and Address. d) SAMPLE & KEYS Interview Sample information before Survey by Name and Address, WAIS ID, SS# depending on Key A. e) COVER SHEET CARDS Interview information after survey, WAIS ID#, Sequence # and Keys. f) SAMPLE CARDS (DATA) Respondent Information, WAIS ID# or dummy WAIS ID# g) LISTINGS RE-KEY PROCEDURES A FIXED FORMAT ID MASTER FILE COMPUTE KEYS, MATCH MERGE PUT IN COMMON FORM LISTING OF NO MATCHES, IF ANY. NO MATCH ID NOMATCH MASTER CORRECTIONS AND ADDITIONS ID FIXED FORMAT & KEYS TO B B ID FIXED FORMAT & KEYS 1 ID FF & KEYS 2 MERGE ID FIXED FORMAT & KEYS TO REMATCH REMATCH PROCEDURE 1962 PUT IN COMMON FORMAT & COMPUTE KEYS SORT BY SS# ID FIXED FORMAT & KEYS FROM B PUT IN COMMON FORMAT SORT BY SS# C MATCH MERGE UNMATCHED 1962 MATCH SS# TO E UNMATCHED FIXED FORMAT ID SORT BY NAME, ADDRESS SORT BY NAME, ADDRESS D MATCH MERGE UNMATCHED 1962 NAME & ADD. UNMATCHED FIXED FORMAT TO E MATCH SS# NAME & ADD. MERGE MATCHED FILE TO F FINDING THE SAMPLE COVER SHEET CARDS MATCHED FILE FROM E F MATCH ON ID# RESIDUAL COVER SHEET CARDS NON SAMPLED ITEMS SAMPLED ITEMS COVER SHEET CARDS CORRESPONDING TO SAMPLED ITEMS SAMPLE WITH COVER SHEET INFORMATION G MATCH ON ID# UNMATCHED FIXED FORMAT ID's SAMPLED ITEMS RESIDUAL COVER SHEET CARDS NON SAMPLED ITEMS TO H COVER SHEET CARDS CORRESPONDING TO SAMPLED ITEMS SAMPLE WITH COVER SHEET INFORMATION MERGE MATCHED FROM MASTER TO J FINDING THE SAMPLE (cont.) RESIDUAL COVER SHEET CARDS LISTING OF RESIDUAL COVER SHEET CARDS LISTING OF ASSIGNED ID's H MATCH BY HAND MATCHED LIST UNMATCHED LIST HAND MATCH NAME AND ID# 62 SAMPLE I MATCH ON NAME MERGE ID# MATCHED FROM MASTER MATCHED 62 NON SAMPLE J SAMPLE FILE TO K AND M WEIGHTING PROCEDURE RESIDUAL SAMPLE SAMPLE FILE MATCHED TOTAL MASTER FILE TOTAL 1962 STATE FILE GROUP NAME 62 FILE etc. K CROSS TAB CROSS TAB CROSS TAB CROSS TAB CROSS TAB L ASSIGN WEIGHTS ON BASIS OF KEYS KEYS AND ASSOCIATED WEIGHTS SAMPLE FILE M MERGE WEIGHTS INTO SAMPLE FILE COMPLETE WEIGHTED SAMPLEhahttp://www.ssc.wisc.edu/wais/WAIS645029.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645029.txtF @@  !"#$)%&(+/0123456789:;=<>?@ABDEFGHIJKLMNOPQRSTUVWYZ[\]^_`abcdghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./012345679:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~D701-0181Master File- Tax Recordshahttp://www.ssc.wisc.edu/wais/WAIS701018.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS701018.txt Bob Gay" 1970 WAIS Meeting: Oct. 1, 1970October 5, 1970u WAIS paper701-017{ Miscellaneous\hahttp://www.ssc.wisc.edu/wais/WAIS701017.pdf httpK Roger Miller 1965.(Format of 1964 Tax Averaging Law File #1April 28, 1965 WAIS paper645-051p Averaging Studies Formats{Roger F. Miller WAIS 645-051 Working Paper April 28, 1965. FORMAT OF 1964 TAX AVERAGING LAW FILE # 1 Note: This is a revision of Barger's WAIS WP 645-048. Description of the File creation, new codes and variable changes is in R.F. Miller's WAIS WP 645-050. Record Position Symbol Item Remarks 1-8 WAIS Identification Number of filer Husband if a "joint record" 9 Record Type Code 10 Race (if available) of filer 11 Number of "Interpolated" years of filer " " " " " 12 " " " " of later spouse 13 " " " " of earlier spouse Records of Filer (or of husband if a "joint record") All Record Types 14-15 58 (i.e. computation year) t = 5 16 L Person filed Last year? and if not, why not? Code i = 1 17-18 Occupation code applicable to this year 19-20 Age in this year (if available) 21-22 E Number of motions 23 M Marital status in this year (recoded) 24 V Value of this year' a record code 25-31 G+ Sum of necessarily positive sources of income 32-38 G Adjusted Gross Income per Wisconsin 39-45 N Net Taxable " " " 46-52 C Capital Gains Net " " " 53-91 Same as 14-52 for preceding year (1957) t = 4 92-130 " " " " 1956 t = 3 131-169 " " " " 1955 t = 2 170-208 " " " " 1954 t = 1 209-403 Same as 14-208 for the later spouse to i = 2 the extent such records are available or "creatable". Record types 3-7 only. 404-598 Same as 14-208 for the earlier spouse to i = 3 the extent, such records are available or "creatable". Record types 3, 5-7 only. 599 Blank 600 Record markhahttp://www.ssc.wisc.edu/wais/WAIS645051.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645051.txtRZVictor Cassidy 1965{A Proposal to Determine Birth and Death Date Data for Persons in the Master File Who Have No Social Security Account Number May 24, 1965 WAIS paper645-065o4.Age Data Proposals- For Analyses, Theses, etc.Victor M. Cassidy Gene Moyer WAIS 645-065 June 18, 1965 Revised A Proposal to Determine Birth and Death Date Data for Persons in the Master File who have no Social Security Account Number The problem of gathering age and death data for those people for whom #805 data is not available is the major reason for this paper, but it seems reasonable to combine the parts of this operation with preparations for a test of the effect upon our name groups of having only the 1958 Madison tax rolls available when the sample was drawn. The following procedure allows both these operations to be performed at once: (1) Pull from the 1962 rolls all persons whose last names are the same as those in our name groups. Check each name list for size. If it is less than 220, pull off persons with the next last name until this list of names is at least 220. If the number of persons on the Madison roll is less than 40, pull off all persons in the next last name group until the total number from the Madison roll is at least 40. Both criteria must be met. If either is not met pull off enough next last names until the criteria are met. (2) With this selected tape, perform the operations required in the forthcoming WAIS paper on a test of whether the method of choosing name groups influenced the sample. These operations are described in that paper and need no further explanation. Extract from this tape those persons whose last names are the same as those of our name groups This is the 1962 master file. At the same time, for each person whose name places him actually in one of our name groups, construct a nine-column key whose values are the same as those found in WAIS paper 645-002 corrected February 12, 1965. Place this in the last nine columns in each record of the 1962 master file (only digits 1, 8 and 9 will be non-blank at this stage). (3) Match the FFID file with this 1962 master file in the following ways: (a) Do Social Security number, last name, and first initial compare? (b) Do first initial, last name, and address compare? If the answer to either of these questions is yes, overlay the "I" or "N" FFID with a "J" FFID. (4) Match the FFID file, the Extract #01 file, and the 1962 master file above on ID number and perform the following operations: (a) In 79-80 of card 2 of the FFID: Compare this figure with the largest year record the person possesses (10-11 of extract 01). If the figure in 79-80 is greater than the figure in 10-11 do not change the record. If it is less than the figure in 10-11, overlay the figure in 79-80 with than given in 10-11. (b) As extract 01 is passed, compute for each person in the FFID file who have "J" records a complete key and place this in the blank columns of the key in the 1962 master file. (c) Count the number of persons in the 1962 master file which have each given key. Print out this count. (5) Having completed operations 1-4, (a) Store the select tape of the 1962 master file for other uses. (b) Print out the complete FFID (both cards) for all persons (at most 1527 as of 4-27-65) who had no Social Security number and who had other than 98 in the county code field (76-77 of card 2) in the FFID. (6)Take this printout to the State Office Building, 1 W. Wilson St. (contact Miss Taylor 266-1371). Death certificates there are filed alphabetically by year and the date of birth is written on the death certificate. (a) If the printout shows a terminal filing year, check under that year for a death certificate. Check also years following the terminal year (I propose a five-year limit to such checking). If no death certificate is found, look for a birth certificate. (b) If the terminal year is 1959 or 1960 and if the subject is not found in 1962 tax tape, check likewise for a death certificate. (c) If the terminal year is 1959 or 1960 and if the subject is found in the 1962 tax tape, then check for a death certificate in 1963, 1964 or 1965. If no death certificate is found, then look for a birth record.hahttp://www.ssc.wisc.edu/wais/WAIS645065.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645065.txtGHFH Ron Durant 1965.'Format of SSRI WAIS Tax Extract #1 FileeApril 12, 1965 WAIS paper645-046pAveraging Studies5LFRon Durant WAIS Working Paper 645-056 April 12, 1965. To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Barger, Bauman, Geffert, Moyer, Ryshpan, Roubal, Seavey, Wiegner. From: Ron Durant Document: Format of SSRI WAIS TAX EXTRACT #1 FILE I. The following record will be extracted for all individual return filers. The tape file will consist of two reels -- 3 records per block. All amount fields will be DOLLARS ONLY. Record Position Data 1- 8 WAIS Identification Number 9-10 Year 11 Sex 0 - Male 1 - Female 12 *Marital status 13-14 Age 15 Race 16-17 Number of dependents 18-19 Occupation 20 Return filed in previous year or reason why not: 21-22 County of Residence (Current Year) 23 City Designation (Current Year) 24-25 County of Residence (Prior Year) 26-32 AGI (Adjusted Gross Income) 33-39 NTI (Net Taxable Income) 40-46 W & S (Wages and Salaries) 47-53 Total Dividends Received 54-60 Gain or Loss on Sale of Assets 61-67 Self-employment Income [Includes Profit or Loss from Business & Partnership Income]. Record Position Data 68-74 Total Interest Received 75-81 Total Rent Received 82-88 Income from Trustees or Fiduciaries 89-90 Year of Birth 91 Record Mark ( ) * See Gene Moyer, The Coding of the Wisconsin State Tax Forms (1946-60) WAIS Paper 645-038 (1st Revision) p. 19, for an explanation of the marital status code.hahttp://www.ssc.wisc.edu/wais/WAIS645046.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645046.txt Ron Durant 1965ztAssignment of Entry Codes to Update and Correct Existing WAIS Master Records (7 Digit Pre-Consistency Master Record)August 12, 1965 WAIS paper645-047rFormatsdhahttp://www.ssc.wisc.edu/wais/WAIS645047.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645047.txtRon Durant WAIS Working Paper 645-047 August 12, 1965 1st Revision To: Hilde Roubal and WAIS Keypunching Staff Info: WAIS Staff: David, Groves, Lampman, Miller, Bauman, Geffert, Loniello, Moyer, Roubal, Ellis, Wiegner, VonSchneidemesser From: Ron Durant Document: Assignment of Entry Codes to Update and Correct Existing WAIS Master Records (7 Digit Pre-Consistency Master Record). I. The following are the format of the WAIS Master income Record and an assignment of entry codes which may be used to correct the appropriate fields in this master record. FORMAT DATA RECORD ENTRY CODE Pos 1 M 1 1 2 - 9 M 9 Identification number 10 - 11 M 11 Year of return 12 - 14 Residence location 15 - 16 County prior year 17 Address change 18 - 19 Occupation 20 Occupation change 21 M 27 Return reason 09 22 Partnership 23 Spouse separate income 24 Marriage details 25 Head of family 26 - 27 Number of dependents 28 - 34 M 34 Largest wage or salary 10 35 - 41 M 41 Second wage or salary 11 42 - 48 M 48 Total other sources wage or salary 12 49 - 55 M 55 Total interest received 13 56 - 62 M 62 Dividends received, total 14 63 - 69 M 69 Rent received, total 15* 70 - 76 M 76 Gain or loss on sale of assets 16* FORMAT DATA RECORD ENTRY CODE CODE 77 - 83 M 83 Profit or loss from business 17* 84 - 90 M 90 Income from trustees or fiduciaries 18 91 - 97 M 97 Partnership income 19* 98 -104 M104 Other income 20 105 -111 M111 Total of sources of income 21* 112 -118 M118 Auto or business expenses 22 119 -125 M125 Income (adjusted gross) less auto expense 23* 126 -132 M132 Standard deduction allowed 24 133 -139 M139 Net taxable incomes, standard deduction basis 25* 140 -146 M146 Wisconsin tax paid 26 147 -153 M153 Union dues 27 154 -160 M160 Medical-dental expenses 28 161 -167 M167 Total interest paid 29 168 -174 M174 Business interest paid 30 175 -181 M181 Dividend deductable 31 182 -188 M188 Other deductions 32 189 -195 M195 Alimony paid 33 196 -202 M202 Forest crop land 34 203 -209 M209 Total deductions before federal tax and donations 35 210 -216 M216 Net income before federal tax and donations 36* 217 -223 M223 Federal tax and social security deductable 37 224 -230 M230 Net income before donations 38 231 -237 M237 Donations 39 238 -244 M244 Net taxable income itemized 40* 245 M251 Block or column 246 -251 Personal exemption allowance 41 252 -258 M258 Net normal or total tax 42 259 -265 M265 First installment 43 266 M272 Type of item in 267-272 267- 272 Miscellaneous information 44 273 -279 M279 Social security received 45 280 -286 M286 Assessed taxable income 46 287 -293 M293 Total additional taxes 47 FORMAT DATA RECORD ENTRY CODE 294 M94 Number of sources wages and salaries 48 295 M295 Farm schedule, profit and loss 49 296 M296 Stock divided 50 297 M297 Auto expenses 51 298 M298 Other enclosures 52 299 M299 Spouse income 59 or 60 53 300-306 M306 Taxable income incomplete form or net taxable income, type 5 form 54* 307 M307 Type of form 55 308 M308 1 incomplete, 0 complete 309 M309 Blank 310 Record mark *Indicates fields where negative amount entries are possible. II. The following card entries will be necessary for the updating of each field in the master record. Only one field may be updated with any one card entry. Therefore, multiple field updating within one master record will require multiple card entries with the same identification number and year. Columns Data 1- 8 Identification number 9-10 Year 11-12 Entry Code 13 One digit information entries (example - Entry Code 48) 14-29 Coded Data (Entry Code 09 only) 30-36 Amount fields (Right justified - minus sign (if any) in column 30) 37-74 Blank 75-80 TAX-07 III. The above correction card entries will serve as input to TAX-07 which will update existing master records only. Therefore, this above procedure will not be used to add or delete master records or change Identification numbers. The procedure for adding and deleting master records and changing Identification numbers will remain as stated in WAIS Paper 645-016 (December 2, 1964). Such entries will be input to TAX-03. In view of these separate possibilities, any correction entries should be clearly marked (externally) as input and to TAX-03 or TAX-07 stored in separate locations.(Marshall Seavey, 1965&Proposal for Interview AnalysisrJanuary 6, 1965 WAIS paper645-023p@:Proposals- For Analyses, Theses, etc. Survey Data and FileMarshall Seavey WAIS Working Paper 645-023 January 6, 1965 Draft Proposal for Interview Analysis Introduction Individuals allocate their savings among various assets, and their investment behavior may be seen as a process designed to accomplish a set of goals in their investment portfolio. A person has perceptions regarding the qualities offered by his present portfolio and in addition there are certain goals or ideal asset qualities which he may hope to achieve in the disposition of his savings. Therefore it would appear that the important aspect of this behavioral description is whether or not the individual feels that his actual portfolio offers him the qualities he desires. If his desires are not satisfied then he will want to make an alteration in either the size or distribution of his portfolio, if this is possible given his financial, family and other constraints. If this information can be obtained, it may be possible to know either the magnitude or direction of the changes to be made in the composition of consumer savings. One line of analysis would entail developing objective measures designed to reveal the asset qualities characterizing any individual's portfolio. These measures would have to be constructed in such a fashion as to enable the analyst to describe and compare a given quality common to all assets. This would mean that the riskiness of a certain type of real estate must be compared with the riskiness of other disimilar assets such as stocks, bonds, insurance and so forth. The difficulties in constructing such measures are obvious but assuming it were possible then these measures would be applied to the portfolio of any respondent to show objectively what qualities were provided by his assets. Then one would simply compare these measured qualities to those qualities which the respondent indicated he considered most important to determine whether or not he was satisfied with his asset position. But the problem here is that the objective measures which were developed may not measure qualities such as risk and liquidity as the respondent perceives these qualities. In other words a respondent's perception of what degree of riskiness characterizes his portfolio may not be equal to our objective measurement of this quality. Therefore a simpler and more fruitful approach to the problem may be pursued. Since our schedule contains questions which asks the respondent which of his assets provide him with current return, appreciation in value, and liquidity it would seem advisable to use the respondent's answers to these questions in assessing the asset qualities of his portfolio. The advantage of using this approach is that we obtain the respondent's own perception of what qualities his portfolio affords him. Even if the respondent's conception of riskiness is erroneous, it is only necessary to know how the respondent describes his actual portfolio through his own assessment of it in determining whether or not he is satisfied. If for instance a respondent perceives a high degree of liquidity in his portfolio and is satisfied because this is his most important goal then it makes no difference if he has misconstrued the meaning of liquidity because we are only interested in knowing that he is satisfied and hence will not desire a change in the allocation and size of his savings. Method of Analysis For each respondent find the corresponding proportion for each asset with respect to his overall investment portfolio that is composed of any of the following: A. Insurance B. Equity in home C. U.S. government bonds D. State and local government bonds E. Corporate bonds F. Savings accounts in banks, savings and loan associations, or credit unions G. Common stock H. Preferred stock I. Other real estate (besides home) J. A business (other than real estate or a professional practice.) A hypothetical respondent's portfolio may consist of 10% A, 20% B, 25% C, 5% F, 30% G and 10% H. Questions 4, 5, and 6 of the interview schedule ask the respondent what investments are owned for current return, capital appreciation and liquidity respectively. By weighting the investments he gives in answer to these questions, his perception of how his portfolio ranks with regard to the three characteristics mentioned above is determined. For example let: x = current return in question 4 y = capital appreciation in question 5 z = liquidity in question 6 Suppose his answers were: .30 G, .10 H, .25C for x .30 G, .10 H for y .05 F for z Assuming that the respondent gives importance to a particular investment as determined by its proportion of his total portfolio, then the importance of qualities x, y, and z can be weighted in the same proportions as the assets, to which the respondent assigns these qualities. Summing the proportions, of assets shown in response to each question gives the weights of importance the respondent gives each quality x, y, and z. The results of adding the weights give the following asset quality ranking: .65 x, .40 y, .05 z. This represents the respondent's assessment of the ranking of asset qualities of his actual portfolio as he perceives them. In question 14 of the schedule the respondent is asked which asset qualities he feels are most important, second most important, and least important. The problem here is that the respondent is offered 5 qualities to choose from and his first and second choices may not always include qualities x, y, and z. However, where they do, it is possible to compare his first and second choices with the first and second ranked qualities his portfolio actually offers him. By comparing these two categories we can determine whether or not a respondent has reached his portfolio qualitative goals. Thus where ranking in the first two: categories of actual portfolio qualities equals the ranking of the first two categories of the desired portfolio, the respondent has reached his goals (RG). Where the ranking differs the respondent has not reached his goals (NRG). And a third grouping of non-comparable qualities would be a residual. In aggregating over all respondents or for particular homogeneous groups of respondents we can obtain proportions: and numbers of those who have reached their investment goals and presumably will not desire a change in their present portfolio position. Of those who have not reached their goals it can be stated that they would change the distribution and perhaps size of their portfolio if their constraints permitted them. Of the NRG group, three subgroups could be computed showing those who considered x, y, and z respectively to be their most desired investment quality. This would show the direction in which the dissatisfied investors would move their funds if they were given the opportunity. Tests of this analysis could be conducted by observing the actual movements in the components of savings of Wisconsin residents for an appropriate period immediately following the interview period. The obvious limitation of this analysis is that it does not show the magnitude of dissatisfaction of the NRG group. Another limitation is that our study does not reveal the priorities of consumption in relation to savings and it may not show other constraints which limit the flexibility of the respondent in his allocation of funds for alternative investments.hahttp://www.ssc.wisc.edu/wais/WAIS645023.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645023.txtmR Gene Moyer 1965NGTable Specifications for the Social Security "Covered Earnings" Projecta July 28, 1965l WAIS paper656-007sSocial Security TablesQQzGene Moyer WAIS 656-007 July 28, 1965 Revised August 4, 1965 Table Specifications for the Social Security "Covered Earnings" Project Some months ago, W.L. Buckler of the Social Security Administration gave to Martin David a set of specifications for tables the SSA desired. The tables Buckler designed looked deceptively simple because his eight tables were really a long series of three (or more) dimensional tables. The following pages represent an attempt to transform Buckler's multidimensional tables into control card specifications for Wistab tables. While Wistab has certain limitations to its usefulness for multivariate problems, the attractiveness and readability of its tables make it the appropriate program to use as often as possible. A copy of Buckler's original table specifications is available in Room 353 if anyone wishes to check them. I. General Considerations A. Negatives in WAIS variables will have a blank in the first column of the field as well as a negative sign over the units position of the field. B. In order that all years can be run with the same decks of control cards, 1959 intervals have been combined with 1955-1958 intervals. except for tables 7 and 8, where considerations of core size seem to make changing cards necessary. C. Tables 7 and 8 (S.S. table numbers) use fewer brackets on 805 earnings than tables 2-6, but they ask for column variables divided according to whether the taxpayer had wages only or wages and selfemployment (SE) income. There are no negatives in SSA variables, so WAIS should denote 805 earnings for those persons who have wages only (coded 2 in 76) as blank in the highest order position (column 52) and for those persons who have wages and SE income (coded 1 or 3 in 76) as 0 in column 52. The positions 53 to the first number in the field should be filled in with zeros. . The 80 brackets for tables 7 and 8, then, will be (lower bound only): 1955-1958: b000000, b004200, b004201, 0000000, 0004200, 0004201 T0T (7 intervals) b004801 00000009 0004800, 0004801 T0T b000000, b004800, (7 intervals) D. Brackets on 805 earnings (53-59) are (preceded by zeros). 1. Tables 2-4 0 1, 600, 1200, 1800, 2400. 3000, 3600, 4200, 4201, 48000 4801, T0T (13 brackets) 2. Tables 5-6 l, 400, 600, 1200, 1800, 2400, 3000,, 3600, 4200, 4201, 4800, 4801 T0T (13 brackets) E. Brackets on all WAIS -variables are (Preceded by zeros, except for blank): bbbbbbb, 1, 400, 600, 1200, 1800, 2400, 3000, 3600, 4200, 42019 48001 4801, 540, 6000, 6600, 72003, 7800, 8400, 9000, 96009 10,200, 10,800, 11,400, 12,000, 12,600, 13,200, 13,804, 149400, 15,0009 T0T (31 intervals including T0T ) (Note: Negative values are included in the interval from blank to $1). F. Each year should be put on a separate reel so that reading time is kept to a minimum. When all the runs are complete, the reels can be combined so that only one reel is stored. G. A note on Wistab cards and their uses. 1. * XTAB is the first card in each run. It gives the name of the table, the size of the counter and the specifications of the input data. 2. * C eliminates all records with values greater than (0) less than (L), unequal to (U), or equal to (E), the value of the variable given in the card. It eliminates those records in all succeeding tables specified in the run. 3., * L acts as * C except that it only eliminates records in a single succeeding table. * A adds the values of the variable specified in the card and places these sums in each cell of the table. C*, *L, and *A must be placed immediately before the * X card. 5. * X specifies the column variable of the table. It is limited to 13 intervals including the total., 6. * Y specifies the row variable of the table. 7. * Z specifies the page variable of the table. 8. * ENM indicates the end of the run. 9. The maximum size of 4=y run is given by [ (C) (X) (Y) (Z) ]i < 28,000 i where C = counter size, X = number of intervals in theme card, Y = number of intervals in the*Y card, Z = number of intervals in the *Z card, all for the ith table in the run. H. The method of computing each table, 1-8. (Note: each year will be run separately) 1. Population: all workers in class Column: 805 earnings Row: WAIS earnings Page: sex Entry: frequency, amount of 805 earnings Population: all workers in class Column: sex Row: WAIS earnings Entry: amount of wages, amount of SE income, amount of total earnings 30 Repeat 1 and 2 for wage (or SE) only workers 4. Repeat l and 2 for wage and SE workers Table 2 only 5. Population: all workers in class Column: 805 earnings Row: WAIS earnings Page: age Entry: frequency, amount of 805 earnings 6. Population: all workers in class Row: WAIS earnings Entry: Amount of wages, amount of SE incomes, amount of total earnings II. Extract Record # 02 (Extract for running Social Security Earnings in the "covered earnings" project) Data Number of New Master File Columns Columns Columns 1- 8 9-10 11-12 13 2-9 Identification number 408-409 Year of birth 1011 Year of record 0 - male 1 - female Age bracket-, 0 if age is not ascertained (blank) I if (11-11) less (9-10) < 25 2 if (11-12) less (9-10) 25, but:< 49 3 if (11-12) less (9-10) >; 50, but :5 664 8 2 2 4 if (11-12) less (9-10); 65 14 1 1 Where (12-13) and (9-10) are the columns in this record which contain the year of the record and the year of birth of the taxpayer Type of WAIS earnings 16 1 381 1 - wages only 2 - wages and SE income 3 - SE income only Number of sources of wages and 17-23 7 28-34 salaries (0.1,2,3,4,5 or more) 37-43 Total WAIS wages and salaries 46-52 24-30 7 91-97 Profit and Loss from business 31-37 7 109-115 (blank in col. 24 for negatives) Partnership income 38-44 7 (blank in col. 31 for negatives) Total self-employment income 91-97 (blank in col. 38 for negatives) 7109-11 45-51 7 28-34 37-43 Total (SSA definition) WAIS earnings 46-52 I (blank in col. 45 for negatives) 91-97 109-115) Columns Number of New Master File Data Columns Columns 52-58 7 416-424 Social Security (805) earnings (blank in 52 if 76 - 2n 0 in 52 if 76 - I or 3) 59 1 - Blank 60-63 4 437-440 Quarterly wage coverage pattern 64 1 441 Agricultural coverage 65 1 436 Self-employment coverage 66-74 9 399-407 Social Security account number 75 1 410 Race 76 1 - 0 if 60-65 are blank or Type of 845 1 if 60-64 are NNNNO 77 1 442 earnings 2 if 65 = 0 3 otherwise (Record Mark) This record will be constructed for each taxpayer in the masterfile who filed a return with earnings in any of the years 1955-1959. Each taxpayer will have, then, a maximum of five such records. * SE only - 1; Wage Only - 2; Wage and SE = 3. III. Table and Run Specifications Run # Wistab Table I # Intervals in card Data Table & Run intervals XTAB ter size = 9 C Age VA (ED in 14 C 805 earnings N& CEO in 76) X Y Type of 805 earnings (l, 2,3,TOT in 76) 4 Type of WAIS earnings (15) 4 I Sex-age (13-14) 9 1296 A 805 earnings (53-58) 4 X Y as above 4 Z 9 1296 Table 1 (continue) Wistab # # Intervals in Run # Card Data Intervals Table (from last page) *A WAIS earnings (45-51) *X *Y as above *Z *A WAIS wages (17-23) *X *Y as above *X *Z *A WAIS SE income (38-44) *Y as above *Z 1296 4 4 9 1296 4 4 9 7296 4 4 9 1296 6480 Table 2 Wistab # # Intervals in Run # Card Data Intervals Table & Run 2 *XTAB Counter size = 9 *X 805 earnings (53.58) 13 *Y WAIS earnings (45-51) 31 *Z Sex (13) 3 10,881 *A Amount of 805 earnings (53-58) 10,881 *X as above *Y *Z *A Amount of WAIS earnings (45-51) 3 *X Sex (13) *Y WAIS earnings (45-51) 31 837 * A Amount of WAIS wage, (17-23) 3 *X as above *Y 31 837 * A Amount of WAIS SE income (38-44) 3 *X *Y as above 31 837 * END 24,273 Table 2 Run # Wistab Data # # Intervals in Card Intervals Table & Run 3 *XTAB Counter size = 9 *X 805 earnings (52-58) 13 *Y WAIS earnings (45-51) 31 *Z Age (14) 6 *END 21,762 4 *XTAB Counter size = 9 *A Amount of 805 earnings (52-58) *X as run # 3 *Y *Z *A Amount of WAIS earnings (45-51) 21,762 *X Age (14) 6 *Y WAIS earnings (45-51) 31 *A Amount of WAIS wages (17-23) 1944 *X as above 6 *Y 31 *A Amount of WAIS SF income (3844) 1944 *X as above 6 *Y 31 1944 * END 27,594 Table 3 Run Wistab Data Intervals # intervals in Card Table & Run *XTAB Counter size - 9 *C Self-employment only and not 13 *C ascertained cases (L1 in 76) *X WAIS self-employment only (E3 in 15) 805 earnings (53-59) *Y WAIS earnings (45-51) 31 * Z Sex (13) 3 10,881 * A 805 earnings (53-58) 13 * X * Y as above 31 * Z 3 10,881 *A WAIS earnings (45-51) 3 *X Sex (13) *Y WAIS earnings (45-51) 31 837 *A WAIS wages (17-23) 3 *X as above 31 *Y 837 *A WAIS SE income (38-44) 3 *X as above *Y 31 837 24,273 6 Repeat Run 3 adding a * C (before the first * X) (U I in 15) 7 Repeat Run 5 adding a * C (before the first * X) (U 2 in 15) Table 4 Wistab # # Intervals in Run # Card Data Intervals Table & Run Counter size = 9 SE only., wages and SE, and in 805 earnings (U 2 in 76) SE only in WAIS records (E 3 in 15) 13 805 earnings (53-58) WAIS earnings (45-51) 31 Sex (13) 3 10,881 805 earnings (53-58) 13 as above 31 3 10,881 WAIS earnings (45-51) 3 Sex (13) WAIS earnings (45-51) 31 837 WAIS wages (17-23) 3 as above 31 837 WAIS SE income (38-44) 3 as above 31 837 24273 9 Repeat 8 adding *C (E 2 in 15) before first * X card 10 Repeat 8 adding *C (E 1 in 15) before first * X card Run # Wistab Table 5 # Intervals in 11 Card # Table & Run Data Intervals * XTAB Counter size 9 *C Wage only and NA in 805 earnings (E 2 in 76) *C *X * Wage only in WAIS earnings (E 1 in 15) 13 805 earnings (53-53) *Y WAIS earnings (45-51) 31 *Z Sex (13) 3 10,801 * A 805 earnings (53-52) 13 * X *Y as above 31 *Z 3 10,881. *A HIS earnings (45-51) 3 *X Sex (13) *Y WAIS earnings (45-51) 31 837 * A WAIS wages (17-23) 3 *X *Y as above 31 837 *A WAIS SE income (38-44) 3 *X *Y as above 31 837 * END 24273 12 Repeat 11 adding * C (E 2 in 15) before the first * X card 13 Repeat 11 adding * C (E 3 in 15) before the first * X card Table 6 Wistab # # Intervals in Run Card Data Intervals Table & Run 14 * XTAB Counter size = 9 *C Wage only, wage and SE, and NA in 13 *C 805 data (U I in 76) *X Wage only in WAIS data (E I in 15) 805 earnings (53-58) *Y WAIS earnings (45-51) 31 *Z Sex (13) 3 10,831 *A 805 earnings (53-58) 13 *X *Y as Above 31 *Z 3 10,881 *A WAIS earnings (45-51) 3 * X Sex (13) * Y WAIS earnings (45-51) 31 837 *A WAIS wages (17-23) 3 *X *Y as above 31 837 *A WAIS SE income (38-44) 3 *X *Y as above 31 837 * END 24,273 15 Repeat 14 changing the record *C to (U 3 in 15) before the first * X card 16 Repeat 14 changing the record * C to (U 2 in 15) before the first * X card Run # Table 7 # Intervals in 17 Wistab Table & Run Card Data Intervals * XTAB Counter size = 9 * C SE only workers and NA in 805 data (L 1 in 76) * C * X Wage amounts (U 1 in 16) 7 805 earnings (52-58) * Y WAIS earnings (45-51) 31 * Z Sex (13) 3 5859 *A 805 earnings (52-58) 7 * X *Y as above 31 *Z 3 5859 *A WAIS earnings (45-51) 3 *X Sex (13) *Y WAIS earnings (45-51) 31 837 *A WAIS wages (17-23) 3 * Y as above 31 837 * A WAIS SE income (38-44) 3 * X *Y as above 31 837 * END 14,229 18 Repeat 17 adding * C (E 2 in 15) before the first * X card 19 Repeat 17 adding * C (E 1 in 15) before the first * X card 20 21 Repeat 17, 18, 19 substituting * C (L 2 in 16) for the * C (U I in 16 card) 22 Table 8 Run # Wistab # # Intervals in Card Data Intervals Table & Run 23 * XTAB Counter size = 9 *C SE and NA workers in 805 data (L 2 in 76) *C SE only in WAIS data ( 3 in 15) *X 805 earnings (52-58) 7 Type of WAIS earnings (15)(1,2,TOT) 3 * X Sex (13) 3 *A 805 earnings (52-58) 567 *X *Y as above 3 *Z 3 *A WAIS earnings (45-51) 567 *X Sex (13) 3 *Y Type of WAIS earnings 3 * A WAIS wages (17-23) 81 * X as above 3 *Y 3 *A WAIS SE income (38-44) 81 as above 3 81 Page total 1377 Table 8 (continued) Wistab # Intervals in Run # card Data Intervals Table & Run 23 (from last page) 1377 (*from SE only in WAIS earnings (U 1 in 15) L Number of wage amounts in WAIS data (U 1 in 76 X 805 earnings (52-58) 7 y WAIS earnings (45-51) 31 Z Sex (13) 3 L 5859 Number of wage amounts in WAIS data (U 1 in 16) A 805 earnings (52-58) 7 x f as above 31 Z 3 L Number of wage amounts in WAIS data (0 1 in 1:b) 5859 * A WAIS wages (17-23) X Sec (13.) 3 y WAIS earnings 31 837 * C Number of wage amounts in WAIS data (U 2 in 16) X 7 y 31 *Z 3 5859 A 7 X Y as above 31 Z 3 5859 A 3 X y 31 837 * END 27,287 Table 8 (continued) Run # Wistab # # Intervals Card Data Intervals Table & Run 24 XTAB Counter size = 9 0 SE and NA workers in 805 data (L 2 in 76) * 0 SE in WAIS data (U 1 in 15) L Wage amounts (U 3 in 16) *X 805 earnings (52-58) 7 *y WAIS earnings (45-51) 31 *Z Sex (13) 3 5859 A 805 earnings (52-58 X 7 * y as above 31 *Z 3 5859 * A WAIS wages (17-23) * X Sex (13) 3 * WAIS earnings (45-51) 31 837 Repeat above substituting * L (U 4 in 16) for the above * L card 12,555 * END 25,110 Table 8 (continued) Run # Wistab # # Intervals in Card Intervals Table & Run Data 25 * XTAB Counter size = 9 * L SE only and NA workers in 805 data (L 2 in 76) * L SE in WAIS data (U I in 15) * L Wage amounts (L 5 in 16) * X 805 earnings (52-58) 7 * Y WAIS earnings (4551) 31 * Z Sex (13) 3 5859 A 805 earnings (52-58) *X 7 * Y as above 31 *Z 3 5859 * A WAIS wages (17-23) * X Sex (13) 3 * Y WAIS earnings (45-51) 31 837 C All except wages and SE in 805 data (U 3 in 76) * C SE only and wages only in WAIS data (U 2 in 15) *C Wage amounts (U 1 in 16) * g 805 earnings (52-58) 7 *Y WAIS earnings (45-51) 31 * Z Sex (13) 3 5859 * A 805 earnings (52-58) X 7 * Y as above 31 *Z 3 5859 A WAIS earnings (45-51) * X Sex (13) 3 * Y WAIS earnings (45-51) 31 837 Wistab Run # Card Data 2S (from last page) 18 # # Intervals is Intervals Table & Run 837 3 31 837 3 31 837 26,784 Table 8 (continued) A WAIS wages (17-23) X Y as above A WAIS SE income (38-44) X y as above 19 Table 8 (continued) Run # Wistab # # Intervals in a ,Card Data Intervals Table & Run 26 * XTAB Counter size 9 * C All except wages and SE in 805 data (U in 76) * C SE only and wages only in WAIS data (U 2 in 15) L Wage amounts (U 2 in 16) X 805 earnings (52-58) 7 Y WAIS earnings (52-58) 31 * 2 Sex (13) 3 5859 A 805 earnings (52-58) 7 *8 * y as above 31 *' 3 5859 A WAIS earnings (45-51) 3 * X Sex (13) * Y WAIS earnings (45-51) 31 837 *A WAIS wages (17-23) 3 *g *Y as above 31 837 *A WAIS SE income (38-443, 3 *X *y as above 31 837 * C Wage amounts (U 3 in 16) 7 R Y 31 *2 3 5859 *A as above 7 *R *y 31 *2 3 5859 *A 3 *g **Y 31 837 25,110 Wistab Run # Card Data 26 (from last page) Table 8 (continued) 20 # # Intervals in Intervals Table & Run 25,110 3 31 as above 837 3 31 837 26,784 27 Repeat Run 26 changing * L wage mats (0 2 in 16) to (U 4 in 16) and * C wage its (U 3 in 16) to (U 5 in 16) 21 SS Wistab # # Intervals in Run Count Cards Data Place B I directly after the AB card in run #1 SE income (U 1 in 15) Wages (17-23) (TOT only) Wage only (E 1 in 15) Profit or loss from business (38-44) (b000000, 0000000, 0000001, TOT) Partnership income (31-37) (b000000, 0000000, 0000001, TOT) Type of WAIS earnings (2,3,TOT)(15) SE income (38-44) (bbbbbbb, 400, TOT) WAIS SE income (38-44) (L 0000001) Wages (17-23) (0, 4200, 4201, 4800, 4801, TOT) SE income (38-44) (1, TOT) Intervals Table & Run 1 9 4 3 432 3 27 6 3 172 Wages (El in 15) Partnership income (L 0000001 in 31-37) P & L from business (G 0000000 in 24-30) SE income (38-44) (BBB's 0000000, TOT) (The blank interval (negatives) will contain the cases of Partnership income (positive) < loss from business) 3 (A can be calculated by s wing the totals of BI and the table including B2, B3, and El.) 27 667 22 Draft - 3/18/65 W.L. Buckler:bjc Appendix A: Counts to be obtained from the file of Wisconsin individual income tax returns For each year separately (1955-59): A. Total number of records B. Number of records with earnings 1/ for the year 1. with wages only 2. with SE income only (a) of profit from business only (b) of loss from business only (c) of partnership income only (d) of both profit from business and partnership income (e) of both Loss from business and partnership income 3. with both wages and SE income (a) with SE income of profit from business only (b) with SE income of loss from business only (c) with SE income of partnership income only (d) with SE income of both profit from business and partnership income (e) with SE income of both loss from business and partnership income C. Number of records with no wages and SE income of < $400 D. Number of records with wages of > $4200 (for 1955-58) or > $4,800 (for 1959) with SE income of $1 or more E. Number of records with loss from business 1. with no partnership income 2. with partnership income < loss from business 1/ Consider only wages or salaries, profit or loss from business and partnership income in determining earnings. Tabulation requirements from file of 23 Draft - 3/18/65 W.L. Buckler:bjc Wisconsin tax returns and SSA data Definitions of terms I. Wisconsin Data (from Wisconsin tax returns) A. Type of Employment (on yearly basis) 2 Wage only - having at least one wage or salary amount shown, but having no profit or loss from business or partnership income. 2. SE (self-employment) only - having a profit or loss from business or partnership income shown, but having no wage or salary amount shown. 3. Waste and SE - having at least one wage or salary amount and alsohaving a profit or loss from business or partnership income. B. Earnings (on yearly basis) 1. Total earnings - the sum of all wage or salary amounts, profits from business and partnership income minus any losses from business. Wages - the sum of all wage or salary amounts. 3. SE income - the sum of profits from business and partnership income minus any losses from business. I. SSA Data (from SSA earnings records) A. Type of Employment (on yearly basis) 1. Waste only - having earnings from wage and salary employment reported and no self-employment earnings that were taxable for social security purposes. 2. SE only - having earnings from self-employment that were taxable for social security purposes and no earnings from wage and salary employment reported. 3. Wage and SE - having both earnings from wage and salary employment reported and earnings from self-employment that were taxable for social security purposes. B. Taxable earnings (on yearly basis) The earnings amounts shown in the SSA earnings record are those that are taxable for social security purposes only. 1. For wage only workers the earnings shown will be taxable wages. 2. For SE only workers the earnings shown will be the taxable SE Income. For wage and SE workers the earnings shown will be a combination of the reported wages and salaries and that portion of the SE earnings that were taxable for social security purposes. Draft - 3/19/63 W.L. Buckler:bjc Logic for determining wages, SE income and earnings from Wisconsin records A. Total wages = largest wage or salary + second wage or salary + all other sources, wage or salary B. 1) SE income = partnership income + profit from business or 2) SE income = partnership income - loss from business C. Total earnings Total wages + SE income Draft - 3/18/65 1 W.L. Buckler:bjc Logic for determining type of at from SSA earnings records For years in which earnings are shown determine the type of employment from the pattern of quarterly wage QC and number of SE or AGQC in the earnings record as follows: Quarterly wage Type of pattern SE QC AQQC employment O/T NNNN 0 ANY Wage only NNNN O/T O 0 SE only O/T MW O/T 0 ANY Wage & SE NN O/T 0 O/T 0 Wage & SEb\http://www.ssc.wisc.edu/WAIS656007.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656007.txt@"t Gene Moyer 1965rlA Comparison of Earnings Reported on the Income Tax with that Reported to the Social Security AdministrationMarch 18, 1965 WAIS paper645-041eAnalysis Social Security!.!'Gene Moyer WAIS 645-041 March 18, 1965 Draft A Comparison of Earnings Reported on the Income Tax with that Reported to the Social Security Administration The Social Security Administration has indicated a desire that we give them some indication of the amount of income in covered occupations which is never reported to the Social Security Administration because the recipient has earned more than the limit of Social Security coverage. The subset of years which Mrs. Merriam requested contains 1957, 1958, and 1959. For those years the Form 805 tape contains the total earnings of the account holder and the individual quarters in which he worked in a covered occupation. These quarters of coverage have the following codes (from a letter of 11-23-1964 from Ira Rifkind): C - wages of $50 or more were reported for the quarter N - no wages reported for the quarter [ ] - wages of less than $50 reported or a minus adjustment made to wages previously recorded These are in positions 514-517, 550-553, and 584-587 on the Form 805 record during the three years under consideration. In addition, the record contains a position of one digit which indicates the self-employment quarters of coverage with the following code: 0 - no creditable self-employment 4 - creditable self-employment of $400 or more [ ] - other than "0" or "4". if earnings are maximum, it means creditable self-employment of $1-$399. If earnings are not maximum, this code should be ignored. Using these two fields, the earnings fields for each year, and the total wages and salaries reported by the taxpayer on the income tax returns in the master file, we should be able to provide the following relevant data to the Social Security Administration: I. Tables Mrs. Merriam asked that we give them tables showing earnings reported to the Social Security Administration run against the total wages and salaries reported on the tax form. Persons with self-employment should be deleted from this computation. (In case our records and the 805 record do not agree on whether the person in question is self-employed, the designation on the 805 will determine whether or not the person is self-employed or not.) The following are proposed as intervals on the column variable, earnings reported to the Social Security Administration (SSE): (SSE = 0, 0 < SSE< $500, $500 < SSE < $1000, $1000 < SSE < $1500, $1500 < SSE < $2000, $2000 < SSE < $2500, $2500 < SSE < $3000, $3000 < SSE < $3500, $3500 < SSE < $4000, $4000 8First Summary Report, Consitency Check Error Tabulations April 1, 1965 WAIS paper645-044Consistency of Data1 J DRoger F. Miller WAIS 645-044 April 1, 1965. First Summary Report, Consistency Check Error Tabulations* The consistency run looked for seven types of within year discrepancies in the tax return data on the WAIS Master File.** These discrepancies were recorded in locations B392-B398 of the Master File Tape Record. The final digits 2-8 will serve to identify the types of discrepancies in the rest of this report, according to the following table:. Table I: Types of within year data discrepancies checked in the consistency run, and codes. Code Type of Discrepancy 2 Totalling of Income Sources 3 Subtraction of Automobile Expenses 4 Subtraction of Standard Deduction 5 Totalling of Deductions before Federal Tax and Donations 6 Subtraction of Deductions before Federal Tax and Donations 7 Subtraction of Federal Tax Deductible 8 Subtraction of Donations A total of 20,328 persons were in the Master File. For any person for which an error was found, an indication of the type and magnitude of the discrepancy was printed. The tabulations were made from an extract having the frequency with which persons' records contained each type of discrepancy, the total number of years the person filed, his maximum, minimum, and 1959 reported incomes, and some information on whether or not he itemized his medical-dental expenses. This extract did not contain information about the magnitudes of the errors. The tabulations summarized in this report were made from this extract. *All programing of the consistency check runs on the WAIS Master File to date was performed by James Geffert. **Other types of checks were included in the program, but are not reported here. Tables II-IV are solely concerned with the discrepancies found in the records. Their purpose is to explore the magnitude of the correction task should we attempt it in any part. Table II displays on the principal diagonal the "marginal percentages": the percentage of all persons having an error of the kind indicated at some time or other. The range of these, from 3.7% to 19.6%, tells us little about the total number of returns in error, or about the number of persons making errors of any kind. Indeed, all we know is that the total number of persons having errors in their tape records lies somewhere between 19.6% and 61.3% of the 20,328 persons in the sample (61.3% is the sum of the marginal percentages; see Table IV). Above the principal diagonal Table II presents the percentages of all sample persons in the pairwise error intersections: 1.65% of the 20,328 persons have errors of both of the kinds coded 2 and 3. Below the principal diagonal Table II displays the percentages of all sample persons in the pairwise error unions: 21.7% of the persons either have a 2-error or a 3-error in their record (or both). The maximum of these union percentages is 27.2% (2-error union 5-error), so at least this percentage have errors of some kind. Thus knowledge of the bivariate distributions is helpful in limiting the total range, raising the minimum from 19.6% to 27.2%. In the absence of any more-than-bivariate distributions, can the range be limited further? Table II: Pairwise Intersection and Union Error Percentages. Code 2 3 4 5 6 7 8, 2 19.6 1.65 1.89 4.33 3.00 2.52 2.58 Intersection 3 21.7 3.7 0.37 0.97 0.85 0.55 0.64 Percentages 4 22.6 8.2 4.9 1.13 0.82 0.50 0.63 5 27.2 14.6 15.6 11.9 3.31 2.57 2.48 6 24.3 10.5 11.7 16.2 7.7 2.71 1.71 7 24.0 10.0 11.3 16.2 11.8 6.9 2.49 8 23.7 9.6 10.8 16.0 12.5 11.0 6.6 Marginal percentages Union Percentages Table III looks at the same data as appears in Table II from two different angles. We wish to know the degree to which the errors are correlated: does one error indicate another? If certain persons were extremely "error prone", it is possible no more than 27.2% of the persons would have errors in their records. Above the principal diagonal, Table III presents the intersection percentages as percentages of the union percentages: of the 21.7% of persons having either a 2-error or a 3-error, 7.6% had both types of errors (1.65 + 21.7 = 7.6%). The pattern of variation is somewhat different from that displayed in the intersections themselves (compare the 2-error-8-error and 5-error-8-error percentages in the two tables). Generally, the intersections seem to be only moderate percentages of the unions. This is reinforced by looking at the percentages below the diagonal in Table III: they show the percentage of all persons making just one of the two types of indicated errors (20.0% of the 20,328 persons made either a 2-error or a 3-error, but not both kinds of errors: 21.7 - 1.65 = 20.0 after rounding). These numbers tend to be substantially higher than those above the diagonal in Table II, indicating weakness of correlation among errors. Table III Error Intersections as Percent of Total Errors and Unions Less Intersections Code 2 3 4 5 6 7 8 2 7.6 8.4 15.9 12.0 10.5 10.9 Intersection as 3 20.0 4.6 6.6 8.1 5.5 6.6 a percent of all errors, both kinds 4 20.7 7.8 7.2 7.0 4.4 5.8 5 22.8 13.6 14.5 20.6 15.8 15.5 6 21.3 9.7 10.9 12.9 22.9 13.6 7 21.5 9.5 10.8 13.6 9.1 22.7 8 21.1 9.0 10.2 13.5 10.8 8.5 Exclusive Union Percentages The more detailed tables underlying Tables II and III reinforce this conclusion: Persons whose records have one type of error more than once or twice are generally unlikely to have any one type of other error present. However, the bivariate tables cannot tell us directly how likely it is that a person with one kind of error also has some other type of error in his records. In the absence of the full multivariate error distribution, we can further narrow the range of total persons with errors somewhere in their records. The reasoning which allows us to do this is involved, and is aided by deriving Table IV from Table II. Column (2) in Table IV presents new information: the maximum frequencies with which persons have errors of each kind. Naturally, the largest of these, 10, does not prevent a person from having errors in each of 15 years if he commits errors of more than one type. For example, the detailed tables show 2 persons each having a 2-error in seven or eight years and a 6-error in seven or eight years. One of these persons only filed in seven or eight of our 15 years, but the other filed in at least 13 of our years. The latter could, by, having no more than one type of error in each year, have 14 or 15 years of erroneous tape returns. Both persons are rare in having committed more than oe type of error, each kind many times. Table IV: Selected Error Statistics. (1) (2) (3) (4) (5) (6) (7) Maximum Marginal Persons with one this error, Error # % if error correlations are at Type of years of all sum of MAXIMUM the + & - extremes Code Committed Persons Intersection Intersection Max % Min % 2 10 19.6 16.0 4.3 15.3 3.6 3 4 3.7 5.0 1.7 2.0 -1.3 4 6 4.9 5.3 1.9 3.0 -0.4 5 8 11.9 14.8 4.3 7.6 -2.9 .6 8 7.7 12.4 3.3 4.4 -4.7 7 8 6.9 11.3 2.7 4.2 -4.4 8 6 6.6 10.5 2.6 4.0 -3.9 Total - 61.3 75.3 20.8 40.5 -14.0 Column (3) of Table IV merely repeats the diagonal from Table II. Column (4) gives the sum, for each error type, of all the intersection percentages in which it is involved. Thus, 16.0 = 1.65 + 1.89 + 4.33 + 3.00 + 2.52 + 2.58 (after rounding) and 5.0 = 1.65 + 0.37 + 0.97 + 0.85 + 0.55 + 0.64. Notice that the 1.65% intersection of 2 with 3-error was included in the 3-error sum as well as the 2-error sum: each entry above the principal diagonal in Table II is added into two sums in Column (4) of Table IV. Thus, if we were to add up all the above diagonal elements in Table II, the sum would be 37.7%, or just half of the 75.4% total of IV(4). Column (5) of Table IV gives the maximum bivariate intersection percentage for each error type. Columns (6) and (7) are derived by subtracting from Column (3) the entries in columns (5) and (4) respectively. Thus, 15.3 (= 19.6 - 4.3) is the largest possible number of persons having only errors of type 2 (and no other types at all), while 3.6 (= 19.6 - 16.0) is the smallest possible number of persons having only 2-errors. The negative numbers in Column (7) for 3-errors through 8-errors imply that the true smallest possible numbers are all zeros.hahttp://www.ssc.wisc.edu/wais/WAIS645044.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645044.txt_v]Q f Ron Durant 1965.'Format of SSRI WAIS Tax Extract #1 File April 28, 1965 WAIS paper645-057pExtract 01 FormatsjcRon Durant WAIS Working Paper 645-057 April 28, 1965. 1st Revision To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Barger, Bauman, Geffert, Moyer, Ryshpan, Roubal, Seavey, Wiegner. From: Ron Durant Document: Format of SSRI WAIS TAX EXTRACT #1 FILE " " " " " " #1A FILE I. The following record will be extracted for all individual return filers. The tape file will consist of two reels -- 3 records per block. All amount fields will be DOLLARS ONLY. (EXTRACT #1 FILE) Record Position Data 1- 8 WAIS Identification Number 9-10 Year 11 Sex 0 = Male 1 = Female 12 *Marital Status 13-14 Age 15 Race 16-17 Number of dependents 18-19 Occupation 20 Return filed in previous year or reason why not. 21-22 County of Residence (Current Year) 23 City Designation (Current Year) 24-25 County of Residence (Prior Year) 26-32 AGI (Adjusted Gross Income) 33-39 NTI (Net Taxable Income) 40-46 W & S (Wages and Salaries) 47-53 Total Dividends Received 54-60 Gain or Loss on Sale of Assets 61-67 Self-employment Income [Includes Profit Loss from Business & Partnership Income]. Record Position Data 68-74 Total Interest Received 75-81 Total Rent Received 82-88 Income from Trustees or Fiduciaries 89-90 Year of Birth 91 Was taxpayer married during the year 0 = no marriage 1 = marriage no "detail" 2 = marriage with "details" 92 Record Mark ( ) See Gene Moyer, The Coding of the Wisconsin State Tax Forms (1946-60) WAIS Paper 645-038 (lst Revision) p. 19, for an explanation of the marital status code. II. In addition the following record will be extracted for all individual return filers. The tape file will consist of two reels -- 3 records per block. All amount fields will be DOLLARS ONLY. (EXTRACT #lA FILE). Record Position Data All Individuals: 1-91 Same as EXTRACT #1 FILE Husbands & Wives: 92-93 "99" MINUS Current Year 94-95 Position 7-8 of Identification Number All Others: 92-93 Position 7-8 of Identification Number 94-95 "99" MINUS Current Year All Individuals 96 Record Mark ( )hahttp://www.ssc.wisc.edu/wais/WAIS645057.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645057.txt Ron Durant 1965<5Report on the Second Updating of the WAIS Master File6 June 2, 1965 WAIS paper645-068Master File- Tax RecordsxrRon Durant WAIS Paper 645-068 June 2, 1965 report To: WAIS Staff: Barger, Bridges, David, Groves, Lampman, Miller, Moyer, Bauman, Geffert, Ryshpan, Roubel, Wiegner From: Ron Durant Document: Report on The Second Updating of the WAIS Master File I. Tax - 03 (Edit & Master Updating Run) TAPES SSRI 334, 335 & 336 TAPES SSRI 291, 168 & 307 MASTER OUT FILE Resubmitted data TAPE SSRI #151 EDIT LISTING MASTER IN FILE TAX-03 Recycled master rcds TAPE SSRI #297 EDITS Resubmitted Data In Records 3283 Master Records In 134962 Edit Records Rejected 307 Master Records Dropped 505 Master Record Out 133981 Recycled Master NEW COMPLETE MASTER FILE Records Out 674 II. MERGE: MASTER IN FILE Sorted Recycled Master Rcds TAPE SSRI #259 TAPES SSRI 291, 168 & 307 TAPES SSRI 324, 325 & 326 FINAL MASTER OUT Listing of Duplicate Recylcled Master & Accepted Master Master Records In 133,981 Recycled Master Records In 674 Recycled Duplicate Master Records Dropped 68 Final Merged Master Records Out 134587 III. Edit Listings and cards from Tax - 03 and MERGE have been handed over to the keypunching group for processing.hahttp://www.ssc.wisc.edu/wais/WAIS645068.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645068.txt  Ron Durant 1965Report on the Running of Extract-01 Which Created Extract 1 and 1A Files as Outlined in WAIS Working Paper 645-057, April 28, 1965 (1st Revision) June 9, 1965 WAIS paper645-070PExtract 01 Programss Ron Durant WAIS Paper 645-070 June 9, 1955 Report TO: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Barger, Bauman, Geffert, Moyer, Roubal, Ryshpan, Wiegner, Cassidy, Weininger From: Ron Durant Document: Report on the running of EXTRACT-01 which created EXTRACT 1 and 1A Files as outlined in WAIS Working Paper 645-057, April 28, 1965 (1st Revision). EXTRACT-01 (1) SSRI 363 (2) SSRI 362 (3) SSRI 361 (4) SSRI 360 Consistency Master File EXTRACT-01 LISTING OF REJECTED RECORDS (Rejected in Extract) and 1A Files ONLY EXTRACT-1 (1) SSRI 128 (2) SSRI 359 EXTRACT-1A (1) SSRI 259 (2) SSRI 301 MASTER FILE (including Social Security Information - Format in WAIS 645-056 Revised 6/7/65) (1) SSRI 291 (2) SSRI 168 (3) SSRI 307 (4) SSRI 297 (5) SSRI 341 Master Records In 134,587 Social Security Records In 14,836 EXTRACT-1 Records Out 130,025 EXTRACT-1A Records Out 130,025 Master Records Out 134,587 Form Type - 6 Records Rejected 4,509 Assessment Records Rejected 53 II IBM 1410 Computer Running Time - 3 hours.hahttp://www.ssc.wisc.edu/wais/WAIS645070.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645070.txtpb James Geffert  1965"Proposed History File FormatNovember 18, 1965 WAIS paper656-031Formats History FilehbJim Geffert WAIS 656-031 November 18, 1965 Document - Proposed History File Format This paper sets forth a format for our proposed HISTORY file on the persons in the WAIS tax sample. The purpose of this file is to record in a concise way whether information for given individuals is present in the several files of data comprising the WAIS data. It will be used to edit our data for errors and to aid in the extraction of records for persons who appear in one or more files. The HISTORY file will be created in several stages. Information from each file will be added to it in turn. It can be observed that this paper is not exhaustive and comments and suggestions are welcome. History File Format Position 1-8 WAIS ID number 9-17 Social Security Number 18 Presence in Master File 1946 19 " " " 1947 20 " " " 1948 . . . . . . 37 Presence in Master file 1965 38-39 First year record appears 40-41 Last year record appears 42-43 Total years records appear 44 Presence in FFID file 45-46 Last year of information 47 Presence in revised 805 file 1946 48 " " " " 1947 49 " " " " 1948 . . . . . . 66 Presence in revised 805 file 1965 67-68 First year of information 69-70 Last year of information 71-72 Date 805 record received 73 Presence in interview file 74 Selected for interview) no' response (reason) 75 Presence in benefit file 1946 76 " " " 1947 77 " " " 1948 . . . . . . 94 Presence in benefit file 1965 95-96 First year of information 97-98 Last year of information 99 Presence of death record 100-101 Year of death 102 Presence in Property Income file 1946 103 " " " " 1947 . . . . . . 121 Presence in Property Income file 1965 122-123 First year of information 124-125 Last year of information 125-127 Total years of information 128 Indication of Multiple ID # 129 Indication of Multiple SS # 130-? Provision for other informationhahttp://www.ssc.wisc.edu/wais/WAIS656031.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656031.txtE Gene Moyer 1965<5Card Format for Multiple Social Security Number Cases April 8, 1965 WAIS paper645-045-Formats Social Security1Gene Moyer WAIS 645-045 April 8, 1965. Card Format for Multiple Social Security Number Cases Card 0 (LINE 1) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. number 12 1 2 Card number (0) 13 1 3 Multiple account number indication 14-18 5 - Blank 19-24 6 4 First 6 letters of surname 25 1 - Blank 26 1 5 Indication that name on record does not agree with name on finder card 27-30 4 - Blank 31-35 5 6 Month and year of birth (and dash between) 36-37 2 - Blank 38 1 7 Race indication 39 1 - Blank 40-46 7 8 Sex 47-55 9 - Blank 56-63 8 9 Wisconsin identification number 64 1 - Blank 65-71 7 10 Job Number 72-80 9 - Blank Card 1 (LINE 2) Columns No. of Columns Variable # Variable Name 1 -11 11 1 Primary S.S. number 12 1 2 Card number (1) 13-17 5 11 Indication of railroad activity 18-20 3 - Blank 21-22 2 12 Newly posted credit earnings item 23 1 - Blank 24-25 2 13 Indication that microfilm showed earnings that were not included in total earnings on form 805. 26 1 - Blank 27-28 2 14 Active earnings discrepancy 29-30 2 - Blank 31-35 5 15 Account in benefit status other than disability 36 1 - Blank 37-40 4 16 Benefit status other than disability was terminated 41-43 3 - Blank 44-46 3 17 Account in disability benefit status or disability freeze status 47 1 - Blank 48-51 4 18 Disability status was terminated 52-53 2 - Blank 54-57 4 19 Credit indication 58-59 2 - Blank 60-64 5 20 Earnings statement issued in year indicated 65-66 2 - Blank 67-69 3 21 Indication of self-employment activity 70-72 3 22 Indication of a delinquent selfemployment item 73-74 2 - Blank 75-76 2 23 Indication of agricultural activity 77-80 4 - Blank Card 2 (LINE 4) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary SS Number 12 1 2 Card number (2) 13-17 5 24 Description of data (TOTAL) 18-19 2 - Blank 20-28 9 25 Earnings, 1937 to date 29-30 2 - Blank 31-32 2 26 Wage quarters of coverage, 1947 to date 33-35 3 - Blank 36-37 2 27 Self-employment quarters of coverage, 1951 to date 38 1 - Blank 39-40 2 28 Agricultural quarters of coverage, 1955 to date 41-48 9 - Blank 49-53 5 29 Description of data (51bTD) 54-55 2 - Blank 56-64 9 30 Earnings, 1951 to date 65-66 2 - Blank 67-68 2 31 Wage quarters of coverage, 1951 to date 69-71 3 - Blank 72-13 2 32 Self-employment quarters of coverage, 1951 to date 74 1 - Blank 75-76 2 33 Agricultural quarters of coverage, 1955 to date 77-80 4 - Blank Card 3 (LINE 5) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. Number 12 1 2 Card Number (3) 13-14 2 34 Description of data (51) 15-17 3 - Blank 18-25 8 35 1951 earnings 5/ 26-33 8 - Blank 34 1 36 1951 self-employment quarters of coverage 35-36 2 - Blank 37 1 37 Not applicable (-) 38-48 11 - Blank 49-50 2 38 Description of data (52) 51-53 3 - Blank 54-61 8 39 1952 earnings 5/ 62-69 8 - Blank 70 1 40 1952 self-employment quarters of coverage 71-72 2 - Blank 73 1 41 Not applicable (-) 74-80 7 - Blank Card 4 (LINE 6) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. number 12 1 2 Card number (4) 13-14 2 42 Description of data (53) 15-17 3 - Blank 18-25 8 43 1953 earnings 5/ 26-27 2 - Blank 28-31 4 44 1953 quarterly wage quarters of coverage pattern 32-33 2 - Blank 34 1 45 1953 self-employment quarters of coverage 35-36 2 - Blank 37 1 46 Not applicable (-) 38-48 11 - Blank 49-50 2 47 Description of data (54) 51-53 3 - Blank 54-61 8 48 1954 earnings 5/ 62-63 2 - Blank 64-67 4 49 1954 quarterly wage quarters of coverage pattern 68-69 2 - Blank 70 1 50 1954 self-employment quarters of coverage 71-72 2 - Blank 73 1 51 Not applicable (-) 74-80 7 - Blank Card 5 (LINE 7) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. Number 12 1 2 Card number (5) 13-14 2 52 Description of data (55) 15-17 3 - Blank 18-25 8 53 1955 earnings 5/ 26-27 2 - Blank 28-31 4 54 1955 quarterly wage quarters of coverage pattern 32-33 2 - Blank 34 1 55 1955 self-employment quarters of coverage 35-36 2 - Blank 37 1 56 1955 agricultural quarters of coverage 38-48 11 - Blank 49-50 2 57 Description of data (56) 51-53 3 - Blank 54-61 8 58 1956 earnings 5/ 62-63 2 .. Blank 64-67 4 59 1956 quarterly wage quarters of coverage pattern 68-69 2 - Blank 70 1 60 1956 self-employment quarters of coverage 71-72 2 - Blank 73 1 61 1956 agricultural quarters of coverage 74-80 7 - Blank Card 6 (LINE 8) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. number 12 1 2 Card number (6) 13-14 2 62 Description of data (57) 15-17 3 - Blank 18-25 8 63 1957 earnings 5/ 26-27 2 - Blank 28-31 4 64 1957 quarterly wage quarters of coverage pattern 32-33 2 - Blank 34 1 65 1957 self-employment quarters of coverage 35-36 2 - Blank 37 1 66 1957 agricultural quarters of coverage 38-48 11 - Blank 49-50 2 67 Description of data (68) 31-53 3 - Blank 54-61 8 68 1958 earnings 5/ 62-63 2 - Blank 64-67 4 69 1958 quarterly wage quarters of coverage pattern 68-69 2 - Blank 70 1 70 1958 self-employment quarters of coverage 71-72 2 - Blank 73 1 71 1958 agricultural quarters of coverage 74-80 7 - Blank Card 7 (LINE 9) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. number 12 1 2 Card number (7) 13-14 2 72 Description of data (59) 15-17 3 - Blank 18-25 8 73 1959 earnings 5/ 26-27 2 - Blank 28-31 4 74 1959 quarterly wage quarters of coverage pattern 32-33 2 - Blank 34 1 75 1959 self-employment quarters of coverage 35-36 2 - Blank 37 1 76 1959 agricultural quarters of coverage 38-48 11 - Blank 49-50 2 77 Description of Data (60) 51-53 3 - Blank 54-61 8 78 1960 earnings 5/ 62-63 2 - Blank 64-67 4 79 1960 quarterly wage quarters of coverage pattern 68-69 2 - Blank 70 1 80 1960 self-employment quarters of coverage 71-72 2 - Blank 73 1 81 1960 agricultural quarters of coverage 74-80 7 - Blank Card 8 (LINE 10) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. number 12 1 2 Card number (8) 13-14 2 82 Description of data (61) 15-17 3 - Blank 18-25 8 83 1961 earnings 5/ 26-27 2 - Blank 28-31 4 84 1961 quarterly wage quarters of coverage pattern 32-33 2 - Blank 34 1 85 1961 self-employment quarters of coverage 35-36 2 - Blank 37 1 86 1961 agricultural quarters of coverage 38-48 11 - Blank 49-50 2 87 Description of data (62) 51-53 3 - Blank 54-61 8 88 1962 earnings 5/ 62-63 2 - Blank 64-67 4 89 1962 quarterly wage quarters of coverage pattern 68-69 2 - Blank 70 1 90 1962 self-employment quarters of coverage 71-72 2 - Blank 73 1 91 1962 agricultural quarters of coverage 74-80 7 - Blank Card 9 (LINE 11) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. Number 12 1 2 Card number (9) 13-14 2 92 Description of data (63) 15-17 3 - Blank 18-25 8 93 1963 earnings 5/ 26-27 2 - Blank 28-31 4 94 1963 quarterly wage quarters of 32-33 2 - Blank 34 1 95 1963 self-employment quarters of coverage 35-36 2 - Blank 37 1 96 1963 agricultural quarters of coverage 38-48 11 97 First secondary S.S. number 49-59 11 98 Second secondary S.S. number 60-80 21 - Blank Card 11 (CLAIMS CASES ONLY) Columns No. of Columns Variable # Variable Name 1-11 11 1 Primary S.S. number 12-73 69 - Blank 73-79 7 99 "MUL CLM" 80 1 100 "7"hahttp://www.ssc.wisc.edu/wais/WAIS645045.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645045.txt;  Martin David 1965TNNotes on the Use of Wisconsin Assets and Incomes Data in Studies of RetirementFebruary 23, 1965e WAIS paper645-0370,%Proposals- For Analyses, Theses, etc. ~ wMartin David WAIS Paper 645-037 February 23, 1965. NOTES ON THE USE OF WISCONSIN ASSETS AND INCOMES DATA IN STUDIES OF RETIREMENT 1. A file of tax records including reports on some 18,000 individuals, a file of form 805 wage earnings records on 14,000 of these individuals, and an interview survey of 1,300 heads of taxpaying units constitute the data currently available for analysis in the Wisconsin Assets and Incomes Study (WAIS). Income sources reported under the Wisconsin personal income tax in the years 1947-1959 are recorded in the tax record file. Age data and supplementary data on transfer incomes are recorded in the latter two files. Taken together these files make possible a detailed scrutiny of the factors related to (1) the date of retirement, (2) the change in sources of income at retirement, and the relative role of private sources of pension income for persons who retired during the years for which income data are available. On the basis of 1960 Census age data we estimate that about ten percent of our sample must have faced the decision to retire during the period for which we have collected data. For perhaps half that group we will be able to observe at least five years' income history following retirement. Supplementary investigations can be undertaken to determine the extent to which persons 55 and over maintain connections with the labor market after retirement from their major lifetime occupation by adopting new occupations, self-employed businesses and the like. Interview data will give data on the extent to which retirement is precipitated by unemployment or illness. However the mall size of that group limits the extent to which definitive conclusions can be drawn from the interview sample. (Probably no more than 150 heads of taxpaying units that are interviewed will have retired during the period for which tax record data are available. 2. The addition of data on social security benefits received by the individuals represented in the tax records will make it possible to obtain some crude estimates of the change in well-being of aged persons on retirement. The amount of OASDI benefits can be added to private sources of income to approximate current inflows of income. Such inflows can be studied to determine the extent to which income deficiencies result in dissaving that is then reflected in smaller interest and dividend payments over the course of the years.hahttp://www.ssc.wisc.edu/wais/WAIS645037.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645037.txtL0t Roger Miller 1965>8Functions Derived from 1964 Tax Averaging Law File No. 1April 28, 1965 WAIS paper645-052lAveraging Studies/f/_Roger F. Miller WAIS Working Paper 645-052 April 28, 1965 Functions Derived from 1964 Tax Averaging Law File No. 1 (1) Definitions of symbols (a) Subscripts: (i) i => ith person (i=1,2,3) in the record (ii) k => kth record (k=1, ..., ?) (iii) t => tth year (t=1,2,3,4,5), 5 is 1958 (comp. yr.) (iv) j => jth type of legal definition (j=1, ..., 4). (b) Variables in Barger output, (input to this program): (i) Ekit = Number of exemptions (ii) Mkit a Marital status (iii) Mkit = "Value" of tth years record for ith person (iv) G+kit = Sum of necessarily positive sources of income (v) Gkit = Adjusted Gross Income per Wisconsin (vi) Nkit = Net Taxable Income per Wisconsin (vii) Ckit = Capital Gains Net Income per Wisconsin (viii) Tk = Record Type (c) Variables created in this program: (i) Akit = Allocated (Deductions + Personal Exemptions) (ii) Bkitj = Basis (iii) Dkit = Deductions Per Wisconsin (iv) Fkitj - Federal Net Taxable Income (v) Jkitj = Joint Adjusted Federal Net Taxable Income with Later Spouse (vi) Kkitj = Joint Adjusted Federal Net Taxable Income with Earlier Spouse (vii) Pkitj = Proportion of Aggregate Adjusted Gross Income with Later Spouse (viii) Rkitj = Proportion of Aggregate Adjusted Gross Income with Earlier Spouse (ix) Qkitj = Qualifying Parameter, Tth Record Type, jth Legal Definition (x) Skitj = Separate Adjusted Federal Net Taxable Income from Later Spouse (xi) Ukitj = Separate Adjusted Federal Net Taxable Income from Earlier Spouse (xii) Xkitj = Y - aB for Y > B (Potentially averageable income excess) (xiii) Ykitj = Allowable Federal Net Taxable Income (xiv) Zkitj = yB - Y for Y < B (Potentially averageable income deficiency) (d) Other symbols used: (i) IDk# = WAIS Identification Number (ii) Hkit = Husband (iii) Wkit = Wife (iv) aj = A multiplier parameter in computing Xkitj (v) yj = A multiplier parameter in computing Zkitj (a) Types of Legal Definition: (i) j=1 => Non-negative net taxable income exclusive of capital gains. (ii) j=2 => Non-negative net taxable income inclusive of capital gains. (iii) j=3 => Any-signed net taxable income exclusive of capital gains. (iv) j=4 => Any-signed net taxable income inclusive of capital gains. (f) Groupings of Record Types: (i) Group 1 => T = 1, 2 (ii) Group II => T = 3, 4 (iii) Group III => T = 5 (a) as joint filers (iv) Group IV => T = 6, 7 (b) as separate filers In what follows we will avoid the use of subscripts where they are not helpful in avoiding ambiguities. Often, instead, they will be indicated in section headings. Add B as a parameter, treated as 0 in first crosstabs. (2) Determination of qualification status: (a) General rule: The support test (subject to some qualifications which cannot concern us) says a person does not qualify unless for the computation year* and every base year the person contributed at least 50% of his expenditures from his own incomes, except in the case of a husband and wife who qualify if they jointly provide at least 50% of their joint support. The law appears considerably more complicated than this, but for the most part the complications are either picayune and apply to very few cases, or else they exist in almost self-cancelling pairs (such as in the case of the non-self-supporting girl who marries during the 5 years a man who supplies more than 75% of their computation year joint income). Note: The law does not appear to specify the support test for the computation year, probably because it is only concerned with averaging of upward fluctuations; we include the computation year because our tabulations will be concerned with fluctuations in both directions. If a couple desires to file jointly and also elect averaging, both must qualify (but see the 75% exception rule above). In any event, we do not have data on the amount it takes to support anyone in any year, and must use ad hoc proxy rules. (b) Special AA Hoc Rules: General Form: (i) Group I: a single person is deemed not to qualify unless he passes both of the following tests: (A) max ( ) > Ql and (B) his "number of interpolated years" < Q2 (ii) Group II: a necessarily separate filing formerly married person does not qualify unless both: (A) same rule as in (i)(A) above and (B) the sum of the number of interpolated years for himself and his wives during the period < Q3 (iii) Group III(a): an H wishing to file jointly with a W (not having had other W's): they don't qualify unless both: (A) max ( ) > Q4 and (B) same rule as in (ii)(B) above (iv) Group IV(a): H wishes to file jointly with present W (he had another W during the period): they do not qualify unless both: (A) max ( ) > Q5 and (B) same sum as in (ii) (B) above < Q6 (v) Groups III(b) and IV(b): married couples wishing to file separately in computation year: each person having a computation year record is "on his own" and must meet the tests in (ii)(A) and (B) above: (c) Specialization: Suggested values of Q's: Q1 = $3,000, Q2 = 3, Q3 = 4, Q4 = $5,000, Q5 = $6,000, Q6 = 5. (d) We will refer to the above tests as A-tests and B-tests. While a person in any group has to pass both the A-test and and the B-test, we want in our tabulations counts (by N classes) of those who fail the A-test only, fail the B-test only, or fail both tests. (3) Treatment of Group I qualifiers: (a) General considerations: no spouse ever is relevant, the person never had a chance to file a joint return in a base year and cannot in the computation year. No reconstructions are necessary. Compute Ft = Nt - 600 Et for t = 1,...,5. (b) For j = 1 legal definition: Record F5 = N5 - 600 E5 for t = 5, for tabulating purposes. Find Bt = max (B; [Ft - Ct] ) t = 1,...,5 where B = 0 in this case and B = 4Et=1 Bt Let Y = B5 If Y > B compute X = y - aB If Y < B compute Z = yB - Y (c) For j = z: As in (b) above except that Bt = max (B; Ft). (d) For j = 3: As in (b) above except B = -(infinity). (e) For j = 4: As in (c) above except B = -(infinity). (4) Treatment of group II qualifiers (a) General considerations: these separate filers are considered to have filed jointly for the years they were married, and must "reconstruct" for those years. These are the only years in which anything differs from (3) above, and it is the definition of Bt that is modified. Only these modifications are presented here. In what follows, let i = 1 represent the filer (present in t = 5) and i = 2 represent the filer's base year spouse (no matter how many spouses may have intervened), for the tth base year. (b) For ALL j = 1, 2, 3, 4: Define D = Gl + G2 - Nl -N2, E = max (E1, E2) Compute P1 = G1/ G1+G2 Define A1 = {0 if P1 < 0.15 {P1 (D+600E) if 0.15 < P1 < 0.85 {(D+600E) if P1> 0.85 (c) For j = l: Define S1 = G1 - Al - C1 J1 = N1 + N2 - C1 - C2 - 600E Then Bt = max (B ; S1 ;1/2j1) where B = 0 (d) For J = 2: as in (c) above except Sl = Gl - Al J1 = Nl + N2 - 600E (e) For j = 3: as in (c) above except B = -(infinity). (f) For j = 4: as in (d) above except B = -(infinity). (g) The modifications in (b) - (f) above only serve to redefine Bt for certain base years (those in which the filer is actually married). All else proceeds precisely as in (3) above. (5) Treatment of group III qualifiers: (a) General considerations: Because of our "gap plugging" operation, reconstruction is possible and necessary for every year, whether or not the couple was actually married in every year, and whether or not they desire to file jointly in the computation year. However, the method of reconstruction differs slightly according to whether or not they were married and whether or not they desire to file jointly. This reconstruction also takes place for the computation year (in which they are, of course, married). In general the reconstructions follow the same rules an in (4) above with some modifications. Only the modifications are presented here. (b) The couple is treated as filing separately. Then for either one who qualifies (both may do so): (i) For any year in which they were actually married, proceed precisely as in (4) above (separately for each). (U) For any year in which they were not married, proceed as in (4) above (separately for each) EXCEPT if Ci is negative for either person, it should not be subtracted (or added) in arriving at Si or Ji in the case where j = 1. (c) The couple is treated as filing jointly (and so qualify): (i) For any year in which they are married, including the computation year, redefine Bt = max (B ; J1) and proceed precisely as in (4) above. (ii) For any year in which they are not married, the same exception as in (b)(ii) above applies. (6) Treatment of Group IV qualifiers,: (a) General considerations: the only difference between group III and group IV is the existence of a prior spouse to whom one of the couples was married in a base year. Due to the nature of our data, this should only be true of the husband in our records as they stand (multihusband wives may be picked up when the cross-index file has been verified). (b) if the couple is treated as filing separately in the computation year, the wife is treated precisely as in (5)(b)(ii) above. For the husband, however, we have to reconstruct his situation with his former wife, precisely as was done for unmarried but formerly married persons in (4) above, arriving at U1 (instead of S1) and K1 (instead of J1 -- S1 and J1 are the corresponding values arrived at by reconstructing with the current wife). Then we redefine Bt for this year in which he was married to his former spouse: Bt = max (B ; U1 ; 1/2 J1 ; 1/2 K1) Notice that S1 is not included in the argument! Also, there is another difference, in the computation of J1: the determination of "separate deductions" for the husband is determined by his income and deductions from his (presumed point) record with his former wife. If i = 3 denotes the earlier wife we get D = G1 + G3 - N1 - N3 E = max (E1, E3) E1 = G1/G1+G3 A1 = {0 if R1>0.15 {R1 (D+600E) if 0.15 < R1 < 0.85 {(D+600E) If R1<0.85 Then for j = 1 J1 = G1 - A1 - C1 + N2 - 600E2 - C2 (subject to the same exception as in (5) (b) (ii) above) whereas U1 = G1 - A1 - C1 and K1 = N1 + N3 - C1 - C3 - 600E (compare (4) (c) above). For j = 2, J1, U1, and K1 are the same as for j = 1 except that the Ci's are not subtracted (see (4) (d) above). (c) If the couple is treated as fixing jointly in the computation year they are treated precisely as in (5) (c) above except in the years in which the husband was married to his former wife. For those years, their joint base is Bt = max (B ; Ul + S2 ; 1/2 K1 + S2) where S2 = {N2 - 600E2 - C2 for j = 1 {N2 - 600E2 for j = 2 and U1 and K1 are defined as in (b) above, subject to the same exception as (5)(b)(ii) with respect to subtraction of C1 and C2 (but not C3). (7) General notes to the programmer (a) If possible the program should allow easy modification beyond changes in the a, y, and Q's parameters. In particular, it should be possible to specify different values of B besides 0 and -(infinity), and it should be possible to define Bt as min (...) rather than max (...). We should also be able to work with a 3 year base period. (b) Be sure to allow for the alternative treatments of groups III and IV (separate vs joint filing) in a single production run. (c) In processing-groups II, III and IV be sure to catch for distinct treatment the years in which the taxpayer is single even though a subsequent spouse's record is present. (d) The functions computed should be written onto tape for use in tabulations even if the computations are done in a driver program to the initial tabulations. For greatest flexibility the functions output should be integrated with the original input. You are free to specify the format. (8) Final Remarks: (a) The present program makes no direct provision for computation of the alternative tax relative to capital gains net income. (b) The method of creating the input outlined in WAIS 645-050 may introduce biases, or may overcame biases (e.g.: the plugging in of spouse's records may essentially simply use a second husband as a proxy for an unobtainable first husband.hahttp://www.ssc.wisc.edu/wais/WAIS645052.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645052.txt *Mike VonSchneidemesser 1967XQCorrecting and Updating the Various WAIS Files - an Appropriate Sequence of StepsnJanuary 12, 1967 WAIS paper667-021t4.656-033 Maintenance System - Files, Data, Etc. Von Schneidemesser, Mike WAIS 667-021 January 12, 1967 Correcting and Updating the Various WAIS Files - an Approximate Sequence of Steps. Completion Est. 1. Corrections from (a) Multiple ID Checks (Esterly, Sahay) completed (b) Filing of new returns (Aldrich) January 29 2. List out master records which either seem to be missing, incomplete or inconsistent with other records in the WAIS files (Von Schneidemesser) January 23 Make appropriate correction cards for affected files, (Aldrich, Esterly, Von Schneidemesser) January 27 3. Run card-edit on all available FFID-UPDATE cards, make corrections (Esterly, Sahay) January 27 4. Update FFID file - involves 2-3 machine runs, eliminating of duplicate and rejected cards (Sahay, Von Schneidemesser) February 10 5. Run card-edit on MA-UPDATE cards, make corrections (Von Schneidemesser) January 31 6. Update Master-file involves 1-2 machine runs, eliminating of duplicate and rejected cards (Sahay, Von Schneidemesser) February 20 7.Using updated FFID file (4.) go through editing procedures of Richard Bauman's Benefit File Updating system. (Duddeston) February 20 8.Update Property File with cards coming from card edit and Later file consistency (PROPNON) checks (Loniello, Mastronarde) February 20 9. Make ID# changes on interview file (Lieberman) February 20 10. Match interview file with FFID file, clear up nonmatches by making necessary update cards, and repeat step 9. (Lieberman, Loniello) February 28 At this stage all the WAIS tape files will be preliminarily updated. It will be desirable to make the necessary ID# changes on source documents of the Interview and Benefit files and possibly integrate the Benefit files and the interview cover sheets with the Master file folders. Now follow some steps which will assure the correct correspondence between the records in the various files. Time Needed 11. Rerun programs KEY, BEN 23, FFKEY which will yield nonmatching FFID, Benefit and Master Records. Make update cards (Von Schneidemesser, Others) 2 Weeks 12. Recreate History File (Loniello) 2-4 Weeks 13. Run PROPNON which will yield nonmatching property file records. Make update cards (Loniello, Mastronarde) 1 Week 14. Repeat steps 3-6, 8. 15. Rerun various programs of the FFID-file maintenance system (WAIS 656-019) among which is "Select 805" which will create the list of nonmatching "Form 805" data. If this number is considerably reduced compared to the previous 114 records we may want to rerun the Ext.-01. (Von Schneidemesser) 1-4 Weeks 16. Repeat 12. Note 1. Steps 4, 6, and 7 may result in a number of duplicate records, inconsistent and false changes, which will have to be corrected in a subsequent round of updates. Note 2 The sequence of steps is irrelevant to some extent since a.) the matter of correction is an iterative process, b.) there will always be some additional errors to be corrected, which were detected during the use of the files.hahttp://www.ssc.wisc.edu/wais/WAIS667021.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667021.txts Richard Bauman 1966d]Summary and Timetable for Completion of Benefit Year Records and Social Security Data Records\November 10, 1966l WAIS paper667-0185"Benefit File Social SecurityztRichard A. Bauman WAIS 667 - 018 November 10, 1966 Summary and Timetable for Completion of Benefit Year Records and Social Security Data Records We recently prepared a detailed outline of the necessary steps to be taken in preparing the final versions of the benefit data tapes (card-image benefit data tape and the benefit year record). Since the procedure is identical to that found in WAIS 667 - 001 and differs only in application, this paper will merely summarize the remaining processing steps and give estimated completion dates. Description of Job Estimated Completion Date 1. Prepare card deck of benefit data not included in 11/18/66 existing benefit year records. 2. Apply "single card edit" to 1, correct errors. 11/25/66 3. Prepare list of benefit data source documents not received from 11/14/66 SSA. 4. Program and produce error lists resulting from match of FFID, log- ging cards, and the then up-to-date 12/16/66 version of the benefit data tape, make corrections. 5. Apply pre-edit and card edit 12/30/66 to result of 4, checking for inter- card inconsistencies not covered in 6 below, make corrections. 6. Rerun series of programs which produce computed benefit year re- 1/13/67 cords. Description of Job Estimated Completion Date 7. Merge with output of 6 all "special case" benefit year re- 1/20/67 cords. 8. Make necessary additions of 1/26/67 selected fields to benefit year records. The results of steps 5 and 8 should be a complete and up-to-date benefit data card image tape ( step 5) and a complete record tape. The possibility of receipt of a source documents from the SSA between now and that a different up-dating procedure be used. Due to the small number of such cases, I propose to have them hand-coded both as basic benefit card records, and benefit year records, hand-edited, and submitted directly to the output of steps 5 and 8. Indicated time lags are primarily a function of the time necessary to get computer output or basic machine output for editors to work on. An estimated total of 60 hours of clerical time and 20 hours of programming time is involved.hahttp://www.ssc.wisc.edu/wais/WAIS667018.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667019.txtN Ron Durant 1965^XProgramming Systems Involved in the Creation and Updating of the WAIS Master Income FileApril 28, 1965 WAIS paper645-054l(!Master File- Tax Records ProgramsRon Durant WAIS Paper 645-054 April 28, 1965 Draft Programming Systems Involved In The Creation And Updating of the WAIS Master Income File The WAIS Master Income File is a magnetic tape record file of a one percent sample of Wisconsin income tax returns for the years 1946-1960. A single master record consists of income and tax information for an individual in a given calendar year. Thus any individual can have from one to fifteen records depending on the number of years filed during the 1946-1960 period. The sequence of the WAIS Income File is identification number (i.e., individual) and year. Programs which have the capabilities of editing, creating, updating and consistency checking this Master file have been written, tested and are currently in operation. It is the purpose of this paper to generally outline the programming systems involved in creating and updating this WAIS Master Income File. All programs were written in IBM 1410 Autocoder language and run on an IBM 1410 (40K) Computer. The use of a character machine (IBM 1410) and a machine orientated language (IBM Autocoder) as opposed to a scientific type word machine and a "higher-order" computer language is essential to the efficient and economical performance of a data processing application. In addition to the WAIS programs, extensive use was made of the various IBM Utility programs (i.e., Card-To-Tape, Tape-To-Card, etc.) and the recently developed IBM 1410/7010 Operating System (1410-PR-155) -- Generalized Tape Sorting and Merge Program which has resulted in increased sorting speed over previously existing canned sort programs. The degree of speed of the IBM 1410/7010 Operating System Sort program can be readily determined by reference to the number of records processed and resulting sorting times for the WAIS Income data in WAIS Draft dated October 21, 1964. I. Edit and Master Creation Run: (TAX-01) After sorting the WAIS Detail Income Data to the sequence of the Master file (i.e., identification number-year) this Detail Data served as input to an edit and master creation run. An individual's year master record could consist of either two or four detail data records depending on the form type or year. The general procedure was to edit the data, determine the year and form type and build a master record accordingly, while checking for missing information. In addition, control totals were kept, as to the number of records in and out of the run. The editing involved: a) Checking for duplicate input records. b) Checking for numeric or alphabetic characters in the proper columns of the input record. c) Checking for compatible year, form type and record type. d) Checking to determine if all the detail records were present in order to build a particular individual's master record. All edited or rejected records were put out on tape and messages and rejected records were printed. The messages specified the reason for rejection and, in the case of illegal characters, indicated the column in the record where the first illegal character was found. Such a procedure kept to a minimum the time necessary for the correction and resubmission of the edits. More detailed information as to the totals (number of records processed) and processing times of the sorts, merges and TAX-01 are contained in the previously cited WAIS Draft dated October 21, 1964. II. Edit and Master Updating Run: (TAX-03) Once the initial WAIS Master Income File is created, TAX-03 is used to perform the following operations on the master file: a) Update existing master records. b) Add new master records. c) Delete existing master records. d) Change identification numbers in existing master records. e) Perform the same editing (as outlined in TAX-01) on the updating input. After sorting the WAIS Detail Income Updating Data to the sequence of the Master file (i.e., identification number-year), this data serves as input to an edit and master updating run. The general procedure is to determine whether there is updating data for an input Master record and then to write out the new output Master record. If there is just data for an individual in a given year and no master input record, a new master record is written out on the basis of the updating data input. Prior to the writing out of a master record, a table is searched to determine if there is an identification number change entry for the master output identification number. If so, the identification number is changed and the master record is written out on a separate recycled master file; otherwise, the master record is written on the normal master output file. The recycled master file is later sorted into the sequence of the master file and merged into the master file. It is this TAX-03 program which can be used to add additionalindividuals and/or additional year data to the WAIS Master Income File. More detailed information as to the control totals (number of records processed) and the processing time of TAX-03 are contained in WAIS Paper 645-028 dated February 1, 1965. III. Correction of the WAIS Master Income Record: (TAX-07) In the course of processing edits and making corrections to the WAIS Master Income File, it was determined that a program was needed which would allow the updating of a single field within a master record and which would also serve as a vehicle for readily adding additional types of information (such as Social Security, etc.) to the existing master record. TAX-07 was written to serve this purpose. Each field in the master record is assigned a two digit entry code and detail cards are submitted specifying the pertinent entry code and the corrected field. The detail entries are sorted into the sequence of the Master file (identification number-year-entry code) and serve as input to TAX-07. In TAX-07 these entries are edited and applied to the appropriate master record and the corrected master is written out. The editing features of this program include checking the detail entries for legal entry codes, correct record format, entry code-field compatibility, and duplicate entries. As in the previous runs, edited records and messages are printed as output. More detailed information as to the assignment of specific entry codes and the Format of the WAIS Master Income record is contained in WAIS Paper 645-046 dated April 20, 1965 IV, Consistency Check of the WAIS Master Income File A separate paper will outline the features of the Consistency program, which checked the consistency of various fields within a WAIS Master Income record.hahttp://www.ssc.wisc.edu/wais/WAIS645054.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645054.txtP ^4  James Geffertd 1965("WAIS Wisconsin Income DistributionFebruary 2, 1965 WAIS paper645-030sAnalysishahttp://www.ssc.wisc.edu/wais/WAIS645030.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645030.txt< Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. WAIS Wisconsin Income Distribution Contents 1. General comments 2. Classification outline 3. Code explanations 4. Distributions James Geffert Original: 19 May 1964 Ditto: 2 February, 1965WAIS 645-030 WAIS Wisconsin Income Distribution 1. General Comments: These distributions were derived from the file of "ALL GOOD" summary records prepared by the State of Wisconsin from individual income tax returns for the year 1962. They are subject to omissions due to the inability of the Assessor of Incomes to include returns which could not be precoded in a normal manner. No reliable figures are available on the number of returns omitted, but a horseback estimate based on experience with the quality of returns and data processing methods would be that corrections for omissions would increase the totals by between 5 and 10 percent. These distributions involve only income from sources taxable in Wisconsin and should not be interpreted as if based on total personal income. Interest on federal obligations, some pension income, social security payments and the like are not included in gross income for Wisconsin income tax purposes. Note also that persons with incomes below $600 are not required to file returns.* The tabulations included were prepared for a survey conducted by Professor Lutterman of the Department of Sociology and as background information for WAIS, both projects under the control of the Social Systems Research Institute. Because of specific nature of these projects the tabulations appear in fragmented form. ---------------- * For detailed information see an instruction sheet for filling out 1962 returns or the Wisconsin Statutes, Chapter 71. ---------------- 3. Income classifications Code Income 00 y < $1 (none observed - included in 01) 01 -$1 < y < + $1 02 $1 < y < $ 1,000 03 $ 1,000 < y < $ 2,000 21 $ 19,000 < y < $ 20,000 22 $ 20,000 < y < $ 25,000 23 $ 25,000 < y < $ 50,000 24 $ 50,000 < y < $ 75,000 25 $ 75,000 < y < $100,000 26 $100,000 < y < $150,000 27 $150,000 < y < $200,000 28 y > $200,000 James Geffert 1965Master File FormatApril 27, 1965 WAIS paper645-056eMaster File- Tax Records n hJames Geffert WAIS 645-056 April 27, 1965 Revised June 7, 1965. MASTER FILE FORMAT Position Old Label New Label Item 1 M 1 B 1 1 2-9 M 9 B 9 ID # 10-11 M 11 B 11 Year of return 12-27 M 27 B 27 Data (coded) 28-36 M 34 B 36 Largest Wage 37-45 M 41 B 45 Second wage 46-54 M 48 B 54 Total other wages 55-63 H 55 B 63 Total interest received 64-72 M 62 B 72 Total dividends received 73-81 M 69 B 81 Rent 82-90 M 76 B 90 Gain or loss, assets 91-99 M 83 B 99 Profit or loss, business 100-108 M 90 B108 Income from trustees 109-117 M 97 B117 Partnership 118-126 M104 B126 Other income 127-135 M1ll B135 Total of sources of income 136-144 M118 B144 Auto or business expense 145-153 M125 B153 Adjusted gross income 154-162 M132 B162 Standard deduction allowed 163-171 M139 B171 Net taxable income, standard deduction basis 172-180 M146 B180 Wisconsin tax paid 181-189 M153 B189 Union dues 190-198 M160 B198 Medical-dental expenses 199-207 M167 B207 Total interest paid 208-216 M174 B216 Business interest paid 217-225 M181 B225 Dividend deductible 226-234 M188 B234 Other deductions 235-243 M195 B243 Alimony paid 244-252 M202 B252 Forest crop land Position Old Label New Label Item 253-261 M209 B261 Total ded. bef. F. tax and don. 262-270 M216 B270 Net income bef. F. tax and don. 271-279 M223 B279 Fed. tax and Soc. Sec. deduction 280-288 M230 B288 Net income bef. donations 289-297 M237 B297 Donations 298-306 M244 B306 Net tax. income itemized basis 307-315 M251 B315 Personal exemp. allowance 316-324 M258 B324 Net normal or total tax 325-333 M265 B333 First installment 334-342 M272 B342 Misc. information 343-351 M279 B351 Soc. Sec. received 352-360 M286 B360 Assessed taxable income 361-369 M293 B369 Total additional taxes 370-378 M306 B378 Tax. income incomplete form 379 M245 B379 Block or column 380 M266 B380 Type of item in B342 381 M294 B381 # sources wage and salary 382 M295 B382 Form schedule, P & L 383 M296 B383 Stock dividend 384 M297 B384 Auto expense 385 M298 B385 Other enclosures 386 M299 B386 Spouse income 59 or 60 387 M307 B387 Type of form 388 M308 B388 Incomplete 1, complete 0 389 B389 Medical-dental indicator 390 B390 Federal tax indicator 391 B391 Donation indicator 392 B392 Income addition key M111 393 B393 Compute auto key M125 394 B394 Std. ded. inc. key M139 395 B395 First phase deduction key M209 396 B396 Net inc. before Fed. tax Key M216 397 B397 Net inc. before donations Key M230 398 B398 Net inc. item basis Key Position Old Label New Label Item 399-407 - B407 Social Security Number 406-409 - B409 Year of birth 410 - B410 Race 411-415 - B415 Account in benefit status other than disability 416-424 - B424 Social Security earnings 51 to date 425-426 - B426 Wage quarters of coverage 51 to date 427 - B427 Claims indication 428-435 - B435 Earnings for the year (SS) 436 - B436 Self-employment quarters of coverage 437-440 - B440 Quarterly wage coverage pattern 441 - B441 Agricultural quarters of coverage 442 B400 B442 (Record mark)hahttp://www.ssc.wisc.edu/wais/WAIS645056.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645056.txt Mike VonSchneidemesser 1967(!Improving the 1946-60 Master File April 7, 1967 WAIS paper667-034Master File- Tax Records Mike VonSchneidemesser WAIS 667 - 034 April 7, 1967 IMPROVING THE 1946 - 60 MASTER FILE TAPE The following suggested actions on the Master File Tape are arranged as to their approximate benefit cost ratio. Benefits in improving the Master File Tape are of two kinds: a. Improving the correspondence between the tape file and the tax returns -- make the tape represent as accurately as possible the information given on the returns. b. Filling in data for key items (AGI, NTI, etc.,) not given, and years where no tax returns were filed by computing or interpolating these data from other items or returns. Generally improvements of type (a) should be done before one works extensively on (b). Points 2, 3, 4, 6, 8, and 10 are primarily of nature (a) while points 7 and 11 are primarily of nature (b). 1. Eliminate "legitimate" Multiple ID Numbers (except those of divorced women who married back into the sample). Improvement: Longer series of returns for some individuals. 2. Find tax returns not on Master File Tape through methods explained in Robert Esterly's WAIS 667-033 "Note on Integration of Files". Improvements: Eliminate cases where a tax return or person is only in folder and not on tape or only on tape and not in folder. 3. Correct all records with a computation indicator (B392-B398) greater than 0 (or 5), that is an inconsistency in amounts (greater than $100). Improvements: Elimination of transcription errors, omitted items, missing deduction indicators. 4. Try to microfilm again tax returns with missing information indicators (B388 = 1). Improvement: Will produce data for most of 203 returns previously having "MI" codes. 5. Compare all 4,700 Form Type 6 records (B387 = 6) with tax returns. Improvements: In some cases more information than just taxable income is available on tax returns and could be put on the tape. 6. Run a consistency check between Master File Tape and Property File (the Master Detail File). Improvements: Detect erroneous income amounts and omissions. 7. For MI cases and Form Type 6 compare AGI and NTI from B378, B360, and other amounts. Revise the code of B388 to reflect these interpolations or educated guesses. Improvements: The creation of certain key amounts will make more records and longer time series available for analysis. 8. Run a consistency check between B388, B387 and all amount fields to assure correct coding of these indicators. Improvements: Will make records available for analysis more accurate and simplify programming. 9. Run a consistency check on the 442 Char Master Tape between AGI and social security covered earnings. Improvements: For incomes below the SS limit possible mismatches between 805 and Master Records and incorrectly coded AGI may be detected. 10. Husband-Wife checks. Improvements: Better coded data, elimination of possible incorrect marriages by Tax Department or WAIS, detection of records not yet on Master Tape. 11. Gap plugging. Improvements: See Richard Bauman's WAIS 667-012.hbhttp://www.ssc.wisc.edu/wais/WAIS6676034.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667034.txtMike VonSchneidemesser 1967PJPlan of Operations for Generating the Longitudinal-Analysis-of-Income File June 16, 1967r WAIS paper667-046gLongitudinal Analysisdd]Michael von Schneidemesser WAIS 667-046 Notes explaining Plan of Operations 1) Format In WAIS 645-056. 2) Includes cards to correct Coded Data to make ID changes to eliminate both valid and invalid Multiple ID's (see WAIS 667-023), to correct various fields an selected records, to add records not previously on Master vital, etc. 3) See WAIS 667-003. 4) Format in WAIS 645-063. 5) See WAIS 656-012. 6) Format in WAIS 645-057. 7) Listing of ID#, YR with appropriate message. 8) Includes all records for male individuals if: (a) 47 < year filed < 59. (b) At least four pairs of consecutive year are available for a person. (c) All records put out have a nonzero AGI amount - either gives by taxpayer or inputed. (See R. Bauman's description) Blocked 3 = 442. 9) Tape containing all records for which AGI and/or NTi was blank in the 442 Character Master, and for which an AGI and NTI amount could be inputed from taxable income Incl. form (B378). This tape may be merged into the 442 Character Master to replace the corresponding records without AGI and NTI if desired. Blocked 5 x 442. 10) Records for which no AGI (or NTI) is available and also could not be inputed a listed by ID# and Year wIth message 'two AGI for." 11) Format described in WAIS 667031 and this paper (Table 3 3-1). 12) Appends parameters of model A, B, and C so Ext-03 and puts out the whole file to binary form. 13) Binary taps, unblocked. Format In WAIS 667-031. Michael von Schneidemesser WAIS # 667- 046 PLAN OF OPERATIONS FOR GENERATING THE LONGITUDINAL-ANALYSIS-OF-INCOME FILE EDITS 2 400 Char MA-F 1 MAUPDATE 3 IDT FORM 805 4 improved 400 Char MA-F 1 EXT-01 5 LIST OF FORM-TYPE 6, NEG. AGE, NO NTI 7 442 Char MA-F 1 EXT-01 6 PREDAVID FORM-TYPE 6, AGI INTI IMPUTED 9 RECORDS REJECTED WITHOUT AGI 10 PRE-EXT-03 8 MDAVID MVS EXT-03 11 DAVID RRG 12 Miler-67 13 TABULATORS (XTAB) TABLES, GRAPHS, etc.hahttp://www.ssc.wisc.edu/wais/WAIS667046.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667046.txt6MISSING"701-023 is MISSING701-023n Gene Moyer 1964>7Sampling Procedures for the 2,000 Name Interview Samples July 17, 1964o WAIS paper645-002 Survey Data and FileGone Moyer WAIS Draft July 17, 1964 Sampling Procedures for the 2,000 Name Interview Sample (1) Sources of the Data The two Sources of names were the "Master File" and the 1962 Wisconsin State Income Tax Rolls hereafter called the "State File." The Master File is a list of some 18,000 taxpayers representing about 1% of the total taxpayers in the state. These 18,000 taxpayers are grouped into 51 name groups of various sizes. For these taxpayers tax forms for the years 1947-1959 are available (Given that a taxpayer filed in each of those years). The state file has the names of all taxpayers for the year 1962, but with one exception we were concerned only with those individuals who were in our 52 name groups. (2) Preparation of the Data There were many duplicates on the two lists since the individuals in the Master File also (in most cases) filed a return in 1962. Therefore by comparing social security numbers, the last name and the first five letters of the first name, and the address, the two lists were merged; and a basic list of names appearing in both files was compiled. This was known as the "merged list." A rather large residual group was left in both files. This was the result of moves in and out of the state, of errors in photographing the tax forms from which the Master File was compiled, and errors in coding end punching both files. Another somewhat separate problem was the fact that a "floating field" format was used in coding the entries on the Master File. This made the comparisons of the two files difficult and resulted in some duplicates in the residual lists. Hereafter, these residual lists will be called the "Unmerged Master File" and the "Unmerged State File." These four lists were further divided into those taxpayers whose county of residence is a "Primary Sampling Unit" (PSU) of the Wisconsin Survey Research Laboratory, the institution which is doing the actual interviewing. These PSU's are counties chosen at random from strata divided according to population. These PSU's are listed below; BROWN POLK CLARK PRICE VANS RACINE DODGE ROCK DOUGLAS SAUK EAU CLAIRE SHEBOYGAN GRANT TREMPEALEAU KENOSHA WAUKESHA MAN.-CAL. WALWORTH MARATHON WASHINGTON MILWAUKEE WAUPACA OCONTO WINNEBAGO OUTGAMIE WOOD (3) The Stratification Strategy Our aim in this study is to get information on as representative a group of the state's taxpayers as possible. It was extremely important that we get people of varying income and of varying amounts of net worth as well as both farmers and non-farmers. Some of these needed data were not explicitly available. The strata chosen to do this are as follows: Actual sec Expect ad (b Number Marital Status (most recent classification available) Married Wives 0 0 Merged Master and State File in PSU's 1 1201 1452 "Unmerged Master File in PSU's 2 232 (12) Unmerged State File in Name Cluster said in PSU's 3 246 222 (25) Unmerged State File in Name Cluster and outside PSU's 4 14 14 (0) Unmerged States File Not in Name Cluster sad in PSU's 5 1315 125 (6) Unmerged State File Not in Name Clusters and outside PSU's 6 22 22 (0) Unmerged Master File outside PSU's 7 22 14 (13) Merged ) master file outside PSU's a 25 26 (0) is Occupation (most recent classification available) Non-Farm I Farm 0 C Did taxpayer ever report dividend or capital gains income? yes 1 no 0 P In how many of the years 1954-1959 did the taxpayer file returns? a 0 1-2 1 3-4 2 5-6 3 stratum 4 Actual Expected r (& Number Crossed Out) Description Code E How much was the taxpayer's average "property" income (total income less income from wages and salaries) for the years 1954- 1959? (Yp) Yp < $200 0 $200 < Yp < $500 1 $500 < Yp < $1000 2 $1000 < Yp < $2000 3 $2000 < Yp < 4 F How much was the taxpayer's average gross income, 1954-1959? (Yq) See the classes and codes at the bottom of the page. G How much was the taxpayer's gross income in 1939? Y59 H See the classes and codes at the bottom of the page. How much was the taxpayer's gross income in 1962? (Y62) See the classes and codes at the bottom of the page. J What percent of tax paid was withheld from the taxpayer's 1962 income? (Tw%) Tw% < 10 10 < Tw% < 20 1 20 < Tw% < 30 2 30 < Tw% < 40 3 40 < Tw% < 50 4 50 < Tw% < 60 5 60 < Tw% < 70 6 70 < Tw% < 80 7 80 < Tw% < 90 8 90 < Tw% 9 Gene Moyer WAIS November 18, 1964 APPENDIX A to "Sampling Procedures For the 2,000 Interview Sample," WAIS Paper 645-002, July 17, 1962 A Description of the Universes from which the Names of Individuals to be Interviewed Were Chosen A1 Individuals in the Master file (1947-1959) who were still taxpayers in 1962 and who lived in PSU counties chosen by the Wisconsin Survey Research Laboratory. A2 Individuals in the Master file (1947-1959) whose 1962 taxpayer status could not be ascertained from the 1962 tax rolls and who lived in PSU counties. A3 Individuals on the 1962 tax rolls who were in Master file name clusters but who were not in our Master file and who lived in PSU counties. (Evidently these people had entered taxpayer status after 1959.) A4 Individuals on the 1962 tax rolls who were in our Master file name clusters but who were not in our Master file and who lived outside PSU counties. A5 Individuals on the 1962 tax rolls who were not in Master file name clusters but who lived in PSU counties. A6 Individuals on the 1962 tax rolls who were not in our-Master file name clusters and who lived outside PSU counties. A7 Individuals in our Master file who were not on the 1962 tax rolls and who lived outside PSU counties. A8 Individuals in our Master file who were still taxpayers in 1962 and who lived outside PSU counties.xhahttp://www.ssc.wisc.edu/wais/WAIS645002.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645002.txtzGene Moyer Jahanara Begum 1965\UReport on the  Gene Moyer Jahanara Begum 1965\UReport on the Checking of Ron's Inconsistent Coded Data Messages with RecommendationspSeptember 7, 1965 WAIS paper656-020rConsistency of DataszGene Moyer Jahanara Begum WAIS 656-020 September 7, 1965 Report on the Checking of Ron's Inconsistent Coded Data Messages with Recommendations Jahanara Begum recently checked several of the Inconsistencies Ron Durant found in the coded data. Before she checked any of these with the actual folders, she checked 100 ID numbers to see how much overlap there was, i.e., to see how many inconsistencies were contained in a single person's records. Tables I and II contain the results of this check: Table I Number of Inconsistencies in each ID checked (100 ID's) x= 1 2 3 4 5 6 7 8 9 10 11 12 Totals y = # ID's 32 32 19 5 6 2 2 0 0 1 0 1 100 with x inconsistencies xy 32 64 57 20 30 12 14 0 0 10 0 12 251 Table II Number of ID's and Year Records with each Error Type Inconsistency 001 002-3 004 005-6 007-8 009-10 101 102 103 104 105 Total # ID's 14 5 58 35 3 2 38 6 5 1 7 174 # Year Records 22 6 82 69 3 2 42 12 5 1 7 251 These two tables indicate that each inconsistent ID has about 2 1/2 inconsistent records and that the inconsistencies with the greatest probability of occurring twice (or more) during a taxpayer's tenure in our sample are 001, 004, 005-6, and 102. In addition, Jahanara checked several types of inconsistency with the folders to see the probable cause. The results of these checks follow with recommendations: Type of inconsistency # Records 001 Wife says husband has separate income, husband return not present 1070 Of the 50 Jahanara checked, 22 - Rejected from extract-01 because they were incomplete 19 - No husband return present 5 - Coding error 4 - Keypunching error Recommendation 1. Check rest for Rejected ID's (by machine) 2. Mark rest as having no husband return present by changing 2 in marital status to 4. 002 Wife states married, husband says not married 147 Out of 14 cases, 5 - Coder recorded in error (in year of marriage - 2) 4 - Taxpayer (husband) was really married 3 - Keypuncher error 1 - Separation in which each recorded it as coded 1 - Husband died and she remarried into our sample - the husband who died was not the one in our sample. Recommendation; Check each one for error and change codes according to what really occurred. 003 New wife states not married during the year 7 NOT CHECKED Recommendation: Check each by hand and correct according to apparent situation. 004 Husband says wife has separate income, no wife return 2826 Out of 50 cases, 16 - Rejected incomplete forms 22 - Wife's return was not present 3 - Wife's returns not punched (one person) 7 - Coding errors 2 - Keypunching errors Recommendation: Check these against form 6's and recode the rest to "4" in marital status 005 Wife says husband has no separate income, husband return present 382 NOT CHECKED 006 Husband says wife has no separate income, wife return present 1330 NOT CHECKED Recommendation: 1. Check to see that in year t-1 there is no marital status code 3. 2. If there is none, change the marital status code to 2. 007 Wife in year t says married during t, however already married in t-1 29 008 Husband in year t says married during t, however already married in t-l 18 009 Dead husband resurrected 38 None of these were checked. Recommendation: Check each one and correct. Individual Inter-Year Checks 101 Indicates filed during t-l, but return not present 8580 Out of 100, 32 rejected form 6 64 - Previous return not in folder (7 have 2 ID's) 2 - Return in folder but not punched 1 - Wife said in 1960 that she had filed in 1959, but the family's 1959 return showed no 1959 income for her. 1 - Man married a new wife in 1960. New wife was recorded as his old wife - we got no earlier returns for the second wife. Recommendation: 1. Check against rejected form 6 2. Check against persons with two ID's 3. Residual: change "Previous year filed" to the following code 0 yes, return present 1 yes, return not present 2 no, but return in folder 3-6 (unchanged) 7 nom insufficient income or unemployed 8 unchanged 9 102 Indicates did not file in t-1; yet t-1 return present 887 NOT CHECKED Recommendation: Recode these as "2" (see last page). These people are probably late filers and someone may wish to study them, 103 Indicates married during t, but already married in t-1 111 Out of 14, 6 - Coding error 5 - taxpayer changed wife 3 - taxpayer recorded an old marriage Recommendation: Check each one and see if we don't get some additional second wives - we do not now have enough. 104 Indicates not married in t, but married during t 56 Out of 20 11 - coding error 7 - keypunching error 2 - taxpayer recorded an old marriage Recommendation: Change marital status code from 0 to l or 2. 105 Indicates not married in t, but says married in t + 1, but no evidence of marriage in t + 1. 619 Out of 63, 43 - Taxpayer did not record new marriage 15 - coding error 5 - keypuncher error Recommendation: 1. Check t - 1 2. If t-1 and t+1 both indicate married, change marital status code to '1" if wife's return not present,, "2" if wife's return is present 3. If t-1 and t indicate not married, change marriage details to "1" (marriage, no details).hahttp://www.ssc.wisc.edu/wais/WAIS656020.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656020.txt"J# Roger Miller 1965LECreation of 1964 Tax Averaging Law File #1 -- Computation Year = 1958lApril 27, 1965 WAIS paper645-050Averaging Studiesl""Roger F. Miller WAIS 645-050 Working Paper April 27, 1965 Creation of 1964 Tax Averaging Law File#1 --Computation Year = 1958 Input: SSRI WAIS TAX EXTRACT #1 FILE as specified in Ron Durant's WAIS Working Paper 645-046 with the following MODIFICATIONS: (1) A sort key to allow sorting of wives into proper sequence. (2) Additional Field: Code for Taxpayer's answer to "Was the taxpayer (newly) married during the tax year? Were marriage details given?" 0 no new marriage 1 & 2 yes new marriage output: 1964 TAX AVERAGING LAW FILE #1 as specified in R.F. Miller's WAIS Working Paper 645-051 (a revision of R. Barger's WAIS Working Paper 645-048). The remainder of this present Working Paper is a description of how the records in the input are to be manipulated to produce the output records and serves as a specification of all coding and variable changes. Working Paper 645-051 defines the symbols used for the variables in this Working Paper. (1) General Considerations (a) Except for the substitution of the personal exemption of the Federal law for the tax credit of the Wisconsin law, no effort is made to adjust the Wisconsin data to reconstruct the Federal data. This transformation toward the Federal concept will be more readily accomplished after the asset detail data is integrated with the basic return data, at which time we will be capable of distinguishing long and short term capital gains, and Federally exempted bond interest. (b) Until we have completed and verified the cross-index file and developed the system for its use, we cannot pick up the previous years' records for a woman who was married in the previous year to a man not still her spouse, unless she is an unremarried widow. (c) For the above reasons it is not at present profitable to attempt to distinguish "head of household" filers from others. (d) For the purposes of this file, we assume that all married persons in base years filed jointly, and treat persons married in the computation year so that they may file either jointly or separately (and we will tabulate them both ways) (e) At worst our tabulations will show something about the magnitude of potentially averageable income (subject to definitional differences and the difference in years). At best they will shed considerable light on the potential effects of varying the legal definition of averageable income in certain ways. The difficulties that we have experienced in attempting to develop workable rules for processing returns of persons in different circumstances, in the face of the terribly complex legislation involved, leads us to believe that the legislation is administratively difficult or impossible, and almost incomprehensible to the taxpayer. The actual number of persons electing averaging is certain to be far fewer than the number who could qualify to do so and benefit from doing so. (f) Our data give "sources of income" not "sources of support", so that we will have to use some ad hoc rules of thumb to approximate the qualifying support tests. (2) Record Type Code Code Description 1 Male separate filer, never married in 1954-1958 2 Female " " " " " " " 3 Male " " ever " " " -1957, not in 1958 4 Female " " " " " " " " " " 5 Male with spouse in 1958, no other wives in 1954-1957 6 " " " " " one other wife " " " 7 " " " " " more than one other wife in 1954-1957. (3) Record Selection and Classification Procedure (a) Have the file sorted so that principal male filer is followed by his wives in order, all for one year, followed by the similar records in the preceding year (all other household members such as children being segregated to the end of the household's record). (b) Read in the 1954-1959 records for the first household member (generally male head) and the similar years records of the last and next to last spouses actually married to the person during the years 1954-1958 (or, if 1958 records are missing: during 1954-1959). It may be necessary to check the last two digits of the ID #'s, the marital status codes, and the "did you become newly married" codes of all wives and the husband to determine the appropriate wives. Watch out for a wife who never filed and then was replaced by another. (c) If the principal earner or either of the above spouses has records in either 1958 or 1959, retain their records -- otherwise we do not want any records for them in this output. (d) If we have a husband's record for 1958, then records for a wife he married in 1959 or 1960 should be processed as a separate single female. If the husband has a 1959 record but no 1958 record, any records for a wife he married in 1960 should be separately processed. Similarly, if a husband has neither 1958 or 1959 records, a wife he married in either of the years 1959 or 1960 should be processed as a separate single female. In any of these cases, if the husband had no other wife in the years 1954-1959, he is to be processed as a separate single male. (e) The appropriate record type code can now be easily determined. (4) Filling Record Gaps, and creating the variable V (a) Code for variable V: 1 Record there 2 Record not present, zeros plugged in data 3 " " " one year later plugged in data 4 " " " two years " " " " 5 " " " three years" " " " 6 " " " four " " " " " 7 " " " five " " " " " (b) Procedure (working back in time from 1959): (i) A current spouse did file in the year the person's record is missing, and says that "spouse had no separate income" (M = 1) or that "spouse died during year" (M = 3). Plug zeros into data fields (G +, G, N, C) of person's missing year and set M = 2, V = 2, L = 0, Occupation = 99, E = 0 and leave age blank unless obtainable from other years' data. (ii) No current spouse filed in the year of the person's record is missing. In the next year that is available, look at the field for "Did you file last year and if not, why not" (designated L symbolically). Action described below: (Set L to 0, make age appropriate). L Code Plug into Data fields G+ G N C M V Occupation 0 Later year's data 7 3-7 99 1 " " " (Note 1) 3-7 99 2 Zeros " 2 99 3 " " 2 36 4 Later year's data " 3-7 99 5 " " " " 3-7 28 6 " " " " 3-7 99 7 Zeros " 2 38 8 Later year's data " 3-7 99 9 " " n " 3-7 99 Note 1: If the next year available is an original record year, with V = 1, then check the "did you become newly married field": If yes (codes 1 and 2) set M = 1; if no (code 0) copy M from later year. For exemptions: check "newly married"? field. Yes- later year's data minus one; No - later year's data. (iii) A current spouse did file in the year the person's record is missing and says that "spouse did have separate income", (i.e. M = 2). Plug in the next year that is available (but don't look beyond 1959), setting V as 3, ..., 7 as is appropriate, M = 2, L = 0, and making the age correct for the missing year. (c) The sequencing of the above operations is important: (i) precedes (ii) which in turn precedes (iii), of course, but they are mutually exclusive anyway; the important thing is to begin with 1958 and work backwards in time to 1954. 1959 is of interest only as a source of a "later year" when 1958 is missing. "Interpolated" years may themselves be interpolated back into a previous year. (d) Each time that V is set to one of the numbers 3-7, increment a counter for this person's "number of interpolated years" which, after all years have been dealt with, gets entered in the output record in position 11, 12 or 13 as is appropriate. (5) Number of Exemptions (E) The number of exemptions to be entered is the number of dependents + 1 for the person + 1 if he is married + 1 if the person is over 65 by the end of the year + 1 if the person's spouse is over 65 by the end of the year. (6) Interpretation of V See (4) (a) above. When plugging in data from a later year which itself plugged in data, simply increment V by one in determining V for the earlier years. (7) Counts Generated during Extraction, (Other than those in the records themselves). (a) Total number of persons' records considered (1954-1959 only) (b) " " " persons whose records were considered. (All years). (c) " " " output records of type 1 (d) " " " " " " " 2 (e) " " " " " " " 3 (f) " " " " " " " 4 (g) " " " " " " " 5 (h) " " " " " " " 6 (1) " " " " " " " 7 (j) For each of (c)-(i) above, the number of records for which V=1 2 3 (I) Head filer or male 4 for (II) Later spouse 5 (III) Earlier spouse 6 7 (8) Further word on exemptions. The number of dependents referred to in (5) above is the total for a married couple and is entered into the records of both persons. Any discrepancies that arise are thus due to the plugging-in procedures outlined in (4) above. (9) If spouse died in 1958 (1959 if 1958 is missing) as is indicated by M = 3 then survivor is entitled to file a joint return for that year and falls in one of categories T = 5-7.hahttp://www.ssc.wisc.edu/wais/WAIS645050.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645050.txtA& Gene Moyer 196582A Proposal for a Thesis on the Methodology of WAISMarch 29, 1965 WAIS paper645-042\4.Analysis Proposals- For Analyses, Theses, etc.$$Gene Moyer WAIS 645-042 Draft March 29, 1963. A Proposal for a Thesis on the Methodology of WAIS Since the basic sources of WAIS data are income tax forms and an interview survey, it seems logical to discuss the methodology according to the following outline: I. The Design and Processing of the Income Tax Sample A. Objectives of the tax sample B. Initial sample design C. Units of analysis D. Problems in gathering the data E. Data processing systems 1. Phase I, Demographic and income summary variables 2. Phase II, Detail information on income from assets and from realization of capital gains and losses F. Possible weights for the data G. (Appendix A) Document: The Coding of the Income Tax Returns H. (Appendix B) Document: The Keypunching of the Income Tax Returns II. The Design and Processing of the Interview Sample A. Objectives of the interview B. a priori stratification scheme and response rates C. Weighting the data 1. Possible populations the data should represent 2. Weighting according to the county of the respondent's residence 3. A posterior stratification and response rates 4. The weighting function D. (Appendix C) The interview schedule and assets booklet III. Validating Key Variables A. Materials 1. Census reports, 1950 and 1960 2. Tax department summaries, 1946-1963 3. Statistics of Income, 1946-1962 4. Frequency and amount compilations of income tax data 5. Frequency and amount compilations of interview data B. Major variables to be compared with similar distributions or amounts 1. Gross Income over time (GI) with state total Adjusted Gross Income (AGI) from Statistics of Income 2. Distributions of gross income by size class in each year with similar distributions of Wisconsin AGI from Statistics of Income 3. Net taxable income by county with net taxable income by county from tax department summaries in each year, 1946-1960. 4. Total net taxable income over time with total state net taxable income over time from the state tax department summaries. 5. Distribution of filers in each county in 1959 with income recipients in each county from census tabulations 6. Total number of filers over time with 1950 population with income plus the change in population, 1950-1959 7. Filers in each occupation group in 1959 with census distributions of occupation groups 8. Distributions of family size with comparable census distributions 9. Distributions of gain and loss realized in 1959 by size class with national distributions from a 1959 Statistics of Income supplementary report 10. 1963 gross income from the interview by size class with 1963 tax department summary. 11. Total 1959 gross income from all state tax forms (estimated) with Wisconsin AGI in 1959 and with census estimates in the same year. IV. Conclusions on the validity of the study's findings An Example of a Comparison of One Major WAIS Variable with Another Distribution The first comparison listed in IIIB of the outline is the comparison of total Gross Income in year t (GIt) from WAIS' sample with Wisconsin Adjusted Gross Income (AGIt, t = 1947, 1948, ..., 1959.) The magnitudes of these two time series are vastly different and it is not obvious what the relationship between these two distributions should be. GIt is the sum of the incomes of tax filers in name clusters selected at random from the 1958 Wisconsin State Tax Rolls. Having selected the name clusters, all persons in those name clusters were included in the sample in all years. The sample represents approximately one percent of persons in the 1958 tax rolls and probably of persons who filed Wisconsin State Income taxes in all years from 1947-1959. The hypothesis we wish to test is that this sample design did not bias the GIt for all t. AGIt is the sum of the incomes on Nt Federal tax returns selected at random in each year from strata based on income size, type of income, and type of return. The percentages at which the strata were sampled varied from 0.3% to 8% for returns with income under $130,000, and 100% for all returns with $150,000 of income or more. Each tax district in the nation sampled returns filed in its district and state totals were weighted by a weighting system unique to the state. Therefore the distribution accurately represents the population of tax filers in Wisconsin. The important consideration here is that both AGIt and GIt are statistics from random samples from the population of tax filers in Wisconsin. Both samples contain at least conceptually many of the same persons. Because of strict filing requirements, all persons who file a Federal return also file a state return except for those whose income is completely composed of the interest on Federal obligations, income from businesses located outside Wisconsin, alimony, armed forces pay, certain pension payments, or other types of income exempt in Wisconsin but taxable under Federal Law. Even if taxpayers' incomes are composed entirely of exempt income, however, the state tax department often asks them to file. Therefore for all practical purposes, persons who file a Federal return also file a state return. The units which were sampled are different because the state tax return lacks the "joint return" feature of the federal return. At the same time we collected the returns of all husbands and wives in the name clusters. Since practically all husbands and wives file a joint return, our sample can be thought of as a sample of Mt Federal returns instead of Lt state returns (Lt > Mt). In order to provide a conceptual framework for comparing these two distributions, let (for demonstration purposes only) yit = the income reported on the return of the ith taxpayer (or married couple) in year t, i = 1, 2, ... V-1, V. yi = 1/T TEt=1 yit = the mean income of the ith taxpayer (or couple) over time eit = (yit - yi) = a disturbance term which we shall assume is distributed N (at least asymptotically), 0,oeit . Then the mean of AGI over time is 1 T l T Nt Nt 1 T Nt (1) AGE T E 9,GIt - T E E yit - E T E yit - y,. t-1 t-1 t-l i-1 t-l i=1 and in a given year N t (2) E(AGIt) - E(Et yit) - E E(yit) - E yi - AGI i-i i-i i-1 By a like process, Mt y _ (3) GI Ei and i-1 Mt (4) E(GIt) - E yi - Gy. i-1 Also because the sums of normally distributed variables are normally distributed, AGIt is normally distributed around AGI and GIt is normally distributed around GI. If these two statistics are the sums of incomes of persons in two independently drawn samples and if the two samples are unbiassed, (5) E(GIt) s E ~ ~ K E(AGIt) and GI 2 T WI t- K, (6) E t m V t-l K should be distributed as XT_l d f. With V we can test either the hypothesis that the two samples are independently drawn or that the two samples are unbiassed, but not both. Since bias in the two samples is the crucial question, let us investigate the question of the independence of the two samples by using probability theory. Let us assume that the following chart represents the population of tax filers in a given year: $150,000 Income Name Clusters Outside Name Clusters B Taxpayer Name $0 a Let (again for demonstration purposes) P(A) = the probability that a return in the under $150,000 income group was chosen in the Federal sample For convenience let us assume that P(A) = a, a single sampling rate (a < .08) P(B) = the probability that a return in the under $150,000 group was chosen in the WAIS sample P(B) = B, (B~ .01) P(C) = the probability that a return in the over $150,000 group was chosen in the WAIS sample P(C) = B P(D) = the probability that a return in the over $150,000 group was chosen in the Federal sample P(D) = 1 The samples are independent if (7) P(A|B) = P(A) and P(B|A) = P(B) and if (8) P(C|D) = P(C) and P(D|C) = P(D). Since the Federal sample contained a % of persons sampled without regard to name and WAIS' sample contained 13 % of taxpayers sampled without regard to income, they sampled (conceptually) a % of WAIS persons and WAIS sampled B % of their persons so (9) P(A|B) = a = P(A) (10) P(B|A) = B = P(B) In the lower income ranges then, the two samples are independent. In the section of the population with incomes over $150,000, (11) P(C|D) = B = P(C), and (12) P(D|C) = i = P(D), so the two samples are independent at least in a given year. Since independence in a given year is all that is required for V to be a test statistic of bias in the two samples, we would reject the hypothesis that AGIt and GIt are both samples from the same population if (13) V > X2T-1 d f at the 5% significance level. If the hypothesis is accepted, we can continue under the assumption that our sample is representative of the total income of state tax filers over time. If the hypothesis is rejected, we will know that the difference between the two samples is significant and we must search out the sources of the difference. Our sample was randomly chosen in 1958 and chosen dependent upon the 1958 choices in all other years. The Federal sample was chosen independent of the persons involved in each year. Therefore one possible source of these differences is associated with time. In order to test this, we might run the regression (14) GIt/AGIt = f (t,t2) and reject the hypothesis of no relationship between the ratios and time if (15) R2/(1-R2)/T-3 > F2dT-3 f d f as tabled at the 5% significance level. If the hypothesis is accepted, we must seek further, but this should wait until these tests have been run.hahttp://www.ssc.wisc.edu/wais/WAIS645042.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645042.txt  Ron Durant 1964HBActual Computer Times and Flow of Sorting and Master Creation RunsOctober 21, 1964 WAIS paper645-008Data ProcessingRon Durant 645-008 Draft October 21, 1964 To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Moyer, Bauman, Geffert, Roubal, Seavey, Wiegner Reference: Proposal for Wisconsin Income Tax Data Processing Procedures (Draft) dated July 17, 1964 Document: Actual computer times and flow or sorting and master creation runs (Steps III and IV of reference). All times apply to 1410 computer. I. Sort Runs: Modified Sort (Tax - A2) Run (1) Data 1 TAPE-SSRI-258TAPE-SSRI TAX-A2 (SORT) TAPE-ECON 19 Sorted Data 1 Run (2) Data 2 TAX-A2 (SORT) TAPE-SSRI-260 Data 3 Sorted Data 2 Run (3) TAPE-SSRI-261 TAX-A2 (SORT) Sorted Data 3 Run (4) Data 4 TAX-A2 (SORT) Sorted Data 4 RCDS.-165,000 TIME -90 min. TAPE -SSRI-264 RCDS.-164,950 TIME -90 min. TAPE -SSRI-284 RCDS.-52,800 TIME -32 min. TAPE -SSRI-267 RCDS.-123,050 TIME -60 min. TAPE -SSRI-265 Run (5) Correct 5 TAPE-SSRI-165 RCDS.-7,750 TIME -10 min. TAPE -SSRI-257 TAX-A2 (SORT). Sorted Correct 5 II. Merge Runs: Merge-3 Way (Tax-A2) Run (6) Sorted Data 1 Sorted Data 3 TAX-A2 MERGE-I MERGE I 1 of 2 Sorted Data 4 MERGEI of 2 Run (7) MERGE I1 of 2 MERGE II 1 of 3 MERGE I 2 of 2 TAX-A2 MERGE-II MERGE II 2 of 3 RCDS.-*340,800 TIME -43 min. *50 nines padding rcds. dropped. TAPE -SSRI-148 TAPE -SSRI-151 RCDS.-505,750 TIME -60 min. TAPE -SSRI-285 TAPE -SSRI-288 sorted Data 2 MERGE II 3 of 3 TAPE -SSRI-289 III. Master Creation Runs: in view of the volume of input, this run was split into three phases with the master output as follows: Run (8) Phase I: TAPS-SSRI-259 TAPE-SSRI-294 PHASE II MAST. RCDS. OUT-57,041 TIME-4 hrs. 15 min. TAPE-SSRI-296 Run (10) Phase III: TAPE-SSRI-266 PHASE I MAST. RCDS. OUT-56,452 TIME-4 hours TAPE-SSRI-293 Run (9) Phase II: TAX-01 TAX-01 TAX-0l Master 1 of 2 Master 2 of 2 Master l of 2 Master 2 of 2 Master 1 of 1 WAIS Master File (5 Reels) Total Master RCDS, Out 151,914 Total Time 2 hrs. 15 min. PHASE III MAST. RODS. OUT-38,421 TIME-4 hours In the course of creating the above master file, 30,334 edits were encountered. The major portion of these edits involved duplicate records and will require no corrections. However, until the volume of valid edits are determined a qualified use of this master file seems warranted.ehahttp://www.ssc.wisc.edu/wais/WAIS645008.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645008.txtu Ron Durant 1964Proposal for an Econometric Analysis of the Earnings Dynamics of Taxpayer Units for a Constant Sample of Wisconsin Income Taxpayers for the Period 19XX-19XXNovember 2, 1964 WAIS paper645-010K,%Proposals- For Analyses, Theses, etc.{Ron Durant 645-010 Draft November 2, 1964 Proposal for an econometric analysis of the earnings dynamics of taxpayer units for a constant sample of Wisconsin income taxpayers for the period 19XX - 19XX. I. Data WAIS tax data covering the period 1946-1960. Micro-data secured from Wisconsin State Income Tax Returns for 18,000 individuals. II. Stratification of Data A. Occupation 1. Professional 2. Semiprofessional 3. Managerial - official 4. Businessmen 5. Farmers 6. Clerical 7. Sales 8. Service 9. Skilled 10. Semi-skilled and unskilled B. Age groups 1. Under 35 years of age 2. 35 to 44 years of age 3. 45 to 54 years of age 4. 55 to 65 years of age 5. Over 65 years of age C. Marital status at beginning of period 1. Married tax paying unit a. Husband b. Wife 2. Single tax paying unit a. Male b. Female III. Statement of the Relationship The relationship describing the earnings dynamics of a taxpaying unit can be assumed to be of the general form: (3.1) Ei, t+l = g [xi1,xi2,...,xik, 0i, t=l' Ai t=1, Mi, t=l, Ei, t+l] i = 1, 2,..., n Tax paying units t = 1, 2,..., T Time periods where Ei, t+l = Primary earnings ith tax unit during t+l time period a) Largest wage or salary b) Profit or loss from business Depends on occupation c) Partnership income xij[j = 1,2,..., k]= Appropriate explanatory variables (see Appendix A) Stratification Variables Oi, t=l = Occupation of ith tax unit at beginning of relevant time period Ai, t=l = Age group of ith tax unit at beginning of relevant time period Mi t=1 = Marital status of ith tax unit at beginning of relevant time period ei, t=l = Value assumed by the ith tax unit in the t+1 period of a stochastic variable which is independent xij[j = 1, 2,..., k]. It will further be assumed that ei, t+l ~ N(O, o2) IV. Selection of an appropriate functional form for the above stated relationships and the estimation of the parameters of the functional form. A. Possible functional forms for each strata: final form chosen is the one which best fits the data as there is no a priori information or theoretical construct for choosing a particular functional form in advance. 1. Ei, t+l = ao + al xil + a2 xi2 + .... + ak xik + Ei, t+1 2. Ei, t+l/Ei, t = ao + a1 xil + a2 xi2 + .... + ak xik + Ei, t+1 3. lnEi, t+l = ln ao + al lnil + a2 ln xi2+... + ak-hln xi,k-h + ak-h+1 xi,k-h+1 + ak-h+2 xi,k-h+2 + ... + ak xi,k + ln ei,t+l 4. Ei,t+l = ao + a1 Ei,t + ei, t+1 5. Ei,t+1/yi,t = ao + a1, Ei,t + ei,t+l 6. lnEi, t+1 = ln ao + a1 lnEi,t + ln ei, t+1 V. Analysis of residuals in order to determine the degree of fulfillment of the assumptions of the Multiple Linear Normal Regression Model. VI. Presentation of statistical findings and economic consequences. APPENDIX A- Some Appropriate Explanatory Variables I. Ei,t = Lagged earnings of taxpayer Ei, 2t= Squared lagged earnings of taxpayer AR i,t+l " Ei,t# - Bi, t,~.f Eist~ - Bist-> 0 . 0 otherwise 1 Asi,t+l = Eisto - Bit-+ f Bist~ - E14 1< 0 - 0 otherwise Si,t = Lagged earnings of taxpayer's spouse Silt a Squared lagged earnings of taxpayer's spouse aSi,t+l = 8iot~ - Si,t--If Si t* - Sisw> 0 - 0 otherwise ASi,t+l Si,t+i - Si,t i Si,tri - Si,t < 0 - 0 otherwise Wilt .~ Lagged secondary wage or salary of taxpayer W L 2 t squared lagged secondary wage or salary of taxpayer 0i,t - Lagged total other sources wage or salary of taxpayer 0121 ~. Squared lagged total other sources wage or salary of taxpayer I i,t 9 Lagged total interest received of taxpayer I 1 2 t - Squared lagged total interest received of taxpayer Di,t - Lagged total dividends received of taxpayer Di2t = Squared lagged total dividends received of taxpayer Ri t a Lagged total rent received of taxpayer Ri2t w Squared lagged total rent received of taxpayer Ai,t = Lagged gain or loss on sale of assets of taxpayer A 1 2 t ~. Squared lagged gain or loss an sale of assets of taxpayer Fi,t s Lagged income from trustees or fiduciaries of taxpayer Fi 2 t = Squared lagged income from trustees or fiduciaries of taxpayer , N Number of dependents of the taxpayer N2 s Squared number of dependents of the taxpayer Z w 1 if spouse had income 0 otherwise C Yi.t01 E-= t Family per capita income exclusive of the taxpayer's primary earnings Yi't - Lagged total family income Bi,t a Lagged taxpayer's primary earnings N' M Number in taxpayer's household is M ss 1 if taxpayer moved in current period (t+l) ~. 0 otherwise * 0 1 if taxpayer changed occupation in current period (t+1) n 0 otherwise A S r 1 if taxpayer changed marital status in current period (t+l) . 0 otherwise 0 s 1 if taxpayer in major urban area (Madison, metropolitan Milwaukee area,. 0 otherwise Kenosha, Racine, and Greenbay)hahttp://www.ssc.wisc.edu/wais/WAIS645010.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645010.txt Harold Grovesl 1966Eligibility TablesJanuary 14, 1966 WAIS paper656-038Averaging Studies Tables Harold M. Groves WAIS Paper 656-038 January 14, 1966 I. Eligibility Tables Table I - Should show the number of taxpayers eligible for averaging under present federal standards in our sample of Wisconsin tax returns for 1955 to 1 James Geffert@ 1966Format of Kahn RecordsJanuary 4, 1966a WAIS paper656-0365 Kahn Outputr James Geffert WAIS 656-036 January 4, 1965 FORMAT OF KAHN RECORDS Card Posistion Label Item Number (400 char. M.F.) ALL 1-8 B9 WAIS ID # ALL 9-10 B11 Year of Return ALL 11 Card number 1 12-14 B14 Resident location 1 15-16 B16 County prior year 1 17 B17 Address change 1 18-19 B19 Occupation 1 20 B20 Occupation change 1 21 B21 Return reason 1 22 B22 Partnership 1 23 B23 Spouse separate income 1 24 B24 Marriage-details 1 25 B25 Head of family exemption 1 26-27 B27 Number of dependents 1 28-36 B36 Largest wage 1 37-45 B45 Second Wage 1 46-54 B54 Total other wages 1 55-63 B63 Total interest received 1 64-72 B72 Total dividends received 2 12-20 B81 Rent 2 21-29 B90 Gain or loss on sale of assests 2 30-38 B99 Profit or loss, business 2 39-47 B108 Income from trustees 2 48-56 B117 Partnership 2 57-65 B126 Other income 2 66-74 B135 Total sources of income 3 12-20 B144 Auto or business expense 3 21-29 B153 Adjusted gross income 3 30-38 B162 Standard deduction allowed 3 39-47 B171 Net taxable income standard deduction basis 3 48-56 B180 Wisconsin tax paid 3 57-65 B189 Union dues 3 66-74 B198 Medical-dental expenses 4 12-20 B207 Total interest paid 4 21-29 B216 Business interest paid 4 30-38 B225 Dividend deductible 4 39-47 B234 Other deductions 4 48-56 B243 Alimony paid 4 57-65 B252 Forest crop land 4 66-74 B261 Total deductions before federal tax and donations 5 12-20 B270 Net income before federal tax and donations 5 21-29 B279 Federal tax and social security deduction 5 30-38 B288 Net income before donations 5 39-47 B297 Donations 5 48-56 B306 Net taxable income itemized basis 5 57-65 B315 Personal exemption allowance 5 66-74 B324 Net normal or total tax 5 75 B394 Standard deduction income key 5 76 B395 First phase deduction key 5 77 B396 Net income before federal tax key 5 78 B397 Net income before donations key 5 79 B398 Net income item basis key 6 12-20 B333 First installment 6 21-29 B342 Miscellaneous information 6 30-38 B351 Social security received 6 39-47 B360 Assessed taxable income 6 46-56 B369 Total additional taxes 6 57-65 B378 Taxable income incomplete (6) form 6 66 B379 Block or column 6 67 B380 Type of item in miscellaneous information 6 68 B381 Number of sources of wage or salary 6 69 B382 Farm schedule, profit and less 6 70 B383 Stock dividend 6 71 B384 Auto expenses 6 72 B385 Other enclosures 6 73 B386 Spouse income 59 or 60 6 74 B387 Type of form 6 75 B388 Incomplete 1, complete 0 6 76 B389 Medical-dental indicator 6 77 B390 Federal tax indicator 6 78 B391 Donation indicator 6 79 B392 Income addition key 6 80 B393 Compute auto keyhahttp://www.ssc.wisc.edu/wais/WAIS656036.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656036.txta $ Martin David 1967("Preliminary- Benefit File Analysis July 21, 19675 WAIS paper678-007Benefit Analysis.'A copy of this paper could not be found5hahttp://www.ssc.wisc.edu/wais/WAIS678007.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678007.txt. Martin David 1967B7Jonathan Ryshpan April 27th, 1965. WAIS 645-061 Project Description The purpose of this project is to take several files of data from the Social Security Administration, combine them with another file generated here at the Income Tax study, and put the output in a form that can be used easily. Data Flow Diagram Ordinary 805 Multiple 805 Claims Prog 1 Social Security Fixed ID Prog 2 Selected Social Security Not on Social Security Not on Fixed ID Data Files 1. Ordinary 805: This is a tape sent to us by the Social Security Administration containing social security data for people in the study. 2. Multiple 805: The Social Security Administration also sent us printout sheets for people who had more than one social security No. All the data for a single person was combined by hand onto a single sheet, was then punched on cards and transferred to this tape in the same form as the Ordinary 805 file. 3. Claims: This is a card file with one card for each person who had ever filed for social security benefits. 4. Social Security: This contains the Ordinary 805 and Multiple 805 files merged, in a new format. Each record has an indicator (0 or 1) to show whether a claims card was present for that social security No. 5. Fixed ID: This contains the ID number, social security No., name, and address for each person in the study. It acts as a link between the social security data and the rest of the study. 6. Selected Social Security: Each record on this file consists of a Fixed ID record followed by the Social Security record with the same social security No. 7. Not on Social Security: This is a file of all the Fixed ID records having social security No's that do not appear on Social Security. 8. Not on Fixed lD: This is a list of all the social security No's that appear on Social Security but not on Fixed ID. Programs 1. Merges and reformats the Ordinary 805 and Multiple 805 files; and puts in the claims indicator. 2. Combines the Social Security and Fixed ID files and indicates What records appear on each file for which there is no corresponding record on the other.ahahttp://www.ssc.wisc.edu/wais/WAIS645061.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645061.txteJonathan Ryshpan 196560Identification and Social Security Record FormatApril 29, 1965 WAIS paper645-063e0*Formats Social Security Earnings Data- 805 Jonathan Ryshpan WAIS 645-063 April 29, 1965 Columns No. of. Cols. Var. No. Var. Name 1- 1 1 1 "I" 2- 9 8 2 Wisc. I.D. No. 10- 18 9 3 Social Security No. 19- 35 17 4 Last Name 36- 37 2 4A Title 38- 50 13 5 First Name 51- 62 12 6 Middle Name 63- 72 10 7A Street or Box No. 73- 75 3 7B RR, RT or RFD No. Only 76- 92 17 7C Street Name or "BOX" 93- 96 4 7D Street Type ("Ave," "St.," etc.) 97-117 21 8 Post Office 118-119 2 9 Zone No. 120-121 2 10 County code 122-123 2 11 Last Year of Data 124-124 1 12 Multiple Account Number Indication 125-125 1 13 Indication that Name on Record does not agree with Finder Card 126-130 5 14 Month and Year of Birth 131-131 1 15 Race Indication 132-138 7 16 Sex (alpha) 139-143 5 17 Indication of Railroad Activity 144-145 2 18 Newly Posted Credit Earnings Item 146-147 2 19 Additional Earnings Indication 148-149 2 20 Active Earnings Discrepancy 150-154 5 21 Account in Benefit Status- Other than Disability 155-158 4 22 Benefit Status Other than Disability was Terminated 159-161 3 23 Account in Disability Benefit Status or Disability Freeze Status 162-165 3 24 Disability Status was Terminated 166-169 4 25 Credit Indication 170-174 5 26 Earnings Statement Issued in Year Indicated 175-177 3 27 Indication of Self-Employment Activity Columns No. of Cols. Var. No. Var. Name 178-180 3 28 Indication of Delinquent Self-Employment Item 181-182 2 29 Indication of Agricultural Activity 183-191 9 30 Earnings, 1937 to Date 192-193 2 31 Wage Quarters of Coverage, 1947 to Date 194-195 2 32 Self-Employment Quarters of Coverage 1951 to Date 196-197 2 33 Agricultural Quarters of Coverage 1955 to Date 198-206 9 34 Earnings 1951 to Date 207-208 2 35 Wage Quarters of Coverage 1951 to Date 209-210 2 36 Self-Employment Quarters of Coverage 1951 to Date 211-218 8 37 1951 Earnings 219-219 1 38 1951 Self-Employment Quarters of Coverage 220-227 8 39 1952 Earnings 228-228 1 40 1952 Self-Employment Quarters of Coverage, 229-236 8 41 1953 Earnings 237-240 4 42 1953 Quarterly Wage Quarters of Coverage Pattern 241-241 1 43 1953 Self Employment Quarters of Coverage 242-249 8 44 1954 Earnings 250-253 4 45 1954 Quarterly Wage Quarters of Coverage Pattern 254-254 1 46 1954 Self-Employment Quarters of Coverage 255-262 8 47 1955 Earnings 263-266 4 48 1955 Quarterly Wage Quarters of Coverage Pattern 267 1 49 1955 Self-Employment Quarters of Coverage 268 1 50 1955 Agricultural Quarters of Coverage 269-276 8 51 1956 Earnings 277-280 4 52 1956 Quarterly Wage Quarters of Coverage Pattern 281 1 53 1956 Self-Employment Quarters of Coverage 282 1 54 1956 Agricultural Quarters of Coverage 283-291 8 55 1957 Earnings 291-294 4 56 1957 Quarterly Wage Quarters of Coverage Pattern 295 1 57 1957 Self-Employment Quarters of Coverage 296 1 58 1957 Agricultural Quarters of Coverage Columns No. of Cols. Var. No. Var. Name 297-304 8 59 1958 Earnings 305-308 4 60 1958 Quarterly Wage Quarters of Coverage Pattern 309 1 61 1958 Self-Employment Quarters of Coverage 310-310 1 62 1958 Agricultural Quarters of Coverage 311-318 8 63 1959 Earnings 319-322 4 64 1959 Quarterly Wage Quarters of Coverage Pattern 323-323 1 65 1959 Self-Employment Quarters of Coverage 324-324 1 66 1959 Agricultural Quarters of Coverage 325-332 8 67 1960 Earnings 333-336 4 68 1960 Quarterly Wage Quarters of Coverage Pattern 337-337 1 69 1960 Self-Employment Quarters of Coverage 338-338 1 70 1960 Agricultural Quarters of Coverage 339-346 8 71 1961 Earnings 347-350 4 72 1961 Quarterly Wage Quarters of Coverage Pattern 351-351 1 73 1961 Self-Employment Quarters of Coverage 352-352 1 74 1961 Agricultural Quarters of Coverage 353-360 8 75 1962 Earnings 361-364 4 76 1962 Quarterly Wage Quarters of Coverage Pattern 365-365 1 77 1962 Self-Employment Quarters of Coverage 366-366 1 78 1962 Agricultural Quarters of Coverage 367-374 8 79 1963 Earnings 375-378 4 80 1963 Quarterly Wage Quarters of Coverage Pattern 379-379 1 81 1963 Self-Employment Quarters of Coverage 380-380 1 82 1963 Agricultural Quarters of Coverage 381-381 1 83 Claims Indication ("0" or "1") 382-382 1 84 ( )hahttp://www.ssc.wisc.edu/wais/WAIS645063.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645063.txt2$William Duddleston 1968<5Description of the Revised Supplmentary Age Data FileJanuary 12, 1968 WAIS paper678-042Age DataWilliam Duddleston WAIS 678-042 January 12, 1968 Description of the Revised Supplementary Age Data File In the various analyzes of the social security benefit data that had been planned, it was deemed necessary to procure age and death data from all sources and to transfer to a general extract tape (See WAIS 678-003 Benefit Analysis Format for E1(805)). Age data and date of death data existed for about 9570 of the claims cases in the Benefit File (4766). The Supplementary Age Data File contained 1877 cases where the age and death data was collected from State of Wisconsin administrative records (See WAIS 656-048 for the "search" and WAIS 656-055 for format of SAD file). Bill Bathe wrote a program SAD Revision which, in effect, merged the age and death data from the Benefit File with the SAD file -- total SAD Revised records 6319. Of course, he weeded out any matches (324 cases) and he listed any matches where there were any discrepancies between the two birth and/or death dates. We used UPDATEAL to correct any errors (9 corrections). The format of this tape, SAD Revision followed the one presented in WAIS 656-055, except that accounts taken from the SAD file have an "A" in column 1 and Benefit File accounts are distinguished by a blank in column 1. The only other source of age data available is the 805 file. The actual integration of the SAD Revised file and this will be briefly described in a subsequent WAIS paper dealing with the actual creation of the general extract, E1(805). Supplement to: WAIS 678-042 3-1-71 Supplementary Age Data 1877 Benefit 4766 Extract Age and Death Data SAD Revision 4766 Benefit Records +1877 SAD Records 6643 -324 Matches between benefit records and SAD records 6319 SAD Revised 6319 805 E (805) Creation 3087 records w/age data but no 805 data  Ron Durant 1964B;Proposal for Wisconsin Income Tax Data Processing Procedureo July 17, 1964T WAIS paper645-001p.(Master File- Tax Records Data Processing Ron Durant Draft 645-001 July 17, 1964 Document: Proposal for Wisconsin Income Tax Data Processing Procedures The attached Steps I-V contain a proposed file flow procedure directed at creating and maintaining a Wisconsin Income Tax Master File for the years (1946-1960). New programs that will be required are TAX-A1, TAX-A2, TAX-01 and TAX-02 whose functions are outlined in the following sections. In the interest of expediency in the creation of an initial Master Tape, it is suggested that current effort be concentrated on Steps I-IV. This will enable a Master Tape (from which data can be extracted) to be ready at the earliest possible date. I. Utility Card-Tape Operations: (3 Runs) (1) ( )= approximate numer of records I.D.# CHANGES "C"In COL. 1(500) C/T 50 R.P.B. I.D.# TABLE TAPE 50x80 (2) (3) ADDITIONAL ORIGINAL DATA (10,000) DATA FROM RE-PUNCHED FOLDERS (3800) ADDED DATA TAPE 50x81 DATA CORRECTIONS (5000) NEW I.D. INFO"H" IN COL. 1(100) C/T50 R.P.B. CORRECTS -CHGS TAPE 50x81 DATA CHANGE &/ORADD "9" IN COL. 1(100) II. Pre-Sort Screening Runs: A.. General Information: Maximum File Size (Single Operation Merge 1 Sort 12) for 81 char. records (50 R.P.B.) is 121, 286. Currently Data Tapes SSRI 178, 154 and 166 are multiple file reels containing more than 121, 286 records. B. Purposes of these Runs: 1. To create single file reels consisting of 120,000 records or less. (Rough estimate 4 reels of 120,000 records with residual amount on 5th reel.) 2. To use a table search method in order to make the necessary I.D. # changes before going into the sort. (Saves resorting changed I.D. numbers) 3. To establish record counts for control purposes from run to run. Run (1): ADDED TAPE DATA ID# TABLE TAPE SSRI 178, 154 &166 (ORIGINAL CARD DATA) 50x81 TAX-A1 DATA 1 DATA 2 DATA 3 DATA 4 DATA 5 I.D. Extract 1* Run (2) CORRECTS -CHGS TAPE ID# TABLE TAPE TAX-A1 CORRECT 6 I.D. EXTRACT 2* 50x81 *Note: I.D. 1 and 2 will be merged with I.D. 2 taking precedence in case of duplicates... Then merged tape will be sorted on Social Security number. III. Sort Runs: (Modified Sort (TAX-A2) - see note on Sort Requirements) Run (1) DATA 1 TAX-A2 (SORT) SORTED DATA 1 Run (2) DATA 2 TAX-A2 (SORT) SORTED DATA 2 Run (3) DATA 3 TAX-A2 (SORT) SORTED DATA 3 Run (4) DATA 4 TAX-A2 (SORT) SORTED DATA 4 Run (5) DATA 5 TAX-A2 (SORT) SORTED DATA 5 Run (6) DATA 6 TAX-A2 (SORT) SORTED CORRECT 6 Run (7) SORTED DATA 1 SORTED DATA 2 SORTED DATA 3 SORTED DATA 4 TAX-A22 WAY(MERGE) MERGE 1 MERGE 2 MERGE 3 Run (8) Run (9) Run (10) MERGE 1 MERGE 2 MERGE 3 MERGE 4 SORTED DATA 5 TAX-A22 WAY(MERGE) MERGE 5 Sort Requirements: (3 Control Words) (1) Identification Number (Cols. 2-9) (2) *Year (Cols. 10-11) SORT CONTROL TAG (3) Card Form (Col. 1) *In order for the Identification ("I") Card to collate low in our File sequence, it will be necessary to modify Phase I of the Sort/ Merge 12 Program. This modification will involve placing a "-" zone above the number in column 10 (high order position of Social Security Number field) of all I cards. This "-" sign should be retained throughout the sort and carried on to the Master Tape so as to facilitate the sequence checking for equal or ascending sequence as records are passed. Any extracting or matching on Social Security numbers will not be hindered since the appropriate program would strip the zone before matching or extracting and then replace zone on the master record to maintain sequence checking procedures. Additional Sort Requirements: Put "+" sign in col. 10 of "H" Cards. (Want ID change card to collate ahead of ID card) IV. Edit and Master Creation Run: (TAX-01) INITIAL MASTER FILE Basic Data- Continuous multi-reel file sorted to sequence of run. TAX-01 MERGE4 EDITS MERGE (1) Drop Card Detail for which there is a correction. EDIT LISTING can be punched out from tape Furnish supplementary comments as to error SORTED CORRECT 6 50u81 Edits will be resubmitted in Edit and Master Updating Run i.e. TAX-02 V. Edit and Master Updating Run: (TAX-02) MASTER IN NOTE A RECYCLE MASTER IN TRANSACTIONS SORTED TRANSACTIONS MASTER OUT RECYCLE MASTER OUT LISTING EDITS OUT REJECTED EDITS TAX-A2 (SORT) 1. Add a Master 2. Delete a Master Possible Examples: 3. Change I.D.# 4. Change Variables 5. Add New Data NOTE A: Check for duplicate master. If duplicates drop and printout recycled master.hahttp://www.ssc.wisc.edu/wais/WAIS645001.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645001.txt) James Gefferta 1964$Proposal for Consistency CheckDecember 15, 1964a WAIS paper645-020 2,Master File- Tax Records Consistency of DataJames Geffert WAIS Working Paper 645-020 December 15, 1964 Draft Consistency of the Data File Specifications (General) Single year record A. The items of income must, when totaled, agree with the indicated total income. B. The deduction items, when the deductable portion is totaled, must agree with the indicated totals. c C. Taxable income by taxpayer's method must agree with his reported taxable income. The tax liability, when computed, must agree with the indicated tax liability. Items of coding such as the number of sources of wages and salaries, the farm schedule, profit and loss codes and the auto expenses itemized code must be consistent with the types of income reported. F. Items of income, deduction and expense should not appear for the years in which they do not appear on the tax forms. Include in record indications of location and nature of inconsistencies. II. Multi year record for individual A. Items of coding (such as residence location and residence change, county prior year, etc.) must agree from year to year. B. Items of coding must reflect missing records in a sequence if gaps are present. III. Household record A. Items of coding (number of dependents, spouse codes, etc.) must agree with information in husband-wife income tax forms when considered as a unit. B. Items of coding for dependents in the household must conform to dependent status. James Geffert WAIS December 3, 1964 R. Draft Consistency of the Data File Specifications (General) I. Single year record A. The items of income must, when totaled, agree with the indicated total income. B. The deduction items, when the deductible portion is totaled, must agree with the indicated totals. C. The tax liability, when computed, must agree with the indicated tax liability. D. Items of coding such as the number of sources of wages and salaries, the form a schedule, profit and loss codes and the auto expenses itemised code must be consistent with the types of income reported. E. Items of income, deduction and expense should not appear for the years in which they do not appear on the tax forms. Consistency of the Data File Specifications (Specific) 1. Single year record A. Addition of items of income 1. M34 through M104 must total to M1111 a. all years B. Deduction items (deductible portion) 1. M146 + M153 + M160 + (M167-M374) + M181 + M118 must equal M209 a. 1947 through 1952 2. M146+ M153 + M160 +(M167-M174) + M188 + M118 must equal M209 a. 1953 through 1955 3. M146 + M153 + M160 + (M167-M174) + M188 must equal M209 a. 1956 through 1958 4. M146 + M153 + M160 + (M167-M174) + M188 + M195 + M202 must equal M209 a. 1959 through 1960 5. M111-M209 most equal M216 if M209 is not blank a. 1947-1955 6. M125-M209 must equal M216 if M209 is not blank a. 1936-1960 7. M216 - M223 must equal M230 a. 1947 - 1960 8. M230-M237 must equal M244 a. 1947-1900 C. Determination of taxable income (TI) 1. 1947 only a. If M244 is blank, TI is equal to M111 b. If M244 is not blank 1) M111 > M244 2) TI = M244 2. 1948 through 1955 a. if M244 if blank TI = M111 - M132 + M139 b. If M244 if tot blank and if 1) M139 > M244 a) TI = M244 2) M139 < M244 a) TI = M139 3. 1956 through 1950 a. if M244 is blank 1) TI = M125 - M132 b. If M244 is not blank and if 1) (M125 - M132) > M244 a) TI = M244 2) (M125 - M132) < 1244 a) TI = (M125 - M132) D. Determination of Net Normal or total tax 1. Normal tax (based an TI) 1947 Income Bracket (TI) Rate Normal Tax lst 1,000 1.00% 2nd 1,000 1.24% N1 3rd 1,000 1.50% N3 4th 1,000 2.00% N4 5th 1,000 2.50% NS 6th 1,000 3.00% N6 7th 1,000 3.50% N7 8th 1,000 4.00% N8 8th 1,000 4.50% N9 10th 1,000 5.00% N10 11th 1,000 5.50% N11 12th 1,000 6.00% N12 over 12,000 7.00% N13 NORMTAX = 13 E i=1 Ni= M251 a. If NORMTAX <= 37.50 and M265 is blank M258 = NORMTAX - .02 NORMTAX b. If NORMTAX > 37.50a and M265 is blank M258 - (NORMTAX - .02 NORMTAX) - (NORMTAX - 37.50)/6 - .02((NORMTAX-- 37.50/6) c. If NORMTAX < 37.50 and M265 is not blank M258 = NORMTAX d. If NORMTAX > 37.50 and M265 is not blank M258 = NORMTAX - (NORMTAX-37.30)/6 2. Normal Tax (based on TI) 1948-1950 a.NORMTAX is computed as in C(1). James Geffert Revision Proposed Dec. 3 Revision Accomplished Format Field Date Record Position Length the of Information 1 1 1 2-9 8 Identification number 10-11 2 Year of return 12-14 3 Residence location 15-16 2 County prior year 17 1 Address change 18-19 2 Occupation 20 I Occupation change 21 I Return reason 22 1 Partnership 23 I Spouse separate income 24 I Marriage details 25 1 Head of family 26-27 2 Number of dependents 28-36 9 Largest wage or salary 37-43 77 Second wage or salary 44-50 7 Total other sources wage or salary 51-59 9 Total interest received 60-68 9 Total dividends received 69-77 9 Total rent received 78-86 9 Total gain or loss on sale of assets 87-95 9 Total profit or loss from business 96-104 9 Income from trustees or fiduciaries 105-113 9 Partnership income 114-120 7 Other income 121-129 9 Total of sources of income 130-136 7 Auto or business expense 137-145 9 Income (adjusted gross) less auto expenses 146-150 5 Standard deduction allowed 151-159 9 Net taxable income, standard deduction basis Field Position Length Item of Information 160-166 7 Wisconsin tax paid 167-171 5 Union dues 172-178 7 Medical-dental :expenses 179-185 7 Total interest paid 186-192 7 Business interest paid 193-199 7 Dividend deductible 200-206 7 Other deductions 207-213 7 Alimony paid 214-220 7 Forest crop land 221-227 7 Total deductions before Federal tax and donations 228-236 9 Net income before Federal tax and donations 237-243 7 Federal tax and social security deductible 244-252 9 Net income before donations 253-259 7 Donations 260-268 9 Net taxable income itemized basis 269-269 1 Block or column 270-274 5 Personal exemption allowance 275-281 7 Net normal or total tax 282-288 7 First installment 289-289 1 Type of item in 290-295 6 Miscellaneous information 296-301 6 Social security received 302-308 7 Assessed taxable income 309-314 6 Total additional taxes 315-315 1 Number of sources of wages + salaries 316-316 1 Farm schedule, profit and loss 317-317 1 Stock dividend 318-318 1 Auto expenses 319-319 1 Other enclosures 320-320 1 Spouse income 59-60 321-329 9 Taxable income incomplete form or net taxable income type 5 form 330-330 1 1 incomplete 0 complete 331-331 1 Type of form James Geffert WAIS December 3, 1964 R. Draft 1947 Master Record Consistency Blank M188 Move M111 to M125 Subtract M118 from M123 Compute Std. Ded. allowed M132 and M139 1948-1952 Auto inclusive Blank M188 Move M111 to M125 Subtract M118 from M125 Std. Ded. Have 9% of M111 to M132 if M111 is less than $5,000. Otherwise move $450 to M132. 1953-1955 Auto inclusive Blank M181 Move M111 to M125 Subtract M118 from M125 Std. Deduction Move 9% of M111 to M132 if M111 is less than $5,000. Otherwise move $450 to M132. 1956-1960 Std. Deduct. Move 9% of Mill to M132 if M111 Is less then $5,000. Otherwise move $450 to M132 All years 2. Data sheet information in each record in place of ampersandhahttp://www.ssc.wisc.edu/wais/WAIS645020.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645020.txtK DBarbara Aldrich 1969:4Possible Tapes to Scratch in Reducing Tape InventoryJanuary 13, 1969 WAIS paper689-022LFGeneral Papers (Regarding WAIS) Maintenance System - Files, Data, Etc. Barbara Aldrich January 13, 1969 WAIS 689-022 Possible Tapes To Scratch In Reducing Tape Inventory The following is a list of 40 WAIS Master tapes that might be scratched in an effort to reduce our tape inventory. I have included on my list some "questionable" tapes (i.e. there was a doubt in my mind whether these tapes should be scratched), as well as those which obviously should be scratched. Appendix A indicates the sequence involved with some of our tapes. All tapes which are not in the circle are on the list of tapes which might be scratched. The final decision on which tapes should be scratched will be made after discussion at a Friday meeting. Possible Candidates for Scratch Reel # Data on Tape Decision? 629, 630 631, 632 400 Char Master 7/22/67 306, 323, 338, 342 442 Char Master 4/16/68 474, 678 152, 140, 139 Sorted EXT01 9/14/65 109, 166 Extract 01A 10/23/66 600, 577, 597 Extract 01F Binary at 800 BPI 120, 303 EXT01 out from EXTOL 4/6/68 419 Tables-Zeromaster Freq of AGI 147, 295, 108 and other tables Post-Tax07 12/8/65 288, 604 Pre-David EXT03 from 442 241 (Intermediate Stage) Master FFYR and 805 on ID 8/4/66 502, 504 ID on ID# last version before elimi- 130 ID on SS# nation of valid mult. ID# FFID without 805 on SS# 8/1/66 155 FFID Updated 10/16/67 345 Death Extract Sorted on Name 193 Sorted Card 1 and 4 for prop file 6/26/64 WAISI & WAISII Force 805 Data from SSA 2/24/64 NEW 805 DATA 9/8/65 366 651 Benefit File Unblocked-No Rcrd Marks 9/8/67 650 Benefit Year Records before Merge 9/12/67 292 Form 805 Reformatted 11/16/65 627 Form 805 Reformatted With Good ID's 8/21/67 492 Geffert Extract 4X/67 107 805 (with ID)& each previous record 8/21/67 195, 369 Happiness File (Preparing 1410 version of EXT01 for conversion to binary) APPENDIX A EXTRACT SERIES Sequence on Tapes(Those circled should not be scratched) Sorted Extract 01 9/14/65 Extract 01A Happiness File 10/23/66 (1967) Extract 01F Binary 800 BPI Extract 01F Binary 556 *Extract 01F Binary at 556 can also be scratched when we re-create the Extract with correct age data MASTER SERIES 400 Char Master 7/22/67 442 Char Master 4/16/68 (Bad Age Data) Extract 01 out from Extract 01 4/16/68 (Bad Age Data) Pre-David EXT03 from 442 Master (Bad Age Data) 805 SERIES Form 805 Data from SSA 2/4/64 New 805 Data 9/8/65 Form 805 Reformatted 11/16/65 Form 805 Reformatted With Good ID's 8/21/67 Non-matching 805 With ID 8/21/67 E1-805 8/1/68 Reformatted 805 12/1/67hahttp://www.ssc.wisc.edu/wais/WAIS689022.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689022.txt $Edward Wiegner Richard Bauman0 1966ZTSummary of Specifications for Age-Occupation Tables from 1964 Tax Averaging Law FileAugust 22, 1966u WAIS paper667-004sAveraging StudieswEdward A. Wiegner Richard A. Bauman WAIS Paper 667-004 August 22, 1966 Summary of Specifications for Age-OccupI Martin David 196882JOBPLAN: Master File on 1108 and XTAB CapabilitiesDecember 23, 1968 WAIS paper689-020.(Data Processing Master File- Tax RecordsMartin David WAIS 689-020 December 23, 1968 JOBPLAN Master File on 1108 and XtAB Capabilities RUN 1 MF 400 char version Note 1) Cobol description of MF in BNANCRE 2) SIGN FIX Subroutine provides a means for removing alphabetics and placing sign in correct position for 3600. New Control Cards XTAB Driver 1108 Version RUN 2 Identical run can be executed using Extract 01F on Loniello PROGRAMDAVID for timing comparison. RUN 3 Alternatively HAPPINESS file is 1410 COBOL version of Extract 01F which could be input into flow chart above. A. We need to know which of 1, 2, 3 is relatively optimal: 1) Time for production 2) Time for extracting 3) Programming considerations We also need to discover what (if anything) is wrong with current version of EXT01F (2). B. This ties in with other programming efforts: 1) Hirsch Marital unit integration If 1108 experience in reading MF in RUN I is good we can shift Hirsch's program to MF, rather than EXT01F as now specified. 2) 805-MF integration As an alternative driver we may wish to tabulate a file that includes the NEW-805 (2) file and MF records. Two alternatives are available: MF NEW 805 NEW XTAB DRIVER OR MF NEW 805 INTEGRATION Revised 442 MF XTAB DRIVER C. Other considerations We would like to be able to vary input to the analysis program by varying the file description. Alternatively we may wish to call MULTILIN rather than XTAB, in RUN 1. New Programs XTAB driver (1108)* NEW XTAB driver ** or INTEGRATION** MARITAL UNIT INTEGRATION + *3600 Fortran version exists **COBOL elements required for integration exist. +Partially completed File Status MF updated ?/68 EXT01F updated 1/67 copied 11/68 on 3600 NEW 805 to be updated and correctedhahttp://www.ssc.wisc.edu/wais/WAIS689020.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689020.txtiN:Janet Whitaker 1969Two 805 ErrorsApril 21, 1969 WAIS paper689-025("Social Security Earnings Data- 805D>Janet Whitaker WAIS 689-025 Revised April 21, 1969 Two 805 Errors There are many discrepancies for the sex indicator between our data and the Social Security data. A program written by Dennis Alley (output is listing number 51 in the library) decked 15,971 of the 805 records, and indicated the following errors: Errors between WAIS ID and sex indicator Of the 264 errors located, 128 are clear discrepancies (ID indicates one sex, 805 Indicator, the other). The remaining 136 cases are less clear; messages include "SEX IS UNK NO" and "SEX IS FEM NO." Errors between age indicator and age data There are 427 cases where the age indicator is 2 (Indicating SAD availability) and the age variable is blank. Of these, 211 are "unknown death recipients", or institutional recipients (ID type xx xxxx 7x). See WAIS 678-019, page 3.hahttp://www.ssc.wisc.edu/wais/WAIS689025.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689025.txtctions: GRPWFINC and CNGOCCUP. Tables B,D,E,F,I,J,K,P,Q,R,S,T do not include these variables. It may be desirable to produce these tables first. For these, E Dk(21 Tki) = 19,590. N.B. The residual may require two runs. E. Copies We plan to run 3 copies of all tables. F. Checkout Some test cases will be run first (similar to the Treasury Runs) to check out the driver. Passengers will not be checked out.hahttp://www.ssc.wisc.edu/wais/WAIS667004.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667004.txtn Gene Moyer 1965*$Approximate Costs for the SSA Tables July 28, 1965 WAIS paper656-008n,&Social Security Data Processing TablesGene Moyer 656-008 July 28, 1965 Approximate Costs for the SSA Tables (See WAIS 656-007) (1) Preparation of WAIS 656-007 This is hard to calculate because of the many false starts. I suppose that I must have spent 30 hours altogether, but I do not have good records on this. 30 hours @ 6.00 = $180 (2) Programming time for making the extract: 30 hours @ 6.00 = 180 (3) Machine time for making the extract: 3 hours @ 25.00 = 75 (4) Machine time for Wistab runs: 25 runs on 5 year tapes = 125 runs 125 @ 1/4 hour per run = 31.25 hours 31.25 hours @ $25 = 781 TOTAL $1216 These are estimates and their error variance may be quite large. The runs (even with 12000 record tape) may take more than 1/4 hour each for instance.hahttp://www.ssc.wisc.edu/wais/WAIS656008.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656008.txty[ ( James GeffertA 1965Modifications of WISTAB May 26, 1965 WAIS paper645-066581Cross Tabulations Data Processing Programs Tables\  James Geffert Gene Moyer WAIS Working Paper 645-066 REVISED June 29, 1965 MODIFICATIONS OF WISTAB* Three new control cards have been added to the WISTAB program. This document describes their use to persons familiar with the program. The three cards are designated C, L, and A and will be explained in that order. The C Card Purpose: The purpose of this control card is to specify criteria for ignoring observations in frequency tabulations. The C specification affects all tables defined after its inclusion point in the packet of control cards. The L Card Purpose: The purpose of this control card is to specify criteria for ignoring observations in frequency tabulations. The L card affects tables only until the next *X control card is encountered. Notes 1. More than one C or L card can be used in a program.** The A Card Purpose: This card instructs the program to add the value of the variable specified into the table rather than adding frequencies of observations. Format: Same as *X, *Y, *Z cards except that cols. 21-80 are blank and *A appears in col. 3 and 4. Note: The A card affects only one table.** ---------------------------- *McCoy and Kenyon, WISTAB User's Manual, University of Wisconsin, School of Commerce Data Processing Center, Madison, July 1964. ** *C, *L, and *A may appear in any order so long as they immediately precede the *X card. Format of C and L Cards Card Column Explanation 1-2 XX Sequence Number Specifies card sequence number to machine operator. 3-4 *C Control Card Identification *L Identifies this control card as a C or L Card. 5-14 XXXXXXXXXXX Variable Name A short name which describes the variable. 15-17 XXX Beginning of-Variable, Designates the left-most, or high order position of this variable in the input record. 18-20 XXX End of Variable, Designates the right-most, or low order position of this variable in the input record. 21 G Elimination Criteria Instructs program to ignore records in which the variable specified in 15-20 is Greater than the value indicated in cols. 22-80 range specifications. L Instructs program to ignore records in which the variable specified in 15-20 is Less than the value indicated in cols. 22-80 range specifications. U* Instructs program to ignore records in which the variable specified in 15-20 is Unequal to the value or values indicated in cols. 22-80 range specifications. E Instructs program to ignore records in which the variable specified in 15-20 is Equal to the value or values indicated in cols. 22-80 range specifications. 22-80 Interval Values As described on page 10 of McCoy and Kenyon, WISTAB User's Manual. (TOT must be the last interval in the sequence) *Note: The U option may only be used to eliminate one value from consideration, e.g. 01 *C Year 010011 U 57 TOT would eliminate all records except those for year 1957. 01 *C Year 010011 U 5758 TOT, however, at the present time eliminates all records (including 57 and 58) from consideration, resulting in blank tables.hahttp://www.ssc.wisc.edu/wais/WAIS645066.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645066.txt`@nVictor Cassidy 1965:4A List of Tables in the Wisconsin Summary Statistics June 10, 19655 WAIS paper645-071b$Analysis Miscellaneous Tables?V?PVictor M. Cassidy WAIS Paper 645-071 June 10, 1965. A List of Tables in The Wisconsin Summary Statistics 1962 Tables: State Totals Table No. Title I. Data from 1962 Individual Income Tax Returns: State Totals by Net Taxable Income Class 1 Number of returns with adjusted gross income. 2 Amount of adjusted gross income. 3 Average adjusted gross income per return with adjusted gross income. 4 Amount of deductions. 5 Average deduction per return with adjusted gross income. 6 Amount of Net Taxable Income. 7 Average Net Taxable Income per return with adjusted gross income. 8 Number of returns with net normal tax. 9 Net normal tax. 10 Average net normal tax per taxpayer. 11 Percent of total returns with adjusted gross income. 12 Percent of total adjusted gross income. 13 Number of returns with deductions. 14 Percent of total number of returns with deductions. 15 Percent of total deductions. 16 Deductions as a percent of adjusted gross income. 17 Number of returns with net taxable income. 18 Percent of number of returns with net taxable income. 19 Percent of total net taxable income. 20 Percent of total number of returns with net normal tax. 21 Cumulative percent of total number of returns with net normal tax. 22 Percent of total net normal tax. 23 Cumulative percent of total net normal tax. 24 Effective tax rate based on net taxable income. 25 Effective tax rate based on adjusted gross income. Table Title No. II. Data From 1962 Indvidual Income Tax Short Form Returns: State Totals by Net Taxable Income Class 1 Number of returns with adjusted gross income. 2 Percent of total number of returns with adjusted gross income. 3 Amount of adjusted gross income. 4 Percent of total amount of adjusted gross income. 5 Number of returns with deductions. 6 Percent of total number of returns with deductions. 7 Amount of deductions. 8 Percent of total deductions. 9 Deductions as a percent of adjusted gross income. 10 Number of returns with net taxable income 11 Percent of total number of returns with net taxable income 12 Amount of net taxable income. 13 Percent of total amount of net taxable income. 14 Number of returns with net normal tax. 15 Percent of total number of returns with net normal tax. 16 Amount of net normal tax. 17 Percent of total amount of net normal tax. III. Data from 1962 Individual Income Tax Long Form Returns: State Totals by Net Taxable Income Class (1) Number of returns with adjusted gross income. (2) Percent of total number of returns with adjusted gross income, (3) Amount of adjusted gross income. (4) Percent of total amount of adjusted gross income. (5) Number of returns with deductions. (6) Percent of total number of returns with deductions. (7) Amount of deductions. (8) Percent of total deductions. (9) Deductions as a percent of adjusted gross income. (10) Number of returns with net taxable income. Table No. Title 11 Percent of total number of returns with net taxable income. 12 Amount of net taxable income. 13 Percent of total amount of net taxable income. 14 Number of returns with net normal tax. 15 Percent of total number of returns with net normal tax. 16 Amount of net normal tax. 17 Percent of total amount of net normal tax. IV. Data on Individual Income Tax Returns 1962 by Type of Return: State Totals. 1 Short form returns. 2 Long form returns. 3 Total returns. V. Data from 1962 Individual Income Tax Returns State Totals by Income Bracket: State Totals 1 Net taxable income. 2 Percent of total net taxable income by income bracket. VI. Data on Taxable Individual income Tax Returns 1962: State Totals by Net Taxable Income Class 1 Number of returns with adjusted gross income. 2 Adjusted gross income. 3 Number of returns with deductions. 4 Amount of deductions. 5 Number of returns with net taxable income. 6 Amount of net taxable income. 7 Number of returns with net normal tax. 8 Amount of net normal tax. Table No. Title VII. Data on Non-Taxable Indvidual Income Tax Returns 1962: State Totals by Net Taxable Income Class 1 Number of returns with adjusted gross income. 2 Adjusted gross income. 3 Number of returns with deductions. 4 Amount of deductions. 5 Number of returns with net taxable income. 6 Amount of net taxable income. VII Data on Individual Income Tax Returns 1962 Taxable and Non-Taxable Returns: State Totals 1 Taxable returns. 2 Non-taxable returns. 3 Total returns. 1962 Tables: County Totals I Data from 1962 Individual Income Tax Returns by County and by District 1 Number of returns with adjusted gross income. 2 Adjusted gross income. 3 Deductions. 4 Number of returns with net taxable income. 5 Net taxable income. 6 Number of returns with net normal tax. 7 Net normal tax. II Derived Data from Individual Income Tax Returns 1962 by County and by District 1 Average adjusted gross income per return with adjusted gross income. 2 Rank. 3 Average deduction per return with adjusted gross income. 4 Rank. 5 Average net taxable income per return with adjusted gross income. 6 Rank. 7 Average net normal tax per taxpayer. 8 Rank. 9 Percentage of total number of returns with adjusted gross income. 10 Rank. 11 Percentage of total adjusted gross income. 12 Rank. 13 Percentage of total number of returns with net taxable income.. 14 Rank. 15 Percentage of total net taxable income. 16 Rank. 17 Percentage of total number of returns with net normal tax. 18 Rank. 19 Percentage of total net normal tax. 20 Rank. 21 Deductions as a percentage of adjusted gross income. 22 Rank. 23 Effective tax rate based on net taxable income. 24 Rank. 25 Effective tax rate based on adjusted gross income. 26 Rank. (No Number) Net Taxable Income and Net Normal Tax for Municipalities by County 1 Net taxable income. 2 Net normal tax. 1963 Tables Table No. Title Entry 1 Gross Taxable Income By Net Taxable Income Class, 1963. (number with Gross Taxable Income, amount in thousands of dollars) 2 Deductions By Net Taxable Income Class, 1963. (number of persons with deductions, amount of deductions in thousands of dollars) 3 Net Taxable Income By Net Taxable Income Class, 1963. (number of persons with net taxable income amount in thousands of dollars) 4 Exemptions by Net Taxable Income Class, 1963. (number of persons with exemptions by net taxable income class, amount in thousands of dollars) 5 Taxes Paid to other States by Net Taxable Income Class, 1963. (number of persons paying taxes to other states by Net Taxable Income Class,amount in thousands of dollars) 6 Net Normal Tax Liability by Net Taxable Income Class, 1963. (number with net normal tax liability by net taxable income class, amount in thousands of dollars) 7 Taxes Paid by Withholding by Net Taxable Income Class, 1963. (number of persons paying taxes by withholding by net taxable income class amount in thousands of dollars) 8 Taxes Paid by Declaration by Net Taxable Income Class, 1963. (number of persons paying taxes by declaration by net taxable income class, amount in thousands of dollars) 9 Taxes Paid with Initial Returns by Net Taxable Income Class, 1963. (number of persons making payment with returns, amounts in thousands of dollars) 10 Spouse Offset 1 by Net Taxable Income Class, 1963. (number of persons with spouse offset 1 by net taxable income class, amounts in thousands of dollars) 11 Spouse Offset 2 by Net Taxable Income Class, 1963. (spouse offset 2 by net taxable income class, amounts in thousands of dollars) 12 Declaration Offsets by Net Taxable Income Class, 1963. (number of declaration offsets by net taxable income class, amounts in thousands of dollars) 13 Refunds by Net Taxable Income Class, 1963. (number of refunds by not taxable income class, amount in thousands of dollars) 14 Refund Reductions Made by Office Audit by Net Taxable Income Classes, 1963. (number of refund reductions by office audit by net taxable income class, amount in thousands of dollars) 15 Refund Increases Made by Office Audit by Net Taxable Income Classes, 1963. (number of refund increases by office audit by net taxable income class, amount in thousands of dollars) 16 Delinquent Charges Assessed by Net Taxable Income Class, 1963. (number of delinquent changes assessed by net taxable income class, amount in thousands of dollars). 17 Gross Taxable Income of Individual Using 10 Percent Standard Deduction (Method 1) by Net Taxable Income Class, 1963. (number of persons using 10% standard deduction, amount in thousands of dollars) 18 Deductions Claimed by Individuals (number of deductions claimed Using 10 Percent Standard by individuals using 10% Deduction -(Method 1) by Net standard deduction, amount Taxable Income Class, 1963. in thousands of dollars) 19 Net Taxable Income of Individuals (number of persons using 10% Using 10 Percent Standard. standard deduction by net Deduction (Method 1) by Net taxable income class, amount Taxable Income Class, 1963. in thousands of dollars) 20 Exemptions Claimed by Individuals (number of exemptions claimed Using 10 Percent Standard Deduction by individuals using 10% (Method 1) by Net Taxable Income standard deduction, amount Class, 1963. in thousands of dollars) 21 Net Normal Tax Liability of (number of net normal tax Individuals Using 10% Standard liabilities by net taxable Deduction (Method 1) by Net income class, amount in Taxable Income Class, 1963. thousands of dollars) 22 Gross Taxable Income of Individuals Using Itemized Deductions (Method 2) by Net Taxable Income Class, 1963. (number of individuals by net taxable income class, amount in thousands of dollars) 23 Deductions Claimed by Individuals Using Itemized Deductions (Method 2) by Net Taxable Income Class, 1963. (number of individuals using itemized deductions (method 2) by net taxable income class, amount in thousands of dollars) 24 Net Taxable income of Individuals Using Itemized Deductions (Method 2) by Net Taxable Income Class, 1963 (number of individuals using itemized deductions by net taxable income class, amount in thousands of dollars) 25 Exemptions Claimed by Individuals Using Itemized Deductions (Method 2) by Net Taxable Income Class, 1963. (number of individuals claiming exemptions by net taxable income class, amount in thousands of dollars) 26 Net Normal Tax Liability of Individuals Using Itemized Deductions by Net Taxable Income Class, 1963. (net normal tax liabilities by net taxable income class, amount in thousands of dollars) 27 Gross Taxable Income of Individuals Using $300 Minimum Deduction (Method 3) by Net Taxable Income Class, 1963. (number of individuals using $300 minimum deduction by net taxable income class, amount in thousands of dollars). 28 Deductions Claimed by Individuals Using $300 Minimum Deduction (Method 3) by Net Taxable Income Class, 1963. (number of individuals claiming deductions using $300 minimum deduction, amount in thousands of dollars) 29 Net Taxable Income of Individuals Using $300 Minimum Deduction (Method 3) by Net Taxable Income Class, 1963. (number with net taxable income, amount in thousands of dollars) 30 Exemptions Claimed by Individuals Using $300 Minimum Deduction (Method 3) by Net Taxable Income Class, 1963. (number of individuals claiming exemptions, amount in thousands of dollars) 31 Net Normal Tax Liability of Individuals Using $300 Minimum Deduction (Method 3), by Net Taxable Income Class, 1963. (number with net normal tax liabilities, amount in thousands of dollars.) 32 Gross Taxable Income Reported Resulting in No Net Taxable Liability by Net Taxable Income Class, 1963. (number with gross taxable income amount in thousands of dollars, number with deductions, amount in thousands of dollars, number with net taxable income, amount in thousands of dollars, number with exemptions, amount in thousands of dollars) 33 Individuals with Gross Taxable Income, Percentages of Totals, Number, and Amounts, Using Methods 1, 2, and 3 by Net Taxable Income Class, 1963. (percentage of total number, percentage of total amount, entries for methods 1, 2 and 3) 34 Individuals Claiming Deductions, Percentages of Totals. Number and Amounts, Using Methods 1, 2 and 3, by Net Taxable Income Class, 1963. (percentage of total number, percentage of, total amount, entries for methods 1, 2 and 3) 35 Individuals with Net Taxable Income, Percentages of Totals, Number and Amounts, Using Methods 1, 2 and 3, by Net Taxable Income Class, 1963. (percentage of total number, percentage of total amount; entries for methods 1, 2 and 3) 36 Individuals Claiming Exemptions, Percentages of Totals, Number and Amounts Using Methods 1, 2 and 3, by Net Taxable Income Class, 1963. (percentage of total number, percentage of total amount, entries for methods 1, 2 and 3) 37 Net Normal Tax Liability, Percentages of Totals, Number, and Amounts, by Methods l, 2 and 3, by Net Taxable Income Class, 1963. (percentage of total number, percentage of total amount, entries for methods 1, 2 and 3) 38 Percentage of Tax Revenue Paid by Withholding, Declaration and Payments with Returns, by Net Taxable Income Class, 1963. (percentage of total number, percentage of total amount, entries for withholding, declaration and payments with returns) 39 All Individuals, Number and Percentage of Total, Not Taxable Income by Bracket, Amount and Percentage of Total, Rate and Revenue Collected per 0.1 Percent Rate by Net Taxable Income Class, 1963. (number, amount in dollars, rate by bracket, revenue collected per 0.1 percent of (rate thousands of dollars)) 40 Deductions Listed, and Net Taxable Income as a Percentage of Gross Taxable Income by Net Taxable Income Class, 1963. (deductions as percentage, net taxable income as percentage of total) 41 Exemptions Claimed and Normal Net Tax Liability as a Percentage of Net Taxable Income by Net Taxable Income Class, 1963. (exemptions as percentage of net taxable income, net normal tax liability as percentage of net taxable income) 42 Refunds Requested and Exemptions Claimed as Percentages of Net Normal Tax Liability by Net Taxable Income Class, 1963. (refunds as percentages of net normal tax liability exemptions as percentages of net normal tax liability) Tables from the Annual Reports Title Entry Net Taxable Income Assessed 1946 (Individual, corporation, total) (by Counties) Schedule 8. Net Taxable Income Assessed 1947 (Individual, corporation, total) (by Counties) Schedule 8., Net Taxable Income Assessed January 1, (Individual, corporation, total) 1948-June 30, 1948 (by Counties) Schedule 8. Net Taxable Income Assessed (Individual, corporation, total) July 1, 1948-June 30, 1949 (by Counties) Schedule 8:. Net Taxable Income Assessed (Individual, corporation, total) July 1, 1949-June 30, 1950 (by Counties) Schedule 8. Net Taxable Income Assessed (Individual, corporation, total) July 1, 1950 through June 30, 1951 (by Counties) Schedule 8. Net Taxable Income Assessed July 1, (Individual, corporation, total) 1951 through June 30, 1952 (by Counties): Schedule 8. Net Taxable Income Assessed (Individual, corporation, total) July 1, 1953 through June 30, 1954 (by Counties) Schedule 8. Net Taxable Income Assessed July 1, (Individual, corporation, total) 1954 through June 30, 1955 (by Counties) Schedule 8. Net Taxable Income Assessed July 1, (Individual, corporation, total) 1955 through June 30, 1956 (by Counties) Schedule 8. Net Taxable Income Assessed (Individual, corporation, total) July 1, 1957 through June 30, 1958 (by Counties) Schedule 8 Net Taxable income Assessed (Individual, corporation, total) July 1, 1958 through June 30, 1959 (by Counties) Schedule 8. Net Taxable Income,Assessed July 1, (Individual, corporation, total) 1960 through June 30, 1961 (by Counties) Schedule 8 Net Taxable Income Assessed July 1, (Individual, corporation, total) 1961 through June 30, 1962 (by Counties) Schedule 8. Net Taxable Income Assessed July 1, (Individual, corporation, total) 1959 through June 30, 1960 (by Counties) Schedule 8.hahttp://www.ssc.wisc.edu/wais/WAIS645071.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645071.txta2 Ron Durant 1965d]Initial Estimates of Earnings--Dynamics Model Based on WAIS Income Data for the Years 1958-59b June 17, 1965; WAIS paper645-072e4.Analysis Proposals- For Analyses, Theses, etc.00Ron Durant WAIS Working Paper 645-072 June 17, 1965 Report To: WAIS Staff: Bridges, David, Groves, Lampman, Miller, Barger, Bauman, Geffert, Moyer, Ryshpan, Roubal, Seavey, Wiegner, and Watts, Cain, and Hansen From: Ron Durant Subject: Initial Estimates of Earnings--Dynamics Model based on WAIS Income Data for the years 1958-59. Outline: I. Procedure. II. Statement of Earnings--Dynamics Model. III. Presentation of Equations And Tables. IV. Comments on Findings. Appendix A--List of Variables. 1. Procedure: The general procedure was to estimate the model for the 1959 cross-section data consisting of 2510 observations.(See Equations 1 and 2). Then the data was stratified according to the specified groups and the model was again estimated. (See Equations 3 thru 26). This author anticipates estimating this model for the years 1948 through 1960 in order to analize the earnings dynamics of a sub-group of married male Wisconsin taxpayers. II. Statement of Earnings--Dynamics Model: A. Functional Forms Estimated: t = 1959 (See APPENDIX A for list of Variables). (II.1) Ei,t = ( ) (II.2) Ei,t/Ei,t-1 = Same as (II.1) (II.3) Note--Subtracting Ei,t-1 from both sides of (II.1) allows one to view the dependent variable as Ei,t = Ei,t - Ei,t-1 = with a0' = a0 - 1 where a0' is the coefficient of Ei,t-1 in the transformed equation. (11.4) Note-Subtracting II.1 from both sides of (II.2) allows one to view the dependent variable as Ei,t/Ei,t-1 = % change in earnings with a0' = a0 - 1 where a0' is the constant term in the transformed equation. III. Presentation of Equations and Tables: A. Codes Used: * = significant at 5% significance level ** = significant at 1% significance level y = mean value of regressand R2 = coefficient of determination n = number of observations s = standard error of estimate n.a = no observations for that variable Equation 1: 1959 (Cross-section) Regressand: Ei,t Regressors Regressors Constant 4928 Constant 1591 O(1) 1108 00) 352.3 0(2) 2045 0 (2) 291.1 A(1) 390.3 -356.7 A(2) 45.79 A(2) -263.5' A(3) A(3) -320,6 j -252.2 ,7168 Ei,t-1 .0032 Note effect of occupation; tWi,t .0407 age-earnings (past period) interaction. Stratification :L, t 250.3 by occupation-age groups D(2) 930,0 will allow for these D(3) 322,1 interactions. 36003 Ni,t -312.3 t t -343.0:K Ri,t -67.34 Ri,t 690.1 t Di,t 393,9 M(u) .,t 50,81 y R2 n 5854 .6353 2510 2259 Equation 2: 1959 (Cross-section) Regressand: Ei,t/Ei,t-1 Regressors Constant 1.340 0(1) -.0862 0(2) -.0153 AM -.1205 A(2) --.150224 A(3) -.1311 Z , C-1 -.00002017'-"* M t -.00000522 9 . - .0000113 9 t OM .008611 ~(2) -.02188 D(3) .05729 Sri t .04121 9 Oi,t .0347 tr : R. t -.1327 R, t .024 Ui,t .07796* 9 iut .9557* 9 iRt .389*' 9 Y 1.132 R2 .0469 n 2510 s .5969 Occupation-Unskilled Age 25-34 Equation 3: Regressand: Ei,t Equation 4: Regressand: Ei,t/Ei,t-1 Regressors Constant 2036 2.375 a Ai, -106.1 -.03946 Ei,t-1 .6856 -.0001913 ,- -.1048 -.0000721 Pi, t .5035 .00001326 U(1) 34.55 -.3016 U(2) -.292.1 95,30 R(3) -16.16 -.2708 Ni,t 11.A LA Oi,t -403.1 -.1879 R -307.8 -.2379 R t 80.82 .01:56 I Ui,t 336.1* . 09309 Mi,t 217.4 .4839 M(R) -243.9 2.066 r i,t y 5101 1.263 R2 ,5243 .2334 n 304 304 s 1078 ,6537 Occupation-Unskilled Age 35-44 Equation 5: Equation 6: Regressand: Ei,t Regressand: Ei,t/Ei,t-1 Regressors Constant 1131** 1.827 Ai,t 171.9 -.01237 Ei,t-1 .7319** -.000182** Wi,t -.03911 .000009562 Pi,t -.1249 -.00007537 D(1) 302.1 .1317 D(2) 390.7* .1746 D(3) 340.0 .3096** Ni,t 171.9 .0879 Oi,t -588.4 -.3067 R*i,t -393.4 -.1192 Ri,t 48.70 -.1370 Ui,t 668.3** .2221** M(u)i,t -117.7 -.2448 M(R)i,t 44.42 .1582 y 5220 1.165 R2 .5975 .1888 n 281 281 s 997.0 .5407 Occupation-Unskilled Age 45-54 Equation 7: Equation 8: Regressand: Ei,t Regressand: Ei,t/Ei,t-1 Constant 1325** 1.331** 0 166.5 .09327 Ai,t .7488 -,00006606 - .0123 .000001419 Ei,t-1 &7, t AP. t .3232 -.00003428 D(1) 2.746 -.005609 D(2) -80.82 .00000733 DO) 207.1 .1091 Ni, 401.9 .1130 Oi,t -211.5 -.258 Ri,t -322,2 -.05182 s 666.2 .04646 Ri,t U 283.2* .09878 Ui,t 909.1 1.216 14(U) i, NA NA M(R) it y R2 4825 .6289 281 969.9 1.122 .0707 231 .4394 Occupation-Unskilled Age 55-64 Equation 9: Equation 10: Regressand: Ei,t Regressand : Ei,t/Ei,t-1 Regressors Constant 1302** 1.67 157.1 -.1292 Ai,t .7867 -.0001039 Ei,t-1 -.04996 -.00006 Wi,t -.2312 -.00026221* t &P i, t -67.52 .004616 D(1) D(2) -218.6 -.01738 D(3) 617.4 .1062 b-w -160.6 -.08357 i, t -573.9 .1113 t -12.06 -.2741 Ri,t 160.3 .3660* Ri,t 379.2 002985 Ui,t NA N.A Y (u) N.a N.A i,t M(R) i,t y 4678 R2 .6037 n 232 s 1127 1.161 ,2841 232 .4934 Skilled Age 25-34 Equation 11: Equation 12: Regressand: Ei,t Regressand: Ei,t/Ei,t-1 Regressors Constant 2329 2.166` a A 18.65 . '03519 *k-41, E,t-1 .6575 -.0001971' a4 is t -.1550 -.0001048 16 Pi,t .00928 .000003162 D(1, 119.7 ,04129 DO 384.3 .06792 D(3) 655.8 -.09857 M 131,8 .1179 lei t -6.875 .2505 Ri -638,3 -.3221 R i,t 134.5 -.05625 U i,t 292,9 .1526 Mi,t N. A. ;3 .A ,) t N.A. y 6011 R 2 .4051 1.28 .1561 183 S 1414 .729 Skilled Age 35-44 Equation 13: Regressand: Ei,t Equation 14: REgressand: Ei,t/Ei,t-1 Regressors Constant D(2) (3) M Lit to Oi,t R3. t R ht i,t no 2161'' 47.7 .7533... t -.05086 -.2099 ]..499* .06269 -.0000922 .0001412 -470.1 -109.1 -551.7 380.6 .0000870 127.3 -636.8 5809 233.5 110A. 30.79 .08412 ,1382 .3172 .07859 -.01247 -.1495 .1023 .04779 N.A. -.03095 R 2 6557 216 1267 .5143 U 8 1.120 .1481 216 .4528 Skilled Age 45-54 Equation 15: Equation 16: Regressand: Ei,t Regressand: Ei,t/Ei,t-1 Regressors Constant 1460 1.184 9 Ai, 114.9 .01507 Ei, t-1 .7786` -.00002641 Wi t -.1978 -.00004444 Pi -.2227 -.00002981 9 t Dal, 20.15 .03995 D(2) -167.6 .002784 D(3) 255.7 -,01548 t -146.0 -.02154 L0. t 331.0 .07281 x -57.60 -.004298 Ri, Ri, t 472.8 .03873 Ui, t 195.6 .02681 Mi,t(u) w 2.934 264.6 Mi,t (R) 478.8 .03290 I, t y 5969 1.092 R2 .6163 .3152 n 213 213 s 1152 .3011 Skilled Age 55-64 Equation 17: Regressand: Ei,t Equation 18: Regressand: Ei,t/Ei,t-1 Regressors Constant 959.2 1,516 r Ai, t -128.0 -.07021 E.8634 -.00007788 ,w. t -.05167 -.00001554 Pi , t -.2373 -.0001152** D(1) 463.3 .09973 D(2) -2196.0 -,04435 D(3) 2195,0 .2546 Ni, t -1054.0 -.03138 t -714.9 -.3097 Ri,t 173.9 .09972 Ri,t 930.9 ,07889 Ui,t 574.7 .1228 N.A N.A N.A NIA ;u3 2.31t m (R) ,t y R2 n S 5714 1.132 .6073 .2050 134 1.34 1514 .3534 Managerial, Official & Self-Employed (Non-Farm) Age 25-34 Equation 19: Regressand: Ei,t Equation 20: Regressand: Ei,t/Ei,t-1 Regressors Constant 480.3 1.038 v -544.2 .1631 A. t Ei,t-1 .8692 -,00001522 16W t .004776 .00008114 Pi, t *X .0001194 1.51 D(1) 1053 .1142 D(2) 1267 -.1071 D(3) 805.7 .1434 Ni, N. A. N.A. LO. t -429.4 .3410 R -702.7 .02411 R -2384 -.4674 0 792.8 .2018 1,t N.A. N.A. M(R) -213.1 .09672 i,t y 6458 1,178 R2 . 78 -.015 n 94 94 s 2517 .7499 Managerial, Official & Self-Employed (Non-Farm) Age 35-44 Equation 21: Equation 22: Regressand: Ei, t Regressand: Ei,t/Ei,t-1 Regressors Constant 2201 .7725 e Ai, t -1036 .1823 Ei,t-1 .4130 -.00001199 Wi,t -.6145 -.00006102 9 t LP. t -.6539 .00005549 D(1) 2175.0 .2907 D(2) 1284.0 .02761 D 3) 1897.0 -.06740 Ni,t 2786.0 .1547 9 t wit -1313.0 -.0274 9 Ri, t -1581.0 -.1491 Ri,t -.2292 t - Ui, t 2436.0 .275 a M(u) N.A. N.A. M(u) M(R) 6417 .4363 i,t Y 7359 .9966 R2 .4153 .0704 n 198 1.98 s 5044 .7296 Managerial, Official & Self-Employed (Non-Farm) Age 45-54 Equation 23: Regressand Ei, t Equation 24: Regressand: Ei,t/Ei,t-1 Regressors Constant I -1309 ,8719 Ai,t -82.06 -.03487 E. .9306 -.000002859 ,8383 011 -.000001263 ,1200 -.00005139' 16P i, t 1044.0 .1785 D(1) O(2) 111.5 .05909 O(3) -1169 -.05080 596.2 .1620 -741.5 .07093 mss., 331.6 -.08583 t 6.846 -.02427 093.3' .2558 * Btu) N.A. N.A. i,t -1977.0 -.05117 M(R) Y 7314 1.009 R2 .8142 .0288 n 222 222 2859 .650. Managerial, Official & Self-Employed (Non-Farm) Age 55-64 Equation 25: Regressand: Ei,t Equation 26: Regressand: Ei,t/Ei,t-1 Regressors Constant 388.6 1.036** 9 -434,1 -.09715 Ai,t .9631 .000006217 Ei, -,0786 -.00001518 t-1 L . t -.06525 .00002262 Pi, 781,7 .02048 t D(1) D(2) 639.6 --.4137 D(3) 252,3 .02098 -298.5 .07860 Ni -901.2 -.2178 t Oi,t 563.6 -.2382 t Ri,t -303.7 .0324 Riot 757.9., .07188 Ui,t N.Ae N,AQ M(u) 1,t M(R) N, A0 N.A. i,t y 6544 .9999 R2 .922 .0598 n 152 152 s 1815 .4931 Table I 1959 Mean Earnings by Occupation and Age A(0) A(1) A(2) A(3) 0(0) .5101 5220 4825 4678 0(1) 6011. 6557 5969 5714 0(2) 6458 7359 731.4 6544 Table II 1959 Standard Deviation of Mean Earnings by Occupation and Age A(0) A(1) A(2) A(3) 0(0) 1561 1569 1589 1786 0(1) 1829 1814 1855 2407 0(2) 5338 6580 6618 6478 Table III 1958-1959 Mean Earnings Change by Occupation and Age Group A(0) A(1) A(2) A(3) 0(0) 681.9 431.5 423.4 402.5 O(1) 934.3 473.6 419.0 448.5 0(2) 699.9 -503.4 200.2 429.0 Table IV 1958-1959 Standard Deviation of Mean Earnings Change by Occupation and Age Group A(0) A(1) A(2) A(3) 0(0) 1249 1095 1079 1184 0(1) 1479 1340 1232 1582 0(2) 2882 6907 3042 1831 IV Comments on Findings: Since these estimates are very recent, there has not been sufficient time for adequate analysis of these results. In addition the running of additional years will undoubtedly bring to light the various consistencies or inconsistencies of the data or model or both. Generally we can note in equations 1 and 2 that the significant variables other than age are of the sign that theory would lead us to expect. The problem with the age variable is no doubt due to the interaction with age and past period income. In viewing the separate occupation-age groups we see how past period earnings plays a varied role as a predictor, becoming a larger determinant as we move up the occupational ladder. In addition, the significance of the remaining regressors with the possible exception of the urban variable becomes rather sporadic. Such findings could well prove the basis for emphasizing research on the stochastic model approach to income distribution, that is within appropriate occupation-age-urban groups. Tables I and II verify existing age-income profile hypotheses concerning the levels and dispersion of earnings Tables III and IV indicate that while the dispersion of earnings differentials also increases as we move up the occupation scale, the movement of the mean level of earnings differentials is not clear. No doubt additional years data will shed light on this aspect of earnings dynamics. APPENDIX A LIST OP VARIABLES: Earnings: (Sum of Total Wages and Salaries and Self Employed Income) Ei,t = Current Period Earnings of the ith married male Ei,t-1 = Past Period Earnings of the ith married male Occupation: Oi,t(1) = 1 if skilled 0 otherwise Oi,t(2) = 1 if Managerial, Official and Self-Employed (Non-Farm) 0 otherwise Oi,t(1) = Oi,t(2) = 0 if unskilled Age: In each occupation-age group; A'i,t = 1 if married male is in the lower 5 years of age bracket 0 otherwise Ai,t(1) = 1 if Age 35-44; 0 otherwise Ai,t(2) = 1 if Age 45-54; 0 otherwise Ai,t(3) = 1 if Age 54-64; 0 otherwise Ai,t(1) = Ai,t(2) = Ai,t(3) = 0 if Age 25-34 Wife's Earnings: Wi,t = Change in earnings of wife of ith married male (Wi,t - Wi,t-1) Property Income: Pi,t = Change in property income of the i married male Dependents Di,t(1) = 1 if 3-4 Dependents; 0 otherwise. Di,t(2) = 1 if 5-6 Dependents; 0 otherwise. Di,t(3) = 1 if > 7 Dependents; 0 otherwise. Di,t(1) = Di,t(2) = Di,t(3) = 0 if 1-2 Dependents Additional Earners: Ni,t = 1 if change in additional earners > 0 during the period t-1 to t. 0 otherwise. Occupation Change: Oi,t = 1 if change in occupation; 0 otherwise Region: Ri,t* = 1 if the ith married male lives in a county designated for assistance by the Area Redevelopment Administration. 0 otherwise. Race: Ri,t : 1 if the ith married male is nonwhite; 0 otherwise. Urban: Ui,t = 1 if the ith married male is residing in a major Wisconsin urban area (i.e., Madison, Metropolitan Milwaukee, Racine, Kenosha, La Crosse, and Green Bay); 0 otherwise. Mobility: Mi,t(u) = 1 if ith married male moved from Rural to Urban; 0 otherwise. Mi,t(R) = 1 if ith married male moved from Urban to Rural; 0 otherwise. Mi,t(u) = Mi,t(R) = 0 if no move or movement from Urban to Urban or Rural to Rural areas.hahttp://www.ssc.wisc.edu/wais/WAIS645072.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645072.txtb3 Richard Bauman Gene Moyer 1965{Some Preliminary Tests of Whether the Method of Choosing Name Groups Influenced Some Characteristics of the WAIS Tax Sample June 21, 1965' WAIS paper645-073 WAIS Sample(hahttp://www.ssc.wisc.edu/wais/WAIS645073.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645073.txt2l2OSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Richard Bauman Gene Moyer WAIS Paper 645-073 June 21, 1965 Some Preliminary Tests of Whether the Method of Choosing Name Groups Influenced Some Characteristics of the WAIS Tax Sample WAIS chose its sample on the basis of the 1958 Madison District Tax Roils. We decided upon 50 name groups with an average group size of 50 (~ 52.1) since we wanted a one-percent sample of the 260,121 taxpayers in the Madison District. We devised criteria for limiting name group size which were based upon non-symmetry of last-name distribution.* The following tests are based on the 1962 tax rolls. If the distribution of names in the Madison District in 1962 remained the same except for a shift in the entire distribution due to a changing number of taxpayers, then the above procedure could be expected to generate a sample in 1962 of the same percentage as we chose in 1958. Differential name group growth may have an effect on the sample chosen in different years, therefore the following procedure is used. A. Revised Criteria for Picking Madison-based Name Groups. 315,786 taxpayers in the Madison District filed returns in 1962. If 50 name groups are desired, the average group size is (315,786 x .01)50 = 63.2 Since the main purpose of these tests is to compare Madison-based name groups and State-based name groups, and since we originally based our group-size limits on estimates of the underlying name-density distribution, it seems appropriate, to a first order of approximation, to form new criteria which will generate similar group sizes. We may therefore round the desired average group size to 60 ~ 63.2. The original lower and upper limits for name-group size were 60% and 200% of the average group size. To form similar groups we shall use 35 ~ 36 = .6 x 60 as the lower limit and 120 = 2 x 60 as the upper limit. Thus, the revisal criteria are: ------------------------ * See "A Sample of Wisconsin Individual Income Tax Returns, 1946-1959," Roger F. Miller, unnumbered WAIS Memorandum, March 31, 1965 ------------------------ (a) In order to compare this method of selection with the actual method we will assume that the original randomly chosen "starting" individuals would have been chosen at random from the 1962 Madison Tax Rolls. This assumption is plausible if the "starting" individual exists in the 1962 Madison Tax Roll. If he does not file a return in 1962, it is also necessary to assume that an alternative "starting" individual could have been chosen in such a way that he does not affect the name cluster starting point. (Example; Suppose we chose G***** *. B**** as a "starting" individual, and on this basis chose all the G. B***** by applying our criteria. If G***** *. B**** also filed in 1962, he could again be chosen as a starting individual. If G***** *. B**** didn't file in 1962, the second assumption implies that a G***** *. B**** would have been chosen rather than a F*** B****.) Therefore the first person in each name group and in the Madison Tax District is equivalent to the actual starting individual since a sample has been selected and since the criteria called for a "forward rollover" procedure (b) If the number of individuals having the same last name as a person chosen in (a) above lies between 35 and 120, then take all those persons as one cluster. (c) If the number of individuals having the same last name as a person chosen in (a) above is greater than 120, then consider all those who also have the same first initial (i) If the latter is at least 35, then all those persons will be taken as a cluster. (ii) If those with the same last name and same first initial number less than 35, proceed to the next first initial in alphabetic sequence, expanding the original grouping until at least 35 persons make up the cluster** (d) If the number of individuals having the same last name as a person chosen in (a) above is less than 35, then proceed to the next last name in alphabetic sequence, expanding the original groupings until at least 35 persons make up the cluster** The Madison-based name groups will be chosen as before, by taking all those persons in the State with names in the groups chosen on the basis of the 1962 Madison Tax Roll according to the above revised criteria. There is no reason to believe that name groups selected by the revised criteria will be exactly the same as those chosen originally. Nevertheless it is felt that the effects of differential name group growth should be eliminated in a study of the effects of Statewide versus Madison Roll book selection. B. Revised Criteria for Picking Statewide-based Name Groups Application of the original criteria resulted in an actual sample of .775% of the Wisconsin Taxpayer population. For comparative purposes, it seems desirable to have revised criteria for State-based name groups which will generate a sample more nearly equal in size to a sample based on Madison Tax Roll selection. 1,771,668 taxpayers filed returns in 1962. If 50 name groups are desired, the average desired group size will now be given by: 1,771,668 x .00775/50 = 274 Setting the lower limit at 165 = .6 x 274 and the upper limit at 548 = 2 x 274 will then enable us to develop criteria parallel to (a) thru (d) above. ---------------------- ** If the procedure of going forward alphabetically from the name chosen to increase the size of the cluster to at least 35 carriers the cluster over 120 it will be necessary to consider the next letter in the first name in the added groups, (the initial in (d), the letter following the initial in (c)(ii) above.) --------------------- (a') In lieu of the assumption in (a), we will assume that the original randomly chosen "starting" individuals would have been chosen at random from the 1962 Wisconsin Tax Rolls. This enables us to use the first person in each name group as defined in our originally chosen groups and on the 1962 State Tax Roll as a "starting" individual. (In order to apply these revised criteria, it will be necessary to consider the 4-district tax rolls "as if" they were alphabetically arranged as a combined roll.) (b') are the same as criteria (b), (c) and (d), except that (c') 165 is substituted wherever 35 appears and 548 is substituted (d') wherever 120 appears. C. Summary of Conceptual Samples to be Compared There are three conceptual samples that may be used in the tests: Sample I is the portion of the 1962 Wisconsin taxpayer population selected on the basis of 1958 Madison Tax Rolls and the original criteria. Sample II is the portion of the 1962 Wisconsin taxpayer population selected on the basis of 1962 Madison Tax Rolls and the revised criteria (a) thru (d). Sample III is the portion of the 1962 Wisconsin taxpayers population selected on the basis of 1962 Wisconsin Tax Rolls and the revised criteria (a') thru (d'). Utilizing Sample I as a base, Sample II and Sample III may be expressed as differences. List II may be defined as two lists of records: IIA, a list of those taxpayers included in sample II and not included in sample I; IIB, a list of those taxpayers included in sample I and not included in sample II. List III may also be defined as two lists of records: IIIA, a list of those taxpayers included in sample III and not included in sample I; IIIB, a list of those taxpayers included in sample I and not included in sample III. D. A Note on the Mechanical Selection of the Three Samples. Sample I includes all those persons on each District Tax Roll for 1962 included within the name groups as defined by the original criteria. Programs for extracting records for these persons can be applied separately to each District Tax Roll. Variables to be included in the extracted record are found in the tests outlined below. Sample II results from the combination of list II and Sample I. List II will necessitate the application of extract programs only to the records of those persons included in sample II and not in Sample I, since list IIB merely defines a group to be deleted from Sample I. The programs can again be applied to each District Tax Roll separately. Sample III is a combination of List III and Sample I. To get list III, it is necessary to apply criteria (a') thru (d') to the entire State Tax Roll. List III will be blank for those name groups (i = 1...50) with a total number (x) of individuals in Sample I which satisfy 165 < Xi < 548. If this condition is not satisfied, efficient solution for list III may call for either a piecewise application of (a') thru (d') to each District Roll(r) so that 165 < ( ) Xir < 548 (r = 1, 2, 3, 4), or a procedure for adding names to deficient name groups. (An estimate of what the latter procedure would involve can be made on the basis of 1958 actual results. The smallest name group (B*****) had 41 individuals (State-wide) in 1958. A pessimistic estimate would require 540-41 = 507 names to be added on a State-wide basis to this name group in order to satisfy criterion (d').) E. Tests. One group of questions to which WAIS would like answers involves the kinds of name groups chosen by the three samples. Let ( ) = the size of the ith name group chosen by the jth criterion in the ( )th sample. (i = 1...50; j= 2,3,4 ( ) = I, II, III) ( ) = ( ), the mean group size chosen by the jth criterion and in the ( )th sample. ( ) = ( ), the mean group size of all groups chosen in the ( )th sample. ( ) = ( ) ( ) = ( ) The first question WAIS would like answered is 'Do WAIS's rules give group sizes which are independent of the criterion which was used?" This suggests hypotheses of the form: ( ) = ( ) = ( ) = ( ) if ( ) < ( ) at a 5% level of significance, the appropriate hypothesis (depending on the value of ( )) would be rejected. This test can also be performed on WAIS's original name groups. The following table summarizes the calculations for these name groups. Note that in this case, ( ) = IV, where Sample IV is defined as the actual 1958 selection.*** ------------------- *** The writers are indebted to Marshall Seavey for doing the calculations in this test. ------------------- WAIS GROUPS - Mean and Variance of Group Size Chosen by Each Criterion Criterion ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) The ratio ( ) = .5785 Since ( ) = 3.22 at the 5% significance level, the hypothesis that ( ) = ( ) = ( ) =( ) is accepted for WAIS name groups. This, of courses is not a surprising result since it merely implies that the rules we chose were effective. Another question that might be asked is: "What sort of distribution of name group sizes is implied by the WAIS criteria?" A strong hypothesis may be formulated as follows: ( ) =( ) and tested for goodness of fit using chi-square procedures, where the test statistics is defined as ( ) < ( ) For WAIS's original name groups, this hypothesis is flatly rejected at the 99% level. This is not too surprising, however, since our criteria did allow for a large amount of fluctuation in name group size, although not as large as exists in the population. Weaker hypotheses that the distribution of name groups is approximated by some distribution may be developed ad infinitum, but the burden of providing some rationale for the hypothesized relationship still exists. One weaker hypothesis that is fairly easy to rationalize is that WAIS's criteria were such as to modify the actual distribution of names so that group size was normally distributed. In order to test this hypothesis, the specifications can be changed as follows: ( ) ~ ( ) The chi-square statistic is computed according to: ( ) = ( ) where: ( ) = the number of name groups with ( ) in the Kth category, K is chosen so that there is at least 1 observation in each category. ( ) = the expected number of name groups in the Kth category if it were in fact true that ( ) ~ ( ) This procedure was used to test WAIS's original name groups. Eight groups were defined, giving a value for X2 = 15.05. For ( ) the hypothesis that our group size is distributed normally with ( ) = 236.92 and ( ) = 20,635.8 is rejected at the 5% level of significance, but cannot be rejected at the 1% level. Figure I illustrates the actual and theoretical frequencies involved. NOTE- We are working on a follow-up paper which develops more general methods of testing the above samples and criteria. This draft is being circulated so the suggestions of the WAIS staff can be incorporated into our tests. (graph) # of Groups Chosen # in Name Group (1958) Figure l. A Graphical Comparison of the Distribution of WAIS's Name Group Size with a Discrete Approximation to the Normal Curve Having the Same Mean and Varianceh& Gene Moyer 1965@:A Proposal for Indexing Tables and Other Tools of Analysis July 1, 1965 WAIS paper656-002bzsGeneral Papers (Regarding WAIS) Maintenance System - Files, Data, Etc. Proposals- For Analyses, Theses, etc. Tabless%T%MGene Moyer WAIS 656-002 July 1, 1965 Revised July 8, 1965 A Proposal for Indexing Tables and Other Tools of Analysis The tables, regressions, and other printed results which WAIS will be producing during the next year will require some indexing system because of their large number. The SSRI library is also working as an Indexing system using the "KWIC" IBM system. The following card layout makes WAIS' system of indexing compatible with the "KWIC" system of SSRI. 1. As tables are run give each table (page) or regression output page a ten digit identification number- 1 - 4 = WAIS 5 - 6 = number of extract file or abbreviation of other source file (e.g. master file = mf) 7 - 9 = book number in which table is located 10 - 12 = consecutive number beginning with 001 (page number of book) 2. for each table or regression, or other output punch the following card which corresponds to the "KWIC" "author card"- see attached form for coding these cards. Columns # Columns Data 1 - 6 6 Data tables etc. run 71 -12 6 Source document # 1 (WAIS paper number) 13 -48 6 Source document # 2 (WAIS paper number) 19 -4 6 Source document # 3 (WAIS paper number) 25 -60 36 Blank 61 -72 12 Table number (KWIC reference number) 73 -74 2 Sequence number within card type (always "-1") 75 1 Card type (always "1" ) 76 -80 5 Blank 3. Each table must also have the following "title" card Columns # Columns Data 1- 60 60 Title of table and remarks 61- 72 12 Table number 73-74 2 sequence number within card type (always "1") 75 1 Card type (always "2") 76- 80 5 Blank For each variable in the table etc., punch the following card: (Subject heading or key word card) Columns # Columns Data 1-6 6 Date table run 7-12 6 Source document # 1 (WAIS paper number) 13-18 6 Source document # 2 (WAIS paper number) 19 -24 6 Source document # 3 (WAIS paper number) 25--56 22 Blank 57 -60 4 WAIS variable number (L. Appleton's system: see pp.3-11). 61 -72 12 Table number 73 -74 2 Sequence number within card type (zone over 73 in last card) 75 1 Card type (always "4") 76 -80 5 Blank These cards can either be run on basic machines or on the 1410 KWIC program, to make indices for books or to find tables and variables as needed. A List of WAIS Variables and their Numbers TAX SAMPLE (Includes Demographic Variable) 01.00 Income Heading Wages & Salary 01.01 Interest 01.02 Dividends 01.03 Rents 01.04 Capital Gains 01.05 Profit & Loss of Business 01.06 Trustees or Fiduciaries 01.07 Partnership 01.08 Total Income 01.09 Adjusted Gross Income 01.10 Net Income Item Basis 01.11 Net Taxable Income St.D.Basis 01.12 Social Security 01.13 Assessed Tax. Income 01.14 Deductions Heading Standard Deduction 01.15 Wis. Tax 01.16 Union Dues 01.17 Medical & Dental 01.18 Total Interest 01.19 Business Int. (not deduct.) 01.20 Dividend Deductable 01.21 All Other 01.22 Alimony 01.23 Forest Crop 01.24 Total deduction before Federal Tax 01.25 Donation 01.26 Federal Tax & Social Security 01.27 TAX SAMPLE (cont.) Miscellaneous Heading Miscellaneous 01.28 Block or Column Indicator 01..29 Items in Miscellaneous 01.30 Number Wage & Salary Indicator 01.31 Farm Schedule, P & L Indicator 0132 Stock Dividend 01.33 Auto Expense 01.34 Other Expenditure 01.35 Spouse Income 1959 or 1960 01.36 Form Type Indicator 01.37 Complete or Incomplete 01.38 Medical-Dental Indicator 01.39 Federal-Tax Indicator 01.40 Donation Indicator 01.41 Errors Possible Heading B392 Source Sum Error 01.42 B393 Subtract Auto Error 01.43 B394 Standard Ded. Error 01.44 B395 First Phase Ded. Error 01.45 B396 Net Income Before Fed. Tax. Error 01.46 B397 Net Income Before Donation Error 01.47 B398 Net Income Item Basis Error 01.48 Demographic Variable Heading Residence Location 01.49 County Prior Years 01.50 Address Change 01.51 Occupation 01.52 Occupation Change 01.53 TAX SAMPLE (cont.) Demographic Variable (cont.) Heading Return Reason 01.54 Partnership 01.55 Spouse Separate Income 01.56 Marriage Details 01.57 Head of Family 01.58 Number of Dependents 01.59 Identification Year of Return Heading Parameters 01.60 Name Group 01.61 Household Position 01.62 Sex 01.63 CAPITAL GAINS 02.00 Rental Income from Real Estate Heading Cost of Land and Buildings 02.01 Cost of Buildings only 02.02 Depreciation Allowed in Each Year 1947-1959 02.03 Property taxes paid, 1947-1959 02.04 Interest, Repairs, and miscellaneous expenses, 1947-1959 02.05 Total gross Income 1947-1959 02.06 Class "B" Real Estate Amount of Interest Paid on Repairs, and Mortgages on the Property 02.07 Proportion of the Property not inhabited by Taxpayer 02.08 Value of business or Small Farm Heading Beginning Inventory, 1947-1959 02.09 Closing Inventory, 1947-1959 02.10 Depreciation & Depletion, 1947-1959 02.11 Interest Paid, 1947-1959 02.42 Original Cost of total Fixed Assets, 1947-1959 02.13 Depreciation Deducted in Prior Years, 1947-1959 02.14 Kind of Business 02.15 Identification Number of Asset 02.16 Realized Capital Gains & Losses Month & Year taxpayer acquired Asset 02.17 Method of Acquiring Asset 02.18 Month & Year Asset Sold 02.19 Cost of Asset 02.20 CAPITAL GAINS (cont.) Realized Capital Gains & Losses Depreciation in Prior Years 02.21 Subsequent Improvements to Property 02.22 Amount Received by Taxpayer from the Sale 02.23 Amount of Gain or Loss 02.24 Residence Rollover 02.25 SOCIAL SECURITY 03.00 Identification Number 03.01 Social Security Number 03.02 Name 03.03 Address 03.04 Multiple Account Number 03.05 Identification Indication of Disagreement between Name on Record and Finder Card 03.06 Birth Date 03.07 Race 03.08 Sex 03.09 Indication of Railroad Activity 03.10 Newly Posted Credit Earnings Item 03.11 Indication of Additional Earnings 03.12 Account in Benefit Status 03.13 Termination Date of Benefits 03.14 Credit Indication 03.15 Earnings statement issued in Year indicated 03.16 Indication of Self-employment Activity 03.17 Indication of Delinquent Self-employment Activity 03.18 Indication of Agricultural Activity 03.19 Earnings, 1937-1963 03.20 Wage Quarters of Coverage 1947-1963 03.21 Self-employment Quarters of Coverage 1951-1963 03.22 Agricultural Quarters of Coverage 1955-1963 03.23 Quarterly Wage Quarters of Coverage Pattern 1955-1963 03.24 Indication that there is a claim against the account 03.25 Disability 03.26 SURVEY 04.00 Demographic Characteristics Heading Residence: Location & Type, Recent Mobility 04.01 Race, Sex, Year of Birth 04.02 Marital Status, Year married or Divorced,etc. 04.03 Location & Size of Place "R" Reared 04.04 Education & other Training 04.05 Age, Residence, Education of both "R's" parents 04.06 Number of Children had by "R" (natural and legally adopted) 04.07 For each of "R's" children 04.08 Age Place at which living if not home Year left (if applicable) Occupational Characteristics Heading Employment status; occupation and industry (last had if R unemployed) 04.09 Year started present work, hours worked 04.10 Qualifications considered important by "R" for present job 04.11 List of Relevant Jobs "R" had before present Job 04.12 Job which Best Prepared "R" for Present job 04.13 R's wish and possibilities to change job in between 1964 and 1969 04.14 Unemployment history for 1959-1964 04.15 SURVEY (cont.) Income & Selected Deductions Heading Income in 1963 for each member of family 04.16 "R's" estimate of direction of 04.17 changed income from 1963-1968 Health, accident, and other 04.18 insurance, sick payment Retirement Program 04.19 Contributions to Social, 04.20 Religious and other Organ- izations or institutions in 1963 Donations of Property or Stock 04.21 to Charitable Organizations or institutions in 1963 Governmental and private 04.22 transfer payments, inheritance, gifts, etc. Housing Heading Size of Home or Farm; Owned or 04.23 Rented Rent Amount, or Property Value 04.24 Property Taxes on the House or 04.25 on the Farm 1963 Current and Historical Amount of 04.26 Debt on the house or Farm Investment Assets Heading (no., value and year owned or sold) Durable Goods 04.27 Liquid Assets 04.28 U.S. Savings Bonds & Other Bonds 04.29 Loans, Interest Hearing Notes, 04.30 Land Contracts, Mortgages Investment Clubs 04.31 Investment Assets (cont.) Heading Stock option 04.32 Publicly traded Stock and 04.33 Mutual fund Shares Unincorporated Business 04.34 Partnership, or Professional Practice Closely Held Corporations 04.35 Real Estate 04.36 Life Insurance 04.37 Other Income-Producing Assets 04.38 Future Investment Practice 04.39 Probable use by the "R" of 04.40 windfall equal to half a year's income Date Table Run Source Document # 1 Source Document # 2 13 14 15 16 17 18 Extract or other file identification Book # Page # 65 66 67 68 69 70 71 72 1 I M 7 8 9 10 11 12 Source Document # 3 19 20 21 22 23 24 61 62 63 64 W 1 A 1 I 1 3 1 Sequence within card # Card # 73 74 75 - 1 9 10 1 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 73 74 75. 1` 1 1 1 2 VARIABLES (57-60) Keypuncher: For each variable, duplicate card 1, columns 1-56 and 61-72, and use the variable numbers and card sequence numbers given below: 37 .58 59 60 57 58 59 60 57 58 59 60 11 I I I iii 73 74 75 73 74 75 73 74 75 l 4l 11 1 41 1 I 1 41 73 74 75 1 141 73 74 75 73 74. 75 4 4 57 58 59 60 37 58 .59 60 57 58 59 60 58 9 60 57 58 59 60 57 59 60 73 74 75 73 74 75 73 74 75 73 74 73 1 73 74 75 73 74 75 11141hahttp://www.ssc.wisc.edu/wais/WAIS656002.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656002.txti!Richard Bauman 1965xrAnalysis of Variance Tests for the Effects of Criteria and Sampling Techniques on Name Group Size in WAIS's Sample July 8, 1965 WAIS paper656-003 WAIS Samples | vRichard Bauman WAIS Paper 656-003 July 8, 1965 Analysis of Variance Tests for the Effects of Criteria and Sampling Techniques Name Group Size in WAIS's Sample* For the purposes of these tests, we define a three-category classification by the criterion used. Criteria b' thru d' have been constructed so that they are equivalent to the revised criteria b thru d, and the original criteria 2 thru 4. For simplicity, the criteria are identified by number only in the following presentation. In an analysis of variance framework, this breakdown can be termed the column classification. An implicit assumption was made in WAIS Paper 645-073 that the change in name density from 1958 to 1962 was such that proportional changes in the criteria were adequate to get samples of comparable sizes. We also rounded the average group sizes, since persons are not (willingly) divisible into fractions. These approximations should not seriously affect the conclusions of the following tests. The standard deviations due to name group variation within the "cells" can be expected to be many times greater than any possible error due to an inexact approximation. (The following S.D's are calculated from WAIS Paper 645-073. P. 7. S.D. (IV |2) = 186.7, S.D. (IV |3) = 68.5, SD (IV |4) = 132.5). For the purposes of these tests, we also define a three category classification by the sampling technique used, each involves a sample drawn in 1962. These samples are, as before: Sample I: 1962 name groups selected on the basis of 1958 Madison Tax Rolls and the original criteria, 2 thru 4 Sample II: 1962 name groups selected on the basis of 1962 Madison Tax Rolls and the revised criteria, b thru d (2 thru 4) Sample III: 1962 name groups selected on the basis of 1962 State Tax Rolls and the revised criteria, b' thru d' (2 thru 4) --------------------- *NOTE: Wais Paper 645-073 contains introductory material upon which this paper is based. Again referring to the analysis of variance framework, let us call this the row classification. We may present the cross-classification in a bivariate table: Criterion 2 3 4 Sample I Xi2I Xi3I Xi4I II Xi2II Xi3II Xi4II III Xi2III Xi3III Xi4III The table shows the cells into which any given Xijt would fall. There may be effects of inclusion in a particular cell which are not simply the sum of a row effect and a column effect. These may be called row-column interaction effects and they may be estimated, as well as the effects of row or column classification, in a dummy variable regression model which is, in many respects, equivalent to the two-way analysis of variance. One formulation of the regression model which will enable WAIS to test several important hypotheses about the size of name groups chosen in several ways is as follows: xijt = a0z0 + B1z1 + B2z2 + y3z3 +y4z4 + 85z5 + 86z6 +87Z7 + 88z8 + eijt Where the variables are: xijt = Name Group Size (as defined above) z0 = 1 for all observations z1 = 1 if chosen by criterion 3 0 otherwise z2 = 1 if chosen by criterion 0 otherwise z3 = if chosen in Sample I otherwise 0 otherwise z4 = if chosen in Sample III 0 otherwise z5 = z1 x z3 = 1 if z1 = 1 and z3 = 1 0 otherwise z6 = z1 x z4 = 1 if z1 = 1 and z4 = 1 0 otherwise z7 = z2 x z3 = 1 if z2 = 1 and z3 = 1 0 otherwise Z8 = z2 x z4 = 1 if z2 = 1 and z4 = 1 0 otherwise eijt = the stochastic component of Xijt . The "classical" assumptions are made about the behavior Xi of eijt, namely: E (eijt) = 0 for all i, j, t; I (eijt eijt') = o2I If these assumptions are found to be untenable, the model can be reformulated to take account of their nonspherical properties. the following tests are based on the assumption of spherical disturbances, and the results do not, in general, hold if these stochastic assumptions are not met. Coefficients a0, the least-squares estimate of a0, will be the mean group site of those name groups chosen by criterion 2 and chosen in Sample II. The b's, c's, and d's, respectively the estimates of the B's, y's, and 8's, are the mean effects (in this model expressed as differences from a0) of the classification. In other words, we may interpret all the coefficients as simply a cell mean and differences between cell means. Hence: a0 = ( ) b1 = ( ) b2 = ( ) c3 = ( ) c4 = ( ) d5 = ( ) d6 = ( ) d7 = ( ) d8 = ( ) where Xijt is the mean group size of the n group chosen by the jth criterion and in the tth sample. If we were interested in estimating only these coefficients, it would be a relatively simple task to compute the means without using a regression. The regression analysis, in addition, efficiently provides the sums of squared deviations needed to perform the following "F" tests. Tests Involving the Model A variety of tests can be performed using the above regression model. These logically involve a decreasing order of generality, e.g., acceptance of the hypothesis that there is no significance in the entire classification scheme would imply that there is no need to proceed to other possible tests. The following table shows the tests which would be most useful to WAIS. Test # Hypothesis Specifications on Test Statistic Tabled Values Result Coefficients (1) No effect of 133 = 7t 8m = 0 F = 2/141 P = 2.00 (5%) the I x (alI j,t,m; j 192; classification {lx) t - 3,4; m = 5,6,7,8) F = 2.62 (2) No effect of Bj = 8m=0 F=SSRc/6 F=2.16(5%) the (all J,,=; j = 1,2; SSB/141 = F = 2.92 (1*) criteria m . 5,6,.7,.8) SSRc is defined as the regression sum of squares involving the specified coefficients in test 2. (3) No effect of yt = 8m = 0 F - SSRs/6 __ F - 2.16 (5%) the sampling (all t, m; t = 3,4; SSE/141 --- F = 2.92 (1%) techniques m - 5,6,7,8) SSRs is defined as the regression sum of squares involving the specified coefficients in test 3. If both hypotheses (2) and (3) are rejected, the next general test that may be performed is the following: (4) No interaction 8 - 0 (all m; F = S " 2.43 (5%) effects (i.e., m _ -- F = 3.44 (lx) the effects of m - 5,6,:7,8) SSE/141 criteria and sampling techniques are additive) SSRis defined as the regression sum of squares involving the specified coefficients in test 4. It is not unreasonable to expect that none of the above hypotheses will be accepted. Some valuable conclusions about WAIS's procedure may still be reached using the following tests. (5) No difference 73 = 85 = &7 = 0 F = SSRI/3 = F = 2.67 (5%) in sampling SS-E/141 F = 3.91 (1x) technique between Sample I and Sample II SSRr, is defined as the regression sum of squares involving the specified coefficients in test 5. This test involves testing the need for making revised criteria for drawing the 1962 sample based on Madison Tax Rolls. If the hypothesis is accepted, then we would be led to conclude that Sample I is "just as good" as Sample II for purposes of comparison of sampling techniques. The possibility of making a type II error, however, would suggest that we retain the distinction for further tests. (6) No difference in 74 - 86 = 87 s 0 sampling technique between sample II and Sample III SSRIII is defined as the regression sum of squares involving the specified coefficients in test 6. This test should be a very interesting one for WAIS. If the hypothesis is accepted, we may conclude that Madison-based selection (which in this case was based an the 1962 Madison Rolls) made no significant difference in name group size, when other sources of variation are accounted for. SSR3 is defined as the regression sum of squares involving the coefficients specified in (7). F = SS F . 2.67 (S$) SSS 1 P s 3.91 (1%) (7) No difference between groups chosen by criterion 2 and criterion 3 B1 = 85 = 86 0 F .. SS F - 2.67 (5%) - B E/141 s -- F - 3.91 (1%) (8) No difference 2 s 87 88 0 F a SSR4/3 F - 2.67 (5%) between groups s- 1 3.91 (1$) chosen by criterion 2 and criterion 4 SSR4 is defined as the regression sum of squares involving the coefficients specified in (8). This list of tests involving hypotheses about name group size does not exhaust the meaningful and interesting hypotheses that can be tested using the above model. For example, one such test would involve comparison of Samples I and III. It is felt that the above list is sufficient insofar as it should indicate the direction that further testing on name group size must take.hahttp://www.ssc.wisc.edu/wais/WAIS656003.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656003.txtM Roger Miller 1965B B (which implies X is the appropriate potentially averageable income, a the appropriate parameter). (b) y < B (implying Z,y are appropriate). (5) Legal definition type expand each of the six qualifying group types [3 each in (4)(a) and (4)(b)] into four groups by segregating the data appropriate for each of the types of legal definition: j = 1, ..., 4. (6) Percentage change variant: expand each of the 24 qualified and "directed" legal definition type groups into four groups according to the value of a or y (whichever is appropriate; see (4) above), used in computing potentially averageable income. Values of a and y are, respectively; (a) a : 1.00; 1.25; 1.33; and 1.50. (b) y : 1.00; 0.80; 0.75; and 0.67. Note that the values of y are each the reciprocal of the value of a directly above. (7) Yield: (3 x 2) + (3 x 2 x 4 x 4) = 6 + 96 = 102 pages. C. Detailed description of cell defining variables on a page: (1) Rows: classes of "'Federal Net Taxable Income" for the computation year (F5) as follows (in dollars): Class lower bounds (excluded) Class upper bounds (included) - 99,999 - 1,000 - 1,000 0 0 1,000 1,000 3,000 3,000 5,000 5,000 7,500 7,500 10,000 10,000 15,000 1,000 20,000 20,000 25,000 25,000 35,000 35,000 50,000 50,000 15,000 75,000 99,999 over 99,999 Total (2) Columns: (a) Nonqualifiers: (plus qualifier totals): (i) Failed A-test only (ii) Failed B-test only (iii) Failed both tests (iv) Failed neither test (total of qualifiers) (v) Totals (b) Qualifiers only, by classes of potentially averageable income: - 99,999 - 1,000 - 1,000 0 0 1,000 1,000 2,000 2,000 3,000 3,000 4,000 4,000 5,000 5,000 10,000 over 10,000 Total D. Detailed specification of cell entries: (1) Number of returns: (a) In tabulations dealing with separate filers, each person filing counts as one return even if two persons are a married couple. (b) In tabulations dealing with joint filers, each couple filing, counts so one return (thus there are precisely twice as many returns for married couples filing separately as there are for married couples filing jointly). (2) Row percentages (each column entry as a percent of its row total) (3) Column percent ages (each row entry as a percent of its column total) (4) Means of associated variables: (a) For the 6 "nonqualifier" pages, give the mean of G (adjusted gross income). (b) For the 96 "qualifier" pages, give the mean of X or Z (potentially averageable income) as is appropriate. (5) Remarks: (a) Tables with cell entries of types: (1) and (4) above will also have the following statistics printed: (i) Standard deviation (ii) Variance (iii) Chi-square (and the degrees of freedom). (b) There will be 102 pages as described in B. and C. above for each of 4 different types of cell entry (as described in (1)-(4) above), for a total yield of 408 pages. E. Future replications: After examining the output of the crosstab specified above plus the crosstabs to be generated by Ron Durant, we will make detailed descriptions of further crosstabs, similar to those above, each further broken down by one or more of the following variables: (a) Occupation (b) Age group (c) Sex (d) Previous and present marital status (e) Raceshahttp://www.ssc.wisc.edu/wais/WAIS645053.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645053.txta9dH Gene Moyer 1965f_An Estimate of Untaxed Wisconsin Income in 1959 and of Non-Filing Wisconsin Individuals in 19620March 10, 1965 WAIS papersl645-035AnalysisccGene Moyer WAIS 645-035 March 10, 1965 First Revision An Estimate of Untaxed Wisconsin Income in 1959 and of Non-Filing Wisconsin Individuals in 1962 One of the questions which is always of concern to persons investigating income taxation is that of the amount of income which escapes the tax base and the number of individuals who manage in some way or other to stay off the tax rolls. Recently this writer was able to get certain tabulations from the Wisconsin State Department of Taxation which allow comparisons to be made of the Wisconsin state income tax base, the Federal tax base in Wisconsin, and the total income reported to the eighteenth decennial census of 1960. In addition, these tabulations allow a comparison to be made of the number of Wisconsin state income tax filers and the number of persons in Wisconsin reported to the census that they had had income in 1959. These tabulations are in three parts [9;10;11], but throughout the paper the entire groups will be called the "Wisconsin Statistics." These Wisconsin Statistics are not so complete as one would desire. In the first place during the years 1946-1960 [9], they give net taxable income for each year (hereafter to be called NTIt) which is defined as gross taxable income (GTIt) less the amount which taxpayers deducted on their tax forms. These deductions could take the form of a "standard deduction," (9% of GTIt if GTIt were $5,000 or less or $450 if GTI were more than $5,000) or of itemized deductions very similar to those allowed by Federal income tax law [13;p.15]. In addition to the fact that the "Wisconsin Statistics" covers only NTIt, they do contain distributions of NTIt by county for the years 1946-1960 but there may be a bias in these because if an individual earned income in two or more counties, his income was all counted as having been earned in his home county. In addition, each assessor in the four Wisconsin tax districts compiled his own county statistics and it is not at all clear that they used the same methods of arriving at the county totals. The state totals, of course, contain no such bias, and this paper is concerned with them. Beginning in 1961, the Wisconsin State Tax Department began recording their tax rolls on magnetic tape. As do all groups who shift from IBM cards to magnetic tape, the tax department had trouble with the system at first, and so had to spend almost the entire year 1961 in perfecting their data processing system. By 1962 and 1963, though, better, more detailed statistics were available. It is well to emphasize that these statistics have not been published and copies of them are not available generally to researchers. It was only because of the special relationship between the Wisconsin Assets and Incomes Studies and the State Department of Taxation that these statistics became available to this writer. Before any comparisons can be made among these three sources of income statistics (Wisconsin Statistics, Statistics of Income, and the Census), it is well to recognize that each source uses a unique definition of income in compiling its statistics and that definitional differences alone account for rather large differences in the total amounts of income the three sources report. Table I allows a comparison of these conflicting definitions. Table 1 Definitions of Income, Wisconsin State Tax Department, U. S. Internal Revenue Service, Bureau of the Census, Sources: [13] and [4J (A) Income item (B) Definition in which (A) appears Wisconsin State Tax Depart. Internal Revenue Service Bureau of Census 1. Wages, salaries, commissions, fees Xl X2 X 2. Dividends received X X X 3. Profits from businesses or pro- fessional practice X3 X X 4. Profits from partnership X3 X X 5. Rents and royalties X3 X X 6. Gains ang losses from the sale of assets X X4 7. Interest received X6 X7 X 8. Annunities and Pensions X8 X8 X 9. Gambling winnings, prizes, awards, bonuses X X X 10. Labor union strike benefits X X X 11. Board, lodging, other accommodations furnished employees X X 12. Alimony received X X 13. Unemployment compensation X X 14. Social security receipts, veterans payments, government assistance, other similar governmental receipts X 15. Other unspecified items X X X 1. $1,000 of armed service pay is excluded. 2. The first $50 of dividends from domestic corporations was excluded. 3. From property located in Wisconsin. 4. One-half of gains on capital assets are excluded or the entire gain is taxed at 25%. Losses are deductible up to a limit of capital gains + $1,000. If losses >(capital gain + $1,000), they may be carried forward for five or fewer years. 5. Items 1-6 constituted 96% of the income reported to the U. S. Internal Revenue Service in 1959 [5, p. 3). 6. Interest on U. S. Government obligations is exempt along with that on bonds purchased before Wisconsin's first income tax law was passed in 1911. 7. Interest on state and local bonds is exempt; some part of interest on pre-1941 bonds is exempted by a tax credit. 8. Only the non-contributing portion of regular payments is taxable. Notice that the main difference between the state and Federal definitions is that Wisconsin includes all reported capital gains net of losses in the tax base, while the Federal law allows much of this income to be excluded. Beyond that the differences are in items which constitute four percent or less of the income reported to the Internal Revenue Service. The main differences among the census definition and the tax definitions are the fact that all receipts from the sales of property are excluded from the census definition while both tax definitions include at least some of the gains and losses on sales of property, the fact that the census excluded income "in kind" while both taxing authorities included board, lodging and other accommodations furnished employees by employers, and that the census included many government transfer payments which the tax departments excluded for equity or efficiency reasons. Because of these differences in definition, we will make two comparisons of these three "incomes" in this paper: First we will compare the "incomes" exactly as they are reported by their respective sources (except that the figure from Wisconsin statistics must be estimated) and then we will attempt to add and subtract amounts from each until all three approximate the state definition and again compare the three resulting totals. Before doing any of this, however, we must estimate gross taxable income in Wisconsin in 1959 (GTI59) on the basis of the net taxable income statistic (NTI59) which is reported in the Wisconsin statistics. In order to do this estimation let us recognize that: (1) GTI59 = a NTI59 (a> 1), and (2) a = GTI59/NTI59 provide us with a method of determining GTI59 precisely if a is known. Since a is not known, we must look for an estimator, d, which will allow us to determine GTI59, the estimated value of GTI59.* Two such estimates of a appear in the Wisconsin statistics, (3) a1 = GTI62/NTI62 , and (4) a1 = GTI63/NTI63 Ordinarily one would use (5) a = a1 + a2 /2 as the estimator of a, but both a1 and a2 possess some upward bias because Wisconsin state law was changed between 1959 and 1962 to allow a 10% standard deduction up to a limit of $1,000 in place of the older 9% or $450 standard deduction [13;p.17]. Therefore a1> a. Before the 1963 income tax year, the law was further modified to allow a minimum standard deduction of $300 [14;p.I7]. This further decreased NTI and so made a2 > a1. Therefore it seems sensible to ignore a2 in the estimation of GTI59. (6) a1 = 1.1395 [10] and (7) NTI59 = 5,892,059,358 [9] Therefore (8) GTI59 = al NT159 = 6,714,001,638. Because Statistics of Income rounds all its figures to the nearest thousand, we shall consider GTI59 to be $6,714,002,000 in all our comparisons with Federal and census income statistics. *Throughout this paper, if (theta) is some statistic, (theta~) will be our estimate of it. The U. S. Internal Revenue Service estimates that the adjusted gross income of Wisconsin taxpayers in 1959(AGI59) as defined in Table I was $6,691,462,000 [5;p.110]. Although both the Federal and the state tax laws call their concepts of gross income "adjusted gross income," the Division of Research calls the state concept "gross taxable income." In order to avoid confusion we will use GTIt whenever we are talking about the state's definition of gross taxable income and AGIt whenever we are talking about the U. S. Internal Revenue Services' definition of gross taxable income. We will use Ct when we are talking about the definition used by the Bureau of the Census. The census does not give its estimate of C59 as such. It gives the mean income of persons over fourteen years old and the number of such persons as $3,654 and 1,994,998 (4;p.195]. Therefore (9) C59 = ($3,654)(1,994,198) = $7,286,799,492 or $7,286,799,000 when rounded. This understates income in that it leaves out the income of persons under fourteen, but the techniques used in taking the eighteenth census did not provide for getting the income of these young persons. When WAIS gets its income and age tables run, an estimate of this omitted income may be made as (10) Omitted income = (Master file income of those under 14 in 1959 / Total master file income in 1959) C59 This is probably not a very great amount of income, however, and so will not make much difference in the result. The following table shows the relationships among these three definitions of income: Table IISome Relationships Among Three Definitions of Income x = y = x - y x-y/x x-y/x GTI59 AGI59 $22,540,000 .00336 .00337 C59 GTI59 $572,797,000 .0786 .0853 C59 AGI59 $595,337,000 .0817 .0890 As the table shows, the amounts of income reported on Wisconsin tax forms and on U. S. Federal tax forms are almost identical. Sampling and estimation error could have accounted for that large a difference. In addition, the two taxing authorities could increase their yields by about 8-9% if they adopted the census definition. Equity would not generally allow such a course of action, however. Of much more importance however, is the question of the differences among the three reported figures if the definitions of all were the same. As table 1 shows, the major differences between the census definition and the definitions of the two taxing authorities are that the census definition allowed social security, veterans payments, and other similar receipts to be counted as income and did not allow any money from the sale of property (except by those persons in the business of selling that property) to be counted as income. This exclusion also excluded income from capital gains and losses. The major difference between GTI and AGI is the fact that GTI requires capital gains to be reported in their entirety and allows all capital losses to be deducted while AGI requires only half of certain capital gains to be reported while allowing capital gains to be deducted to the extent of the amount of reported capital gains plus $1,000. Any further differences amounted to less than 2% of the total amount of AGI59 reported by all U. S. Federal taxpayers [5;p.3]. The total amount of capital gains net of losses which were realized by Wisconsin taxpayers is not available, but the following model indicates a way of estimating the magnitude of these realizations: Let WG = the total amount of gains less losses realized by Wisconsin taxpayers in 1959. WH = the total amount of gains less losses (after exclusions and limitations) realized by Wisconsin taxpayers in 1959 and reported on their Federal income tax returns. UG = the total amount of gains less losses realized by all U. S. taxpayers in 1959. UH = the total amount of gains less losses (after exclusions and limitations) realized by all U. S. taxpayers in 1959 and reported on their Federal income tax returns. We wish to estimate WG from a combination of WH, UG, and UH. To do this we make the assumption that: (11) WG/WH = UG/UH Multiplying both sides by WH, then, we get (12) WG = (UG/UH) (WH) According to the Internal Revenue Service, WH, UG, and UH have the following values: (13) WH = 109,851,000 [5;p.67] (14) UG = 12,331,867,000 [8;p.ll] (15) UK = 6,286,266,000 [5;p.67] Therefore 16) WG~ = (12,331,867/6,286,266) (109,851,000) = (1.9617) (109,851,000) = $215,495,000 The major component which must be deleted from C59 so that it will approximate the state definition is the amount of social security receipts, veterans payments, and other similar transfer payments. Unemployment compensation must be added to AGI59 because it was taxable in Wisconsin but not under Federal law. The Statistical Abstract for 1960 lists $241,320,000 as the amount of old age and disability payments made to Wisconsin citizens in 1959 [3;p.274] $75,319,000 in veterans payments were received by Wisconsin residents in 1959 [3;p.253]. In addition, 33,027 persons received an average of $45.48 during the month of December, 1959 as aid to dependent children, 994 persons received an average of $85.49 in December for aid to the blind, and 8,057 persons received an average of $82.43 in December for general welfare assistance [3;p.291]. If one assumes that the member and monthly amounts were constant over 1959, the total annual payment for ADC would have been $18,025,000 (rounded); the total annual payment for aid to the blind would have been $1,020,000; the total for general assistance would have been $7,970,000. The total of the three would have been $26,995,000. Certainly the total of these three and of the old age and veterans payments should make up for the largest proportion of the government assistance components of CS9. Therefore: (17) $241,320,000 + $75,319,000 + $26,995,000 = $343,634,000 will be used as the estimate of this government assistance component (TP) of C59. Unemployment compensation payments (UC) were made to Wisconsin citizens in the amount of $37,016,000 in 1959 (3;p.283]. The three estimates of GTI59, then, are (18) GTI59 (State) = GTI59 = $6,714,002,000. (19) GTI59 (Federal) = AGI59 + (WG - WH) + UC = $6,691,462,000 + ($215,495,000 - $109,851,OW ) + $37,016,000 = $6,834,122,000. (20) GTI59 (Census) = CS9 + WG - TP = $7,286,799,000 + $215,495,000 - $343,634,000 = $7,158,660,000. The following table indicates the relationships among these three estimates of income. Table III Some Relationships Among Three Estimates of GTI59 x = y = x - y x-y /x x-y /y GTI59(Federal) GTI59(State) $120,120,000 .0175 .0178 GTI59(Census) GTI59(State) $444,658,000 .0621 .0662 GTI59(Census) GTI59(Federal) $324,538,000 .0453 .0463 It is difficult to ascertain the meaning of the first row of Table III. Certainly it is interesting that the additional components in the definitions of Wisconsin income are sufficient to make the income reported by the Wisconsin Department of Taxation appear to be larger than that reported by the Internal Revenue Service for Wisconsin. At the same time if it were possible to compute confidence intervals for the two estimates it is probable that the intervals would show that the two estimates are actually the same. The last two rows of Table III indicate that only about 4 1/2 - 6% of income as the state defines it is lost to the tax rolls. By far the largest proportion of this is probably income earned by those persons who earn less than the filing requirement of $600 for an individual or $1400 for a husband-wife family. It is not possible to determine the amount of income earned by these people very precisely, but it may be estimated in the following tortuous manner. Statistical Abstract, 1960 gives the total amount of income of U.S. families and "unattached" individuals who earned less than $2,000 in 1959 as $8,612,000,000 [3;p.317]. There were 7,622,000 such families and individuals [3;p.317]. The mean income of those families and unattached individuals, then, is (21) US = $8,612,000,000/7,622,000 = 1129.89. Let the mean income of the same group in Wisconsin in 1959 be W and assume (22) W = US. The 1960 Census says that 257,868 families and single persons in wisconsin earned under $2,000 in 1959. Of these 98,985 were families and 158,883 were unrelated individuals [4;p.188]. The number of families with 1959 income of less than $1,000 was 37,457 [4;p.188J. Their total income was (23) (1129.87)(37,457) = $42,322,000 (rounded). While $1,000 is somewhat less than the $1,400 filing requirement for families, the $1,400 criterion is only relevant in those cases in which one spouse has over $800, and the other spouse has income of less than $600 because the spouse with the lower income would not otherwise be required to file. Therefore the major criterion is that the husband and wife each have $600 of income. The $1,000 limit understates the amount of non-reported income, but this seems preferable to certainly overstating it as using a $2,000 limit would do. In addition to the low income families there were 106,985 unrelated individuals who had less than $1,000 in income or who contributed $120,881,000 to the total income of families and unrelated groups with income under $2,000. There is no way of knowing how these persons were distributed between those earning over $600 per year and those earning more than $600 per year. Among all persons over 14, however, 277,360 persons received less than $499 (including negative) and 235,301 persons received from $500 to $999. If unrelated persons were distributed in the same way, (24) (277,360/277,360 + 235,301) $120,881,000 = (.541)($120,881,000) = $65,396,000. Total unreported income because of filing requirements, then, is probably quite a little larger than (25) $65,396,000 + $42,322,000 - $107,718,000, but less than (26) $120,881,000 + (98,985)(1129.89) = $120,881,000 + $111,842,000 = $232,723,000. This accounts for from (27) 107,718/444,658 = 24% to 232,723/444,658 = 52% of the non-reported income in Wisconsin. This still leaves $211,935,000 or 2.9% of Census income unaccounted for. This is probably composed of small remaining differences in definition, evasion which was reported to the Census, and a conglomerate sampling and estimation error. The number of non-filing persons with income cannot be estimated for Federal returns at all because the joint return provision does not allow any tabulation of the number of filers. The number of returns filed in each year is available, but there is no way of knowing the relationship between the number and the number of filers. The number of Wisconsin filers is not available for 1959, either, but it is available for 1962. The Wisconsin statistics indicate that 1,733,743 taxpayers with GTI filed a 1962 income tax form [10]. Unfortunately the Census does not give even the total number of persons with income in Wisconsin in 1959. It gives only the number of persons over 14 years old who had income in 1959. As was noted above, they numbered 1,994,198 persons. Let (28) E62 = the number of persons with income in Wisconsin in 1962 = 1,994,198 + B59 + 60-62, where B59 = the number of persons under 14 with income in 1959 and 60-62 = the change in the number of persons with income from 1960, when the Census was taken, to 1962. The statistic we wish to know is (29) 62 = 1 - 1,733,743/E62. A good estimator of B59 is the number of persons under 14 who received work permits in 1959. Unfortunately even this figure is not available. A telephone call to the Wisconsin Industrial Commission, Division of Woman and Child Labor, revealed that the statistics on work permits for all persons are available but these persons range in age up to 18 and the statistics are not divided by age. The only statistic available for the number of work permits issued for children under 14 is the number issued for work in the "street trades," e.g. as newsboys. In 1959, 15,323 of these permits were issued. This excludes permits issued for children to work as caddies and in school lunch programs, the other two main categories of children's occupations. Even if these numbers double the 15,323 figure, however, the work permit holders would still number only about two percent of persons under 14 and an even smaller percent of the number of persons over 14 who have income. While it will have a negligible effect on the outcome, we will use (30) (2)(15,323) = 30,646 as the estimator of B59 60-62 is not available either. There are several estimators of it available, but it is difficult to know which one to use. One possible estimator of 60-62 is the change in the number of Federal returns filed by Wisconsin citizens between 1960 and 1962 (~62) Statistics of Income gives 1,407,472 as the number of returns filed by Wisconsin citizens in 1962 [7, table 4] and 1,389,916 as the number of returns filed by Wisconsin citizens in 1960 [6;p.105]. Therefore (31) ~60-62 = 17,456. But ~60-62 is probably a very poor estimator of 60-62 because many Federal returns are joint returns some of which have two income earners, some of which have only one. Even if one knew the number of joint returns, then, he would not know the number of filers under the Wisconsin requirement and so it seems sensible to reject ~60-62 as an estimator of 6O-62. Still, substituting (31) into (28) gives (32) ~(1)E62 = 1,994,198 + B59 +~60-62 = (1,994,198 + 30,646 + 17,456)= 2,042,300 and substituting (32) into (29) gives (33) ~(1)62 = 1 - 1,733,743/2,042,300 = 1 - .8489 = .1511 . Another possibility is to estimate 60-62 by the following model: Let Pt-t+x = the change in population from time t to time t + x where x = 1, 2, 3, ... n. y = a real number greater than zero and note that (34) 60-62 = y P60-62 The problem of estimating 60-62 then becomes a problem of estimating y, because (35) P60-62 = 67,223 [1;p.6]. In order to do this, let us note that in addition to (34), (36) y = 60-62/67,223 If we assume that (37) 60-62/P60-62 = 62-63/P62-63 , then (38) y = 62-63/P62-63 = 66,346/4,066,000[2;p.6] - 4,019,000[1;p.6] = 66,346/45,000 = 1.4143. Therefore substituting 38 into 35, (39) ~(2)60-62 = (1.4743)(67,223) = 99,106 Substituting (30) and (39) into (28) gives (40) ~E62(2) = (1,994,198 + 30,646 + 99,106) - 2,123,950 and substituting (40) into (29) gives (41) ~(2)62 = 1 - 1,733,743/2,123,950 = 1 - .8163 = .1837 A third way of estimating 60-62 is to recognize that 60-62 reflects a period which is twice as long as that of 62-63 Therefore: (42) ~(3)60-62 = (2) (62-63) = (2)(66,346) = 132,692. Substituting (42) into (28) gives (43) ~E(3) = (1,994,198 + 30,646 + 132,692) = 2,157,536 and substituting (43) into (29) gives (44) ~3y62 = 1 - 1,733,743/2,157,536 = 1 - .8036 = .1964. In summary, Wisconsin collected taxes in 1959 from an income base of about the same magnitude as that of the Federal Internal Revenue Service. Wisconsin's base is somewhat larger, but when definitional differences are removed, Wisconsin's base becomes slightly smaller than that of the Federal income tax. These two tax bases represent about 4 1/2% to 6% of the total income of Wisconsin when income is measured by the total reported by the census sample adjusted to the Wisconsin income tax definition. Of those persons in Wisconsin who had income in 1962, about 80% to 85% filed Wisconsin income tax forms. While the median estimate of the three calculated is 82%, the estimating techniques make this estimate only a little more reliable than the other two. Bibliography 1. U. S. Bureau of the Census, Current Population Reports: Estimates of the Population of States and Selected Outlying Areas July 1, 1962. (Series P-25, No. 272), Washington, D. C., September 20, 1963. 2. , Current Population Reports: Estimates of the Population of States by Age: July 1, 1963. (Series P-25, No. 294), Washington, D. C., November 5, 1964. 3. , Statistical Abstract of the United States 1960. (Eighty-second edition) Washington, D. C.: U.S.G.P.O., 1960. 4. U. S. Census of Population 1960 General Social and Economic Characteristics Wisconsin. Final Report PC(l)-51C. Washington, D. C.: U.S.G.P.O., 1962. 5. U. S. Internal Revenue Service, Statistics of Income: Individual Income Tax Returns, 1959. Washington, D. C.: U.S.G.P.O., 1961. 6. , Statistics of Income: Individual Income Returns, 1960. Washington, D. C.: U.S.G.P.O., 1962. 7. , Statistics of Income: Individual Income Tax Returns, 1962, (Preliminary Report) Washington, D.C.: U.S.G.P.O., 1963. 8. , Statistics of Income: Sales of Capital Assets Reported on Individual Income Tax Returns, 1959. Washington, D. C.: U.S.G.P.O., 1961. 9. Wisconsin Department of Taxation, Annual Report, 1960, "Net Taxable Income Assessed, July 1, 1959 through June 30, 1960, by County," unpublished. 10. Wisconsin Department of Taxation, Division of Research, "Statistics of Income - Wisconsin Individual Income Tax Returns 1962 - State Totals," unpublished. 11. , Division of Research, "Statistics of Income - Wisconsin Individual Income Tax Returns, 1962 - State Total;" unpublished. 12. Wisconsin Taxpayers' Alliance, Taxes, 1960, Madison, Wisconsin: Wisconsin Taxpayers' Alliance, December, 1959. 13. , Taxes, 1963, Madison, Wisconsin: Wisconsin Taxpayers' Alliance, December, 1962. 14. , Taxes, 1964, Madison, Wisconsin: Wisconsin Taxpayers' Alliance, December, 1963.hahttp://www.ssc.wisc.edu/wais/WAIS645035.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645035.txtk Harold Groves 1965Averaging Monograph6 July 20, 1965 WAIS paper656-005pAveraging StudiesyyHarold Groves WAIS 656-005 Revised June 28, 1967 Confidential CHAPTER I GENERAL CONSIDERATIONS The Need for Averaging. Ability to pay or taxpaying capacity is a difficult concept at best. Leaving aside all criteria except income, the consensus would probably accept the proposition that two taxpayers have equal ability when their receipts are equal and they have equal family obligations. The analysis to establish this, if any is needed, may simply note that the two are equal because they can maintain the same scale of living. Or it may resort to psychology contending that (except for differences in temperament which must be ignored) equal taxes from these two parties means equivalent sacrifices. Some may prefer to gauge the situation in terms of social consequences (social sacrifice) and again it will be contended that taxing the two alike will incur equal costs or losses. Whichever defense is taken, the incomes that are compared must have a time dimension. No one would be satisfied with comparisons that used a day or a week for the gauge. This, so to speak, is not a representative sample of the individual's experience. The week for one may be vacation without pay while that for the other may be a regular work week; one's week may include a pay-day and the other's an accrual of unrealized income looking toward pay day at the end of the month. For most people, using a year's experience will be sufficient to iron out these small differences in the time-patterns of income flow. But there will also be a considerable number whose incomes fluctuate from year to year. Over a number of years incomes may add up to the same totals as those of their neighbors but in any one year they are above or below the steady incomes. A taxpayer with fluctuating income may be able to live as well as his neighbor; he may simply draw on reserves built up in good years to support his consumption in the poor years. The case for averaging arises out of the fact that with a progressive scale applied annually the taxpayer with fluctuating income will usually pay more taxes than his neighbor. The mechanics of this may be seen in a simple diagram (below): Diagram 1 A (steady) B (fluctuating) Bracket of Income average income Years 1 2 1 2 It shows that B with fluctuating income pays at the fourth bracket rate on part of his income and that to establish parity on this part of his income, he should pay at the third bracket rate instead of the fourth. On any standard it would be difficult to argue that B is being treated fairly. The conclusion that follows is that an equitable tax system with progressive rates requires at least in some instances a longer standard of measurement than one year.* Moreover, income associated with risk no doubt fluctuates more than non-risk income. Thus to avoid a special penalty on risk-taking, some averaging is required. The case is even clearer where income realized in one year is the result of effort expended and expectations accrued in previous years. The simplest case is that of back-pay for services previously rendered. Here the punishment of progressive rates is not only a matter of fluctuation of income but in some sense failure of accounting practice to reflect the true facts. It was allowance for cases of this sort which crept into the statutes before 1964 and which are now replaced by a general system of averaging. The problem of equitable treatment for fluctuating income is essentially one of progressive taxation. If all income were taxed proportionately, the two taxpayers in Diagram 1 would pay equal taxes without any averaging. And yet the problem is also essentially one of horizontal equity. We cannot measure differences for progressive taxation properly until we recognize the situations in which no real differences exist. We may also note in passing that the problem of averaging is not only related to progression, but also the degree of progression and the widths of brackets within which no progression occurs. We shall forego discussion of remedies which might seek to alter our practices in these matters on the ground that this is not our subject. We note, however, that the British system which covers a broad span of income under a single rate, offers much less need for averaging than that of the United States. The Case Against Averaging. While the case for general averaging seems compelling to many students of taxation, there are also grounds of opposition. Particularly troublesome is the factor of leisure, especially where the taxpayer selects it in preference to a role in the labor force. --------------- *The overpayment may not be large; if the bracket width is $2,000, and the step-up in rates is 3 percent, and all brackets are filled, it will in this case be $60 ($2,000 x 3 percent). At the present (1965) scale for single taxpayers, this sum is about 9 percent of the tax liability at the $4,000-level of income; it is only 1 percent of such liability at the $20,000-level of income. But if graduation itself is worth having at these levels of income it would appear worth correct application, at least if precision can be had by a simple procedure without undue compliance and administrative cost. If there are only a few cases of substantial fluctuation and injustice so much the better; perhaps they can be corrected with minimal administrative cost and loss of revenue. Two illustrations will suffice. The first is the case of where married women move into and out of the labor force on a close balance between the benefits of leisure and those of extra income. The averaging of zero or negative incomes and let us say $5,000 in alternate years procures a drastic reduction of tax partly because effective progression is very high in the starting brackets. It is at least doubtful that this is the sort of fluctuation that averaging would be inaugurated to alleviate. The second is the case of the person who voluntarily retires at the age of 55. He may be compared with the person who works twenty years longer before retiring. Let us assume that the life-time earnings of the two are the same. Under lifetime averaging they would pay the same tax at least in terms of present value of the payments (discounting for time of contribution). Yet the person retiring early received better pay for his efforts and 20 years of extra leisure. Can we say that over the long run the two have equal capacities to pay? One can argue that the rate of pay if not leisure is entitled to some weight in assessing ability to pay; that intensity as well as amount is a relevant aspect of income flow. Of course we cannot adjust the income tax to take account of the non-monetary aspect of jobs. But in considering reform on other grounds we need not ignore this aspect. The problem is further complicated by the fact that intermittent employment and early retirement may be involuntary. Under these circumstances leisure may or may not have a positive value. As we shall see later there is no way of tracing individual income through a change in marital status which is clearly rational and right. A bachelor with an income of $50,000 hardly incurs a reduction in his income by half simply because he marries. Similarly a person with a $50,000 income all subject to tax in this country hardly incurs a drop in his income to zero simply because of a transfer of residence to some other country. One might conclude from these examples that averaging could create as much mischief as it relieves and all at a cost of a vast amount of paper work. At least during certain periods such as all-out war what a man earns currently is deemed by many as more significant than his long-time earnings. Taxation, say these skeptical critics, is a business of rough justice, and further so-called refinements in an over complicated law had best be confined to articles in academic journals. Such limited policing resources as we now have had better be devoted to ironing out inequities due to uneven administration. We do not deem these objections conclusive but they are worth pondering. Several of them deal with specific situations that are not beyond the competence of the law to isolate and manage with tolerable satisfaction. Against the above case are too many claims that are patently valid. At least if they can be relieved by a system whose price in terms of complication, compliance and administration is moderate, we should accept it. Periods for Averaging. If a period longer than a year is to be taken into account in reckoning ability to pay, how long should the period be: two years, a lifetime, or some period in between these extremes? Clearly the precision with which the ultimate income flow is measured will increase with the length of time taken into account, Any period short of a lifetime is bound to be somewhat arbitrary and to miss significant variations in income patterns. But compulsory lifetime averaging for everybody would be a formidable undertaking and price changes, rate changes, choice of leisure as against augmented earnings, and in some cases residence abroad make lifetime comparisons more difficult and less precise than might appear. Moreover, some critics regard the short horizon as mainly significant for ability to pay. It can be argued that most people adjust their habits and their outlooks to intermediate periods, say of three, five or at most ten years' duration. If A and B have equal incomes over this span of time, their welfare levels and their sacrifices to pay equal taxes can be regarded as equal. This is to argue that we can disregard the facts that one of these taxpayers may be headed for adversity in the distant future while the other may experience unforeseen good fortune. And we may also ignore such long-past vicissitudes that have left no mark on present incomes. Lifetime averaging itself is not entirely free from arbitrary features; where one taxpayer dies at 60 years of age and another at 80, it is not self-evident that one should enjoy an averaging period of 60 years and the other a period of 40 years. But the short-horizon approach to averaging is not very persuasive either. Probably different people have different horizons and the same people have different horizons for different consumption, investment and borrowing decisions. When a young couple builds a house it presumably looks forward at least twenty years and when it buys a suit of cloths it looks forward perhaps two years. A well-situated business executive could comfortably sell short for ten years but an unskilled worker might well hesitate to pledge a week's wages. Though the income of A and B may now be the same, the fact that one earned more in the past is likely to have affected their relative savings accounts. The difference between transitory and permanent income is not so much a matter of time as of permanency of expectations. One could make a case for the proposition that nothing which ever happens to the taxpayer or at least which he can anticipate is strictly irrelevant to his present not to speak of his ultimate welfare. Welfare is a very tricky and fuzzy concept. It is surely not irrelevant to the present welfare of the taxpayer that he is now accruing capital gains which he may not choose to realize for many years. And so on. Proponents of a short averaging period do better perhaps to rely on the practical advantages associated with this choice. Diverse Forms of Fluctuation and Ad Hoc Averaging. Some empirical work has been (is being) done to inform us as to the nature and extent of fluctuations in income. Some of the results of our own efforts in this area are presented in Chapter II. But ordinary observation tells us that fluctuations (at least gentle ones) are a diverse and pervasive phenomenon. We have mentioned two cases or types of fluctuation: (1) the realization of present income accrued over a longer period than one year; and (2) the consequence of good and bad years in the course of business. There are many other situations which will occur to anyone who attempts to make a list. First thought turns to windfalls as where one wins a prize. Slightly different is the case of inheritance; here one may jump from a lower to a higher plateau of income. A special case is the rise and fall of income as one enters or leaves the labor force either on his own option or because of unemployment. The in-and-out employment of married women is a conspicuous sub-class in this category. The opposite of windfalls are the cases of personal misfortune -- the loss of a job, illness, disability and the like. Most people most of the time experience some fluctuation (or trend) in income due to the general factor of economic growth and/or creeping inflation. Since per-capita income has been increasing several percentage points per year we can assume that most of these fluctuations most of the time are upward. However these upward trends may be interrupted or distorted by the business cycle. Finally there are age cycles of income: Most people no doubt experience rising incomes as they pass from youth to middle age; and for some, income continues to rise up to and perhaps past retirement age. And so on. We could attempt to rate the species of fluctuation according to need and suitability for averaging. We shall confine ourselves to the sort of analysis involved. For instance take the cases of a sudden jump of income from a lower to a higher level at which it remains relatively constant. This type of situation is quite commonly experience by one who has not been in the labor force (at least fully) and for whom the turning point is his first full-time job. It also includes the person who receives a large gift or inheritance (while the receipt itself is not clacced as income, the yield from new investments made possible by the receipt is so classed). His case for relief from progressive taxation annually applied after the leap upward is surely not very compelling. His revised expectations are permanent and all his adjustments to changes in income are agreeable. The present federal averaging provision seeks to exclude such income from averaging. But these grounds could be advanced for a considerable number of cases of this sort which are not so easily defined. Or take the somewhat opposite case of a one-shot windfall gain such as a prize, an athlete's bonus, or a one-time capital gain. (The capital gain is distinguishable on the score that it not only involves a one-shot gain; in its case there may also have been a long-time accrual). A case could be made for the view that windfalls of this sort should be singled out for especially severe treatment because they are unexpected, provide scant basis for a permanent improvement in consumption, and could be sacrificed in large part without heavy social or individual loss. On the other hand most of these gains have nothing to do with one year's accounting except that they were realized during this period. We exclude inheritances and gifts from the income tax partly on the ground that they would create "buchiness" in the income-tax structure. Perhaps a society should be in position to offer prizes without undue punishment from the income tax. Negative windfalls -- the unexpected mishaps of life -- would call perhaps for a somewhat different analysis. Here the overpayment, if it may be attributed to a bad year, comes at a high price in terms of personal sacrifice. In his bad year the taxpayer will probably be having a "hard time" maintaining his consumption at a level somewhere near to that to which he is accustomed. And so on. If there were only a few types and sources of income fluctuation or of fluctuations which result in an unfair incidence of a progressive tax, a case could be made for ad hoc averaging. That is we might single out these areas and make them the special beneficiaries of income-spreading. Such indeed were the provisions in our federal law prior to 1964. The Canadians and Australians have compromised with general averaging by confining its use largly to extractive industries such as farming and fishing and mining. But as we have suggested above the phenomenon of fluctuation is far too diverse and pervasive to cover the problem adequately with a special listing of cases. Steger estimated* from some empirical work that 20 to 30 percent of taxpayers incur fluctuations which are significant for their tax burdens and our own studies confirm his conclusion. However, the anatomy of fluctuation is not entirely irrelevant in the choice of a system of averaging. It has some bearing on the choice of an averaging period; the choice between optional and compulsory averaging; and the possible exclusion of small gains from eligibility. The use of a short period along with some exclusion could largely eliminate consideration for those small fluctuations which most people experience most of the time. (A taxpayer starting with a $10,000 income and experiencing a five percent upward trend in income each year for five years would (if we ignore compounding) have a low of $10,000 and a high of $12,500 in his series. Along with an optional feature this could eliminate a considerable part of the very large volume of paper work that would attend the general use of averaging. On the other hand it must be conceded that it would not compensate adequately for the considerable differences in patterns of life-time income flows -- the case of the athelete for instance, and of good and bad fortune during retirement. -------------- *Wilbur A. Steger, '"Averaging Income for Income Tax Purposes," Tax Revision Compendium, Committee on Ways and Means, 1959, Vol. I, pp. 590-591. The Marital Unit: A great embarrassment for all averaging systems is a change in marital status. How do we splice single and joint returns where they both occur in an averaging series? Several situations will readily suggest themselves: (1) A was single during the first years of the series and was married to B during the latter years. A has a $50,000 job throughout the period while B had no independent income either before or after the marriage. (2) Reverse the time-sequence of the above and assume that B dies during the period leaving A a widower. (3) Assume the conditions of (1) above except that B had a $10,000 job which she relinquished at the time of marriage. (4) Assume she retains the $10,000 job. (5) Assume that A marries and is divorced and remarries a different spouse during the period. (6) Assume that A and B are consistently married and make joint returns throughout the period either on a single income or some combination of two supports for the family budget. Since 1946 the federal tax law has treated a married couple's income for purposes of graduation as though each spouse had an equal share in the joint income; that is graduation is applied to half the income of the marital unit and the tax is then multiplied by two. This is the phenomenon generally described as "splitting". Logically it should be applied to averaging in which case we would splice the incomes of single people to half their joint income after marriage. But obviously this would produce some bizarre results. In case (1) above it would indicate that A has suffered a drastic set-back in his income although psychologically he may and probably does feel that his economic circumstances have deteriorated little if any. An alternative procedure which seems more realistic to these authors would look to the legal title to income after marriage and splice the records as though the parties had remained single. But this does not appear very satisfactory either. In the third case mentioned above it would show B suffering a serious loss of income when as a matter of fact she has no reason to complain about her fortune. Anyway it contradicts the spirit of the 1948 legislation and involves the artificial proration of deductions. Or we could use both approaches, allowing relief only if a change in income is apparent both on the assumption of equal sharing and on that of independent status. This has much to commend it and appears to give the "right answer" at least in cases like (4) above. It will call for a maze of calculations especially where there are multiple changes in marital status as in (5) above. The case of constant marital status, (6) above, appears to present no problems, but even here a choice of procedure is involved. Perhaps joint returns hide the fact that during the period, A's income has dropped one-half and the family income has remained stable only because B took a job to make good A's losses. Should we disregard these individual components of the joint return? Again we can say with confidence that the changes in income that are due mainly to a change in marital status are not the type that averaging should be inaugurated to alleviate. They can be avoided only by a return to ad hoc averaging that limits the privilege to certain types of income such as entrepreneurial gains, back pay, capital gains and the like. If we choose to average all total net taxable income we can minimize the consequences where marital status changes during the averaging period. Much of the problem is due to the ambiguities in applying graduated taxes to the marital unit where in some sense and some degree sharing does reduce ability to pay and in some sense and some degree it does not. Administrative Considerations and Other Reasons for Disregarding Small Gains from Averaging. Any generous system of averaging is bound to add substantially to the cost of income tax administration and compliance. It involves the keeping and checking of records for the several years of the series. Cumulative averaging purports to avoid some of this by requiring the taxpayer to keep the record current as he moves along; he calculates a cumulative total each year and copies last year's figure on this year's return. But cumulative averaging as will later be explained has administrative difficulties of its own. Administrative and compliance costs are important facts that cannot be ignored in income tax policy. If the taxpayer saves $50 in taxes from an averaging system and it costs him an extra $50 in the fee he pays his lawyer to complete his return, the taxpayer has gained nothing and the lawyer presumably could have turned his time to greater advantage. Thus, it may be necessary in devising legislation for averaging to compromise between the benefits of greater equity and the costs of administration and compliance. The balance of these considerations may not be easily recognized at least without experimentation but the idea indicates some passing over of minor inequities to keep traffic within bounds. The simpler the averaging scheme and the more readily it is understood by the taxpayer or his agent the less compromising of this sort may be necessary. There are other more important reasons for disregarding small gains from averaging in some systems. Where a moving average is used and the taxpayer is allowed to count one year's experience more than once in successive series he may enjoy a degree of multiple relief from one fluctuation.* Moreover, averaging may make no provision for changes in the rate scale (the new federal law does not). Where rates as well as incomes fluctuate there is no sure support for the proposition that a taxpayer with fluctuating income always suffers inequities from progression. Generally where a rate reduction has occurred, a simple averaging system that makes no allowance for rate changes will favor the taxpayer with a positive fluctuation.** --------------- *Thus if the taxpayer presents four $10,000 incomes and a zero income in a five-year series he gets full relief from his fluctuation in his fifth year and if he may count the zero in subsequent series he will pay less taxes over the years than a taxpayer with steady income. Under certain circumstances and if there is no option not to average, he may also pay more than a person with steady income (further discussed later). **The post war years have been characterized by long periods of stable rates, but this may not continue; particularly if the administration were given power to manipulate tax rates in the interest of economic stability, rate changes might occur frequently and become a highly significant factor in averaging. A calculation which we need not present in detail here indicates that a single taxpayer at the $10,000-level of income incurring a 50 percent positive fluctuation in a future year when 1965 rates were reduced 10 percent, would under 5-year averaging and due attention to the rate fluctuation pay approximately the same tax as he would without averaging. His gain from the lucky timing of his big income in the one case compensates for the benefit of averaging in the other. If, however, no attention is paid to the rate change and he is allowed to average as though present reduced rates had prevailed over the 5-year period, he gets a double benefit and will be very substantially undertaxed. In the case of a negative fluctuation in the year of rate reduction, failure to recognize rate changes means that relief afforded by averaging will be inadequate. Perhaps all of these considerations influenced the federal law-makers to constrict the averaging privilege quite severely. Whether the constrictions of the new law are appropriate for their purpose is another matter. Net Income versus Net Taxable Income. Should we select net income (adjusted gross on the returns) or net taxable income (net of deductions and exemptions) as the base for averaging? The presumption favors the latter since it is the basis for measuring relative ability to pay. Moreover, for all we know a priori, deductions and exemptions may fluctuate quite as much as positive net income. A taxpayer may donate a third of his income in one year and none at all in the next. He may be single in one year and married to a widow with four children in the next. It is a well-known fact that medical expenses can and do fluctuate widely. Some of these items may be more manipulable as to timing than ordinary income. But what of it? Averaging simply makes manipulation unnecessary. Exclusions. Some comments have been made on this subject and more will be added in Chapter V on the federal law. Here we may confine ourselves to the major issue: whether capital gains should be excluded. In our view capital gains are an example par excellence of income for which averaging is especially appropriate: they are characterized by much irregularity and accrual over long periods. The reason assigned for their exclusion in the recent federal law namely that gains are favored by other concessions is unpersuasive. Rate concession to capital gains affords no distinction between the taxpayer who receives them regularly and the taxpayer who receives one once in a lifetime. Nor do they afford distinction between the taxpayer who is in a position to manipulate realization -- "do-it-yourself" averaging -- and the one who is not. A better program for equity would allow averaging for capital gains in exchange for the elimination or mitigation of other concessions. As will be observed later the inclusion of capital gains in the averaging scheme (directly) would do much to simplify the present law. It might be argued that the inclusion of capital gains in averaging would tend to encourage the deferral of realization. But the answer is indeterminate: averaging might encourage deferral because if and when the latter were abandoned and bunched income realized, the penalty on bunching would be removed. On the other hand averaging also mitigates the penalty on early realization. The main effect would appear to be the relief of penalty taxes on irregular receipts of capital gains whenever they are realized. Built-in averaging by manipulation would become unnecessary. Negative Income and Fluctuation. There are several kinds of negative income and fluctuation that lead to problems in averaging. First, there is negative adjusted gross income usually associated with business losses. We now provide independent averaging for such income with loss carry-overs and carry-backs. The system provides more generous benefits than substitution of general averaging is likely to afford. So far as the authors are aware, the present practice works well and need not be disturbed by other innovations in averaging. Second, there is the negative net taxable income which arises when adjusted gross income is positive but deductions and exemptions exceed it. Allowing these negatives to count in averaging is described as a carry-over of unused exemptions and deductions. Such allowance seems amply warranted in principle and it is necessary to achieve parity between some fluctuating as against some steady income. Moreover, it is undoubtedly of great importance to farmers and small businessmen whose experience in any one year is quite likely to fall midway between a loss and a taxable amount. Furthermore, this is the only feature of averaging that is sure to help the poor. But we must recognize at least that full allowance for negative taxable income is a potent relief involving many returns, much administrative difficulty, and a considerable threat to the revenue. A genuine carry-over gives the taxpayer double mileage, so to speak, once as an additional deduction, and once as a factor in rate determination.* The feature would probably necessitate more stringent filing requirements and the auditing and retention of many more returns of taxpayers with negligible income. If a self-support test were used as a basis of eligibility, as in the present federal law, the feature would place an additional strain on its administration. Suppose the taxpayer has no positive income and borrows the funds to cover his annual living expenses. Currently no attention need be given his return or his failure to file a return. All that must be established is that he has no taxable income in which case he may count the year's experience in the averaging series as zero. Were he to be allowed a negative figure in the series, attention to the detail of his return would become important. The case for carry-overs is considerably weakened by the fact that transfers, with or without a means test, are generally not subject to tax. Thus a person who has no adjusted gross income may nevertheless have uncounted net income. It seems wise to conclude that if negative net taxable income is to be recognized in averaging, it better be with qualification. One such -------------- *It would be possible to separate the two. Thus if the taxpayer has a series of $2,000 incomes followed by a negative $2,000 income, his algebraic average income would be 2 plus 2 plus 2 equals 1.2; his arithmetic average would be 2 plus 2 plus 2 plus 2 equals 1.6. Without averaging, his tax would be that on $2,000 multiplied by 4; with a complete carry-over it would be that on $1,200 multiplied by five; with a partial carry-over, the tax would be that on $1,200 multiplied by five plus a third of this intermediate figure, The third calculation allows a rate relief but not a base relief.b[In the case of inflation we have not only a rise in many money incomes that involves no advance in real incomes; we also have advances that are probably permanent; and differential gains. If the taxpayer has been receiving a steady income of $10,000 and now because of inflation he jumps to a $20,000 level and remains there, few would nominate him for tax relief. Of course if everybody had the same experience we could manage the problem by changes in the rate scale. But as in the case of war there will be differential gains to consider. A Note on the Mechanics of Progression. Our system of graduation does not apply a consistent incremental tax rate to each incremental dollar of income. On the contrary it provides for a step-up in rates at bracket intervals. This feature means that the penalty of graduation (benefits of averaging) will depend considerably upon the position of the taxpayer's marginal income with respect to bracket boundaries. Fluctuation entirely within bracket limits can involve no penalty from progressive rates (benefit from averaging). This accounts for the fact that at the top of the scale where brackets are broad, large absolute fluctuations can occur without penalty and yet some modest fluctuations (over bracket boundary lines) nevertheless do involve progression. Thus a single taxpayer at the $50,000 level of income can incur a $10,000 fluctuation without gain from averaging but the same taxpayer at the $58,000 level would profit considerably from the privilege. Since under our system of splitting income for spouses, brackets for married taxpayers are in effect wider than those for single taxpayers, it follows that the latter have more to gain from averaging than the former. At the $12,000 level of income averaging out a $12,000 increment (over 5 years) involves for the single taxpayer (1966 rates) a potential saving of $1522; for the married taxpayer, the comparable figure is $400. To those who hold the view that the concessions of splitting unduly penalize the single taxpayer, this aspect of averaging is salutary. Distributive Effects. The distributive effects of averaging may be viewed from either of two angles: (1) the importance to a taxpayer of averaging adjustments at different levels of income; and (2) the effect of an averaging privilege on the distribution of income by classes. As to the first of these considerations, two variables are important: (1) the percentage decrease or increase of income that it takes to fill a bracket; and (2) the percentage of income (or of taxes) that any given tax adjustment constitutes. Thus at the $2,000-income level for single taxpayers it takes a 100-percent positive fluctuation to fill a bracket whereas at the $20,000-level it takes only a 10-percent fluctuation to do so. On the other hand, a $100-adjustment constitutes 5 percent of a $2,000-income and only 1/2 of 1 percent of a $20,000-income. A $100-adjustment is nearly a third of the tax liability of the small taxpayer and less than 2 percent of the taxes due from the affluent payer. Thus for an equivalent adjustment in absolute size, averaging is far more important to the small than to the large taxpayer; but equivalent percentage fluctuations are less likely to afford adjustment. A third variable is the step-up in rates between brackets; generally the 3 percent step-up is consistent; but there are two 4 percent increments in the middle brackets and 2 and 1 percent mark-ups at the top and bottom of the scale. It can be said with considerable confidence that averaging is not a very potent device for helping the poor. Obviously it will not help a family that never pays any income taxes. A family at the $4,000 income level has the following potential gains from five-year averaging: if its income doubles in one year it gets no relief at all since the entire fluctuation is within one bracket. If its income drops to zero in one year (unemployment) it could save $60 which is 1.9 percent of its average income and some 12 percent of the annual tax on average income. If negatives are counted and the drop is to - $1000, the carry-over would save the family $230 which is nearly half the taxes on its average annual income. Moreover the carry-over -- first cousin to a "negative income tax" -- is the one sure relief; it reduces taxes for the reason that it reduces the base of the tax. On the other hand at the $20,000-income level, doubling the income involves potential relief of $1360 which is 5.7 percent of average annual income and nearly 24 percent of the taxes on this figure. A negative fluctuation of 50 percent could save $240 which is 1.3 percent of average income and more than 6 percent of the annual tax on such income. The figures assume a married couple, simple averaging, a 5-year period, and 1966 rates. In the matter of the distribution by classes of income, another variable enters the picture and that is the number of people (percentage of population) within each income class that experience significant fluctuations. This will depend in considerable degree on the composition of income by source and the score of variability for each source. Some empirical evidence on these matters will be presented later in this monograph. However, it is known from previous studies that property income, entrepreneurial income, and wages and salaries are all of substantial importance at the lowest levels of income. Of these at least entrepreneurial income would be expected to indicate substantial fluctuations. Wages and salaries are usually stable except for unemployment and the intermittent entry of wives into the labor force. It is a well known fact that the incidence of unemployment is highest among the relatively unskilled. In the lower middle class of incomes, wages and salaries are a highly predominate source but entrepreneurial income still provides an unstable element. As we proceed up the scale dividends and capital gains begin to play a major role. They are among the more volatile elements of income. However, where diversification of investments develops and capital gains can be manipulated, these sources can become relatively stable. While the effects of averaging on distribution are of interest, we do not regard them as relevant in the decision to allow or not to allow averaging. The progressive rate scale itself can be manipulated to achieve any desired distribution. Cyclical Effects. In general all averaging tends to be pro-cyclical in the recovery and boom phases of the business cycle. This is due to the fact that it creates a lag in the adjustment of effective tax rates to rising income. Thus with a series of incomes consisting of 10,10,10,10, and 15 (in thousands) the effective rate on the $15,000-item might be reduced by averaging to the level appropriate for $11,000 and it might take five years to completely adjust to the new level. On the other hand, at the other end of the cycle some systems may relieve and others aggravate the recession. Any system which gives parity treatment to negative fluctuations and which in effect reopens past returns will result in reduced tax or a refund in bad years and this will bolster "built-in flexibility." Anticipating a more intensive review of averaging devices in Chapters 3 and 4 we may note that not all averaging systems do this. The 1964 federal law does not do it because it excludes negative fluctuations. A moving average which casts its effect forward would not do it. Thus if the series were 10,10,10,10 and 5(in thousands), the taxpayer might be required to pay in the final year of the series on a $9,000-base at a rate appropriate for that base. Any system which requires more of the taxpayer in a bad year than would have been assessed on an annual basis is clearly perversely pro-cyclical. Again while the cyclical effects of averaging are of some interest we do not regard them as crucial in decisions as to averaging. Even if the revenue at stake were sufficient to make averaging a major factor in the cycle, there are tools of cyclical control, including the progressive rate scale itself, that should be adequate to cope with the cycle. It seems hardly necessary to insist on an unfair application of the progressive scale in order to temper the cycle. Summary. In this introductory chapter we presented the case for averaging: many people with fluctuating income pay more tax than people with steady income although the former clearly have no advantage in ability to pay. We also gave some attention to the case against averaging: many fluctuations are counter-balanced by preferred extra leisure; many situations such as changing marital status defy a precise calculation of income through long periods of time; anyway during some phases of national experience such as all-out war immediate receipts seem more significant for taxes than prolonged experience. General averaging involves adding to already high administrative and compliance costs. Despite the conceded validity of some of these criticisms, the present authors conclude that a general system of averaging is well worth the effort. The longer the period of averaging, the more precise is the adjustment of burden between fluctuating and steady income; any period short of a lifetime is bound to be somewhat arbitrary and to miss significant variations in the pattern of income flows. But compulsory lifetime averaging would be a formidable undertaking; and price changes, rate changes, changes in marital status, choice of leisure as against taxation, and in some cases residence abroad, among other factors make lifetime comparisons more difficult and less precise than might at first appear. Moreover, some critics regard the short horizon as mainly significant for ability to pay. Fluctuations in income are sufficiently diverse and pervasive to render any ad hoc system of averaging or one that is confined to particular areas inadequate and unsatisfactory. However, the averaging system may well be tailored to largly exclude certain fluctuations that are quite widely experienced and fairly insignificant in amount. Changing marital status is a great embarrassment for all averaging systems and where family income must be spliced to the income of single persons before marriage, neither income splitting nor legal title to shares gives an entirely satisfactory answer. We accept the solution in the federal law which in effect calculates with each alternative and applies the one least unfavorable to the revenue. No adequate case has been made for the exclusion of specific sources from the averaging privilege and the goal should be universal coverage. Capital gains, in particular, seem especially suited for averaging. Parity treatment for negative fluctuations and for negative net taxable income and negative adjusted gross income is in accord with the algebraic principle on which the net income tax is founded and thus enjoys a strong presumption in its favor. The carry-over of unused exemptions and deductions (negative net taxable income) is the feature of averaging with most potential for the relief of poverty. However, this feature does involve substantial administrative and revenue costs and its adoption warrants careful consideration and probably some qualification. An eligibility requirement that would reduce the paper work of averaging seems desirable. Its use to compensate for limitations in the averaging system such as failure to recognize rate changes and multiple counting (in some systems) impresses us as highly doubtful. Averaging will have distributional and cyclical effects and we present some analysis to indicate what these might be. However, we are of the opinion that, with counteracting remedies available, including the progressive rate scale itself, these considerations should carry little or no weight in decision-making about averaging. CHAPTER III THE MECHANICS AND CRITERIA OF AVERAGING SYSTEMS There are three main types of averaging systems discussed in the literature. All but cumulative averaging have been applied in some country at some period. The applications have developed a variety of special features that will be discussed more fully in the next chapter. The first of the types is so-called simple averaging. It says to the taxpayer, you may take some number of years and recalculate the taxes that you paid on an annual basis adjusting them to what you would have paid had your income been steady. Note that the adjustment comes at the end of the period. As prescribed by Henry Simons* and applied in Canadian law the years averaged must be consecutive ones and no one year may be counted in successive series. This avoids the multiple weighting of a given fluctuation that plagues the operation of moving averages. Negative fluctuations are automatically given treatment symetrical to that of positive ones and they may result in a refund for the taxpayer. Moreover the system gives the "right answer" by definition but it is right for the limited perspective of the given period only. Where changes in the rate schedule have occurred during the averaging period, the calculation should apply the proper rates to average income as they have been applied to annual income. Where there are rate changes, averaging need not always favor the taxpayer but with this system the stake is so small that the privilege may be made optional. That is, for instance, the taxpayer may average a series of five consecutive years, drop a year and then average another five-year series. To further limit the traffic involved, a limit on eligibility may be prescribed. Logically it should be related to processing costs and perhaps take the form of a dollar amount disallowed in the adjustment. Some have advocated a percentage of the tax on annual income as a suitable threshold. ----------------- *Henry Simons, Personal Income Taxation, University of Chicago Press, Chicago, 1938, p. 154. A second type of averaging is called cumulative. It uses as a period or series a span of years with a fixed initial point and expanding as it goes along. Each year's adjustment is complete so that two taxpayers starting at the same point of time will always have paid the same tax if their aggregate incomes since the beginning add up to the same totals.* The series must have a terminal point and a point of initiation; logically it should include all of the years from one's majority until his death. Over identical periods the system does not differ in principle or result from a simple average. In some sense it is simple averaging over the life of the taxpayer. It cumulates taxes and income and adjusts perfectly to the taxes that would have been due had average income prevailed throughout the time to date. But there is a crucial difference and advantage in cumulative averaging. The difference is in the timing of the taxpayer's adjustment. It adjusts the taxes of the past years annually so that the taxpayer is always paid up, no more and no less than as though his income had been steady up to that point. Thus he gets relief from fluctuations annually and doesn't have to wait until the end of an averaging period. We should not leave the impression that cumulative averaging could not be applied over shorter periods than a lifetime. It would be possible to allow the taxpayer to cumulate at his option for any number of consecutive years say up to ten. Any year when the option were not exercized would start a new series. An exclusion to rule out minor adjustments is not incompatible with this scheme. The taxpayer could calculate his own average without the use of tax tables simply averaging for more years as he goes along. The compliance load would not be formidable since the taxpayer could carry forward the cumulative totals from one year to the next. The application of changing rate scales would be no more difficult than the some problem in the case of simple averaging. However, while it is by no means always more advantageous for the taxpayer to average out a given fluctuation over short periods rather than over long periods he ------------------------- *The Vickrey proposal, discussed in the next chapter, does differentiate between early and late payments by means of an interest factor. But this is not essential to a system of cumulative averaging. gets his relief sooner under the former alternative and the system in practice might degenerate into two-year simple averaging. And the prospect of thousands of taxpayers averaging with different periods is not one to delight an administrator. The option could of course be confined to a definite period beginning at a signal from the taxpayer and compulsory for him thereafter to the end of the period. But this would involve distasteful speculation as to future incomes and if the taxpayer were allowed to change his mind end select a different period retroactively we might as well combine this privilege with a simple average. Again this says nothing about administrative difficulties. Probably cumulative averaging is in its optimum role as the vehicle for lifetime averaging, universal, compulsory, and calculating tax from tables provided for each set of cohorts by the government. The third technique of averaging is described as a moving average. This system selects a limited time for averaging, allows the taxpayer to average each year, but with a constantly shifting series of years, each series adding the current year and dropping the most remote year of the previous series. Thus the taxpayer might use the incomes of 1962-1966 for one series and those of 1963-1967 for the next. The adjustment of tax may be effected either by averaging the base itself or by manipulating the rate on the current base.* The 1964 federal statute provides a form of moving average that ostensibly does not open past returns but accomplishes the same result by manipulating the rate on current income. The past four years are averaged and the excess of the current year's income is divided by five to calculate tax on the increment which is then multiplied by five and added to the tax on the four year's average income. With some awkwardness the system could be adapted to allow for negative fluctuations. But without reopening past returns it could never allow the taxpayer a refund. --------------- *Radically different consequences hinge on the technique of application but a full discussion of them is reserved for the next chapter. If his current taxable income is zero no rate applied to this base gives a negative answer. Of course as we have said a simple average easily accommodates negatives and we could adopt a simple average that moves, abandoning the qualification that no year is to be used in more than one series. However if we were to use this last proposed alternative we would confront the main difficulty with moving averages, namely that they lead to erratic results by weighting a single fluctuation several times in successive series. A few simple calculations (presented below) indicate that the system can produce highly erratic results. Like the cumulative average, the moving average offers the advantage of annual availability but the former involves no problems of double counting. For the sake of completeness we may mention a fourth technique recommended in Canada and discussed more fully in the next chapter. It would allow the taxpayer to reduce his tax base in good years by depositing extra income (or any income) with the government and be taxed on it in a year of withdrawal at his discretion. This proposal involves saving as well as averaging and it has some kinship to an expenditure tax. As to averaging, if the taxpayer could chart his future accurately and realized his high incomes early in life, he could manipulate this system to accomplish much the same result as lifetime averaging. Both assumptions are at best only partly true to fact, and no doubt there are other difficulties. It is safe to conclude that the idea has some merit and is worth a good deal of further study. Criteria for an Averaging System. In Chapter I we arrived at certain conclusions regarding averaging which we may amplify here into a list of criteria or desirable characteristics. An averaging system should: 1. Use net taxable income as its base. 2. Grant parity treatment to negative fluctuations. 3. Apply to all income--no exclusions. 4. Apply to a consistent family unit; where the unit changes, family income should be divided in a manner least inimical to the revenue. 5. Avoid cumbersome administration and compliance. 6. In accordance with (5) above, the system should be optional and minor fluctuations should be ignored. 7. Provide some threshold of eligibility related to cost of administration and one that the taxpayer can discern without time-consuming calculation. 8. Be available annually, not postponing taxpayer's relief for long periods. 9. Avoid the imposition of extra burdens in bad years. 10. Keep the taxpayer paid up, avoiding remnants of obligation when the taxpayer dies or otherwise leaves the system.* 11. Above all, give rational empirical results, at least moving the taxpayer closer to the norm of burden on steady income than annual payment. The last criterian listed above is worthy of further analysis that may be presented in the form of a numerical example as follows: Suppose that the taxpayer has a series of incomes consisting of 10,10,20,10,10 (in thousands of dollars). Assume that he is single and subject to the 1966 schedule of rates. On an annual basis he would pay $14,830 in tax. Let us assume that we regard the tax he would pay with a 5-year simple average as a correct and rational amount for him to pay. It comes to $14,150. In other words the taxpayer pays $680 too much under annual payment. Under a 3-year simple average he would pay the equivalent of a tax on $13,333 for the first three years and on the annual figure of $10,000 for each of the next two years for an aggregate tax of $14,309. His relief falls short by $159 of what he would get under 5-year averaging but the result is at least rational in the sense that he is moved closer to the assumed correct figure than he would be without averaging. But suppose we use a moving average (federal type): he now pays on $13,333 during each of the five years for a tax of $16,550. This is further away from the "correct" answer of $14,150 than he would be if he paid on an annual basis. In effect he pays on more income than he actually received. This absurdity is obscured in the federal law by the taxpayer's option, the non-averagability of negatives, and an exclusion from averaging of one-third of his fluctuations. The absurdity arises from the fact that the adjustment for an aberration is corrected in the first series yet the aberration is carried over to subsequent series. ---------------------- *Points 9 and 10 will be further discussed in the next chapter. Now take the reverse situation where a negative fluctuation is involved. Say his series is 10,10,0,10,10. Under annual payment his tax is $8,760 and under 5-year simple averaging $8,150. He pays $610 too much. Under a simple three-year average he pays $8,271 which is $121 too much, but much closer to what he should pay than on an annual basis. However on a three-year moving average he pays only $6,485 which is $1,665 too little and way wide the mark. One way to avoid the erratic result implicit in a moving average is to substitute average income for actual income in the second and subsequent series averaged. Thus in the second illustration above the second series becomes 3.333, 3.333 and 10 instead of 10,0, and 10; the third series would become 5.555, 5.555 and 10 rather than 0,10,10. The taxpayer would thus be continually revising the income and taxes on the duplicate years in successive series. The scheme yields acceptable results though less precise ones than a cumulative or simple average. The cumulative average constantly revises all the records of all the years since the starting point. The scheme we are considering drops a year with each series. But the principal objection to it is the complexity and cumbersomeness that it imposes on the taxpayer. Those who place a high value on simplicity as a criterion of averaging will respond to the proposal unfavorably. While the result of simple averaging is never perverse it may be capricious in the choice of years averaged. One may use up his years so to speak in a mildly favorable combination when a much more propitious combination is destined to become available in the near future. Thus he might have a series of 20,20,20,20,10 followed by several years of zero income. Perhaps he will average the first five years when it would have been more advantageous to average 20,20,10,0,0. The capriciousness, if serious, could be mitigated by allowing the taxpayer the option of changing his period, of averaging by paying up his taxes on an annual basis for the first years of a previously selected series. This however would be at a cost of considerable additional administration. It will be noted from the previous discussion that none of the systems of averaging meets all of these criteria perfectly. Only cumulative averaging adjusts the taxes paid annually and perfectly for an indefinite period. But it offends the simplicity-of-administration criterion badly and is unacceptable to those who hold that only short periods are relevant in assessing ability to pay. A moving average that casts its influence both forward and backward fails the empirical test of rationality. The simple average fails to give the taxpayer annual relief and involves capriciousness in the selection of periods. It thus appears that there is no ideal system of averaging. CHAPTER IV LITERATURE AND EXPERIENCE This chapter will be devoted to a survey of the considerable attention to averaging in the literature and the much more limited experience of several countries with application of averaging statutes. The literature* has concentrated heavily on the design of averaging systems; philosophical analysis (in terms of the relation of averaging to progression) and empirical studies of actual data have been few and much less significant. Some attention both in the literature and the laws has focused on limited types of averaging such as proration of "bunched earnings"** and carry-overs of personal exemptions and deductions. The first concerns itself with immediate realization of income that has in fact accrued over longer periods than one year. Here specific averaging has been suggested either by proration over the period of accrual or over a number of years presumably according to some notion of a limited time horizon relevant to the taxpayer's ability to pay. Much of the attention in this area has focused on taxing capital gains where irregular realization and long accrual have argued against the full application of the progressive scale and arbitrary concessions unpopular with the critics has resulted. Typical here are the recent works of Martin David and Richard Goode, the former indicating some favor for proration over accrual, the latter suggesting a procedure similar to ----------------------- *Wilbur A. Steger in "Averaging Income for Income Tax Purposes, Tax Revision Compendium, Washington D.C., 1959 pp. 589-620 presents the single best review of the development of various systems, and it is used extensively in this chapter. A more recent review than Steger's and a more complete review than our own will be available in a doctoral dissertation by E.A. Wiegner, University of Wisconsin, Madison, 1967. **See for instance J.B.C. Woods, "Proposed Simplified Method of Rebracketing Taxes," Taxes, Vol. 32, 1954, pp. 426-428 who suggests proration of all substantial items of income received in one year but matured over two or more years. J.A. Pechman, "A Practical Averaging Proposal," National Tax Journal, Vol. 1954, pp. 261-263 similarly suggests proration treatment for a broad range of items. See also J. Willis, "The Mitigation of the Tax Penalty on Fluctuating and Irregular Incomes," Canadian Tax Foundation, Toronto, 1951, p. 71 and S.S. Surrey and W.C. Warren, "Income Tax Project of the American Law Institute," Harvard Law Review, Vol. 66, pp. 761-833. that in the federal law (1964).* It is generally agreed that "constructive realization" of gains at time of death would require some kind of spreading, and the more radical proposal to include all inheritance in taxable income would make some mitigation of "bunching" even more imperative. Negative net taxable income carry-overs as previously explained are a separable feature of general averaging and they have proved an attractive prospect to a considerable number of critics.** Such carry-overs are of concern especially to low-income taxpayers. The critics have noted that marginal tax progression is especially steep at low income levels jumping from zero to 14 percent (formerly 20 percent) with the first dollar of taxable income. One can find here a middle ground between a negative income tax and present practice; it could make a considerable dent on the incidence of taxation in terms of poverty. However, no jurisdiction in the United States or elsewhere, to our knowledge, has applied the idea. Its cost in terms of revenue would undoubtedly be substantial. The U.S. Code prior to the Revenue Act of 1964 had in I.R.C. sections 1301-1307 several very limited averaging privileges. They were mainly proration-privileges to meet accounting-realization problems. Lump sum compensation for long-term work, certain damage awards for injuries extending more than one year and related income were allowed these privileges. --------------------- *Martin David, Alternative Approaches to Capital Gains Taxation, The Brookings Institution, Washington, D.C., 1966, pp. 207-224; Richard Goode, Individual Income Taxation, The Brookings Institution, Washington, D.C., 1964, pp. 199-204. Goode's proposal would be simpler in terms of administration. Moreover, a five-year spread would probably eliminate most of the tax differential involved and is in accord with a welfare-horizon approach such as we have discussed in Chapter I. **See William Vickrey, Agenda for Progressive Taxation, The Ronald Press Company, New York, 1947, p. 192; Charles Volt, "Averaging of Income for Tax Purposes: Equity and Fiscal Considerations," National Tax Journal, Vol. II, 1049, pp. 358-359; Harold M. Groves, Postwar Taxation and Economic Progress, McGraw Fill Book Company, Inc., New York, 1946, pp. 228-234; Australian Royal Commission on Taxation, First Report, Australia, 1921; reprinted in Accountant, Vol. 12, 1925, pp. 669-684 and 709-720. The United States also has several broader provisions that involve a degree of averaging. Capital gains and losses are accorded certain privileges that are akin to averaging and certain other lump sum payments are accorded capital gains treatment. Carry-overs of business losses, installment reporting, and deferred compensation are also related to the period of progression as is averaging.* Simple Averaging. The simple average (explained in the previous chapter) has commended itself to many critics because of its simplicity, ease of administration, and the precision with which it adjusts tax burdens on fluctuating income to equal those on steady income. The first recommendation to apply simple averaging to fluctuating income that we have discovered was one made to an Australian Committee in 1921.** Some twenty five years ago Henry Simons*** proposed the following system of averaging: permit the taxpayer to sum his taxes over a period of years; calculate what his tax bill would have been had his income been distributed evenly over these years, determine the difference between the two, and claim the difference as a refund or tax credit. To prevent minor refunds he suggested that relief be confined to cases where the actual taxes exceeded the calculated taxes by 5 or 10 percent. The taxpayer might thus be permitted to average the income of any 5 or 10 successive years at his option, subject to the limitation that no year could appear in more than one averaging computation. -------------------- *See W .A. Klein and E.A. Wiegner, "Income Averaging for Tax Purposes-Sources of a Statutory Solution," Northwestern Law Review, May-June, 1965, pp. 150-151. ** ***Henry Simons, Personal Income Taxation, University of Chicago Press, Chicago, 1938, p. 154. See also Harold M. Groves, Taxation and Economic Progress, op. cit., pp. 223-228. The proposal has had a wide range of support.thahttp://www.ssc.wisc.edu/wais/WAIS656005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656005.txtaIRon Durant 1965~xOperat Martin David 1965<5Preliminary Tabulations SSRI WAIS Tax Extract File #1 April 27, 1965 WAIS paper645-049\"Averaging Studies Extract 01Martin David WAIS 645-049 Working Paper April 27, 1965 Revised August 3, 1965. Preliminary Tabulations SSRI WAIS TAX EXTRACT FILE #1 1. General 1.1 It is assumed that the file will be available according to WAIS Working Paper 645-046. 1.2 Modified WISTAB (permitting calculation of means) will be required in 1.3 1410 regression will be required in 1.4 Intervals are indicated by the lower limit of the range. (i.e. -00, 1000, 2000 would be interpreted as less than 1000, 1000 to less than 2000, 2000 or more). 2. Highest priority is ** Next priority is * INCOME (L Tables) **Table L1 Population: all records Column: Adjusted Gross Income (26-32): -00; 0; 1; 1000; 2000; 3000; 4000; 5000; 6000; 7000; 8000; 10,000; 15,000*; 25,000*. Row: Year, sex, and marital status (9-12) 46 0 0; 46 0 1; 46 1 0; 46 1 1; 47 0 0; 47 0 1; 47 1 0; 47 1 1; repeat for each year to 1960 Page: Decade of birth year (89) blank 0, 1, 2, 3, 4, 6, 8, 9 Entry: (a) number of records (b) percent of records (c) number of dependents (for Chris Green) *Deleted to bring number of intervals down to the limit of 13 (including T0T) for Wistab. Table L2 Population: All records Column: Adjusted Gross Income (26-22) -00; 0; 3000; 5000; 7000; 10,000; 15,000 Rows: Year - sex (9-11): 460; 461; 470; 471 (repeat to 1960) Page: Decade of birth year (89) Blank, 0, 1, 2, 3, 4, 6, 8, 9 Entry: (a) Mean Wages and Salaries 40-46 (b) Number of non-zero cases (if available) Tables L3 - L9 have the same classifying matrix as L2 **Table L3 Entry: (a) Mean Net taxable income 33-39 (b) Number of non-zero cases (if available) Table L4 Entry: ( a) Mean dividends received (47-53) (b) Number of non-zero cases (if available) Table L5 Entry: (a) Mean gain or loss (54-60) (b) Number of non-zero cases (if available) Table L6 Entry: (a) Mean self-employment income (61-67) (b) Number of non-zero cases (if available) Table L7 Entry: (a) Mean interest received (68-74) (b) Number of non-zero cases (if available) Table L8 Entry: Mean Rent received (75-81) Table L9 Entry: (a) Mean adjusted Gross Income (26-32) *Table L10 Population: All records Column: Total dividends received (47-53): Blank, -00; 0; 1; 50; 100; 250; 500 corrected to blank, 0, 1, 50, 100, 250, 500 Row: Gain or loss on sale of assets (54-60) -00; -2000; -1000; -500; -250; 0; 1; 250; 500;1000; 2000 corrected to blank, 250, 500, 1000, 2000, 0, 1, 250, 500, 1000, 2000 Page: Year Entry: (a) Number (b) Percent *Table L11 Population: All records Column: Self-employment income (61-67) -00; -5000; -2000; 0; 1; 2000; 5000; 7000; 10,000; 15,000 corrected to blank, 2000, 500, 0, 1, 2000, 5000, 7000, 10,000, 15,000Row: Gain or loss on sale of assets (54-60) [use same intervals as L10] Page: Year (9-10) DEMOGRAPHIC (D Tables) *Table D1 Population: All records column: Sex-marital status (11-12) 00, 10, 11 (changed to 0b 1b, 11) Row: Occupation (18 - 19) (Tabulate all detail) Page: Birth decade (89) Blank, 0, 1, 2, 3, 4, 6, 8, 9, Table D2 Population: All records Columns: Occupation (18-19) -00, 01, 20, 23, 25, 27, 30, 35, 36, 37, 38, 39 Row: Year, sex, marital status (9-12) 46 00; 46 10; 46 11; 47 00; 47 10; 47 11 (repeat to 1960) Page: Birth year (89-90) Blank, 00; 05; 10; 15; 20; 25; 30; 40; 60; 80; 85; 90; 95 Table G1 Population: All records Column: Year (9-10) Row: County of Residence (21-22) Entry: Amount of Net Taxable Income (33-39) Table G2 Population: All records Column: Year (9-10) Row: Adjusted Gross Income (26-32) blank, 0, 1, 500., 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7500, 8000, 8500, 9000, 9500, 10,000, 15,000, 20,000, 25,000, 50,000, 100,000, 150,000, 200,000, 500,000 Entry: (a) Number (b) Percent (c) Amount of AGI (26-32)hahttp://www.ssc.wisc.edu/wais/WAIS645049.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645049.txtMike Von Schneidemesser 1965NGIdentification Code for Social Security Administration Punch Card FilesOctober 21, 1965 WAIS paper656-027e("Social Security Earnings Data- 805von Schneidemesser WAIS 656-027 October 21, 1965 Identification Code for Social Security Administration Punch Card Files Along with the tape files of 805 information, the SSA sent us punch card files for those cases which for some reason or another are not contained on the tape file. These card files are described in the letters by Robert Heller to Roger Miller (March 6, 1964) and for the second delivery, Ira Rifkind to Gene Moyer (September 2, 1965). The layout of these card files is given in the "Description of tape file of form 805 in 1965To Use the Program FFYROctober 21, 1965 WAIS paper656-025Programshahttp://www.ssc.wisc.edu/wais/WAIS656025.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656025.txty2Mike VonSchneidemesserVonSchneidemesser WAIS 656-025 October 21, 1965 To use the program FFYR. HEADR: Redate Last Year TITLE: FFYR 1410 Program Place the program followed by the "REDATE" cards into the read-hopper and compile. When the message "Install SSRI 305 on 5, Blank Tape on 6" appears on the console, place the old FFID tape in ID # sequence on unit 5 and the tape for the updated output on unit 6. Enter $50. The program will then produce the updated FFID tape on 6. It will print out all cards with an ID # not contained on the FFID tape together with the statement "This ID not on FFID." It will also compare the SS# on the card - if given - with the SS # on the tape. In case these two differ the tape record and the card will be printed out together with the message "SS# not indentical, year not changed." The FFID record will be put on the new tape, but not updated. At the end of the program counters will be printed, which specify the number of records read in and written on the output tape.jRon Durant 1965~xOperating Instructions for the Updating of Individual (9-Digit) Fields in the Character 400 Post-Consistency Master FileAugust 27, 1965f WAIS paper656-013LF?Data Processing Maintenance System - Files, Data, Etc. ProgramslRon Durant WAIS Paper 656-013 August 27, 1965 Operating Instructions For the Updating of Individual (9-Digit) Fields in the 400 Character PostConsistency Master File OUTLINE: Page I. E.A.M. Sort of Updating Data 1 II. Card-to-Tape Updating Data 1 III. TAX-08 1 Appendix A Systems Flowchart A-1 I. E.A.M. Sort of Updating Data. 1. Sort updating data on columns 12 thru 1. II. Card-to-Tape with Updating Data. 1. Load SSRI C/T (Blocked 50) program in Card Reader. [Located in SSRI (DURANT) Drawer # 1 - 1410 Room]. 2. Load Updating data behind program deck. 3. Mount Scratch Tape on Unit 1. 4. Perform Standard 1410 Processor-108 Initialization Routine on Console. 5. C/T Output will be on Unit 1 at end of job. III. Update of Individual (9-Digit) Fields in Post-Consistency Master File: (TAX-08) 1. Load TAX-08 Object Deck in Card Reader. [Program Richard Bauman 19652+Codes for "Unknown" Lump Sum Death PaymentsOctober 22, 1965 WAIS paper656-028p Benefit FilejdRichard Bauman WAIS 656-028 October 22, 1965 Codes for "Unknown" Lump Sum Death Payments In the Social Security benefit data, there are a number of lump sum death payments to beneficiaries whose identity is quite impossible to ascertain. In such cases, no fixed format ID will be made. The WAIS ID number that should be assigned for the purposes of recording the data is: Digits 1-6 Same as family unit ID number Digits 7-8 - 71 for LSDP to unknown male relatives 72 for LSDP to unknown female relatives 73 for LSDP to a funeral home 74 for LSDP to the estate 79 for LSDP not identified by the SSAhahttp://www.ssc.wisc.edu/wais/WAIS656028.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656028.txtp#| Gene Moyer Jahanara Begumr 1965>7Summary of Wistab Cards for the Preliminary TabulationsAugust 3, 1965 WAIS paper656-010Analysis Tables"b"[Gene Moyer Jahanara Begum WAIS 656-010 August 3, 1965 Summary of Wistab Cards for the Preliminary Tabulations (See M. David's WAIS 645-049 - attached with additions) (Run Numbers do not reflect the order of running, merely the order given in 645-049) Run # Tables L1 Intervals* 1 * XTAB Counter size = 6 * C Year G54 * X AG1 (26-32) 13 * Y Year, Sex, m.. status (460b, 460l, 461b, 4611,..5411) 36 (9-12) * Z Decade of birth year (89) 10 * END Total core for run 28,080 2 * XTAB Counter size = 6 * L Year L55 * X AGI (26-32) 13 * Y Year, sex, m. status (550b, 5501, 551b, 5511,...,6011) 24 (9-12) *Z Decade of birth (89) 10 18,720 Table L10, *X Dividends Received (47-53) 8 *Y Gains and Losses (54-60) 12 *Z Year (9-10) 16 9216 * END Total core for run 27,936 * The number of intervals given includes TOT so that the size of the ith table = Ti = (C)(X)(Y)(Z) where C = counter size, X = the number of intervals in the * X card, Y = the number of intervals in the * Y card, and Z = the number of intervals in the * Z card. The core size limit for a run is Ei Ti < 28,000. Tables L1A Run 3 Duplicate run 1 inserting * A adding up number of dependents (16-17) Run 4 Duplicate run 2 (Table 1 only) inserting * A adding up number of dependents (16-17) Tables L2 Run 5 * XTAB Counter size = 9 * X AGI (26-32) 8 * y Year, sex, 1946-1960 (9-11) 31 * Z Decade of birth (89) 10 *END Total core for run 22,320 The results of run 5 will be put on SSRI TAPE # 306 and used in calculating means. Tables L2-L9 require this basic set of cards with the following cards inserted before the * X card: Run 6 * A adding up wages and salaries (40-46) Run 6 = Dividend Run 5 = Divisor to get means Run 7 * L Wages and salaries = 0000000 Tables L3 Run 8 * A adding up Net Taxable Income (33-39) Run 8 = Dividend Run 5 = Divisor to get means Run 9 * L Net Taxable Income - 0000000 Tables L4 Run 10 * A Dividends received (47-53) Run 10 = Dividend Run 5 = Divisor to get means Run 11 Run 11 * L Dividends - 0000000 Tables L5 Run 12 * A Gains or Losses (54-60) Run 12 = Dividend Run 5 = Divisor to get means Run 13 * L Gains or Losses = 0000000 Tables L6 Run 14 * A Self-employment income (61-67) Run 14 = Dividend Run 5 = Divisor Run 15 * C Self-employment income = 0000000 Tables L7 Run 16 * A Interest received (68-74) Run 16 = Dividend Run 5 = Divisor Run 17 * C Interest received = 0000000 Table L8 Run 18 * A Rent received (75-81) Run 18 = Dividend Run 5 = Divisor Table L9 Run 19 *A AGI (26-32) Run 19 = Dividend Run 5 = Divisor Table Dl Run 20 * XTAB Counter size = 6 * X Sex-Marital status (10-11) 4 * Y Occupation (18-19) 43 * Z Decade of birth (89) 10 10,320 Table L11 * X Self-employment Income (61-67) 11 * T Gains and Losses (54-60) 12 * Z Year (9-10) 16 12,672 * END Total core for run 22,992 Table D2 Run 21 * XTAB Counter size = 5 * C Year G53 * X Occupation (18-19) (General Categories) 13 * Y Year, Sex, marital status (9-12) (1946-53) 25 * Z Birth year ranges (89-90) 15 * END Total Core for run 24,375 Run 22 * XTAB Counter size = 5 * C Year L54 * X Occupation (18-19) (General Categories) 13 * Y Year, Sex, Marital status (9-12) (1954-1960) 22 * Z Birth year ranges (89-90) 15 * END Total Core for run 21,450 Additions to runs Table Gl * XTAB Counter size = 9 * L Year G52 * A Net Taxable income (33-39) 8 * X Year (9-10) (1946-1952) * Y County of Residence (21-22) 76 (Added to Run 7 Total Core - 22,320 + 5472 - 27,792) 5472 * L Year L53 * L Year G59 * A Net Taxable Income (33-39) * X Year (9-10) (1953-1959) 8 * Y County of Residence (21-22) 76 (Added to Run 9 Total Core - 22,320 + 5472 = 273792) 5472 * C Year G59 * A Net Taxable Income (33-39) * X Year (9-10) (1960) 2 * Y County of Residence (21-22) 76 (Added to Run 11 Total Core - 22,320 + 1368 - 24,688) 1368 Table G2 * L Year G52 * X Year (9-10) (1946-1952) 8 * Y AGI (26-32) 31 2232 * C Year L53 * X Year (9--10)(1953- 960) 9 * Y AGI 31 (Added to Run 13 Total Core = 22,320 + 4743 =27063) 2511 * A AGI (26-32) * X Year (9-10) (1953-1960) 9 * Y AGI (26-32) 31 2232 (Added to Run 15 Total Core = 22,320 + 4743 = 27,063) 2511 Approximate Machine Costs 1 1/4 hours per run + 10 minutes for each mean = 22.1 1/4 + 1/6 .8 = 27 1/2 + 1 1/3 = 725 29 hours @ 25.00 Time from first running (approximate) 23 runs @ 1 1/4 28 3/4 28 3/4 25.00 = 717.75 Time owed WAIS from Wistab problems 4 1/2 * 25.00 = -112.50 Additional time which WAIS should get from lack of negative signs 5 1/2 @ 25.00 = - 137.50 Total Cost 1192.75 Martin David WAIS 645-049 Working Paper April 27, 1965 Revised August 3, 1965. Preliminary Tabulations SSRI WAIS TAX EXTRACT FILE #1 1. General 1.1 It is assumed that the file will be available according to WAIS Working Paper 645-046 1.2 Modified WISTAB (permitting calculation of means) will be required in 1.3 1410 regression will be required in 1.4 Intervals are indicated by the lower limit of the range. (I.e. -00 1000, 2000 would be interpreted as less than 1000, 1000 to less than 2000, 2000 or more). 2. Highest priority is ** Next priority is * INCOME (L Tables) **Table L1 Population: all records Column: Adjusted Gross Income (26-32): -00; 0; 1; 1000; 2000; 3000; 4000; 5000; 6000; 7000; 8000; 10,000; 15,000*, 25,000*. Row: Year, sex, and marital status (9-12) 46 0 0; 46 0 1; 46 1 0; 46 1 1; 47 0 0; 47 0 1; 47 1 0; 47 1 1; repeat for each year to 1960 Page Decade of birth year (89) blank 0, 1, 2, 3, 4, 6, 8, 9 Entry: (a) number of records (b) percent of records (c) number of dependents (for Chris Green) *Deleted to bring number of intervals down to the limit of 13 (including TOT) for Wistab. Table L2 Population All records Column: Adjusted Gross Income (26-32), -00; 0; 3000; 5000; 7000; 10,000; 15,000 Row: Year - sex (9-11): 460; 461; 470; 471 .... (repeat to 1960) Page: Decade, of birth year (89) Blank, 0, 1, 2s 3s 4, 6, 8, 9 Entry (a)Mean Wages and Salaries 40-46 (b)Number of non-zero cases (if available) Tables L3 - 1.9 have the same classifying matrix as L2 **Table L3 Entry (a) Mean Net taxable income 33-39 (b) Number of non-zero cases (if available) Table L4 Entry (a) Mean dividends received (47-53) (b) Number of non-zero cases (if available) Table L5 Entry: (a) Mean gain or loss (54-60) (b) Number of non-zero cases (if available) Table L6 Entry: (a) Mean self-employment income (61-67) (b) Number of non-zero cases (if available) Table L7 Entry: (a) Mean interest received (68-74) (b) Number of non-zero cases (if available) Table L8 Entry: Mean Rent received (75-81) Table L9 Entry: (a) Mean adjusted Gross lncome (26-321 *Table L10 Population: All records Column Total dividends received (47-53): Blank, -00; 0; 1; 50; 100; 250; 500 corrected to blank, 0, 1, 50, 100, 250, 500 Row: Gain or loss on sale of assets (54-60) -00; -2000; -1000; -500; -250; 0; 1; 250; 500;1000; 2000 corrected to blank, 250, 500, 1000, 2000, 0, 1, 250, 500, 1000, 2000 Page; Year (a) Number Entry: (b) Percent *Table L11 Population: All records Column: Self-employment income (61-67) -00; 5000; -2000; 0; 1; 2000; 5000; 7000; 10,000; 15,000 corrected to blank, 2000, 5000, 0, 1, 2000, 5000, 7000, 10,000, 15,000 Row: Gain or loss on sale of assets (54-60) [use same intervals as LID] Page: Year (9-10) Demographic (D Tables) *Table D l Population: A11 records Column: Sex-marital status (11-12) Row: Occupation (18-19) (Tablulate all detail) Page: Birth decade (89) Blank, 0, 1, 2, 3, 4, 6, 8, 9, Table D2 Population All records Column: Occupation(18-19) -00, 01, 20, 23, 25, 27, 30, 35, 37, 38, 39 Row: Year, sex, marital status (9-12) 46 00 46 10, 46 11, 47 00, 47 11, (repeat to 1960) Page: birth year (89-90) Blank, 00; 05; 10; 15; 20; 25; 30; 40; 60; 80; 85; 90; 90; 95 Table G1 Population; All records Column: Year (9-10) Row: County of Residence (21-22) Entry: Amount of Net Taxable Income (33-39) Table G2 Population: All records Column: Year (9-10) Row: Adjusted Gross Income (26-32) blank, 0, 1, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 16500, 7000, 7500, 8000, 8500, 9000, 9500, 10,000, 15,000, 20,000, 25,000, 50,000, 100,000, 150,000, 200,000, 500,000 Entry: (a) Number (b) Percent (c) Amount of AGI (26-32)hahttp://www.ssc.wisc.edu/wais/WAIS656010.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656010.txtr qo Ron Durant 196582Results of Consistency Checks on Averaging RecordsAugust 2, 1965 WAIS paper656-009,%Averaging Studies Consistency of DataRon Durant WAIS 656-009 August 2, 1965 Results of Consistency Checks on Averaging Records 1. Number of Records involved: Total Number of Records 130,025 Total Number of Households 13,046 Total Number of ID No's. (00's) 87,655 " " " " (X0`s) 36,998 124,653 " " " " (XX's) 5,372 where (1 < X < 9) II. Husband-Wife Checks: Type of Error Number of Year Records 001 - Wife ID # XXXXXXXX YEAR t states married with separate husband income - 1,070 no husband return. 002 - Wife ID # XXXXXXXX YEAR t states 147 married - husband states not married. 003 - New Wife ID # XXXXXXX YEAR t states not 7 married during the year. 004 - Husband ID # XXXXXXXX YEAR t states married 2,826 with separate wife income - no wife return. 005 - Wife ID # XXXXXXXX YEAR t states married 382 with no separate husband income - husband return present. 006 - Husband ID # XXXXXXXX YEAR t states married 1,330 with no separate wife income- wife return present. 007 - Wife ID # XXXXXXXX YEAR t states married during t; however married before t to the same husband. 29 008 - Husband ID #XXXXXXXX states married during t; however married before t to the same wife. 18 009 - Dead Husband ID#XXXXXXXX resurrected in the year t. 38 III. Individual Inter-Year Checks: Type of Error 101 - ID # XXXXXXXX indicates that in t a previous year return was filed - (t - 1) file not present. 8,580 Additional 101 Errors due to missing 1946 Returns. (4,667) 102 - ID # XXXXXXXX indicates no return in previous year t - Return present . 887 103 - ID # XXXXXXXX indicates married during t, also indicates already married in (t - 1). 111 104 - ID # XXXXXXXX indicates not married in t, but also indicates married during t. 56 105 - ID # XXXXXXXX indicates not married in t, but married in (t + 1) - however not married during (t + 1) 619 IV. Multiple Wife Checks Type of Error Number of Year Records Multiple wives 15 Wife Entering with No. Less than Previous Wife 18ohahttp://www.ssc.wisc.edu/wais/WAIS656009.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656009.txt  Ron Durant 1965{Assignment of Entry Codes to Update Existing WAIS Post-Consistency 400 Character Master Record File (9 Digit Amount Fields)rAugust 12, 1965; WAIS paper656-011o& Master File- Tax Records FormatsRon Durant WAIS 656-011 August 12, 1965 To: Hilde Roubal and WAIS Keypunching Staff Info: WAIS Staff: David, Groves, Lampman, Miller, Bauman, Geffert, Loniello, Duchan, Moyer, Roubal, Ellis, Barger, Wiegner, VonSchneidemesser. From: Ron Durant Document: Assignment of Entry Codes to Update Existing WAIS Post-Consistency 400 Character Master Record File (9 Digit Amount Fields). Entry Codes will be used in program TAX-08. I. The following are the format of the selected fields in the WAIS Master Income Record and an assignment of entry codes which may be used to correct these fields in this master record. This procedure should only be used to update amount fields which require more than seven digits of details. Format Data Record Entry Code Pos. 1 M 1 1 2-9 M 9 Identification Number 10-11 M11 Year of Return 28-36 M34 Largest wage or salary 10 64-72 M62 Dividends received, total 14 82-90 M76 Gain or loss on sale of assets 16 91-99 M83 Profit or loss from business 17 100-108 M90 Income from trustees or fiduciaries l8 127-135 M111 Total of sources of income 21 145-153 M125 Income (adjusted gross) less auto expense 23 163-171 M139 NTI (standard deduction basis) 25 262-270 M216 Net income (before fed. tax & donations) 36 280-288 M230 Net Income before donations 38 298-306 M244 NTI (Itemized Basis) 40 370-378 M306 Taxable income incomplete form or net taxable income, type 5 form 54 II. The following card entries will be necessary for the updating of each field in the master record. Only one field may be updated with any one card entry. Therefore, multiple field updating within one master record will require multiple card entries with the same identification number and year. Columns Data 1- 8 Identification number 9-10 Year 11-12 Entry Code 13-21 Amount field (Right justified-minus zone (if any) in column 13). 22-74 Blank 75-80 TAX-08hahttp://www.ssc.wisc.edu/wais/WAIS656011.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656011.txt8 Ron Durant 1965D=Operating Instructions for the Running of EXT-01 (EXTRACT-01)xAugust 27, 1965 WAIS paper656-012d@:Extract 01 Maintenance System - Files, Data, Etc. Programs Ron Durant WAIS Paper 656-012 August 27, 1965 Operating Instructions For the Running of EXT-O1 (EXTRACT-01) OUTLINE: Page I. Sort of Social Security Input Data 1 II. EXT-O1 2 Appendix A Systems Flowchart A-1 Appendix B EXT-O1 Flowchart B-1 I. Sort of Social Security Input Data; Sorting Sequence is: (1) WAIS Identification Number 1. Load PRE-EXT-O1 Sort Control Cards in Card Reader. [Located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Operating Systems Initialization). 2. Mount SOF [Tape # SSRI 130 A (HALF REEL)] on Unit 0. 3. Mount Social Security Input to be sorted on Unit 1. " " " " 2. " " " " 3. " " " " 4. 4. After Halt signaling that the Input Tape has been read in and unloaded: Mount Scratch Tape on Unit 1. 5. At end of job a console message will inform the operator as to which unit the sorted output is located. II. EXT-01: The object of this program is outlined in WAIS Working Papers 645-056, 645-057 and 645-070. 1. Load EXT-01 Object Deck in Card Reader. [Program Deck is located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Processor 108 Initialization). 2. Mount Sorted Social Security Data on Unit 3.. Mount 1st Reel of WAIS Master I/P " " 1. Mount Scratch Tape " " 2. " " " " " 4. " " " " " 6. Bring Printer to Ready. 3. During the course of the program, console messages will notify the operator as to the mounting of additional input and output reels of tape. 4. EXT-01 output is as follows: EXTRACT - 1 File on Unit 2. " - 1A " " " 4. MASTER FILE OUT on Unit 6. Apendix A A-1 Listing of Rejected Records on printer. Systems Flowchart for EXT-O1 Social Security Data Step I Sort Social Sec. data into WAIS'ID" Sequence Sorted Social Security Data Consistency Master File Step II EXT-01 EXTRACT #1 Extract 1A Master File Listing of Rejected Records Appendix B B-1 Extract #1 Start Open MASTERIN EXTRACT 1 MASTEROUT SOC SECIN PRINTOUT BEGIN RD SOC. SEC. ALPHA 1 RD MASTER CMP MASTER TO SOC. SEC. IN RD SOC SEC SETON SW.A SW.E BUILD EXTRACT RECORD (EXCL SOC SEC) SW. E ON OFF SETOFF SW.E MOVE ADDED FILEDS FROM SOCSEC IN TO EXTRACT RCD A1 Put Master out A + 1CNT BRANCH TO ALPHA1 B-2 PUT EXTRACT RCD. ADD +1 TO RCDCNT BLANK EXTRACT MLCB ENDMASTIN ENDINOUT i.e. OLD MASTIN TO MASTEROUT AREA SW.A ON OFF A1 MOVE ADDED FIELDS FROM SOCSEC IN TO MASTOUThahttp://www.ssc.wisc.edu/wais/WAIS656012.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656012.txtAu Ron Durant 1965\VOperating Instructions for the Maintenance and Updating of the WAIS Master Income FileAugust 27, 1965 WAIS paper656-015F?Data Processing Maintenance System - Files, Data, Etc. ProgramswzRon Durant WAIS Paper 656-015 August 27, 1965 Operating Instructions for the Maintenance and Updating of the WAIS Master Income File OUTLINE: Page I. Card-to-Tape - Updating Data 1 II. Sort Updating Data 1 III. TAX-03 2 IV. Sort Recycled Master File 3 V. MERGE 3 Appendix A Systems Flowchart A-1 Appendix B TAX-03 Flowchart B-1 MERGE Flowchart B-10 I.Card-to-Tape with Updating Data: (50 Records Per Block) 1. Load SSRI C/T (Blocked. 50) Program in Card Reader. [Located in SSRI (DURANT) Drawer # 1 - 1410 Room]. 2. Load Updating data behind program deck. 3. Mount Scratch Tape on Unit 1. 4. Perform Standard 1410 Processor-108 Initialization Routine on Console. 5. C/T Output will be on Unit 1 at end of job.. II. Sort Tape Created in I; Sorting Sequence is: (1) WAIS Identification Number (Cols 2-9) (2) Year (Cols 10-11) (3) Card Number (Col. 1) 1. Load PRE-TAX-03 Sort Control Cards in Card Reader. [Located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Operating Systems Initialization 2. Mount SOF [Tape # SSRI 130 A (HALF REEL)] on Unit 0. 3. Mount I/P Tape to be Sorted on Unit 1. " Scratch Tape " " 2. " " " " " 3. " " " " " 4. 4. After Halt signaling that the Input Tape has been read in and unloaded: Mount Scratch Tape on Unit 1. 5. At end of job a console message will inform the operator as to which unit the sorted output is located. III. Edit and Master Updating Run: (TAX-03) A. For a detailed description of the scope and procedure employed in this program, see WAIS Paper 645-054. Programming Systems Involved in the Creation and Updating of the WAIS Master Income File. (April 28, 1965). B. Operation: (1) Load TAX-03 Object Deck in Card Reader. If there are any "ID" C (Change) Cards, load them behind the program deck. [Program Deck is located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Processor-108 Initialization). (2) Mount Sorted PRE-TAX-03 Output on Unit 1. " 1st Reel of WAIS Master I/P on Unit 3. " Scratch Tape " " 0. " " " " " 2. " " " " " 4. Bring Printer to Ready. (3) During the course of the program, console messages will notify the operator as to the mounting of additional input and output reels of tape. (4) TAX-03 output is as follows: (a) Recycled Master Record File on Unit 0. (b) Updated Master File on Unit 2. (c) Edit Tape on Unit 4. (Edit Tape consists of 80 character records which can be punched out using an IBM Utility program.) (d) Edit Listing on the Printer. IV. Sort Recycle Master Record File (created in TAX-03). Sorting Sequence is: (1) WAIS Identification Number (Cols 2-9) (2) Year (Cols 10-11). 1. Load POST-TAX-03 Sort Control Cards in Card Reader. [Located in (DURANT) Drawer # 2 - 1410 Room]. (Standard Operating Systems Initialization). 2. Same as II-2 through II-5. V.MERGE - The object of this program is to merge the A. Recycled Master File into our Updated Master File thereby creating a Final Updated Master File. If a recycled master record is found to be a duplicate of an existing Master record, the recycled master record is dropped and both the accepted Master record and the dropped recycled master record are printed out. B. Operation: (1) Load MERGE Program Deck in Card reader. [Program Deck is located in SSRI (DURANT) Drawer # 2 - 1410 Room]. (Standard Processor-108 Initialization). (2) Mount Sorted Recycled Master File on Unit 1. " Update Master File " " 3. " Scratch Tape " " 2. Bring Printer to ready. (3) During the course of the program, console messages will notify the operator as to the mounting of additional input and output reels of tape. (4) MERGE output as follows: (a) Final Updated Master File on Unit 2. (b) Edit listing on the printer. Appendix A-1 Systems Flowchart for the Maintenance and Updating of the WAIS Master Income File Step I CARD TO TAPE WITH UPDATING DATA Step II. SORT DATA INTO SEQUENCE OF MASTER FILE SORTED UPDATED DATA Step III EDIT LISTING INPUT MASTER FILE TAX-03 UPDATED MASTER FILE Step IV SORT INTO SEQUENCE OF MASTER FILE RECYCLED MASTER FILE SORTED RECYCLED MASTER FILE Step V MERGE FINAL UPDATED MASTER FILE LISTING OF DUP. RECYC. MASTER RCDS & ACCEPTED MASTER RCDS. APPENDIX B B-1 TAX-03 Flow Chart START READ IN ID CHANGE ENTRIES & BUILD TABLE RD MASTIN BEGIN RD DATA CMP DTAG TO PREVDTAG PUT OUT DUP. DATA MSG. PUT DUP.RCD. OUT AS EDIT BRANCH TO BEGIN HALT SEQUENCE ERRO MOVE DTAG TO PREVDTAG MOVE DRCD TO PREVDRCD MOVE DTAG TO CURRTAG MOVE CURR. DATA RCD TO CHECK AREA BRANCH TO EDIT RTN B-2 EDITRTN CHECK FOR CURRTAG = "NINES" SETON PROCESS DATA SW. FOR 'LAST RCD & SETON EOJ SW. BRANCH TO BEGBALLINE CHECK FOR "J" ENTRY "P" ENTRY BRANCH TO PROCESSRTN ZERO OUT Xi CHECK FOR COMPATIBLE YRCARD #- CARD TYPE - SEARCH TABLE PUT OUT MSG. INCOMPATIBLE YR-NO-TYPE PUNCH OUT REJECT DATARETURN A + 1, Xi CMP BASE- 1 + Xi @ Z @ BRANCH TO UPCNTRTND NUMERIC NON-NUMERIC MLCB Xi, SAVX. #2 BRANCH TO 3A B-3 3A CMP SAVEXi, @ 12 @ BRANCH TO ALPHA1 PUT OUT MSG. COL. XX NONNUMERIC. PRINT & PUNCH REJECT BRANCH TO BEGIN IS CURR RCD a 1 or 9 RCD (COL. i) NO YES IS COL. 12 a "+" YES BRANCH TO UPCNTRTND NO B-4 ALPHA1 CMP BASE - 1 + Xi CHECK FOR M,R,S, & C SEARCH MIL. DED. TABLE CHECK FOR "-" or "NEG. ZONE" - SEARCH TABLE PUT OUT MSG. CO. XX ILLEGAL CHAR- PRINT & PUNCH OUT REJECT BRANCH TO BEGIN UPCNTRTND CMP SAVE Xi TO 79 BRANCH TO DATARETURN PROCESSRTN B-5 PROCESSRTN CMP CURRTAG TO PREVTAG HALT SEQUENCE ERROR ID#-YR CHANGE NO SETON DATA SW. YES SETON PROCESS DATA SW. LOOK AT CURR TAG AND DETERMINE WHETHER TO SETON DATA OR "P" OR "J" SW. BRANCH TO BEGBALLINE B-6 BEGBALLINE PROCESS DATA SW. OFF BRANCH TO BAKER1 ON SETOFF PROCESS DATA SW. PROCESS INDICATOR OFF ON SETOFF PROIND CMP PREVIOUS "ID#-YR" TO CURR MASTIN "ID#-YR" MOVE CURR MASTIN RCD TO MAST 0/P AREA MOVE CURR MASTIN RCD TO MAREA MLCS @ 0 @ TO OP MLCS @ 1 @ TO OP BRANCH TO GENMASTOUT ROUTINE ADD ADDITIONAL INFO TO MAST RCD (DEPENDING ON FORM TYPE ETC) IN MAREA CHECK FOR MISSING CARDS & PUT OUT APPROPRIATE MSGS. BUILD DATA MAST RCD (DEPENDING ON FORM TYPE ETC IN MAJEA MOVE MAREA TO MAST 0/P AREA AND WRITE OUT MAST. RCD BRANCH TO GENMASTIN ROUTINE BLANK OUT; DATA ACCUM AREA, MAREA, MAST 0/P AREA, RD IN MASTIN B-7 BAKER1 J. SW. OFF BRANCH TO ABEL1 ON SETOFF J SW. CMP. CURR "ID" TO CURR MASTIN "ID" PUT OUT MSG. J ENTRY ID# (CURRTAG)-NO MASTER EXISTS PUNCHOUT EDIT CMP CURR "ID" TO CURR MASTIN "ID" ADD +1 TO MAST DRP CNTR MOVE CURR MASTIN RCD TO MAST 0/P AREA SETON JPR SW. BRANCH TO GENNASTOUT ROUTINE BRANCH TO GENMASTIN ROUTINE SETOFF JPR SW. ON OFF JPR SW. B-8 ABLE1 P. SW. OFF BRANCH TO BAKER2 ON SETOFF P SW. CMP CURR "ID#-YR" TO CURR MASTIN "ID#-YR" PPR SW. OFF PUT OUT MSG. PENTRY ID# YR# (CURR TAG) NO MASTER PUNCH OUT EDIT ON SETOFF "PPR" SW MOVE CURR TO MAST O/P AREA BRANCH TO GENMASTOUT ROUTINE BRANCH TO GENMASTIN ROUTINE ADD +1 TO MASTDRP CNTR SETON PPR SW. B-9 BAKER2 DATA SW. OFF EOJ SW. OFF MOVE CURRTAG TO PREVETAG BRANCH TO BEGIN ON SETOFF DATA SW. MOVE CURR RCD INTO APPROP DATA ACCUM AREA SETON PROCESS INDICATOR ON CMP CURR MASTIN ID# TO @ 9999999999@ HALT SEQUENCE ERROR MOVE CURR MASTIN RCD TO MAST O/P AREA PUT OUT CONTROL TOTAL & CLOSE FILES BRANCH TO GENMASTOUT ROUTINE BRANCH TO GENMASTIN ROUTINE END OF JOB HALT 99999 B-10 Merge Flow Chart BEGIN RD MASTER RCD RD RECYCLED MASTER RCD RD MASTER RCD CMP MASTER TO RECYCLED MASTER "ID-YR" WRITE MASTER RCD OUT WRITE RECYLCED MASTER RCD OUT LIST DUPLICATE RECYCLED MASTER & ACCEPTED MASTER WRITE MASTER RCD OUT BEGINhahttp://www.ssc.wisc.edu/wais/WAIS656015.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656015.txty3*Jonathan Ryshpan 1965.'Fixed Format ID File Maintenance SystemlSeptember 3, 1965p WAIS paper656-019rd^Data Processing Fixed Format Identification File (FFID) Maintenance System - Files, Data, Etc.11Jon Ryshpan WAIS 656-019 September 3, 1965 Fixed Format ID File Maintenance System NB All programs print a record count as a last page; this may appear on the same page as the monitor cards for the next program. NB The numbers in the boxes in the flow chart correspond to the first number of the page code -- i.e. the Update FID program, numbered "1", is described on page 1-1. The letters on the tape reels correspond to the letters to the far left on the tape descriptions. NB Wherever the programs herein described are not assembled into a multiprogram JOB, the appropriate MONITOR cards are included in the program deck. These are all in Job 9 Prog 1 Sort by ID 3 FID A Update Cards List of Duplicates FID B Sort of Social Security No. FID C Multiple 805 Sheets Prog 6 Mult 805 Tape Reg. 805 D Claims Cards Prog 5 Reformatted Form 805 E Prog 4 Not on Soc Sec. F Soc Sec. "We have tried to find them cards." Job 7 Package 7 Printed Output both tapes Tape for Soc. Sec. FID recs for people sent G People not on FID Selected Soc. Sec. Package 8 FID C Refomatted Form 805 E Job 8 People with Changed ID no's There is a program for printing selected records from the Regular Form 805 Tape HEADR Update the FID File TITLE UPDATEFID 1410 Program The purpose of this program is to make corrections to the FID file according to the scheme set forth in WAIS 645-055 and also to enter new I records. To use the program, place the program deck in the read hopper (with the appropriate Monitor cards) followed by the corrections cards sorted on cols. 2-9 and 80. The operation of the program consists of two phases. In phase 1, the computer reads the correction cards, checks them for format and sequence, and writes them out on tape. If any errors are found in the corrections cards by phase 1, the program prints error messages on the printer and cancels the execution of phase 2. In phase 2 the actual updating of the tape is carried out. At the start of phase 2, the console typewriter types "Mount old FID on MR1, blank tape on MR2." and enters a waiting loop to allow the operator to do these things. If during the updating the program encounters any C, J, or N cards whose ID No's do not match any on the tape or I cards whose ID No's do match any on the tape, it prints out these cards and bypasses them in the updating operation. The old FID should be sorted by ID No. At the end of phase 2, the updated FID file is on MR2. It may have records out of sort and multiple records with the same ID, so it must be sorted and have duplicates eliminated. JOB Sort 124 Digit Records1410 Program These are control cards for the operating Sort program for sorting the FID file. In the listing there are two "E CNTFLDS" cards. Only one of these is used at a time, each in its proper place just before the end card. The first, which reads: NUMBER-1, LENGTH-0009, 1L0SC-0018, 1LEN-009 sorts the FID file by Social Security No. The second which reads: NUMBER-l, LENGTH-0008, 1L0C-0009, 1LEN-008 sorts by ID number. HEADR Find and Eliminate Duplicate ID Numbers TITLE SIMPLICATE 1410 Program The purpose of this program is to eliminate all but one of a set of records with the same ID No. from the FID file. To use the program, place the program deck in the read hopper (with the appropriate Monitor cards). When execution starts this message will appear on the console typewriter "Mount input on MR1, Output on MR2". The operator should put the FID tape, sorted by ID No. on MR1 and a blank tape on MR2, and then enter "$50" on the console typewriter. The program produces a list of records having multiple ID No's. These are printed in groups by ID No. The record selected to be put on the output is one of the 1st two read and is in order of preference: The N record The J record The 2d record The output is an FID tape, sorted by ID No. and without duplicate ID No's on MR2. HEADR Select Reformatted Form 805 TITLE SELECT805 1410 Program The purpose of this program is to create a file of FID information together with the Form 805 Data from Social Security wherever both are present, make a tape of people for whom we have FID data but no Form 805 data, and make a list of the Social Security No's of those people for whom we have Form 805 data but no FID record. To use the program, place the program deck in the read hopper (with the appropriate Monitor cards). When program execution starts, this message appears on the console typewriter: "Mount Inputs on MR1-2, Outputs on MR3-4," and the program enters a waiting loop. The operator should then place the FID tape, sorted by Social Security No. on MR1, the Reformatted Form 805 Tape on MR2, and blank tapes on MR3 and MR4; then enter $50. At the end of program execution, the file of combined Social Security and FID data is on MR3, the file of people for whom we have FID data but no Form 805 data is on MR4, and the list of people for whom we have Form 805 data but no FID data is on the printer. The file on MR4 is input to a set of programs that make the tape to be sent to Social Security. HEADR Reformat Form 805 and Add Claims TITLE FIX805 1410 Program The purpose of this program is to change the Social Security Form 805 data from the form in which it came to us into the form described in WAIS 645-063 and add a claims indication for those people who have claims cards. To use the program, place it in the read hopper (with the appropriate Monitor cards) followed by the claims cards sorted by Social Security No. The claims cards must be followed by a trailer card with "999999999" in card cols. 1-9. When program execution starts, this message will type out: "Mount Inputs on MR1-2, Output on MR3." "Put claims cards in hopper then 999-card." The operator must then place the first reel of the Regular 805 data sorted by Social Security No. on MR1, the reel of Multiple Accounts data sorted by Social Security on MR2 and a reel of blank tape on MR3, and then enter "$50". (Note: The program is designed to take two reels of Regular Form 805 data of which the first has a tape label and one reel of Multiple Account Form 805 data. If some other configuration of tape files exists, the program will have to be changed.) After the first reel of Regular Form 805 data has been read, the computer will type out: "Mount 2D Reel of 805's on MR1" and enter a waiting loop. The operator should mount the second reel of the Regular Form 805 data on MR1 and enter $50. The program prints out two different kinds of error messages: "Unreferenced Claims Card" and the card if a claims card does not have the same Social Security No. as any tape record, and "Both Files have the Same Social Security No." and the Social Security No. if both the Regular and Multiple Account Form 805 files have records with the same Social Security No. Card to Tape for Multiple Soc-Sec 1410 SPS Program The purpose of this program is to transfer the cards punched from Multiple Form 805 sheets according to the format described in WAIS 645-045 to tape in the same format as the Regular Form 805 tape from the Social Security Administration. To use the program, place it in the read hopper, followed by the data cards sorted by Social Security No. and Card No. (i.e. Card cols 1-12). Put a blank tape on tape unit 1. Press the card load key. At the end of execution, the output is on tape unit 1. N.B. Leave Sense Switch A on. JOB Prepare the Tape to Send to Social Security 1410 Monitor Job Package This is a set of 3 programs with their associated Monitor control cards: 1. HEADR Eliminate Extra Work TITLE HELPSOCSEC The purpose of this program is to eliminate those people whom the Social Security Adm. has already tried to find from the tape to be sent to them. 2. HEADR List and Reformat Missing Items TITLE EXPORT List the tape of FID records of people who will be sent to Social Security and prepare the actual tape of 80 Char. unblocked records 3. HEADR Print 80 Character Blocks TITLE PRINT80 List the tape to be sent to Social Security. The job deck consists of two parts: a thick blue deck followed by a thin red deck. To run the job, place the deck of cards of people that Social Security has tried to find between the blue and red job decks, and put the whole works in the read hopper. Mount the "Not on Soc-Sec" tape on Unit 1, blank tapes on Units 2 and 3, and start the job. At the end of the run, the tape of FID records for people being sent to Social Security is on Unit 2; and the tape to be sent is on Unit 3. JOB Find and Print Changed ID Numbers 1410 Monitor Job Package This is a package of two programs and a Sort for the purpose of finding out when a record on the FID file has the same Social Security No. as a record on the Reformatted 805 file; but the two have different ID No's. To operate the package, prepare yourself by having the Reformatted Form 805 file and the FID file sorted by Social Security No. on hand. Put the package in the read hopper, start the job, and watch the console typewriter for detailed operating instructions. The package is constructed of 1. HEADR: Find Records with ID changes TITLE: FINDCHANGE The purpose of this program is to find the changed ID No's and put them out on tape. 2. SORT: To sort the records by the ID No. on the FID file. 3. HEADR: Print out Changed ID No's. TITLE: PRINTID To printout the records found by FINDCHANGE. JOB Update FID, Sort on ID, Eliminate Duplicates 1410 Monitor Job Package This is a package of two programs and a sort for the purpose of updating the FID file - including sorting it into ID No, order, and eliminating any duplicate records that may have arisen in the process. These are: 1. HEADR Update the FID File 2. JOB Sort 124 Digit Records 3. HEADR Find and Eliminate Duplicate ID Numbers All of which are described earlier. The Job deck consists of two parts: a green deck followed by a pink deck. To run the job, place the deck of update cards, sorted on columns 2-9 and 80 between the green and pink decks and put them in the read hopper. Have the old FID file on hand. Start the run; and watch the console typewriter for detailed operating instructions. At the end of job, the updated FID file will be on Unit 1. Print and Punch Selected Soc-Sec Records From the 805 Tape. 1401 SPS Program The purpose of this program is to print and punch the first line of records selected from the Form 805 tape sent to us by Social Security. To use the program, place the program deck followed by cards with the Social Security No's of people who are to be printed out punched in cols. 2-10, this deck sorted in Social Security No. order in the read hopper, and mount the first reel of the Form 805 file on tape unit 1. Press computer reset and card load keys. When the first reel of the file is exhausted, mount the second reel on unit 1 and press the start key. N.B. Leave Sense Switch A on. Reel No. Label Notes A SSRI154 FFID Tape of May 11th Sorted by ID No. Blocked 8 x 124. 20224 Recs May 31, 1965 This is the last FID file before the run of the standard update program. B SSRI305 FFID File with C*N*I*J Blocked 8 x 124. corrections made + duplicate ID's removed. (This is input to the next update run.) Sorted on ID No. 20134 Recs June 3rd, 1965 C SSR1288 FID file with C:N:I-J corrections made + Blocked 8 x 124. duplicate ID's eliminated. Sorted by Soc-Sec No. (This is input to the selection run.) D -------- Form 805 Data from Social Security (1 of 2) Blocked 1 x 738. Feb 24th These are the 2 original reels of 805 data sent ------- Form 805 Data from Social Security (2 of 2) to WAIS by the Social Security Adm. Feb 24th E SSRI178 Form 805 Data, Reformatted with a Claims Blocked 10 x 404 Indication 14627 Recs April 22, 1965 Note: The format of this tape is similar to that of WAIS 645-063. The differences are that cols. 1-123 are all blank except for 1 "I" if Regular 805, "J" if Multiple 2-9 Wisc ID No. (From the Form 805 file) 10-18 Social Security No. And tape cols. 382-403 Multiple Social Security No's (On "J" records) 404 "not equal" Reel No. Label Notes F SSRI121 Items on the FID file that do not appear on Blocked 8 x 124 the SECUR file, Sort on Soc-Sec. The SECUR file is the Reformatted 805 file. 3456 Recs June 2, 1965 G SSRI166 Items on FID that are not on SECUR, less those Blocked 8 x 124 that Soc-Sec Adm. has sent us "We have tried These are the FID recs for the people we sent to find them" cards to Soc-Sec for further processing. 2,800 Recs June 8, 1965 SSRI-- OK Recs Jim Geffert made this tape from which the Sorted Keys + Random "J" recs were made. SSRI129 ------ Scratch tape.hahttp://www.ssc.wisc.edu/wais/WAIS656019.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656019.txt1965>8Purpose and Operation of the Programs FFID, FFIDS, FFIDEOctober 21, 1965 WAIS paper656-026hProgramshahttp://www.ssc.wisc.edu/wais/WAIS656026.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656026.txtw2Mike VonSchneidemesserVonSchneidemesser WAIS 656-026 October 21, 1965 Purpose and Operation of the Programs FFID, FFIDS, FFIDB These two programs serve to extract certain records specified by cards from the FFID tape sorted on 1965>8Purpose and Operation of the Programs FFID, FFIDS, FFIDEOctober 21, 1965 WAIS paper656-026hProgramshahttp://www.ssc.wisc.edu/wais/WAIS656026.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656026.txtw2Mike VonSchneidemesserVonSchneidemesser WAIS 656-026 October 21, 1965 Purpose and Operation of the Programs FFID, FFIDS, FFIDB These two programs serve to extract certain records specified by cards from the FFID tape sorted on ID number and print these out. The tape remains unchanged. FFID: This program prints out each input card with the ID number in columns 2-9 with its corresponding FFID record from the tape. If no FFID record with the specified ID number can be found on the tape, the card will be printed with the message "This card not on FFID tape." To operate the program place it in the read hopper followed by the card deck sorted on ID number. When the message "Get tape set up" appears, mount the FFID tape on unit 5 and enter $50. At the end of the program counters will be printed out. FFIDS : This program does the same as FFID except that the card input file has to conform to the following layout: Columns Entry 1-9 SS# 10-73 other entries or blank 74-80 identifying code (not required) These cards should then be sorted on SS#. The tape which has to be installed on unit 5 is the FFID tape sorted on SS#. FFIDB: This program will produce a listing of the input cards plus every FFID record which shares the same first 6 digits of its WAISID number with those of the card ID number. In other words: This program prints the records of a whole family. The card layout requires only that the first 6 digits of the ID number are punched in columns 11-16. To operate: put program in hopper, followed by the card deck sorted on columns 11-15. On the message "Get tape set up" install the FFID tape sorted on ID number on unit 5. Enter $50. The program stops when the counters for cards and tape records read are printed.{hMartin David 1965<5Notes on Future Work in the Area of Averaging Studies6November 3, 1965 WAIS paper656-029Averaging Studiest$M. David WAIS Paper 656-029 November 3, 1965 NOTES ON FUTURE WORK IN THE AREA OF AVERAGING STUDIES I. Present Capabilities At present WAIS has a partially edited five-year file of tax records required to reconstruct income for 1958 u Gene Moyer 1965D=Report on the Construction of the "Average of Variables" TapexSeptember 17, 1965 WAIS paper656-021mPrograms\VGene Moyer WAIS 656-021 September 17, 1965 Report on the Construction of the "Average of Variables" Tape Recently James Geffert wrote a program which extracted records from Extract - 01 and averaged appropriate variables for each person on that tape. This tape has been preserved and is available on SSRI #301. It contains observations on 19,525 individuals. The format of that tape is on page 2 While the tape was on the computer, the two runs listed on page 3 were run and are available in 353. Later, then, we merged the record of person with two or more ID's and reran Table 3. This is also available in 353. In the second running, we also computed mean AGI (the mean of means) for the table. Suggestions for other tables to be run from this tape will be appreciated. Format of the "Average of Variables" Tape - SSRI #301 Number Source Columns of Columns in Columns Extract #01 1-8 8 (1-8) WAIS Identification number 9 1 (11) Sex 0 = male 1 = female 10-16 7 (26-32) Mean Adjusted gross Income [ = 1/N N E t AGE t] 17-23 7 (33-39) Mean Net Taxable Income 24-30 7 (40-46) Mean Wages and Salaries 31-37 7 (47-53) Means Total Dividends Received 38-44 7 (54-60) Mean Gain or Loss on sale of assets 45-51 7 (61-67) Mean Self-Employment Income 52-58 7 (68-74) Mean Total Interest Income 39-65 7 (75-81) Mean Total Rent Income 66-72 7 (82-88) Mean Income from Trustees and Fiduciaries 73-74 2 --- Total number of years filed 75-76 2 (89-90) Year of birth 77-78 2 --- First year filed (year this person began to file) 79-80 2 --- Last year filed (year this person stopped file) 81-87 7 --- Mean "Property" Income = mean AGI less mean wages and salaries (incorrect) 88 1 92 Record Mark The blocking factor is 10. Tables So Far Run from "Average of Variables" Tape on August 23, 1965 Run # Table 1 Intervals 1 *XTAB Counter size = 9 *X Total years filed (73.-74) (01,02,03,04,06 08,09,10,11,12,13 14,ToT) 13 *Y Mean AGI (10.16) (blank, 0; 1; 1000; 2000; 3000; 4000; 5000; 6000; 7000; 8000; 9000; 10,000; 15,000; 20,000; 25,000; 50,000, TOT 18 *Z Last year filed (79-80) (46, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, TOT) 12 *END Total core for run 23,472 2 Repeat Table 1 adding *A, amount of mean AGI 3 Table 3 *XTAB Counter size = 9 *C Last year Filed (L55) *X Total years filed (73-74) 13 (01,03,04,05,06,07,09,10,11,12,13,14,T0T) *Y Mean AGI (10-16) 18 (as above) *Z Decade of Birth Year (75) 10 (blank, 0,1,2,3,4,6,8,9,TOT) *END Total core for run 21,060 Repeat Table 3 adding *A, amount of mean AGI Suggested Tables to be Run from the "Average of Variables" Tape Run # 05-07 Repeat Table 1 adding the following *A cards and computing means - also add a * c card eliminating women (E l in 19) 05 *A = Mean NTI (17-23) 06 *A = Mean "Property" income (81-87) 07 *A - Mean Wages and Salaries (24-30) Table 3 08 *XTAB Counter size 9 *C Last year filed (L 55) *C Sex, (El in 9) *X Mean SE income (45-51) 10 (b000000, b000001, 000000, 0000001, 0002000, 0005000, 0007000, 0010000, 0015000,TOT) *Y Mean Gain or Loss on sale of assets (38-44) (13) (b000000, b000001, b000250, b0000500, b001000, b002000, 0000000, 0000002, 0000250, 0000500, 0001000, 0002000,TOT) *Z Total years Filed 11 (01,02,05,07,09,10,11,12,13,14,TOT) 12,879 Table 4 Repeat Table 3 using *Y Mean Dividends Received (blank, 0000000, 0000001, 0000050, 0000100, 0000250, 0000500 ,TOT) Approximate Costs for Average of Variables Tape I. Cost of Compiling the program and Extracting the tape 314 hour @ $25.00 = 18.75 II. Cost of four Wistab runs already made 114 hour X 4 = 1 hour @ $25.00 = 25.00 43.75 III. Programmer time 1 hour @ $7.00 7.00 IV. Cost of merging persons with two or more ID's 12.50 V. Additional Wistab run 8.00 71.25hahttp://www.ssc.wisc.edu/wais/WAIS656021.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656021.txt} 1965:3An Additional Capability f 1965 The Updating of Extract 01October 21, 1965 WAIS paper656-023o Extract 01hahttp://www.ssc.wisc.edu/wais/WAIS656023.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656023.txt2Mike VonSchneidemesserVonSchneidemesser WAIS 656-023 October 21, 1965 The Updating of Extract 01 This paper is to describe the function and operation of the program FUDGE, HEADR Fudge the Extract, TITLE FUDGE, 1410 Program Purpose: To make changes in the Extract -01, (WAIS 645-057) records by use of entry codes or to delete whole records. (a) Updating: The cards with the updating information should generally conform to the layout specified in Ron Durant's working paper WAIS 645-047. However, for all entry codes but 09 entry codes it is necessary that the cards have the following layout: Cols. Entry 1-8 ID number 9-10 year 11-12 entry code 13-19 amount field (right justified, minus sign - if any - in column 13) 20-80 blank or anything (not required) The program will then overlay the field specified by the entry code with the data given in card cols. 13-19. Note: The program FUDGE has provisions for the major entry codes. Before using any entry code make sure the program checks for this code; if not; an additional routine for the entry code has to be inserted. (b) Deleting: To delete a record punch up a card with the ID number and year of the record to be deleted in columns 1-10. To operate, the program FUDGE write at first the update cards sorted on ID number and year on a tape, 50 records per block. Place the program in the read-hopper followed by the Delete cards sorted on ID number, When the message "Get tapes set up" appears on the console, install the first reel of the Extract -01 on unit 5, the tape with the update information on unit 7, and a blank tape for the output on unit 8. When unit 5 and 8 unloads, install the second reels on 5 and 8, but change the assigned numbers to 6 and 9 respectively. On the printer you will get a list of all delete cards, update cards (identified as ADDITIONAL), and all the records which have been updated. Also a message "This ID not on update nor on Extract -01" will be pointed with cards whose ID number was not found on the Extract. The program FUDGE has been set up for a two reel Extract -01 file. If a three reel file is to be used changes in the tape unit assignments have to be made and statement number 05030 has to be changed to MLCS @ 3 @, OLDTEXT-9.A>H Bill Gates 1968("B5500 Program Document: PhilosophyAugust 21, 1968 WAIS paper689-009 $Administration Data Processing D =Bill Gates WAIS 689-009 21 August 1968 B5500 PROGRAM DOCUMENT: PHILOSOPHY The intent of this paper is to clarify the rationale for using the form appended to this paper and those which are being collected in WAIS paper 689-008, "B5500 PROGRAM DOCUMENTATION". This paper was to have been written prior to 689-008 as an explanation but since we had the form and programs to document which we felt were important and the need to demonstrate the ease with which pertinent information could be collected, we published first and decided to validate second. The rationale spoken of relates to other developments: (1) a B5500 disk file called WAIS/TAPEL1B; (2) a digital scheme for collecting program listings and card decks based on the WAIS/TAPEL1B; (3) cognizance of identification problems spoken of in 689-006, "NAMING FILES ON THE B5500"; (4) and a disk file called WAIS/TAPES. The document or form referred to and attached here consists of the following items: 1. Program document number and deck number 2. Source name 3. Object name 4. Date 5. File declarations 6. Data format declarations 7. Sample execution control cards 8. Related programs 9. Program series 10. Purpose and description of execution. 1. The "PROGRAM DOCUMENT NUMBER AND DECK NUMBER" will have a four-digit code as explained in the preface of 689-007 corresponding to areas of file classification of the disk file WAIS/TAPEL1B. 2. The "SOURCE NAME" refers to the name of the source code for the program and should be named according to those standards outlined in 689-006. 3. The "OBJECT NAME" should follow as in 2. 4. "DATE" should be that of documentation. 5. "FILE DECLARATIONS" may be copied verbatum from the source program for ALGOL source programs. COBOL and FORTRAN standards have not been established. 6. "DATA FORMAT DECLARATIONS" should not necessarily include all format declarations, but only those dealing with the data. Error messages, for example, need not be included, but you may. 7. "SAMPLE EXECUTION CONTROL CARDS" are included so that anyone may execute the object program given the required information as to input files and output files without requiring the programmer to do it for them. 8. "RELATED PROGRAMS" refers by document number to those programs from which this program was adapted or those programs which served as an example for this program. 9. "PROGRAM SERIES" refers a set of programs which have a chronological or input-output requirement that ties them to another program. 10. "PURPOSE AND DESCRIPTION OF EXECUTION" includes all those incidentals not revealed explicitly by the previous nine fields. Furthermore, it may include information to provide perspective as to the scope of the program, i.e., the programs flexibility, and position in a series, etc. To repeat what has been said in 689-008, program listings and program source decks will be identified by the "DOCUMENT NUMBER" and therefore available if requested by that number. We hope that the last item (DESCRIPTION) will help provide perspective to a degree further than we have achieved to date. It's felt that collecting the program documentation would serve this end. PROGRAM DOCUMENT NUMBER AND DECK NUMBER SOURCE NAME: OBJECT NAME: DATE: FILE DECLARATIONS: DATA FORMAT DECLARATIONS: SAMPLE EXECUTION CONTROL CARDS: RELATED PROGRAMS: PROGRAM SERIES: PURPOSE AND DESCRIPTION OF EXECUTION:\hahttp://www.ssc.wisc.edu/wais/WAIS689009.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689009.txt  Bill Gates 1968HAA File Support System: An Example of Implementation on the B-5500September 25, 1968 WAIS paper689-0125<6Data Processing Maintenance System - Files, Data, Etc. Bill Gates WAIS 689-012 September 25, 1968 A FILE SUPPORT SYSTEM: AN EXAMPLE OF IMPLEMENTATION ON THE B-5500 To define "file support system", we may say it is the sum of those manual and machine operations necessary to make a file or set of files usable or more usable. Each of the activities must be directed and controlled so as to maximize efforts to make the file(s) more usable. The potential for making incorrect "corrections" is too great to be assumed away. With this introduction in mind I will outline a preliminary file support system as applied to the summary data of the new master tax data, 1959-1964. Briefly, we have done as follows: 1. At all times maintained several stages of backup for problem recovery. 2. Devised a "Change Form" to serve as a permanent record of the corrective process, in addition to the computer or other machine output. 3. In working with corrections, dealt with a subset of the file until a final feedback loop verifying the changes that have been made are the changes intended to be made. 4. At this stage the corrections are merged and we have a new updated file. A more specific example follows: 1. Form 4 card to tape. 2. Sorting of form 4 (B-5500 processing begins here). 3. Execute a program to detect missing or superfluous cards. 4. Check the diagnostic output against the original documents, at the same time filling in the "Change Form" when appropriate. 5. Execute GENERAL/20BLKS which retrieves twenty data blocks (maximum) from the form 4 data tape which contain records to be corrected. 6. Make changes to card-type field with EDITOR where necessary to provide for proper handling of the data format. 7. Execute MASTER/REF50 which expands any one card to several seventy-two character records with descriptors prefixing each field to aid in making corrections. 8. Make corrections using the "Change Form" and EDITOR on the output file of the last step. 9. When the corrections are completed execute MASTER/F4AGAIN, in order to return expanded records to their eighty character form. 10. Execute GENERAL/FEEDBAC, comparing these twenty blocks with the original twenty blocks. One output file produced shows: (1) on its first line the new record; (2) on its second line the old record blanked out except in those characters differing from the new record; (3) on its third line the record number relative to the beginning of the file. The second output file contains the new records to be merged and the number of records in this file must be equal to the number of "Change Forms". 11. The first file described above should then be compared to the original computer output or source documents. If it is compared only to the "Change Form" one may be simply verifying an error, not a correction. 12. The "Add, Merge, Delete" program is then executed providing more feedback to be verified as to appropriateness. It will give information as to all actions taken including sequence checking. Some of the steps above seem rather awkward; but this is a preliminary system. It has supplied a great deal of insight with respect to what problems will be encountered and to how it may be improved upon. To close with an example, steps 6, 7, 8, and 9 might exist within one program that we could write given the priority.hahttp://www.ssc.wisc.edu/wais/WAIS689012.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689012.txte during which that person (either spouse if couple) had at least one return. A 5-year period with no returns will be output if there are any returns both before and after the period. 1964 TAX AVERAGING PROCESSING TAX MASTER EXTRACT SHORT RECORD WITH SORT KEY 1410 SORT ID #, YEAR 1410 GROUP INTO 5 YEAR PERIOD 1410 COMPUTED VARIABLES FOR FUTURE PROCESSING 3600 COMPUTE VARIABLES, TABULATE CROSS TABULATIONShahttp://www.ssc.wisc.edu/wais/WAIS645048.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645048.txt Aldrich1967 Aldrich1967& Aldrich1968< Aldrich1968? Aldrich1968B Aldrich1968F Aldrich1968K Aldrich1969R Aldrich1969Z Aldrich1969b Aldrich1969g Aldrich1969w Aldrich1970z Aldrich1970{ Aldrich1970 Aldrich1970 Aldrich1971 Aldrich1971 Aldrich1971 Aldrich1971 Aldrich1971 Aldrich1971 Aldrich1971 Aldrich1971~ Athreya1970 Athreya1970 Athreya1970H Barger1965$ Bauman19646 Bauman1965B Bauman1965V Bauman1965b Bauman1965c Bauman1965i Bauman1965l Bauman1965 Bauman1965 Bauman1965& Bauman1965 Bauman1966 Bauman1966 Bauman1966 Bauman1966 Bauman1966 Bauman19666 Bauman19666 Bauman1966 Bauman1966 Bauman1966 Bauman1966 Bauman1966 Bauman1967 Bauman1967 Bauman1967 Bauman1967 Bauman1967  Bauman1967p Begum1965z Begum1965Bhargava1967Bhargava1967Bhargava1967Bhargava1967Bhargava1967Bhargava1967Bhargava1967Bhargava1967Bhargava1967Bhargava1967!Bhargava1968(Bhargava1968-Bhargava19689Bhargava1968zBhargava1970: Bridges1965Bussmann1967Bussmann1967Bussmann1967Bussmann1967 Bussmann1967Bussmann1967+Bussmann19681Bussmann1968Bussmann1968Z Cassidy1965` Cassidy1965 David19648 David1965; David1965I David1965 David1965 David1967 David1967 David1967 David1967 David1967  David1967  David1967 David1967 David1967 David1967 David1967 David1967 David1967I David1968J David1968\ David1969] David1969^ David1969m David1969q David1969 David1970 David1978 David1978 David1979 David1979 David1979 David1979 David1979 David1980 David1980 David1980 David1980 deVries1966 deVries1966 deVries1966 deVries1966 deVries1966 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 deVries1967 Dey1966 Duchan1965 Duchan1966 Duchan1966 Duchan1966 Duddleston1967 Duddleston1967 Duddleston1967# Duddleston1968$ Duddleston1968 Durant1964 Durant1964 Durant1964 Durant1964! Durant1964% Durant19641 Durant19652 Durant19655 Durant1965? Durant1965F Durant1965G Durant1965N Durant1965Q Durant1965] Durant1965_ Durant1965a Durant1965o Durant1965q Durant1965r Durant1965s Durant1965t Durant1965u Durant1965 Esterly1967 Esterly1967 Esterly1967 Esterly1967 Fitzgerald1980 Fitzgerald1980 Fitzgerald1980 Fitzgerald1980 Fitzgerald1981 Fitzgerald1981 Fitzgerald1981 Fitzgerald1981 Fitzgerald1981 Fitzgerald1981 Frost1978 Frost1978 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1979 Frost1980 Frost1980 Gates1967 Gates1967 Gates1967 Gates19674 Gates19685 Gates1968> Gates1968? Gates1968A Gates1968Y Gates1969f Gates1969h Gates1969i Gates1969k Gates1969l Gates1969n Gates1969o Gates1969p Gates1969r Gates1969| Gates1970 Gates1970 Gates1970 Gates1970 Gates1970 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gates1971 Gay1970 Gay1970 Gay1971 Geffert1964 Geffert1964 Geffert1964) Geffert19643 Geffert19654 Geffert1965P Geffert1965S Geffert1965T Geffert1965[ Geffert1965\ Geffert1965g Geffert1965g Geffert1965g Geffert1965g Geffert1965 Geffert1965g Geffert1965g Geffert1965g Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965g Geffert1965g Geffert1965g Geffert19651965g Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert1965 Geffert196565g Geffert1965 Geffert1965g Geffert196565g Geffert1965g Geffert1965g Geffert1965g Geffert1965g Geffert1965g Geffert1965 Geffert1965 Geffert1965pchneidemesser WAIS Paper 656-052 April 4, 1966 A Definition and Description of the Fixed Format ID File During the updating of the Fixed Format Identification File (FFID file) frequent questions arose as to what address and person are relevant for the FFID file. From this I concluded that the purpose and meaning of the FFID file should once and for all be rigorously defined so that (a) Mike VonSchneidemesser 1966F?Utility Print Programs for the Fixed Format Identification File March 9, 1966 WAIS paper656-04560Fixed Format Identification File (FFID) Programs2,M. von Schneidemesser WAIS paper 656-045 March 9, 1966 Utility Print Programs for the Fixed Format Identification Files All these programs are designed to accept a 124 position record as described in WAIS 645-058. To list out the whole FFID files with 60 records per page there exist two 1401 object deck programs. Program: FFIDL JOB LIST FFID TAPE ID SEQU ON 1401 M. V. S. This will list the FID file sorted on ID numbers. It also prints out counters for the number of records in each name group. Program: FFSSL JOB LIST FFID TAPE SS# SEQUENCE ON 1401 M.V.S. This will list the FFID file sorted on Social Security number. Each program needs about 30 minutes to list out about 20,000 records. A listing of the file - which will greatly facilitate coding and updating work - therefore will cost about $2. To print single records specified by cards, three programs for the 1410 are available: The program FFID will list single records from the tape sorted on ID number. An object deck exists for this program. The program FFIDS will do the same for the tape sorted on SS-number and input cards specifying the SS-number. The program FFIDB will list out the whole family unit, i.e., all the records which have the same first 6 digits of the ID number as the input card. These last three programs are described in WAIS 656-026.hahttp://www.ssc.wisc.edu/wais/WAIS656045.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656045.txt\ Gene Moyer 19662+An Additional Comment on the Survey WeightsSeptember 28, 1966 WAIS paper667-008Survey Data and FileVPGene Moyer WAIS 667-008 September 28, 1966 An Additional Comment on the Survey Weights Due of the factors which worried this writer about the weights was that the weighted proportion of records with "Joint Return Incomes" of $10,000 - $14,999 seemed high when compared to the distribution of Federal joint returns for 1963. Therefore we ran a table of respondent's incomes and their associated weights to see if records in the $10,000 - $15,000 class had high weights. The table on this page shows the number of records in each respondent's income class and the mean weight associated with the records in that class. Notice that the distribution of mean weights is inversely related to income as it should be and that the mean weight for records in the $10,000 - $14,999 income class is 3.78, not extremely large. Therefore, the high weighted proportion in the $10,000-$14,999 income class seems to be the result of the underlying distribution, not of the weights; the weights seem proper. The Relationship Between the Weights and Respondent's (Head's) Income in the Survey Heads Income Class Number of Records Mean Weight Under 1000 92 9.61 1000 - 1999 65 18.05 2000 - 2999 82 9.39 3000- 3999 86 12.71 4000 - 4999 83 12.38 5000 - 5999 114 10.48 6000 - 6999 104 9.59 7000 - 7999 88 13.65 8000 - 8999 64 9.28 9000 - 9999 56 4.09 10000 - 14999 196 3.78 15000 - 19999 81 3.10 20000 - 24999 32 2.21 25000 - 49999 60 1.75 50000 or more 12 1.85 Not Ascertained 3 3.56 Unweighted (Not in name groups) 82 00.00 Total 1300 7.97hahttp://www.ssc.wisc.edu/wais/WAIS667008.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667008.txttRichard Bauman 1966,&Card Format for Supplementary Age DataApril 19, 1966 WAIS paper656-055Age Data FormatsxrRichard Bauman Wais Paper 656-055 Revised by V. Toop July 1, 1966 CARD FORMAT FOR SUPPLEMENTARY AGE DATA Column(s) DATA 1-1 "A" 2-9 WAIS ID # 10-18 Social Security ID # 19-20 BLANK 21-22 Last Year Filed 1 - Wisconsin death records 23-24 BLANK 2 - Wisconsin birth records 3 - Benefit data file 25-25 Code for primary source of 4 - Parent's file age data 5 - Motor Vehicle Department 6 - Chance reference to age in tax file 7 - Age-unidentified residual file 9 - No number given or 7 writt down 26-27 BLANK 1 - M 2 - NM 28-28 Marital Status 3 - W 4 - D 9 - NA 29-30 BLANK 31-36 Date of birth Month/Day/Year 37-38 BLANK 39-41 Age in years at death 42-43 BLANK 44-45 Usual Occupation 46-47 BLANK 48-49 County of Death 50-51 BLANK 60-60 Race 1 - White 2 - Non White 52-57 Date of Death 9 - NA 58-59 BLANK 61-80 BLANK If Column 25 - 1 the entire card is relevant Punch every field, filling NA fields with 9's If Column 25 - 2-6 only the first 36 columns are relevant Punch every field to the date of birth field (31-36) and skip out 37-80 Fill NA fields(col 1-36) with 9'shahttp://www.ssc.wisc.edu/wais/WAIS656055.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656055.txt"b James Geffertd 1966B;Computerized Error Correction Applied to Income Tax ReturnspFebruary 24, 1966 WAIS paper656-044u&General Papers (Regarding WAIS)f!,\V.LW`iIPUHgk9'$;kV8#" +*0l>D=[hb_tmsz^|w|@Gx  "*+)0/27691?A~DKORWYO(X{o <6UUUUUUUUUUUUU 0oFmr2 3 96 006 1613. %415,84F ro m 30166 131151 .2%7aLrr ycSrheoed rW IA S96-010 3S peetbmre2 ,9196 AeRoptro nht eeRe-iditgno fht eaMtsreF li eT ih shsro tapep redcsireb sht eere-iditgnp orecsso nht eaMtsreF li esuni ght erpgoar mAMTSRER/DETI .hT eabis crpgoar maw shttad sercbidei nAWSI6 090-21 ,D"te% Gene Moyer 1966,&1962 Deductions of Wisconsin TaxpayersFebruary 21, 1966 WAIS paper656-042pB8Notes on the Processing of Social Security Benefits DataNovember 18, 1965 WAIS paper656-032r"Benefit File Data ProcessingRichard A. Bauman WAIS Paper 656-032 November 18, 1965 Notes on the Processing of Social Security Benefits Data 1.0 General Comments 2.0 Data Card Edits 3.0 Inter Card Edits 4.0 Social Security Benefit Record 5.0 Appendix - Format for Benefit Record 1.0 General Comments 2357 of the 2727 identified claims cases were received as of November 15, 1965. These were logged according to WAIS 645-074 and the data was keypunched according to 656-006. New FFID's were assigned to those beneficiaries not in the Master File (at present) and updated FFID's were made for some persons in the Master File. The ID assignment instructions are also found in WAIS 656-006. Most of the cases not received as of November 15, 1965 are expected to involve disallowed or denied claims. Our card files (Logging Cards, Benefit Data Cards, FFID cards) from the benefit data contain essentially 4 types of data; 1.1 Identification data 1.2 Benefit income data 1.3 Supplementary data on non-benefit earnings 1.4 Built-in redundant data 1.1 The identification data is keyed into our existing FFID system. This should facilitate the combination of our various files. The SSA benefit account number appears on all benefit data cards, allowing a cross reference by source of benefits. Supplemental identification by type of beneficiary, type of claim, etc. is included on the benefit data cards. 1.2 Benefit income data appears on the Benefit data cards in monthly income amounts for the years 1946-1965. 1.3 Supplementary data on non-benefit earnings appears on the data cards in several forms including: 1.31 WIC - work indication code (per month) for years 1962-1965 1.32 S - benefits suspended during month for years 1946-1961 1.33 TEC - type of earnings code for years 1962-1965 1.34 PSC - payment status code for years 1963 and 1964 1.35 Annual report data for 1963 and 1964 1.4 The Built-in Redundant such as the PIA primary insurance amount, along with certain logical necessities, allows for the checking for internal consistency of our benefit files. The following parts of this paper outlines a proposal for handling the card files and producing a record which satisfies the needs of WAIS. The present requirements of WAIS call for a benefit record which: - Is consistent- see 2.0. 3.0 - Contains yearly, not monthly benefit incomes - see 4,0, 5.0 - Can be matched and edited with other record --see 4.0, 5.0 - Allows for inferences about non-benefit, non-taxed earnings - see 4,0, 5.0 2.0 Data Card. Edits The General Card Edit Program XS- will be used to edit the benefit data cards. In addition to the elementary edits for permissible characters, we are also able to check for: 2.1 Consistency of date of birth and date of entitlement for 2.11 certain old-age beneficiaries 2.12 child beneficiaries 2.2 Consistency of amount fields and explanatory codes. 3.0 Inter-Card Edits Since the Benefit data for any one beneficiary is found on a variable number of cards (depending upon both the number of changes in his history and the form on which the history is recorded) and since there are items that should be consistent on all cards for a beneficiary, some inter-card edits system is necessary. There are two alternative ways this can be done. Edit cards can be formed from the benefit data cards and logging cards and submitted to the General Card Edit Program XS-, or a special program can be written to perform the edits. The latter would be combined with a creation run for the Benefit Record Tape. The important inter card edits are those that check for: 3.1 Duplicate or missing cards 3.2 ID or SS # mistakes 3.3 Agreement of logging cards, Type 1 cards, and Type 3 cards 3.4 Chronological order of history entries 3.5 Consistent identifying information on Type 1 and 2 cards The General Card Edit Program XS- does not have the capability of comparing 2 data fields directly, therefore it seems that 3.2 through 3.5 could not be handled most efficiently by preparing an edit card(s). 4.0 Social Security Benefit Record The Benefit Record (Sec. 5.0) will.. be constructed from edited data cards and FFID cards (or tape records) for the beneficiaries. Record Positions Source 1-8 Columns 13-20 of cards 2 and/or 3 9-17 Columns 10-18 of FFID cards 18-26 Columns 4-12 of cards 2 and/or 3 27-50 (Must be calculated) A record will be constructed for each year in which a beneficiary received benefits. The monthly payment record contains a one character code for each month in the record year. This code is primarily useful in estimating the individual's earnings during the year, if any. It also indicates the extent of benefits accrued but not paid in a given year. Monthly Payment Record Code Code Reason blank Benefits not paid during mouth because not yet entitled or terminated in previous month 1 Benefits paid during month 2 Benefits not paid during month because of retroactive payment later 3 Benefits not paid during month because of suspension or work indication code > 0 and beneficiary payment designation = 0. 4 Benefits not paid during month because they were withdrawn for adjustment 5 Benefits not paid during month because they were terminated 6 Benefits not paid during mouth because of a previous entitlement to another type of benefit The beneficiary ID code(s) are useful as an additional source of demographic data and also as a possible check against our other records. The code is explained in Appendix A of WAIS 656-006. 5.0 Appendix Format for Benefit Record Positions No. of Positions Data 1-8 8 WAIS ID number for Beneficiary 9-17 9 SSA Account number for Beneficary 18-26 9 SSA Benefit account number 27-28 2 Year of Record 29-34 6 Amount of Benefits received during year 35-46 12 Monthly payment record 47-48 2 BIG-Beneficiary ID code-end of year 49-50 2 Earlier BIC-Beneficiary ID code 51 1 # - Record Markhahttp://www.ssc.wisc.edu/wais/WAIS656032.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656032.txt$Richard Bauman Marshall Seavey 1965>8Report on Check-Coding of Interview Schedule and BookletJanuary 4, 1965 WAIS paper645-022<6Consistency of Data Property File Survey Data and File&Richard Bauman Marshall Seavey WAIS Working Paper 645-022 January 4, 1965 Report on Check-Coding of Interview Schedule and Booklet In order to obtain a rough estimate of the size of error involved in the coding of the interview schedule and booklet, 200 interviews were checked for the accuracy of coding. The interviews checked were those numbered, 0001-0050, 0301-0350, 651-700, and 1051-1100. Each interview schedule and booklet, if it existed, was coded by the check-coders and then this check coding was compared with the original coding. The question by question comparison of the coding for the 200 interviews is summarized in the tables included in this report. In the opinion of the check coders there were three major factors which accounted for coding errors. Often the coder did not discriminate well among the alternative codes available for a particular question. There was often a tendency for certain codes to be used more frequently than others when the others were more appropriate. A second important factor was a lack of understanding of the code's meanings on the part of the coders. And the third important problem in coding was the large number of vague responses. Interpretation of Table Row 1. The entry here represents the number of times a given question had an answer which was or should have been coded. Row 2. This entry indicates the number of times both the original coder and check coders were in complete agreement. Row 3. This is the percentage of total codes for a particular question which were in complete agreement. Row 4. This category shows where one code pertaining to a question was not in agreement with the original code and the check code. Row 5. Here the original coder placed more codes for a question than did the check coder. Row 6. In this case the check coder placed one code more in response to a question than the original coder. In addition, where the original coder did not code a question which should have been coded an indication of the omitted code was included in this category. Row 7. The questions which were coded by the check coders with two or more codes than the original coding are enumerated here. Row 8. This category includes instances in which combinations of the above mentioned errors occurred in the coding of a question and errors which did not fit any of the preceding categories. The following remarks indicate types of errors which characterized the coding of certain questions. Question 1 - The main reason for the differences here was the failure to follow the instructions to "code questions one and two together" Question 2 - A large source of the differences here was the apparently automatic imputation that R felt "pressure from creditors" whenever he said he would use the money to pay off debts. Another source was the failure to distinguish between "general desires" and specific desires. Most, however, were the results of vagueness of the responses, and the inconsistencies are not as great as they may appear. Questions 13, 13a, 13b - Most of the differences here seem to be due to laziness - coding a general reason where specific reasons are given - much like Question 1. Question 60 - The first coder confused fees and salary. Question 66 - One reason for differences here was the coding of a response like "you have to know how to handle people" as "love of people" rather than "polish, glamour". Another reason was the use of "other qualification" for those already ascertained in Question 65. Question 87c - One large reason for the differences here was an apparent misunderstanding of the code for the second digit. Question 138 - On this question the original coder would often code a response which apparently was merely repetitive of the answer to Question 137 as "Floor space per person". GRAND TOTALS Question No. Schedule 13 13a 58a 1 2 7 9 11 13b 27a 35 41a 42a 53 56 59 60 61 62a 66 70b 70c 71a 73b 74a 75a 1. No. of schedules coded 200 200 14 42 13 196 148 28 28 5 7 16 172 28 110 46 190 3 27 61 6 .13 3 2. No. coded in complete agreement by both coders 168 114 11 37 7 147 148 27 21 4 6 15 166 22 91 32 142 3 20 45 5 10 3 3. % of above 84 57 79 88 54 75 100 96 75 80 86 94 97 79 83 70 75 100 74 74 83 77 100 4. No. coded with one different response 16 42 3 1 5 27 1 7 1 1 1 6 6 10 34 3 14 1 2 5. No. Coded with less responses than originals 4 15 1 2 1 6. No. coded with one additional response 9 10 2 1 10 5 13 3 7 1 2 1 7. No. coded with several additional responses 1 3 1 8. Compound differences in coding 2 16 2 10 1 1 5 2 GRAND TOTALS Question No. Schedule 79 80 85 86 87a 89 90 98a 99a 100 103 105 107f 112 114 116 123 124 125E 128 129 130 1. No. of schedules coded 40 81 145 146 77 7 5 43 42 46 55 52 1 9 9 8 11 11 1 8 7 7 2. No. coded in complete agreement by both coders 30 72 125 128 67 6 5 35 35 44 53 44 1 9 9 7 9 9 1 6 4 5 3. % of above 75 89 86 88 87 86 100 81 83 96 96 85 100 100 100 88 82 82 100 75 57 71 4. No. coded with one different response 7 8 20 18 10 1 4 6 2 1 4 1 2 1 5. No. Coded with less responses than original 2 1 6. No. coded with one additional response 4 1 4 1 2 1 2 1 1 7. No. coded with several additional responses 1 8. Compound differences in coding 1 GRAND TOTALS Question No. Schedule 202 202a 138 140b 154 157 160 165 181 202c 205 206 1. No. of schedules coded 200 23 15 3 9 4 1 154 18 18 2. No. coded in complete agreement by both coders 153 23 11 3 9' 4 1 103 15 10 3. % of above 76 100 73 100 100 100 100 67 83 56 4. No. coded with one different response 25 4 29 3 5. No. coded with less responses than original 6 6. No. coded with one additional response 10 12 3 3 7. No. coded with several additional responses 2 2 1 8. Compound differences in coding 4 8 1 GRAND TOTALS Question No. Booklet 9a 10 23c 35a 35b 35c 35d 39a 40d 41a 42a 42i 48a 49a 50a 52 53a 54 55 1. No. of booklets coded 148 70 3 7 8 20 9 3 4 30 8 3 1 18 10 30 5 75 86 2. No. coded in complete agreement by both coders 126 57 2 5 2 14 3 2 1 19 6 1 8 6 17 4 40 56 3. % of above 85 81 67 71 25 70 33 67 25 63 75 100 0 44 60 57 80 53 65 4. No. coded with one different response 3 5 1 5 3 6 3 1 5 2 6 1 19 13 5. No. coded with less responses than original 1 1 3 4 1 6. No. coded with one additional response 19 5 1 1 1 3 6 1 1 1 1 6 10 12 7. No. coded with several additional responses 1 1 2 8. Compound differences in coding 2 1 2 1 1 2 1 1 2hahttp://www.ssc.wisc.edu/wais/WAIS645022.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645022.txtbZzBarbara Aldrich 1969PIFormat for the Extract of Selected Variables from the Master File and 805\ July 17, 1969l WAIS paper690-006l4-Extract 79 Social Security Earnings Data- 805:4Barb Aldrich WAIS Paper No. 690-006 July 17, 1969 Format for the Extract of Selected Variables from the Master File and 805 Column Contents 1-8 WAIS ID number 9-10 year 11-13 residence location 14-15 county prior year 16 address change 17-13 occupation 19 occupation change 20 return filed and reason 21 partnership 22 spouse separate income 23 marriage details 24 "Head of Family" 25-26 number of dependents 27-34 largest wage 35-42 second largest wage 43-50 other wage 51-58 interest received 59-66 dividends 67-74 rent 75-82 capital gains 83-90 profit, loss from business 91-98 trust income 99-106 partnership income 107-114 other income 115-122 total of all sources of income 123-130 automobile and business expense 131-138 adjusted gross income 139-146 standard deduction Column Contents 147-154 *NTI 155-156 NEWOC 157-164 property income transformation 165-172 wages 173-180 self employment income 181-188 earnings 189-196 earnings - expenditures 197-200 birth year 201-208 net taxable income (standard deduction basis) 209-216 net taxable income (itemized deduction basis) 217-224 805 ID number 225-239 blank 240 record mark *This amount was derived from NTI (standard deduction basis) or NTI (itemized deduction basis). When both fields had an amount in them, the lesser amount was put in NTI. This decision was necessary since machine calculation of all standard deductions has been made and put in NTI (standard deduction) field for every record, whether the standard deduction was taken or not. The amount fields on this tape are to the nearest dime e.g. 32924 on the tape would be $3292.4X.hahttp://www.ssc.wisc.edu/wais/WAIS690006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS690006.txtBarbara Aldrich 1969$Format for the Alley-805 TapeSeptember 4, 1969 WAIS paper690-014FormatsBarb Aldrich WAIS 690-014 September 4, 1969 Format for the Alley-805 Tape Column # of Cols. Var. No. Var. Name 1 1 1 Indicator 2-9 8 2 WAIS IN 10-18 9 3 Social Security # 19-20 2 4 County Code 21-22 2 5 Last Year of Data 23 1 6 Multiple Account Indicator 24 1 7 Name Disagreement indicator 25-26 2 0 Month of Birth 27 1 9 Blank 28-29 2 10 Year of Birth 30 1 11 Race 31-36 6 12 Sex MALE FEMALE 37 1 13 Blank 38-42 5 14 Railroad Activity Indicator 43-44 2 15 Newly Posted Credit Earnings Item 45-46 2 16 Additional Earnings Discrepancy 47-48 2 17 Active Earnings Discrepancy 49-53 5 13 Account in Benefit Status other than Disability 54-57 4 19 Benefit Status Other Than Disability was Terminated 53-60 3 20 Account in Disability Benefit Status or Disability Freeze Status 61-64 4 21 Disability Status was Terminated 65-68 4 22 Credit indication 69-73 5 23 Earnings Statement Issued in Year Indicated 74-76 3 24 Indication of Self-employment Activity 77-79 3 25 Indication of Delinquent Self-employment Item 80-81 2 26 Indication of Agricultural Activity 82-90 9 27 Earnings, 1937 to Date 91-92 2 28 Wage Quarters from 1947 93-94 2 29 Self-employment Quarters of Coverage 1951 to Date 95-96 2 30 Agricultural Quarters of Coverage 1955 to Date 97-105 9 31 Earnings 1951 to Date 106-107 2 32 Wage Quarters of Coverage 1951 to Date 108-109 2 33 Self-employment Quarters of Coverage, 1951 to Date 110-117 8 34 1951 Earnings 118 1 35 1951 Self-employment Quarters of Coverage 119-126 8 36 1952 Earnings 127 1 37 1952 Self-employment Quarters of Coverage 128-135 8 38 1953 Earnings 136-139 4 39 1953 Quarterly Wage Quarters of Coverage, Patterns 140 1 40 1953 Self-employment Quarters of Coverage 141-148 8 41 1954 Earnings 149-152 4 42 1954 Quarterly Wage Quarters of Coverage Pattern 153 1 43 1954 Self-employment Quarters of Coverage 154-161 8 44 1955 Earnings 162-165 4 45 1955 Quarterly Wage Quarters of Coverage Pattern 166 1 46 1955 Self-employment Quarters of Coverage 167 1 47 1955 Agricultural Quarters of Coverage 168-175 8 48 1956 Earnings 176-179 4 49 1956 Quarterly Wage Quarters of Coverage Pattern 180 1 50 1956 Self-employment Quarters of Coverage 181 1 51 1956 Agricultural Quarters of Coverage 182-189 8 52 1957 Earnings 190-193 4 53 1957 Quarterly Wage Quarters of Coverage Pattern 194 1 54 1957 Self-employment Quarters of Coverage 195 1 55 1957 Agricultural Quarters of Coverage 196-203 8 56 1958 Earnings 204-207 4 57 1958 Quarterly Wage Quarters of Coverage Pattern 208 1 58 1958 Self-employment Quarters of Coverage 209 1 59 1958 Agricultural Quarters of Coverage 210-217 8 60 1959 Earnings 218-221 4 61 1959 Quarterly Wage Quarters of Coverage Pattern 222 1 62 1959 Self-employment Quarters of Coverage 223 1 63 1959 Agricultural Quarters of Coverage 224-231 8 64 1960 Earnings 232-235 4 65 1960 Quarterly Wage Quarters of Coverage Pattern 236 1 66 1960 Self-employment Quarters of Coverage 237 1 67 1960 Agricultural Quarters of Coverage 238-245 8 68 1961 Earnings 246-249 4 69 1961 Quarterly Wage Quarters of Coverage Pattern 250 1 70 1961 Self-employment Quarters of Coverage 251 1 71 1961 Agricultural Quarters of Coverage 252-259 8 72 1962 Earnings 260-263 4 73 1962 Quarterly Wage Quarters of Coverage Pattern 264 1 74 1962 Self-employment Quarters of Coverage 265 1 75 1962 Agricultural Quarters of Coverage 266-273 8 76 1963 Earnings 274-277 4 77 1963 Quarterly Wage Quarters of Coverage Pattern 278 1 78 1963 Self-employment Quarters of Coverage 279 1 79 1963 Agricultural Quarters of Coverage 280 1 80 Claims Indication ("0" or "1") 281 1 81 BLANK 282 1 82 805 and Age Data Indicator 283 1 83 Benefit Indicator 284 1 84 Survey Indicator 285-288 4 BLANKhahttp://www.ssc.wisc.edu/wais/WAIS690014.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS690014.txt:Janet Whitaker 1970WAIS ResidualsSeptember 1, 1970x WAIS paper701-010 &General Papers (Regarding WAIS)B<701-010 Janet Whitaker September 1, 1970 WAIS RESIDUALS In social science research one word may have many different meanings, depending on the context of its usage. For example, the word EDIT means error detection to some WAIS-ers and error correction to others. The word RESIDUAL has also had different meanings for WAIS. When talking about the property file, RESIDUAL refers to stock or dividend issuing companies or corporations in addition to those in our joint LFF-CS listing. Another use of RESIDUAL refers to eligible tax sample members whose returns were, for one reason or another, not microfilmed with the rest of the sample members. A third use of RESIDUAL refers to the Survey File sample; here, a RESIDUAL is a person who is a high income supplement member and whose first name or initial is not in the tax sample name group. Two file drawers in the file room contain 147 new tax data folders labelled RES. The data for these taxpayers are presently in 80-character card image tape form; these data have not been processed with the new Master File. This group of 147 household units can be divided into 4 sub-groups: 1. SURVEY FILE RESIDUALS n = 107. High income supplement members who are not in the name group sample. 2. High income supplement members who are also name group sample members; n = 13. First names of these women places them in name group sample, but first names of their husbands exclude them. 3. High income supplement members who are also name group sample members; n = 3. These people should be treated like other supplement people who are in the Master File.hahttp://www.ssc.wisc.edu/wais/WAIS701010.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS701010.txto) Alan Duchan 1965NHDescription of Initial Crosstabulations from 1964 Tax Averaging Law FileDecember 8, 1965 WAIS paper656-03450*Averaging Studies Cross Tabulations Tables(V(OAlan I. Duchan WAIS paper 656-034 December 8, 1965 Description of Initial Crosstabulations from 1964 Tax Averaging Law File A set of tables designed to give an indication of the effects of varying the definition of averagable income has just been completed. The basic description of the tables is in WAIS paper 645-053, the data used as input to the tables are described in WAIS papers 645-050, 051, and 052. The following covers additions and modifications to these papers. 1. The variables that were used. The following variables were used on one or more of the tables. (The abbreviation used to identify the variable on the tables is shown in parenthesis.) A) Group Type (GRPTYPE) which has the following values: 1 => Filers who were single in 1958 (i.e., group types I, and II in WAIS papers 645-052, 053). 2 => Filers who were married in 1958 (group types III and IV in WAIS papers 645-052, 053) treated as filing separately. 3 => Filers who were married in 1958 treated as filing jointly. B) Legal Definition (LEGALDEF) which takes on the values l, 2, 3, and 4 corresponding to the four legal definitions defined in WAIS paper 645-052. Differences among the four pertain to how federal net taxable income is defined and, in part, are an effort to overcome the problem of simulating the federal definition of income with WAIS data. The definitions are briefly tabulated below; but for a full description see WAIS paper 645-052. Capital gains Capital gains excluded included Federal net taxable income is set equal to zero if the com- Legal Def. Legal Def. puted figure is negative 1 2 The computed figure is used Legal Def. Legal Def. whether positive or negative 3 4 C) Direction of Fluctuation 0 => Y > B ; i.e., Federal Net Taxable income for the computation year (1958) is greater than average Federal Net Taxable income for the base period (1954 through 1957) 1 => Y < B D) Qualification Status (QUALTYPE) 0 = Passed both tests; i.e., qualifies 1 = Failed A test only 2 = Failed B test only 3 = Failed both tests The A test refers to an ad hoc rule for deciding whether a person was self supporting. If a person's (or a couple's) income was above an arbitrary amount, he was considered self-supporting and thus qualified. The B test refers to the completeness of a person's record. If the number of years for which records were available were above an arbitrary figure, he passed the B test, On half the tables, the number of interpolated years a filer could have and still pass the B test was 2, 3, or 4 depending on marital status. On the other tables, filers passed the B test only if they had no interpolated years. Thus, a comparison of any pair of tables will permit judgments on the validity of the interpolation procedure. Whether a particular table does or does not allow a filer with some interpolated years to qualify is discussed below and is shown on the title of the table. E) Percentage Change Variant (PCV) PCV is the percentage of federal net taxable income in the base period used in computing potentially averageable income. The different percentages used on these tables are tabulated below. Value of Percentage used Percentage used PCV shown if direction of if direction of on the table fluctuation is fluctuation is Positive (Y > B) negative (Y < B) 1 100 % 100 2 125 80 3 133 75 4 150 67 The following variables are continuous; the values shown on the tables are upper bounds of each interval except that all values higher than the largest upper bound are included in the last interval. For example: Let the values shown Then the values included on the table be in the interval are -10 - (infinity) < x < -10 0 -10 B), Potentially Averageable Income is X defined by X = Y - [PCV]B. 2) If Direction of Fluctuation is negative (Y < B), Potentially Averageable Income is Z defined by Z = [PCV] B - Y where PCV is the constant described under (E) above, II. Description of the tables A) General - Tables A,B,C and D allow filers to pass the B test if they have no more than 2,3, or 4 interpolated years depending on marital status. Tables E,F,G,H are identical to tables A,B,C and D, respectively, except that only those people with no interpolated years passed the B test. B) Description of each table 1) Tables A and E are five-dimensional tables with the following variables. a) Group Type b) Legal Definition c) Direction of Fluctuation d) Qualification Status e) Federal Net Taxable Income in 1958 Note that WAIS paper 645-053 does not ask for a division by "Legal Definition" on this table. However, Direction of Fluctuation can only be determined after a particular Legal Definition is assigned; therefore Legal Definition has to be one of the variables. 2) Tables B and F are the sane as tables A and B respectively, except that Adjusted Gross Income replaces Federal Net Taxable Income 3) Tables C and G are six-dimensional tables that include Qualifiers only. The variables are: a) Group Type b) Legal Definition c) Direction of Fluctuation d) Percentage change Variant e) Potentially Averageable income f) Federal Net Taxable Income in 1958 4) Tables D and H are the same as tables C and G except that Adjusted Gross Income replaces Federal Net Taxable Income. The rest of this paper concerns correct papers 645-050, 051, 052, and 053. III. Clerical corrections to WAIS papers A) For paper 645-052 Under (3) - b on page 5, change: X = y - ( ) B to X = Y - ( ) B B= 4Et=1 Bt to B = 1/4 4Et=1 Bt That is, B is the average of Bt over the base period rather than the sum Under (3) - (c) on page 5, change: j = z to j= 2 B) For Paper 645-050 Under Note I at- the bottom of page 4, change "if yes.... set M=1 to "If yes .... set M=0 M - 1 implies the person is married while M = 0 implies the person is single. Thus, as corrected, the instructions are to treat a person as single in year t if he was newly married in year t + 1. IV. A note on deciding martial state when a record is missing WAIS paper 645-050 (pages 4-5) explains the procedure for assigning marital status for filers in years in which their returns are missing. The procedure utilized the same code as was used on the original coding of returns, i.e.: M=0 = Single person M=1 = Married; spouse had separate income M=2 = Married; spouse did not have separate income M=3 = Married, but spouse died during the year In addition, whenever a person's record was missing for some year t and for the next later year (t + 1); and no current spouse had filed in year t, then the person's marital status in year t was set equal to "7" Thus "7" implies that a filer's marital status in year t is in doubt, signaling caution when treating this person. The paper, however, does not present a specific rule for treating such a person. For these tables, it was decided to treat the person as single unless evidence of being married was very strong. Specifically, the following ad hoc decision rule was used. Examine later years (t + 2, t + 3, etc..) until either the latest year is reached or until a year is found where M does not equal 7. Examine earlier years (t - 1, t - 2, etc..) until either the earliest year is reached or until a year is found where M does not equal 7. If both an earlier and a later year in which M does not equal 7 are found and if M = 1 or 2 in both these years, treat the person as married in year t. On all other cases, treat the person as single in year t. V. A note on capital gains One of the problems in using WAIS data to simulate tax income for Federal tax purposes is the treatment of capital gains. Wisconsin simply includes 100% of capital gains and losses in current taxable income while the Federal government employs a much more complex procedure. To recreate "Federal" capital gains with WAIS records would demand a treatment far more complex than the one used and, more important, requires data that is not at present available - the separation of short and long term gains and losses. The special treatment of capital losses for Group III people in years in which they were not married in paragraph (5) (b) (ii) of WAIS paper 645-052 presumably was written to alleviate this problem. Unfortunately, we can no longer remember exactly what good it was supposed to do. It was decided, therefore, to ignore the special treatment. VI. Treatment of people who were married in 1958 and had different spouse sometime during 1954-57. Although a proper treatment of a twice-married man is complex indeed, we originally planned to consider both spouses. However, when it was discovered that only 18 records out of 8,300 fell into this category, it was decided that the earlier wife be ignored. (We hope she will forgive us.) Alan I. Duchan Page 9 of WAIS paper 656-034 VII. Cross-references between variables used in the tables and WAIS pagers 645-050, 051, 052, and 053. Tabular Symbol Symbol Used in Described In Or Generated By Functions Described In of Variable WAIS 645-052 1) GRPTYPE "Group" Section (2) of WAIS 645-050 and Section (1) - (f) of WAIS 645-052 2) LEGALDEF j Sections (3), (4) and (5) of WAIS 645-052 3) DIRFLUC No symbol used Sections (3), (4) and (5) of WAIS 645-052 and Section II-B-4 of WAIS 645-053 4) QUALTYPE No symbol used Section (2) of WAIS 645-052 5) PCV ( ) or ( ) Section II-B-6 of WAIS 645-053 6) FNTI Y Sections (3), (4) and (5) of WAIS 645-052 7) AGI G Self-explanatory. Not explicitly described. 8) POTAVINC X or Z Section (3)-(b) of WAIS 645-052hahttp://www.ssc.wisc.edu/wais/WAIS656034.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656034.txtf Martin David 1964b\Note on a Recursive Model of Income Determination for the Wisconsin Assets and Incomes StudyOctober 16, 1964 WAIS paper645-0074.Analysis Proposals- For Analyses, Theses, etc.  M. David WAIS 645-007 October 16, 1964 Note on a recursive model of income determination for the Wisconsin Assets and Incomes Study The purpose of this note is to outline some of the ideas that I would like to pursue in analyzing the Wisconsin Assets and Incomes Study data. 1. Stratification of the data for analysis In order to study the Impact of short-run fluctuations and business conditions and changes in. the economic circumstances of families on the earnings of family members it would be useful to stratify the Wisconsin Assets and Incomes data file into a number of homogeneous subgroups.. Initially these groups would be defined by age-and occupation as follows: Occupation Group Age Groups I. Farm 1. Under 35 years of age 2. Self employed proprietors 2. 35 to 45 years of age 3. Non-self employed managers and 3. 45 to 55 years of age professionals 4. 55 to 65 years of age 4. Clerical and sales 5. 65 years of age and over 5. Semi-skilled 6. Unskilled and service workers (When survey data have been coded other dimensions can be added.) In addition the data would have to be stratified on the basis of the number of years for which returns have been filed as a longer time series offers a greater number of degrees of freedom for analysis of the micro-unit earnings time series. 2. Models to be estimated The basic plan envisioned in the analysis would be to estimate regressions for individuals within the above strata. Each regression would contain predetermined variables, current family income situation variables, and current market condition variables formulated in such a way that the regressions on earnings of the head and wife would recursively define the earned income of the taxpaying unit. One such set of relationships that might be estimated could be formulated as follows: Model I Eih= (gamma)1 a1 Eih,-1 b1 Zih c1 Ni, -1 where Eih,-1 = Earnings of head of ith tax unit during t-1 Zih = Wages paid in the occupation-industry of this individual during t Ni,-1 = Non-earned income of the ith during the previous period Eiw= (gamma)2 a2 Eih b2 Ziw c2 Ni, -1 where Eiw = Earnings of the wife of the taxpayer during t Ziw = Wages paid in the occupation-industry of the wife during t An additive expression of the same model might prove equally interesting. The former earnings of the wife can easily be incorporated in that model in spite of the fact that they will be zero in many cases. Such a formulation would give rise to Model II: Model II Eih= 3 + a3 Eih,-1 + b3 Zih + c3 Ni,-l Eiw = (gamma)4 + a4 Eih,-l + b4 Ziw + c4 Ni,-1 + d4 Ew,-1 For both models it is clearly necessary to have auto-regressive moments of earnings upon themselves and the moments showing co-variation of various indicators of labor market earnings paid with the earnings reported by the individuals reported in the tax sample. The models might be elaborated to include hypotheses of the effect of the marginal tax rate on work effort by the inclusion of marginal tax rates in the relationship and ought of course to be elaborated for a number of personal characteristics that can be measured through exemptions and deductions such as deductions for medical care. Initially, however, I should think it would be most useful to focus on the variables outlined above and suppress some of the demographic factors.hahttp://www.ssc.wisc.edu/wais/WAIS645007.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645007.txt7Mike VonSchneidemesser 1966^XReport on the Matching of the FFID File with Master, SSA Form 805 and State Roll Extract June 16, 1966  WAIS paper656-062jcSocial Security Earnings Data- 805 Fixed Format Identification File (FFID) Master File- Tax RecordsdF@M. von Schneidemesser WAIS Paper 656-062 June 16, 1966 Report on the Matching of the FFID File with Master, SSA Form 805 and State Roll Extract At the writing of this report, all changes and updates on the FFID Marshall Seavey 1965`ZSuggested Additions to Coding and Punching of Stock and Bond Information in Assets BookletFebruary 8, 1965 WAIS paper645-033Survey Data and FileMarshall Seavey, WAIS 645-033, February 8, 1965, Draft. Suggested Additions to Coding and Punching of Stock and Bond Information in Assets Booklet. After surveying about 200 assets booklets at random, it appeared that certain added categories of codes to holdings and sales of stock and bond might be helpful in capturing information provided by respondents that does not fit the present questions. Proposed additions: 1. A code for questions 17c, 19f, 25d, and 33h, which would indicate means of acquisition other than through a purchase by Respondent. Respondents often indicated that they acquired their stocks and bonds by inheritance, reinvestment of capital gains in a mutual fund, through a profit sharing scheme, stock splits and other means. 2. Stock received in a profit sharing plan should be left in questions 25 and 35. Most of this stock was in American Motors profit-sharing plan and are left in trust for two years. Presumably the respondent is free to use stock as he wishes after this period of time. 3. When the respondent indicates that be has purchased stocks and bonds over a period of time, the beginning and last dates of this period might be coded where they could be ascertained. 4. Sometimes "per share" prices are given where a lot price is asked for. In such instances a coder could multiply the "per share" price times to number of shares in the lot and enter this figure in the appropriate category. 5. In one case out of 200 interviews a respondent had a lot of 50 shares of stock. He exercised options on 2 shares of this lot. There may not be enough cases of this to warrant a special code for this type of exception which does not fit the "yes" or "no" responses to questions 25e and 33g.hahttp://www.ssc.wisc.edu/wais/WAIS645033.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645033.txt2Mike VonSchneidemesser 1966^XReport on the Matching of the FFID File with Master, SSA Form 805 and State Roll Extract June 16, 1966  WAIS paper656-062jcSocial Security  Martin David 1965<5Notes on Future Work in the Area of Averaging Studies6November 3, 1965 WAIS paper656-029Averaging Studiest$M. David WAIS Paper 656-029 November 3, 1965 NOTES ON FUTURE WORK IN THE AREA OF AVERAGING STUDIES I. Present Capabilities At present WAIS has a partially edited five-year file of tax records required to reconstruct income for 1958 under the 1964 Tax averaging provisions. (WAIS 645-050,051). it also has available a partially edited file of individual tax records and social security wage earnings data for the period 1947-1959. (WAIS Paper 645-057). A program to compute the amount of averageable income in 1958 under the 1964 provisions is almost completed. (WAIS Paper 645-052). II. Short-run output: Stage 1 Computation of averageable income under various variants of the 1964 tax averaging provisions for the year 1958 will yield information on the role of absolute dollar and percentage limitations on the amount of averageable income, the effect of full inclusion of capital gains, and negative income amounts. (WAIS Paper 645-053; Treasury contract document, 6/21/65). (Conceptually the file is not adequate for determining the full impact of exchanging negative-income averaging for carryovers as the amounts of business loss carryovers are not identified on the tax returns. Effects of averaging negative income due to excess deductions and exemptions is limited as low income individuals did not generally have to file for refunds during the sample period. However, simulation of carryovers for capital losses could be attempted.) III. Intermediate-run output: Stage 2 Prerequisites: (a) Financing (b) decisions on the extent of useful replication of II (for 1951-57) (c) resolution of questions associated with the choice of an appropriate income base. Using the five-year file created for II, simulate federal income tax liability with and without averaging under the assumption that capital gains are treated as ordinary income. Compute tax savings under an alternate plan in which simple annual averaging is permitted with limitations on the refund (Groves-Simon). Repeat the variant with alternative provisions for reconstruction of income: A. Reconstruct income as actual earnings in single years, half of joint income in married years. B. Reconstruct income as actual earnings in single years, the greater of separate or half of joint income in married years. Varying the principle of reconstruction vastly simplifies the computation and assembly of records required and makes it possible to deal with individual income and tax histories without becoming enmeshed in changing tax unit structures. IV. Intermediate-run output: Stage 3 Using reconstruction under IIIA or IIIB it would be possible to apply (a) periodic averaging, (b) cumulative averaging, and (c) the simple annual averaging plan in III to investigate the cumulative history of tax payments of individuals over an indefinite period. Comparison of (a) and (c) would provide information on the importance of duplication of years in successive averaging computations. Variation in the averaging period could be investigated on these files; use of the existing five-year file implies investigation of shorter averaging periods only. Comments We assume little interest in a moving average method. Treatment of capital gains and reconstruction as suggested are devices to simplify the tax computation. Treatment of negative income is assumed adequately studied by the planned output for II, but some effort could be put on conceptual linking of averaging negative tax credits, and carryovers, using the file for whatever illustrative purposes may be useful. A program such as this would require 18-24 months of funding.hahttp://www.ssc.wisc.edu/wais/WAIS656029.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656029.txtL R. F. Miller 1964@9WAIS Memorandum on Initial Tabulations of Tax Return Datae July 28, 1964r WAIS paper645-003sf`Analysis Cross Tabulations Master File- Tax Records Proposals- For Analyses, Theses, etc. TablesR.F. Miller Draft 7-28-64 645-003 WAIS Memorandum on Initial Tabulations of Tax Return Data Objectives: (1) Obtain data distributions comparable to those in other sources for determination of the relationship of the sample to its universe; (2) Obtain distributions of variables which give insight into the problems we intend to investigate more fully at later stages. Limitations: (1) Use only tax return data, none of Social Security or interview data; (2) Can obtain joint husband and wife income only for those cases in which we have made the matching; (3) Keep to univariate and bivariate distributions at this stage. Units of Analysis: (1) Individual return filers, regardless of household status; (2) Combined husbands and wives as (2a) single units, and as (2b) separate individuals each having half the combined income; (3) "Households," combining incomes of husbands with wives and other household members where these are identifiable. For the initial tabulations it is proposed that we use only two types of units: (1) Individual return filers; and (2) Single persons and combined husband and wife units. Each tabulation should be done using both types of units, and should give both actual frequencies and percentage distributions. Variables for Analysis: (1) Adjusted Gross Income (AGI) and its various components; (2) Net Taxable Income (NTI); (3) Year of filing (t); (4) Number of dependents (n); (5) Spouse filing separately? (m); (6) Occupation (0); (7) Averages of the variables. in (1) and (2) over time; (8) Variances of the variables in (1) and (2) over time; (9) Trends of the variables in (1) and (2) over time; (10) Residual variation from trends; (11) V = sq. root((8)+(7)); (12) Sex of filer; (13) Number of years filed; (14) Years filed continuously?; (15) Marital status. The separate components of AGI of most immediate interest are (16) Wages and salaries (W & S) (for comparisons with Census data); (17) Dividends and capital gains realizations (D & G) (for the NBER study); and (18) Business and professional income (B & P) (for Harry Kahn). It is also desirable to have the following aggregates for Wisconsin available for each year prior to extracting the individual data: (aggregates to be obtained from Personal Income by States): (19) Personal income less transfer payment; (20) Wages and Salaries plus other labor income; (21) Property income; (22) Proprietors' income. Proposed Distributions: Row and column totals on bivariate distributions will give most of the univariate distributions, in which cases the latter need no separate tabulations. For each of the variables we ask for values in selected years as well as the averages, etc. Appropriate years suggested are 1947, 1955, 1959, and 1962 (the latter contains no information on the components of AGI). (1) Bivariate distributions: Table No. Row Classes Variable Column Classes Variable I AGI average, 1947-59 NTI average,, 1947-59 II AGI average, 1947-59 Variance of AGI, 1947-59 III D & G average, 1947-59 W & S average, 1947-59 IV B & P average, 1947-59 D & G average, 1947-59 V W & S average, 1947-59 B & P average, 1947-59 VI Dividends average, 1947-59 Capital gains average 1947-59 VII Dividends average, 1947-59 Dividends variance, 1947-59 VIII Capital Gains average,. 1947-59 Capital Gains variance, 1947-59 IX W & S average, 1947-59 W & S variance, 1947-59 X B & P average., 1947-59 B & P variance, 1947-59 XI NTI average, 1947-59 NTI variance, 1947-59 (2) Other distributions and summary measures: Table No. Description XII For each occupational category that is distinguished in the data, give 1) weighted average of AGI 1947-59 averages, weights to be number of years filed (2) number of distinct units included (3) average number of years filed (4) weighted average of variances of AGI (5) weighted average of residual variances from trends (6) weighted average of trend coefficients (7) weighted average of coefficients of variation. XIII For each AGI class, for each year 1947, 1953, 1959, 1962 and for the average 1947-59, give the number of units and the aggregate income of the units in that class (for Lorenz curves), Table No. Description XIII(cont.) and give the weighted averages of variances, residual variances, trend coefficients and coefficients of variation for all persons whose average 1947-59 income is in that class. XIV Repeat XIII for NTI classes. XV Repeat XIII for W & S classes, omitting 1962. Summary Measures: In addition we should obtain the overall mean and variance of each of our variables, plus certain selected covariances. The covariance should be obtained between each pair of variables in the bivariate distributions. Also, the following covariances can be computed during the extraction run: (2) with (19) (16) with (20) (17) with (21) (18) with (22) Single Year Filers: Persons filing in only one year in the entire sample period obviously have zero observed variances and trend coefficients. These observations should not be included in computing averages of these measures in tables XII-XV. The number of single year filers in each class or occupation category should be separately tabulated. Class Boundaries for AGI, NTI, W & S, B & P and Their Averages: Upper limit (included in the class): -$ 100 - 1 + 1 1,000 3,000 5,000 7,000 9,000 11,000 13,000 15,000 20,000 22,000 30,000 32,000 50,000 52,000 75,000 77,000 100,000 102,000 150,000 152,000 infinity Class Boundaries for Dividends and for all Variances: $ 0 100 300 500 700 900 1,100 1,300 1,500 2,000 5,000 10,000 20,000 50,000 100,000 200,000400,0001,000,000 infinity Separate classes: For variances: Filed in only one year. For dividends: Never reported any dividends or capital gains. Class Boundaries for Capital Gains Realizations: Upper limits of classes: - 900 - 700 - 500 - 300 - 100 0 + 100 300 500 700 900 1,200 1,500 2,000 5,000 10,000 20,000 infinity Separate Class: Never reported any dividends or capital gainshahttp://www.ssc.wisc.edu/wais/WAIS645003.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645003.txtariate distributions. Also, the following covariances can be computed during the extraction run: (2) with (19) (16) with (20) (17) with (21) (18) with (22) Single Year Filers: Persons filing in only one year in the entire sample period obviously have zero observed variances and trend coefficients. These observations should not be included in computing averages of these measures in tables XII-XV. The number of single year filers in each class or occupation category should be separately tabulated. Class Boundaries for AGI, NTI, W & S, B & P and Their Averages: Upper limit (included in the class): -$ 100 - 1 + 1 1,000 3,000 5,000 7,000 9,000 11,000 13,000 15,000 20,000 22,000 30,000 32,000 50,000 52,000 75,000 77,000 100,000 102,000 150,000 152,000 infinity Class Boundaries for Dividends and for all Variances: $ 0 100 300 500 700 900 1,100 1,300 1,500 2,000 5,000 10,000 20,000 50,000 100,000 200,000400,0001,000,000 infinity Separate classes: For variances: Filed in only one year. For dividends: Never reported any dividends or capital gains. Class Boundaries for Capital Gains Realizations: Upper limits of classes: - 900 - 700 - 500 - 300 - 100 0 + 100 300 500 700 900 1,200 1,500 2,000 5,000 10,000 20,000 infinity Separate Class: Never reported any dividends or capital gainshahttp://www.ssc.wisc.edu/wais/WAIS645003.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645003.txt D James Geffertt 1966<5Check for Presence of Proper Cards in the Survey File May 4, 1966 WAIS paper656-061aSurvey Data and File4.James Geffert WAIS paper 656-061a May 4, 1966 Check for Presence of Proper Cards in the Survey File Cards 02 Required. 03 " 04 " 05 " 06 not required if (a) col. 48 card 4 = 5 not required if R is female 07 not required if (a) col. 48 card 4 = 5 not required if R is male 08 Required 09 " 10 " 11 12 not included 13 14 15 Required 16 not required if col. 63-64 card 15 = bb, or 00 17 not required if col. 65 card 15 = b or 0 18 not required if col. 67 card 15 = b or 0 19 not required if col. 68 card 15 = b or 0 20 not required if col. 69 card 15 = b or 0 21 Required 22 Required only if card 21 col. 53 = b or 3 and card 21 col. 54 = 1 23 Required if card 22 col. 80 = 1 or if card 21 col. 54 does not equal 1 and card 21 col. 53 = b 24 Required if card 21 col. 55 = 1 25 Required if card 21 col. 55 = 1 26 Required if card 21 col. 55 = 2 or 3 27 Required if card 21 col. 53 does not equal b 28 Required 29 Required 30 Required 31 Required 32 Required if card 29 col. 40-42 does not equal 000 or ( ) or ( ) 33 Required if card 29 col. 43-45 does not equal 000 or ( ) or ( ) 34 Required if card 29 col. 46-48 does not equal 000 or ( ) or ( ) 35 Required if card 29 col. 49-50 does not equal 00 or ( ) or ( ) 36 Required if card 29 col. 51-52 does not equal 00 or ( ) or ( ) 37 Required if card 29 col. 53-55 does not equal 000 or ( ) or ( ) 38 Required if card 29 col. 56-57 does not equal 00 or ( ) or ( ) 39 Required if card 29 col. 58-59 does not equal 00 or ( ) or ( ) 40 Required if card 29 col. 60-61 does not equal 00 or ( ) or ( ) 41 Required if col. 62-63 card 29 does not equal 00, ( ), ( ) 42 Required if cal. 64-66 card 29 does not equal 000, ( ), ( ) 43 Required if col 64-66 card 29 does not equal 000, ( ), ( ) and card 49 col 72 = 1 and card 42 col. 78-80 matches card 43 col. 78-80 44 Required if col. 67-69 card 29 does not equal 000, ( ), or ( ) 45 Required if col. 70 card 29 does not equal 0 or ( ) 46 Required if col. 73 card 45 = 1 47 Required if col. 71-72 card 29 does not equal 00, ( ), ( ) 48 Required NOTE: b is equivalent to ( ) and means an alpha blank.jchttp://www.ssc.wisc.edu/wais/WAIS656061a.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656061a.txt(.JAlan Duchan Gene Moyer 1966NHChanges in Computation of Income and Deductions for the Averaging TablesFebruary 3, 1966 WAIS paper656-041pAveraging Studies Tables--Alan Duchan and Gene Moyer WAIS Paper 656-041 February 3, 1966 Changes in Computation of Income and Deductions for the Averaging Tables (See Miller, WAIS 645-052, April 28, 1965) Because WAIS desires to approximate the Federal definition of Taxable Income as closely as possible, we are redefining certain parameters in Miller's original paper on the averaging tables. 1. The Problems 1.1 Differences between standard deductions allowed by State and Federal law. At present, the averaging tables use the state system of deductions for 1953-1959, i.e., persons who actually filed State income tax returns for 1954-1958 have either the State standard deduction (min.[.09SAGIt, $450]) or itemized deductions. Married persons whose income is interpolated and who did not file have zero deductions, and in general, persons who did file have a smaller deduction than Federal law allows. The Federal law allows a standard deduction for joint return filer or single persons of either 10% of joint AGI up to $1,000 or $200 + $100 for each exemption up to $1,000. For couples married, but filing separately, the problem compounds itself because both persons must take the same type of deduction; i.e, both must itemize, both must take $200 + 100 per exemption or both must take 10% of ACT. In the last two cases the deduction is limited to $500 each. Ideally, the type of deduction that minimizes total taxes should be chosen, but to do this requires computation of tax for each of the three different deductions and assuming a particular tax law - a difficult task, indeed. An alternative is to use the type of deduction that maximizes total deductions. Usually the two procedures would yield the same result, but even where they would not, the latter should not significantly distort aggregate figures. 1.2 Reconstruction of income when a person file a separate return in the computation year but a joint return in base period years. For people married in the computation year (Group III people) and treated as filing separately, Miller allocated their joint deductions and exemptions, as required by the law, in approximate proportion to each person's contribution to total AGI. He used the allocation procedure for all base period years, whether or not the couple was married, and for the computation year. The law, however, requires such allocation only when the couple filed a joint return. For years in which the couple filed separate returns, each person's deductions are those shown on his own return. Since WAIS assumed the couple filed a joint return for all base period years in which they were married, our procedure is correct for married base period years. For base period years in which the couple was single, Miller's allocation procedure is incorrect. What will be done now is use the deductions and exemptions shown on each individual's State return adjusted to meet the Federal definition. When the couple is treated as filing separately in the computation year, allocation of deductions for that year is also incorrect. A further problem arises, however. Due to the nature of Ellis' input plus our gap plugging operation, the only exemption information available for a married couple is the total for the couple. One reasonable (but nevertheless arbitrary) way to allocate exemptions is to distribute them according to the ratio of AGI's. Proposed modifications 2.1 Group I - People single throughout period. 2.1.1 Using Wisconsin definition of deductions Present procedure is correct. 2.1.2 Using Federal definition of deductions. Compute Ds = max[min(1,000;.1G);min(1,000;200+100E);G-N] The first two arguments represent the allowable Federal standard deduction: the larger of 10% of AGI or $200 + $100 per exemption, but in no case more than $1,000. The third argument is the total deductions shown on the State return. If G-N is larger than the Federal standard deduction, the taxpayer must have itemized deductions and G-N becomes an approximation of Federal itemized deductions. It will be off only to the extent that Wisconsin itemized deductions are not allowed by the Federal government or vice-versa. Compute Ns = G - Ds and proceed as in WAIS 645-052 replacing N with Ns 2.2 Group II People single in the computation year. Married in one or more base years. 2.2.1 Using Wisconsin definition of deductions. Present procedure is correct. 2.2.2 Using Federal definition of deductions. 2.2.2.1 For all years in which the person was single (must include the computation year), the treatment is the same as for a Group I person (2.l.2). 2.2.2.2 For years in which the person was married, compute Dj = max[min(1000;.1(G1 + G2);min[1000;200+100E];G1 + G2 - N1 - N2] Redefine J as J = G1 + G2 - DJ - C1 - C2 - 600E for j = 1,3 J = Gl + G2 - Dj - 600E for j = 2,4 Proceed as in WAIS 645-052 replacing D with Dj and using the now definition of 3. 2.3 Group III-a married in computation year and treated as filing jointly in computation year. 2.3.1 Using Wisconsin definition of deductions. Present procedure is correct, except when single, j = N1 + N2 - 600 (E1+E2) 2.3.2 Using Federal definition of deductions. 2.3.2.1 For base period years in which single. Compute Dsi = max[min(1000; .1Gi);min(l000;200+l00E;); Gi - Ni] (i = 1,2) For j = 1, compute Si = Gi - Dsi - 600Ei - Ci j - S1+S2 (i = 1,2) J = S1 + S2 Then Bt = max(B ; J) where B = 0 For j = 2, same as j = 1 except Si = Gi - Dsi - 600Ei For j = 3, same as j = 1 except B = - (infinity) For j = 4, same as j = 2 except B = - (infinity) 2.3.2.2: For the computation year and base period years in which married. Compute B = max(E1,E2) DJ = max[min[ 1000; .1(G1+G2)];min[ l000;200+100E),G1+G2-N1- N2] For j = 1, compute J = G1+G2-DJ-600E-C1-C2 Bt = max(B;J) where B = 0 For j = 2, same as j = 1 except J = G 1 +G 2 -DJ -600E For j = 3, same as j = 1 except B = -(infinity) For j = 4, same as j = 2 except B = -(infinity) 2.4 Group III-b People married in computation year and treated as filing separately, 2.4.1 Using Wisconsin definition of deductions. 2.4.1.1 For base period years in which single For j = 1, compute Si = Ni - 600Ei - Ci (i = 1,2) J = S1 + S2 Bit = max (B; 1/2 J; Si) For j = 2, same as j = 1 except Si = Ni - 600Ej For j = 3, same as j = 1 except B = -(infinity) For j = 4, same as j = 2 except B = -(infinity) 2.4.1.2 For base period years in which married. The present procedure is correct. 2.4.1.3 For the computation year. Compute E = max (E1, E2) Pi = Gi/G1+G2 (i = 1,2) {0 if Pi < .15 Xi = {PiE if .15 < Pi < .85 {E if Pi > .85 For j = 1 Si = Ni - 600Xi - Ci Bt = max (B; Si) B = 0 For j = 2, same as j = 1 except Si = Ni - 600Xi For j = 3, same as j = 1 except B = -(infinity) For j = 4, same as j = 2 except B = -(infinity) 2.4.2 Using Federal definition of deductions 2.4.2.1 For base period years in which single Bit = max (B; Si; 1/2 J) (i = 1,2) B, S, and J vary with j as described in 2.3.2.1 covering married people filing jointly.2.4.2.2 For base period years in which married. Proceed in section 2.2.2.2, above, covering Group II people in married base period years. 2.4.2.3 For the computation year. Let d' =G1+G9 -N1-N2 d'' = min[500;.1G1] min[500;.1G2) d''' = min[ 500; 100+100X1 ] + min[ 500; 100+100X2] Find max[ d' ; d''; d''' ] Based on which argument is largest, compute D for each person using the following definition of DIi. If d' is max. If d'' is max. If d''' is max. DIi = Gi - Ni min[500;.1Gi] min(500;100+100Xi) For j = 1 Compute Si = Gi = DIi - 600Xi - Ci and Bt = max (B;Si) B = 0 For j = 2, same as j = 1 except Si = Gi-DIi - 600Xi j = 3, same as j = 1 except B = -(infinity) j = 4, same as j = 2 except B = -(infinity) 3.0 Summary of adjustments 3.1 Table I below indicates the steps needed to obtain a person's income for averaging purposes from his Wisconsin return. The table incorporates the changes given in Section 2.0, but based on the Wisconsin definition of deductions. Further changes needed to convert to the Federal definition of deductions are given after the table. The third row (married base period years, filing separately) is included for completeness. It is not needed for the current work since we assume joint returns are filed for all married base period years. Table I: Steps Needed to Obtain Averageable Income Using Wisconsin Definition of Deductions Marital Status and How Filing in Computation Year i Group IIIb - Married Group IIa Married in Base period in computation year. computation year and treated year-single Treated as filing separate- as filing jointly in com 1 in computation year putation near 600E Si = Ni - 60OEi J = N1 + N2 - 600(E 1 + EE B =max.- (13;-F) J = S1 + S2 = max, Bi = max. (f3 ;;S; 1/2J) Base period Not B = max. (E1; E2) max. (E ; ) year-married applicable D = G1 + G2 - N1 3 = N1 + N2 -, 600E joint return Si - Gi - Ai B - max. (P ; J) where is the parts of D + 600E allocated to each person in accordance with his (her) contribution to AGI, J --N1 - N2 - 600E B - max. ; S 1 2J Base period Not . E ; E2 max. (B ; ) years-married applicable Xi % of E allocated to each person in accordance J = N1 + N2 - 600E separate B - max. (f3 ; J) return with his contribution to JOINT AGI Si = Gi - Di - 60OXi 3 = S1 + S2 = N1 - 600E Bi = max. (P ; S; 1/23) Computation Same as base period year year-joint Not Not Not with joint return t n applicable licable applicable Computation Same as married base period year-separate year with separate return not return Same as single base period year except B1 = max. (P ; Si) applicable For simplicity capital gains have. been included so that the formulas apply to Legal Definitions 1 and 3. 10 3.2 To convert to the Federal definition of deductions 1) For all single or joint return years, redefine a) ' D as the minimum of: 1) Deductions shown on Wisconsin return, 2) $200 + $100 per exemption up to $1,000, 3) .1 AGI up to $1,000. b) Redefine F, S, and J accordingly. 2) For separate return married_ years a) Find type of deduction that maximizes total deduction for couple; i.e., find max. of: 1) Sum of deductions shown on Wisconsin returns, 2) Sum of $100 + $100 per exemption up to $300. 3) Sum of .1 AGI up to $500. b) Separately for each person, redefine Di according to the type of deduction that maximized total deductions. c) Redefine S and J accordingly. 3.3 As an indication of what is required to utilize the adjustments suggested in Section 2.0, Table II describes the eresent procedure used to find income for averaging purposes. The table is a summary of the procedure outlined in WAIS 645-052. 11 Table II - Procedure outlined in WAIS 645-052 for obtaining averageable income croup I-Single Group II-Single Group IlIb-Married Group IIIa throughout in computation in computation year period ear. Married Treated as filing Married in com previously separately in com- putation year putation year. Treated as filir jointly in com utation year. Base period Same as Table I Same as married Same as married year single joint return base joint return period year in base period year Table I in Tabl I Base period Not Same as Table I Same as Table I Six tm#r applicable e juld ' Base period Not discussed in WAIS 645-052 year married 'separate re- turn Computation Not Not not Same as Table I year joint applicable applicable applicable return Computation Same as Table I Same as married Not year-separate joint return base applicable return -eriod year in Table Ihahttp://www.ssc.wisc.edu/wais/WAIS656041.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656041.txtv Gene Moyer 1966.'A Weighting Function for WAIS InterviewFebruary 22, 1966\ WAIS paper656-043xSurvey Data and Filec,c&Table 5 Rates at Which the Unrecognized Matched Respondents Were Drawn A= 2 A = 3 True True -1 Number of Respondents Rate Rate Rate (Rate) .01 .05 .0505 19.8019 10 .02 .05 .0690 14.4927 2 .01 .10 .1090 9.1743 2 .02 .10 .1180 8,4745 4 .10 .10 .1900 5.2631 2 .20 .05 .2400 4.1667 1 .20 .10 .2800 3.5714 2 .25 .10 .3475 2.8777 1 .01 .50 .5050 1.9807 7 .02 .50 .5100 1.9607 3 .20 .50 .6000 1.6667 5 Low Income Sample .01 1.00 1.0000 1.0000 8 High Income Sample .02 1.00 1.0000 1.0000 5 .10 1.00 1.OOOQ 1.0000 3 .20 1.00 1.0000 1.0000 2 .25 1.00 1.0000 1.0000 5 1.00 1.00 1.0000 1.0000 35 97 interviews involved by key designation grouping, Table 6 indicates a proper way to group these 68 interviews. The significant key digit which divides these interviews is key E, average property income. When key E is 0 or 1, the interviews were drawn at 1 or 2 percent. When key E is 2-4, the interviews were drawn at 10-25 percent. Therefore, this writer suggests that there be three groups of interviews with the same weight, those drawn at 1-2 percent, those drawn at 10-25 percent and those drawn at 100 percent. This gives a k i of 15, 37 and 16. One may argue that 15 or 16 is too small as a k value, but this division seems to reflect best the population from which the 68 interviews were drawn. The 19 interviews remaining in A = 3 must, it seems to this writer, be considered as a single group. In addition to our d i adjustment of A = 1, we need to recognize that the population values for A = 2 and A = 3 are larger than the true 1-1 1 values because the values in Table 2 include persons who are in both lists. Furthermore, A = 3 values in Table 2 are smaller than they should be because some persons in A = 5 really should be in A = 3. Let BiPi = the number of persons in P i who are really matches. Then Pi- BiPi = Ni Pi(1-Bi) = Ni We showed for A = 1 that Ni(1-ai) = Mi, so for A = 2, Pi(1-Bi)(1-ai) = Mi Table 6 A Priori : Interviews in A = 2 Sampling Rate and Key Designation A Key C l a s s F G Key Sampling Number B C D E Class Rate of Population Interviews .2 1 0 0-3 0-1 0-4 0-4 2145 1 10 2 1 1 0-3 0-1 0-4 0-4 296 2 5 2 1 0 0-3 2-4 0-4 0-4 162 10 7 2 1 0-1 0-3 2-4 0-4 0-4 166 20 18 2 1 1 0-3 4 0-4 0-4 94 25 12 2 all all all all 0-4 0-4 45 100 16 Sum 2808 68 For the A = 3 group, let Qi = the number of persons shown in Table 2 for the ith division yiQi = the number of persons in A = 5 who should be in A = 3. Then Qi + yiQi = Qi(1+yi) = Pi therefore Qi(1+yi)(1-Bi)(1-ai) = Mi. Therefore in addition to estimating CV as we did for A = 1, we must estimate yi and Bi for A = 3 or A = 2. Let us consider the estimation of yi first. For A = 3, of course, all the interviews are in the ith group. The easiest route to take is to recognize that we have an estimate of Qi(1+yi) in the population of A = 2 as shown in Table 2. Since WAIS limited the population to those who filed a 1959 or a 1962 income tax return, we know that the A = 2 population represents the number of persons who filed a 1959 income tax return (in name groups and in PSU) but who did not file a 1962 return. The A = 3 population should represent the number of persons who filed a 1962 income tax return, (in name groups and in PSU) but who did not file a 1959 return. While the A = 3 group should be somewhat larger than the A = 2 group because the number of 1962 returns was probably greater than the number of 1959 returns, if we use 2808 (the A = 2 low income population) instead of 1603 (the A = 3 low income population of Table 2) we should approximate Qi(1.3-y ) rather closely or at least significantly improve the estimation over an estimation which used yi = 0. Two possible estimators of Bi exist. The first makes use of the fact that 277 of the 2328 members of the Unmatched State File in name groups (A = 3 and A = 4) were actually matches. We might use, then 277 = 114 2328 as an estimate of Bi. This estimate seems too low.* In this writer's judgment, a better method of estimating 131 is to assume that the proportion of true matches in the unmatched populations of (A = 2 and A = 3) was the same as that in the sample as originally drawn. There were 50 hand matches in the Low Income Sample from ------------------------------------------ *The denominator is low because 2328 was lower than the true population size of A = 3 and A = 4. The numerator was low because listings of A = 5 and 6 were never available for matching. The effect of changes in the numerator and the denominator cannot be estimated without some idea of the magnitude of the changes. A 3 and 4 from A 2. Since these were matches, they must have appeared in both groups. The A = 3 low income sample was 151, the A = 2 sample was 149. Thus our estimate of Bi (using the average of the A = 2 and A = 3 samples) is Bi = 54/150 = .3673. This seems much more realistic than the .114 of the preceding estimator. For the A = 2 and A = 3 groups, then, Wi = Mi/ki as before. As we mentioned before, this weight is based on the groups in the low income population which had a positive probability of being drawn. The groups which had a zero probability of being drawn were those who resided outside PSU's. Let us now consider this group of persons. There is no a priori reason to believe that these persons are significantly different from persons who reside in PSU's because of the way PSU's were chosen. The method of choosing PSU's was to divide Wisconsin's 72 counties into 12 "self-representing" and 59 "non-self-representing counties or county groups on the basis of the 1960 Census of county population. The "self-representing" counties are Brown, Dane, Kenosha, Manitowoc and Calument (combined) Marathon, Milwaukee, Outagamie, Racine, Rock, Sheboygan, Waukesha and Winnebago counties. These are "self-representing" basically because they contain the largest cities in the state and so have by far the largest populations. The remaining counties were divided into fourteen strata based on population size and were formed so that each stratum contained about 100,000 persons in the 1960 Census. One county in each of these strata was chosen as the PSU from the strata. Thus the probability of choosing the ith "non-self-representing" PSU in the kth stratum is pj/Epj where p is the population of the jth PSU is the kth stratum and Epj is the population of the kth stratum.* Thus the weight given to each interview from the jth PSU is Sj = Epi/pj. In self-representing PSU's, of course, Sj = 1. Table 7 gives the PSU's and the weights attached to them as well as the additional counties in each stratum where such exist. In effect, what WAIS did by giving a zero sampling rate to residents of counties which are not PSU's was to make Mi the size of the population in PSU's. If we were weighting to make Hi represent the entire state population (1960 Census) we would probably use a weight Fij =Ti/ki Sj , (Where Ti is the population of the ith class in the state) for an interview from a person in the ith class who resided in the jth PSU. WAIS is weighting to an alphabetic sample of husbands and single-persons in the state who filed tax returns in-1954-1959 or 1962 and not to the 1960 Census estimate of the entire population, but there is some evidence that the Master File contains some bias toward the Madison Tax District.** Therefore this writer would argue that Gij = Mi/ki Sj = WiSj is likely to remove some of this bias and to make interview distributions more like population distributions rather than less. Having weighted interviews by Gij, sample aggregates should represent population aggregates at least asymptotically in repeated sampling. ---------------------- *This is basically from "The WSRL Sample of Wisconsin Housing Units," Wisconsin Survey Research Laboratory, University of Wisconsin, December 1963, pp. 1-2. (Unpublished) **This evidence is examined in my Ph.D. Thesis, The Longitudinal Income Tax Return Sample of the Wisconsin Assets and Income Studies and Wisconsin's Income-Receiving Populations: The Validity of Income Distributions which should be available soon. Table 7 Primary Sampling Units, Their Weights, and the Stratum from Which Each Was Chosen Self-Representing Counties Non-Self-Representing Counties Weight County (Other Counties in Stratum) 3.9519 Clark (Barron, Rusk, Taylor, Dunn) County Wright Brown 1.0000 Dane 1.0000 Dodge (Chippewa) 1.7139 Kenosha 1.0000 Douglas (La Crosse) 2.6100 Manitotoc-Calumet 1.0000 Eau Claire (Fond du Lee) 2.2879 Marathon 1.0000 Grant (Iowa, Lafayette, Green, 2.8005 Crawford) Milwaukee 1.0000 Oconto (Kewaunee, Door, Marinette) 3.9322 Outagamie 1.0000 Polk (Buffalo, Pepin, Pierce, St. Croix) 3.9318 Racine 1.0000 Price (Bayfield, Ashland, Iron Vilas, Burnett, Washburn, 8.5524 Sawyer, Florence, Forest, Oneida) Rock 1.0000 Sauk (Monroe, Richland, Vernon) 3.0616 Sheboygan 1.0000 Trempealeau (Green Lake, Juneau, 4.3211 Waushara, Adams, Marquette, Jackson) Waukesha 1.0000 Walworth (Jefferson) 1.9566 Winnebago 1.0000 Washington (Columbia, Osaukee) 2.6295 Waupaca (Lincoln, Langlade, 3.1677 Shawano) Wood (Portage) 1.6254 4. Weighting the High Income Sample While the weighting scheme for the High Income Sample should probably be similar to the system devised for the low Income Sample, provision must be made for differences in the way the two samples were drawn. Let us review these differences. The sampling frame for the High Income Sample contained all 8 lists of people represented by key A. One can look at this frame as having three parts, key A = 1, 2 and 3, a sample of high income people from the name groups (Master File) in PSU, key 4, 7, and 8, a supplementary sample of persons from the Master File outside PSU, and key A = 5 and 6, a supplementary sample of persons. from outside the name groups. We have already noted that some people from the A = 5 and 6 supplementary sample actually belonged in the two name group samples. One original suggestion for handling these interviews from the supplementary samples was to leave them out of the weighting scheme altogether (and so out of any analysis of the entire sample). We would like to reject this suggestion because these interviews contain the most significant of the interview data and because the variance of most variables in which WAIS is interested is greater among high income-high wealth respondents than among other groups in the population or sample. To use the notation of the weighting model, we noted that in repeated sampling E(a) = A. We did not mention that in a sample of size m from a population of size M without replacement, the variance of a is o2/a = o2A (M-m/N-1). Thus as m approaches M, o2/a approaches zero. While WAIS drew its sample so that m = M (at 100%), non-response makes the number of actual respondents (k) less than m. If there is no response bias, this will have the effect of making ( ) If we can increase k to near m-M, however, o~2/a should fall to near zero as we desire. Thus we would like to include all response from the High Income Sample in the name group sample if it is appropriate. In addition, we would like to be able to use the Wisconsin Survey Laboratory's PSU weights for the High Income Sample if possible. Therefore let us investigate the appropriateness of including all high income response an the name group sample and including it in the PSU sample by investigating, the differences between A = 1 2, 3, A = 4, 7, 8 and A = 5 and 6. We begin by considering the differences between PSU counties and nonPSU counties. For the self-representing PSU's, of course, there is no difference and it is not possible for anyone to be in the non-PSU category from those counties. The relevant cases, then, are among those who reside in counties in strata from which non-self representing PSU's were drawn. Table 8 shows the number of respondents to the interview who reside outside PSU's by the county and stratum in which they actually reside. Notice that the 44 respondents residing outside the PSU counties resided in 23 counties. More important is the fact that PSU stratum were devised so that each stratum was relatively homogeneous. There is no apparent a priori difference between a resident of Iowa county, for example, and a resident of Grant county. Therefore it seems reasonable to treat the persons who reside outside PSU's as if they resided in the PSU county and to give them the same weight one would give a resident of the PSU county. This procedure, then, allows us to use Si in our weight for the High Income Sample. The question of how to define Wi remains. There is one problem in defining Wi which needs solution. The A = 5(6) group is not in the Master File name groups. The possibility of persons existing in A = 5 and 6 arose because of the decision to keep the number of persons in each name group in the Master File approximately the same. Thus when Table 8 Respondents to the Interview from Non-PSU Countiesby County and Stratum of Residence Grant Iowa 1 Oconto Door 1 Lafayette 1 Marinette 8 Green 2 Eau Claire Fond du Lac Sauk Monroe 2 WTalwor th Vernon 5 Jefferson Washington Columbia 2 Ozaukee 4 Dodge Chippewa 2 Wood Portage 1 Trempealeau Green Lake 2 Jackson 1 Polls Pierce 1 St. Croix 1 Clark Barron 2 Dunn 1 Waupaca Lincoln 1 Shawano 1 Price Ashland 1 Oneida 1 Counties in the Stratum in which PSU WAIS Respondents Stratum Reside Number of Respondents in County Counties in the Number of Stratum in which Respondents PSU WAIS Respondents in the Stratum Reside Stratum Miller drew a person from the 1958 Madison Tax Roll whose last name was B**** and there were 757 B****s on the 1958 Madison Tax Roll, he limited the "B****" name group to persons who also had the same first initial as the person drawn, G, so that the name group identity was all persons in the state whose initial and last name was G. B****. The basic question, then, is "What is the difference between the G. B****s and persons in the supplementary sample named H. B**** or T. B****?" WAIS does not really know the answer. First initials are probably not equally likely to appear in the population and may not be independent of the last name (e.g. then may be many more G. B****s than there are G. W****s), but if the name chosen at random had been someone named H. B****, the entire "B****" name group would have been "U. B****" or possible "H-I B****." Thus while the name group was G. B**** by chance, H. B**** or T. B**** probably had an equal or nearly equal chance of being included,. Therefore it does not seem any more unreasonable to consider the persons in A = 5(6) as members of A = 1, 2, 3 than it did to consider members of A = 6, 7, 8 as members of A = 1, 2, 3,., 5. Still we do not wish to have a sample which is larger than the population for any grouping because this would force us to have a weight less than 1 for that group.This might be all right for some purposes, but it may lead to unrealistic results for others. Therefore, let us resolve to include the A = 5(6) group with the A = 1, 2, ,3 (4, 7, 8) group unless this procedure results in weights less than 1. Having made this resolution, let us consider the groupings we wish to make. All these interviews were drawn at 100 percent, but non-response may cause sample distributions to be significantly different from population distributions merely because ki < mi. It seems safer, then, to try to form some groupings of persons so that we can be sure (at least to some extent) that the High Income Sample is representative of income and wealth groupings in the population. There is no need to be concerned about the unrecognized matches because one cannot draw samples at a greater than one. In order to be sure that High Income Sample distributions do reflect income and wealth groupings in the population, let us consider placing all interviews in total income and property income grouping. All interviews in the High Income Sample have a key G or H = 5, 6, 7, 8, 9 and a J value 0-9 or an E value = 0-4. Table 9 indicates the way interviews in the sample are distributed over these values.: The way to combine the A = 2, 7 group with the other interviews is to recognize that persons with J = 9 and E = 0 have almost no property income and that J = 0-8 and E = 1-4 indicate substantial amounts of property income, we can combine them as in Table 10. The small number of interviews in G or H = 6, 7, 8-9 also makes combining them advisable. Table 10 shows the population (non-sample cases) of these four income less property income categories and the sample including and excluding the A = 5(6) cases. Unless WAIS wants weights less than 1, this writer would suggest two weighting schemes. In the first, A = 5(6) would have a zero weight. This scheme would be used whenever an analyst wished distributions from the entire income-wealth spectrum. In the second, all members of the Low Income Sample would be given a zero weight. This scheme would be used whenever an analyst wished to isolate as many high income cases as he could and to analyze them separately. Having entered these two weights on cards, WAIS should be able to run distributions for the Respondent Report and for other analysis. Table 9 Interviews in the High Income Sample Income and Property income Indication A = 1, 3, 4, 5, 6, 8 A = 2, 7 dt 5 6 7 8-9 Sum 5 6 7 8-9 Sum J 0 30 29 49 8 116 0 2 -- 2 1 4 2 4 - 10 1 2 1 1 2 4 2 1 -- - -- 1 3 4 4 10 2 20 3 2 -- -- -~ 2 4 5 2 2 1 10 4 15 11 .3 1 30 5 2 2 4 - 8 S 20 11 3 1 35 6 7 2 4 3 16 7 11 4 3 18 8 18 8 10 1 37 9 95 17 29 3 144 Sum 176 71 116 20 383 Table 10 The High Income Population and Sample by 1959 or 1962 Income Category and by Property Income Category The Population (A = 1, 2, 3) Key G or H 5 6-9 Sum J .9,E = 0 138 65 203: 3 = 0-8,4 = 1-4 113 124 237 251 299 440 The Sample (Actual lnteruierws): (A =1,2,3,4,5,6,,7,8) 5 Key G or H Sum 6-9 :- 9.,E - 0 99 49 148 0-8,E = 1-4 97 173 270 196 222 418 M Actual interviews from the name groups (A = 1, 2, 3, 7, 8) 5 Key G or H Stan 6-9 9,E m 0 99 32 131 J.=:0-8,9 - 1-4 97 108 205 196 140 336 (A The Weights For the High Income Sample A = 5-6 Vie 1, 2,. 3,:_ 4, 7, 8) 5 Key G or H 6-9 Key G or H 5 0-9 J' = 9,E 0 138 .- 65 2.031 3 - 9,E = 0 0 0 99 1.394 32 J. - O-8, E = 1-4 124 3 - 0-8,E = 1-4 0 0 197 -'1,165 - 1.148 108 Summary of Computations I. Low Income Sample A. A = 1 1. From Table 2, 2. Compute 3. 3. 4. Ri , ai , mi for the seven groups. On a listing of the cover sheet cards from the intern ews only, record 1. L(1-a) / mi - Wi for each interview 2. Sj i (1-ai )\ Compute Gig m J (S3) = weight 1 and record it next to i each interview in A ~= 1 Record weight 2 = 0 13. A = 2 1. Compute Pi, Pi, ai, nii for the three groups 2. Record for each interview 3. (1-ai) m M Wi x 2. Sj Compute and record 4. Gij a Wi,Sj = Weight 1 Record weight 2 - 0 C. A 3 1. Compute -gi(1:~a iX-Pi for A - 2 (3 a m,9 for the single group 2. Record for each interview I Q1 (l+ai) 3 (x-~i) (1-a.) 1. '~ Wi 2. 6 1 3. Compute and record Gig W. Si = weight 1 2. 4. Record weight 2 , 0 II. High Income Sample A. Record Wi from Table 10 B. Record Sj C. Compute and record Gij = W i S j D. Record weight 2 = 1 weight 1 Card Format for Recording Weights Columns Number of Data Columns 1 1 "W" 2-9 8 WAIS Identification number 10-13 4 Interview number (0001-1300) 14 1 1 = Booklet returned 0 = Booklet not returned 15-23 9 Key 24 1 Key A designation from old key or zero 25-31 7 Weight 1 (3 places to right of decimal, 4 places to left) 32 1 Weight 2 33-80 48 0 = Low Income Sample 1 = High Income Sample Blank Appendix A The question of Non-Response Bias: New Thoughts on an Old Error This writer suggested in a paper written about a year ago (645-039, March 2, 1965) that there was a non-response bias among persons in the interview sample who had a known 1962 income because of evidence in the following table: Table 1 Response and Non-Response by Income Bracket1962 Income Class 100,, 000 $1 1000 5000 10,000 15,000 25,000 50,000 and 0 999 4999 9999 14 999 24 999 49,999 99,999 over E Responding 34 113 304 357 176 71 113 12 8 1188 Non-responding 15 91 196 151 73 37 64 13 3 643 (all reasons) E 49 204 500 508 249 108 177 25 11 183.1 A contingency test on this table showed a computed Chi-square of 25.924, significant at the .001 level. Therefore, I concluded that there was some kind of bias and proceeded to see if the bias was correlated with income. The following table shows the response and non-response percentages for each income bracket. Table 2 Response and Non-Response Rates by Income Bracket 1962 Income Class 100,000 $1 1000 5000 10,000 15,000 25,000 50,000 and 0 999 4999 9999 14 999 24,999 49,999 99 999 over E Response 70 55 61 70 70 66 64 48 73 65 Non-response 30 45 39 30 30 34 36 52 27 35 (all reasons) E 100 100 100 100 100 100 100 100 100 100 and sampling frame. The first is rather clearly not true and so the latter course seems preferable. Therefore, we exclude them from the sample. Table 4 shows the sample respondents and non-respondents with the non-sample cases (ineligibles, dead, moved) removed from the computation. There were two major problems with the analysis in that paper. It is questionable whether one can trust Chi-square with such small cells, and every person with a 1962 income was included in the non-response even if he were drawn in error. The first problem, that of Chi-square is not really important. The computed Chi-square is significant at the l percent level even when the small cells are combined. The problem of which groups to include in the non-response is important. The Survey Laboratory divided non-response into eight categories; but last year's analysis ignored this knowledge. These eight categories were Ineligible Respondents (wives living with their husbands); dead; Moved out of state; Moved, last known address within state; Moved, last address not ascertained (all these movers probably belong in one category); Refused; Incapacitated; and Away (for the duration of the study). Table 3 shows how the original sample of persons with a 1962 income divided into response and non-response categories. Some of these persons in the non-response categories should not have been in the sample because they were actually outside the population. The ineligibles and the dead are clearly in this group. The refusals, the incapacitated, and those who were on vacation clearly belong in the sample. One might quibble about the incapacitated because we determined not to intervew someone else in their place, but they could file a 1964 tax return so they probably should be included in the sample. Movers represent a special case. Including them would be tantamount to assuming that those who remain in a given income or wealth category are like those who move. Excluding them adds another qualification to the population and sampling frame. The first is rather clearly not true and so the latter course seems preferable. Therefore, we exclude them from the sample. Table 4 shows the sample respondents and non-respondents with the non-sample cases (ineligibles, dead, moved) removed from the computation. Table 3 Response-Non-Response by 1962 Income Category $1 1000 5000 10,000 15,000 25,000 50,000 100,000 0 999 4999 9999 X4,999 _24.999-49-922 ,19-2_-. Response 34 112 304 357 176 71 11.6 12 8 1190 Non-Response: 1 25 33. 3 3 10 -- 78 Ineligible to respond Dead 2 5 4 -- 2 1 -- 14 Moved 2 14. 24 26 10 4 3 1 -- 84 Refused 10 34 117 113 54 29 46 10 2 415 Incapacitated -- 7 10 -- 1 2 1 1 22 Away for the duration 6 3 9 4 -- 4 1 32 of the study z 49 203 500 508 250 108 181 25 11 1844 7, Response 70 55 61 70 70 6.6 64 48 73 65 Non-Response: 2 13 6 1 6 4 4 Ineligible Dead 4 2 1 1 -- -- 1 Moved 4 7 5 5 4 4 2 4 -- 5 Refused 20 17 33 22 21 26 24 40 18 22 Incapacitated 3 2 1 2 1 1 Away for the duration 3 2 2 2 -- 2 4 9 2 of the study 100 100 100 100 100 100 100 100 100 100 Table 4 Response-Non-Response by 1962 Income Category (Non Sample Cases Removed) 1962 Income $1 1000 5000 10.,000 15,000 25,000 50,000 100,000 999 4999 9999 14999 24.999 49 999 99,999 or over T Response 34 112 304 357 176 71 116 12 8 1190 Non-Response 10 47 135 122 59 31 51 11 3 469 E 44 159 439 479 235 102 167 23 11 1659 70 Response 77 70 69 75 75 70 69 52 72 72 Non-Response 23 30 31 25 25 30 31 48 28 28 100 100 _ 100 100 100 100 100 100 100 100 If we divide the income distribution into those with income less than $25,000 and those with incomes over $25,000, we get a response rate of 72.49 for the first group and 70.46 for the second. A t test of the difference with one degree of freedom is t= ( ) This is not significant at the .05 level in a one tailed test, We conclude that non-response bias is not very great if it is present at all.hahttp://www.ssc.wisc.edu/wais/WAIS656043.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656043.txtiHi,Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Gene Moyer WSRL #237 WAIS 656-043 February 22, 1966 A Weighting Function for WAIS Interview Formulation of a weighting function for the interview sample is not an easy task because of all the decisions WAIS had to make about unknown values. (some are still not known) while drawing the interview sample, because of the resulting errors which were made and because of the complicated sampling frame. This paper is an attempt to enumerate the problems in weighting the sample and to suggest a weighting scheme which offers solutions to as many of the problems as possible. 1. General Considerations 1.1 The Population Orginally WAIS planned to draw the interview sample from its Master File, a sample of Wisconsin individuals who filed state income tax returns during the years 1947-1959. These persons had last names included in name groups chosen by a random process from the 1958 tax roll for the Madison district. "Persons in the Master File" was not a satisfactory population for two basic reasons. Wisconsin tax returns are all filed by individuals whether these individuals are married or not. Therefore to have sampled individuals would have resulted in interviews from wives but not their husbands, and husbands but not their wives. WAIS was convinced that the proper unit to be sampled was the husband-wife unit and not the spouses individually. A second reason for finding this population unsatisfactory was that the latest returns these people had filed was for the 1959 income year filed in 1960. This was four years before the interviewing was to be done. In that time many persons had died, had moved out of the state, or moved to some different address within the state. Therefore, WAIS secured, the tape record of the 1962 State Income Tax Roll and matched persons on that tax roll to persons with returns in the Master File. This gave addresses for early 1963, only one year before the interviewing was to be done. Because persons who filed only in the early years of the Master File were less likely to be available for interview, anyone who filed only 1947-1958 returns was deleted from the population. The population from which the sample was chosen, then, was "husbands and single persons who filed a State income tax return in 1959 or 1962, and whose last names were in Master File name groups." 1.2 The Sampling Frame Several considerations dictated the choice of the sampling frame. The skewed nature of the distribution of persons over the income range caused WAIS to draw two samples, a low income sample (persons with 1959 or 1962 incomes of under $10,000) and a high income sample (persons with a 1959 or 1962 income of $10,000 or over). Cost considerations caused WAIS to give a zero probability of inclusion in the low income sample to persons outside Primary Sampling Unit (PSU) counties in which the Wisconsin Survey Research Laboratory maintains trained interviewers. These PSU's are Brown, Clark, Dane, Dodge, Douglas, Eau Claire, Grant, Kenosha, Manitowoc-Calument, Marathon, Milwaukee, Oconto, Outagamie, Pol, Price, Racine, Rock, Sauk, Sheboygan, Trempealeau, Waukesha, Walworth, Washington, Waupaca, Winnebago, and Wood counties. A third consideration was-that WAIS matched persons with returns in the Master File with persons on the 1962 state tax roll. Most persons in the alphabetic name groups matched, but a significant number in both groups did not. WAIS drew its sample from the residual groups as well as from the matched group. For the high income sample, WAIS also drew persons from persons whose last name was the same as that of a name group, but whose first initial disqualified them from membership in the name groups, e.g., WAIS drew A. B****s for this supplementary high income group; only G. B****s are members of the name groups. Finally, Roger Miller devised a nine digit "key" which divided the population. into several groups of people with approximately the same characteristics. WAIS sampled randomly from some of these groups; at a 100 percent rate from others; and at a zero rate from the remainder. Table 1 gives these keys and the possible values each may take. These keys were used as single nine digit numbers. For example, one key might be A B C D B F G H J 1 1 0 1 1 4 5 5 7 This configuration of keys means that this person was in the matched group who lived in PSU's (key A), a non-farmer (key B), never reported dividends or capital gains (key C), filed one or two returns during the years 1954-1959 (key D), had an average "property income" of $201-$500 during the years 1954-1959 (key B), had an average gross income of $5,000-$9,999 during the years 1954-1959 (key F), had a 1959 gross income (key G) and a 1962 gross income (key H) of $10,000-$14,999, and had 71-80% of his 1962 tax liability withheld from his 1962 wages. All persons with keys (or a subset of key digits) similar to this person's key were in a single group and this group was sampled at some rate from zero to 100%. Graph 1 indicates the population of income tax filers divided according to key A. 1.3 The matching Process Because WAIS' Master File is a name cluster sample, there was no need to try to match it with the entire 27 reel state tax roll. Rather the first step in the matching process was to form a file of persons on the state tax roll with last names the same as persons with returns in the Master File. There were 39,115 such persons. Let us call this the State File. This entire State File was matched with the Master File in two ways. In the first pass, two records were considered to be matched if the social security number, the last name, and the first initial were the same. In the second pass, two records were considered to be matched if the full name and the address were the same. These matching processes resulted in a Matched File, an Unmatched Master File, and an Unmatched State File. These files were further divided according to whether the person's last known address was in PSU or not. The Unmatched State File was also divided into those actually in the alphabetic name groups and those in the last name groups only. This resulted in the, eight groups shown in Table 1 and Graph 1. Table 1 Key Designations for WAIS Interview Sampling Frame Key Description Code A List from which the person's name was drawn Matched file in PSU's 1 Unmatched Master File in PSU's 2 Unmatched state file (tax roll) in came groups and 3 in PSU's Unmatched state file in name groups and outside PSU's 4 Unmatched state file in last name groups only and in PSU'S 5 Unmatched state file in last name groups only and not 6 in PSU's Unmatched Master File not in PSU's 7 Matched file not in PSU's 8 B Occupation (most recent classification available) Non-Farm 1 Farm 0 C Did taxpayer ever report dividend or capital gains income? Yes 1 No 0 D In how many of the years 1954-1959 did the taxpayer file returns? 0 0 1-2 1 3-4 2 5-6 3 E How much was the taxpayer's average "property" income (total income less income from wages and salaries) for the years 1954-1959? (Yp) Yp < $200 0 $200 < Yp < $500 1 $500 < Yp < $1000 2 $1000 < Yp < $2000 3 $2000 < Yp 4 F How much was the taxpayer's average gross income, 1954- 1959? (Yq) (See the classes and codes at the bottom of the page) G How much was the taxpayer's gross income in 1959? (Y59) (See the classes and codes at the bottom of the page) H How much was the taxpayer's gross income in 1962? (Y62) (See the classes and codes at the bottom of the page) J What percent of tax paid was withheld from the taxpayer's 1962 income? (Tw% ) Tw% < 10 10< Tw% <20 1 20< Tw% <30 2 30 < Tw% < 40 3 40< Tw% <50 4 50 < Tw% < 60 5 60< Tw% <70 6 70< Tw% <80 7 80< Tw% <90 8 90 < Tw% 9 Class Bounds and Codes for F, G, and H. Class Bounds Code Y< -1 0 -1 < Y < +1 1 +1 < Y < 1,000 2 1,000 < Y < 5,000 3 5,000 < Y < 10,000 4 10,000 < Y < 15,000 5 15,000 < Y < 25,000 6 25,000 < Y < 50,000 7 50,000 < Y < 100,000 8 100,000 < Y 9 Graph 1 The Population and Sampling Frame of Persons who Filed a 1949-1950 or a 1962 State Income Tax Return In PSU A = 1 (Original Matches) 4415 Men 1499 Women 5914 Unrecognized Matches A = 2 Unmatched Master File 4563 Men A = 3 Unmatched State File 1079 Persons A = 5 Unmatched State Files 21,550 Outside The State File Out PSU A = 8 (Original Matches) 1507 Men 405 Women 1412 Unrecognized Matches A = 4 Unmatched State File 649 Persons A = 7 Unmatched Master File 2442 Men 1893 Women 4039 A = 6 Unmatched State File 7140 Outside the State File In Name Groups In Last Name Groups Outside Name Groups Only 3108 of these men filed 1959 returns There were errors in these designations for several reasons. Probably the major reason was that WAIS' identification files were in poor order and incomplete. WAIS has since provided a better format for identification and has tried to make the file complete, but none of this had been done at the time the sample was drawn. Because of the errors in matching, some of the persons in A = 2 and A = 3 should have been in A = 1; some of the persons in A = 4 and 7 should have been in A = 8. A second source of error lay in the program which divided the residual state file into A = 3, 4, 5 and 6. For some reason in the name groups alphabetically after (C******, M-P), persons were excluded from the A = 3 and A = 4 lists whose first initial was A - L regardless of the name group to which they belonged. Thus many persons in the C****** - C******, D****** - D*******; and other name groups which include an entire last name group or several entire last name groups were included in A = 5 and A = 6 when they should have been in A = 3 and A = 4. 1.4 A Priori Sampling Rates Table 2 contains the sampling rates for the groups as WAIS thought they existed before the sample was drawn. This includes all the misspecifications and mismatches which have since been discovered except for some duplicates removed from the sample before the interviewing began.* Notice that the numbers of persons given for the population of each key value group is the population of men only for the Matched File and the unmatched Master File. The number of persons given for the unmatched State File include both men and women because there was no way of differentiating between them. The sampling for each key A group which was sampled at a positive rate less than 100 percent was done by giving each person a random number sorting the group on the random number; and selecting every n record with the specified configuration of keys where n is the reciprocal of the rate shown in Table 2. Notice that only a few rates were used. All of them were constructed so that their reciprocals (except the zero rates which are not shown) are whole numbers. -------------------------- * The a priori sample includes all (or almost all) these duplicates, but available records are not good enough to allow certainty. The a posteriori sample should include no duplicates. Population Sizes, Sampling Rates, Sample Size and Response to the Interview Subsample by Key Class Key Class Key Same A Priori iori Hand A Non Sample Response Response Response Response Cress ling Matches Posteriori Sample Size to Rate to - Rate Popula ,t Cases Less. Interview Booklet lo t_.,_ Non Sam a Sample Size Sample S all all 100 6 100, 6 100 ~, 50 50 4;0 5 8 5 62: 62 IN . 50 80, 78 10 75 :~ 75 M' DW' all . 100 2 100 100, 2 4. 100 l 100. 100 4 100 - 75 11. 50 X 100 , . _ . ?5 61,E 10 ~ ~ :. 3:. 13: 50 ~~~~ 100'. 5 00 1 0 all othe 39 . 100 38 39 #; 29 71 1 0 76`10 10 10 7 ~ 51 8 ' 88 - 88, 0 all 4 100 33 33 other 5-6 3-5 JIM 19 69 - 69 3-4 5-6 100: 7 0-4 0-4 26 90 3 ~ 50 all other 281. 79 79 all 11 80 80 17 67 67 117 . ~ ~ 58` . 4 58 ' 10 10 6 5 46 4 9 7 78 MM _!W' ,~ 20 91 20 91 Key Class Key Samp A Priori Hand A Non Sample Response Response Response Response A B C D B F Class ling Matches Posteriori Sample Size to Rate to Rate Popula Rate Sample Cases Less Interview Booklet 7a tion Size Non S , Sample Size 1-4 3 3 20.1 251 5 1, - 5 5 _. 4 80 4 all other 32__L__- j 30 30 1 26 87 24 80 30 .1 1001 2 3 1.3 50 . 1 7 1 1 I. 5 f 83 1 5 1 33- 3 2 .48 1; 20 9 I_ - 9 9 - 9 6 67 6 67 ~31 I 3 403 2 8 4 12 1 11 9 82 9 82 3 4 547 2 11 4 15 15-1. 14 1 . 93, 1 '14 .1 93 4 3 55 20 11 - 11 ,~ 1_ 64 1 7 I 64 11 j 7 4 4 506 2 10 2 l2 - 12 8 67 8 67 al.11other 79 100 79 112 91 10 81 58 72 52 64 0 3 1 3 37. 25 10 - 11 1 ,~ 8 73 . 8 J 73 4 24 1 50 12 12 - 12 9 75 9 75 4 4 41 25 10 . ` 10 - 10 8 80 I , 8 . I 80 3 3 I 21" ! 50I 11 t 11 11.1 8 1 73 8. 1 73 3 4 14, 501 7 - { 7 - 1 5 1 71 I 4. 57 4 4 X20 50 10 10 10 10 100 f 9 90 3 3 3 31 50 15 1 16 16 1 11 69 I 11 I 69 4 3 3 33 50 17 - 17 16 10 62 L 10 1 62 0 3 all other 176 100 176 1 187 5 182 123 68 109 60 1 0-2 a 1 3 3 25 .. 1 0 25 1 26 1 25. 17 68 15 60 4 4 19 100 19 - 19 3 16 14 88 9 56 3 0 3 16 - 16 L 12 I 75 I 12 I 7 3 4 20 18 19 18 11 61 I 11 I 61 4 3 22 100 22 - 22 14 64 13 59 4 4 138'.- 33 - 33 27 82 25 76 1 13 3 27 100 / 27 15 56 14 52 4 4 50 50 24 24 17 15 62 2 4 3 3 105, 50 52 54 50 36 72 70 4 4 88 ~, 50~ 43 ~1 ;;4 44 1 43 32 74 28. 65 1 1 - a o t h - 441 100 441 51 492 472 323 68 246 52 subtotal A 1 I 25 121 15 6 1 0 A B C D E F G Key Samp A Priori Non Sample Response Response Response Response Class ling Matches Posteriori Sample Size to Rate to Rate Popula RateL Cases Less Interview Booklet % tion Non Sam 1e Sample Sample Size Size al1kL 2 2 - 0 all 0-4 0- 117 2 2 0 0 0 J 0 2 _, , 0 0--2 0-1.0-4 0-4 1466 1 21 - 21 12 6 50 6 50 fta ' 4 0-4 96 10 10 24 - 1 9 3 33 3 33 0 4 679 1 7 7 2 5 4 80 4. 80 0-1 0-4 2-3 0 4 0-4 66 10 5 - 80 4 80 4 0-4 0 4 31 20 83 4 67 1 0-3 0-4 0-4 296. 2 10 - 9 5 71 . 0-4 0-4 20 26 - 6 19 = 68 11 58 ~. OW 0-4 0-4 6 17 71 9 53 1 all 0-4 5-9 100 3. 10. 9 60 rI, JM5-9 0-4 100 46 19 32 16 50 41 =.I .~'5-9 5-9 68 100 59 61, 18 44 b ota F 3108 165 102 62 83 50 4 13 54 7 54 50 66 42 * 19 6 :F 3 0-4 9 970 '~ 18 16 6 38 38 5-8 all 70. 100 70 -50 20 16 ~~WJ19 ub of 1673 -100 own 38. 34 100 4 4 ~ 25 7-9 7 6-9 6-9 N. 00 9 ~ 20 20 8 6-9 all 6-9 ~~ENEO 100 ONEN 36 26 81 66 Total 1-8 9375 - 2070 0 2070 208 1862 1290 69 1095 59 wives who had never filed _ +10 + 0 10 0 100 10 100 Total sample 2070 +10 2080 208 1872 1300 69 1105 59 so 1.5 Response Rates Table 2 also shows the response rates for the study by key class grouping. While it is obvious that the A = 3-8 groups have a lower total response rate than the A = 1-2, there is no particular evidence that this is correlated to some other key variable because two cells in the A = 3 category had response rates of 32 and 38 percent and yet were in the middle income brackets. In order to better ascertain the relationship between response and the key variables, we present Table 3 which gives response rates for each of the 9 key variables. Notice that the relatively low (and high) response rates occur when the number of persons in the class is small. The values for A = 6, 7, and 8, D = 0, G = 1, 8 and 9, H = 8 and 9, and J = 1 and 2 are examples. The A = 3 key value also has a low response rate, but this seems to be a function of our hand matching procedure. Table 3 shows the sample as it exists after the hand matching was completed. Eighty of the A = 1 respondents came from A 3; only 6 of the A 1 non-respondents came from A = 3. If these numbers are added to the appropriate numbers of Table 3, A = 3 should have 104 respondents and only 41 non-respondents for a 72 percent response rate. Another possible non-response bias may be among groups with a given total income, and a given amount of property income or percentage of tax withheld. Either of the latter two variables, of course, are proxies for wealth. To explore this possibility, Table 4 shows the response rate for each income and property income class. Table 4 shows considerable variation in response rates over keys H and J and keys E and G, but the F ratios are not significant at the .05 level. We conclude, then, that the non-response bias is insignificant if it exists at all. 2. The Weighting Model A simple model which at least embraces the basic qualities one expects from a weighting system is the following: Let -------------------- *We used interview data to help match respondents to their tax forms. No interview data existed for non-respondents, and the matching was unquestionably less complete for them. Table 3 Response Rates to the Interview by "Key" VariableAfter Removal of Non-Sample Cases) Number Key Number W M ROOM Responding -/0 % Responding Responding % not Total Responding Key Number Number not Responding A 1063 383 1446 74 26 G = 0 225 86 311 .72 2 102 37 139 73 27 1 1, 0 1 100 3 24 35 59 41 59 2 70 24 94' 74 1 3 4 25 75 3 292 120 412 7.1 66 41 107 62 38 4 374. 127 501 ' 75 5 16 5 21 76 24 5 123 34 157 78 . 3 4 25 75 6 69 23 92- 7 $ 26 6 32 81 19 7 29 39. 7A 15 1299 513 1812 72 28. 8 8 13. 62. 1109 408 1517 73 27 1 0 1 100 *0 83 21 104 80 .20 1192 429 1621 74 E 1192 429 1621 74 26 u-0 i - C 645 262 907 71 29 1 34 10 44 77 *0 547 167 714 77 23 2 112 47 159 70 E 1192 429 1621 73 27 3 304 135 439 69 =;O 18 2 26 90 10 4 357 322 479 74. :L. 1.15 34 149 77 23 5 176 59 235 75 2 146 43 189 77 .23 6 71 31 102 70 913 350 1263 72 28 7 116 51 167 69 .1192 429 160-. 73 27 8 12 11 23 52 E = 407 121 528 77 23 8 3 11 73 145 49 194 75 25 E 1190 469 1639 72 134 40 174 77 23 j 0 461 214 675 68 148 57 205 72 28 1 18 7 25 72 4 358 162 520 69 31 2 12 9 21 57 il 1192 429 1621 73 27 3. 22 4 26 85 F 0 37 3 46 92 8 4 16 7 23 70 -0 -- - 5 18 6 24 75 94 34 123 73 27 6 31 13 44 70 3 417 160 57.7 72 28 7 50 19 69 72 4 300 133 523 74 26 8 93 26 119, 78 9 167 50 6 21 1 R 29 1190 484 6 72 B 1113 B 90 469 16531 7 30 12 a 7 3 10 70 30 9 *Not all persons had every key value. These sums E 1192 429 1621 73 27 include only those persons who had a non-blank value for the specific key in question. 28 0 26 29 25 22 25 26 38 0 26 23 30 31 26 Table 4 Response Rates for Income and Property Income Groups Total squared error KEY G (1959 Income) 0-1 Key 0 .78 1-3 .72 (Average property income) 4 .60 Mean .70 .0168 F(2, . .F (4., 10) "= .35 L 6 of 7 cases .. 2 3 4 5-9 Mean Total Squared Error .78 .71 .78 -.85 .78 .0093 .72 .75 .78 .72 .73 .0032 1 .36 .65 .62 .77 .70 .0494 .79 JO a. .78 .74. .0074 ..'73 .0099 - .0051 11013-1 .0086 .0033 Z E y to OW" . MY 0 1 .73. 1 1-8 .501 (Percent of 1962 tax withheld) 9 Mean .61 Total squared error .0276 1962 An c.6 m 6 3 5-9 Mean Total Squared Error .66 .66 74 .70 o0097. .76 ..88 169 .71 .0759 .71 .77 .66 .69 .0158 .71 .77, .70 .70 .0171 .0050 .0242 .0032 .0002 11 of 2 cases 29 of 15 cases A = the total aggregate value of same variable in the population. For ease of exposition let us consider A to be total income for given year, although such specification is not really necessary. a = the total aggregate value of the same variable in a sample. Ai (ai) = the value of A in the ith stratum into which persons with some given value of A were divided for sampling (i = l,2,3,...n-1,n) the rate at which the i th stratum was chosen, = Mi the number of persons drawn for the sample / Mi the number of persons in the population Wi = some weight to be attached to values in the ith stratum. Ai = Ai/Mi = per capita value of A in the ith stratum We abstract from the response rate problem by assuming a response rate ai = E(ai) = ai/Mi of 100%. It is obvious that (1) A Z A. E Z K A 3. 1 3. 1 i (2) a. ~ Z a Z.m a Since the sample was drawn randomly from the population of each stratum we know that in repeated sampling (3) 9 {al) by the central limit theorem We want to construct W: so that x (4) ET a A, By substitution at the limit i (5) Z WimiA i ' = D1. A: 1 3. For the i th . category, (6) W1m M.A 3. . and 2. Notice that so long as there is no consistent bias, Wi is the proper weight, at least asymptotically in repeated sampling. If we relax the assumption of 100% response, so that k i (k i < persons responded, everything holds (so long as there is no group which consistently fails to respond) if is substituted for m. in the appropriate equations. Having ascertained weights for the sample on the basis of some known J variable, we assume that E(a i A i when A is some variable whose population parameters are unknown and that W is also relevant to that variable. 3. Weighting the Low Income Sample The model of section 2 indicates that there are two major questions we must ask of our sample values. "Are the mean values of some known variable in the population the same in each class as the mean in the sample or is there some bias which keeps this from being true?" and "What are the proper values for Mi and mi?" One reason for expecting a bias might be the errors in the division of the state tax rolls and Master File into A = 1, A = 2, A = 3 (only these had a non-zero probability of being chosen for the low income sample.) Another reason might be a non-response bias, but we have already seen that there is little reason to expect one for the interview. We have not investigated the problem for the assets booklet as yet. The errors in the specification of the lists of names from which the sample was drawn are somewhat more serious. For one thing, persons who should have matched but didn't represent a group sampled at a higher rate than persons with similar keys because they had a probability of being drawn from the A 2 list as well as from the A 3 list. They were still drawn randomly (independently) (or at 100%) from both lists, however, so providing that we know the a priori rates at which they were sampled, we can compute the actual rate. Let P(drawing an unrecognized match from A = 2) = P 2 P(drawing an unrecognized match from A = 3) = P 3 Then P(drawing an unrecognized match) = P2 + P3 - P2P3 A population can be divided into any set of strata the sampler wishes without inserting any bias into the data. Therefore let us consider that these unrecognized matches constitute a fourth stratum and consider each of the A values (A = 1, A = 2, A = 3) and the unrecognized matches as the four strata from which the low income sample was drawn. The interviews from A = 1 were drawn at seven rates, .02, .05, .10, .20, .25, .50, and 1.0. To weight them at the reciprocal of the rate at which they were drawn, however, would be to weight them at Ni/Mi (Ni > Mi) The difference (Ni - Mi) is the result of our including persons in the population (and the sample) who were not eligible to be drawn for the sample because they had died, moved out of the state, or were married women in 1964. WAIS knows nothing about the proportion of these persons in the population figures of Table 2, but it seems reasonable to assume that the proportion is the same in the population as in the sample. Therefore if ai = the number of ineligibles in the ith sample group/ the number of persons in the a priori sample of the ith group, Mi = Ni (1-ai) The number of interviews in the ith class (ki) is in general less than mi and raises a problem in defining classes. If the ith class contains only persons drawn from a given key classification, the introduction of it (and to a lesser extent the introduction of Mi) will lead to a proliferation of weights and so an unstable weighting system. Therefore let us divide A = 1 into seven classes including in each class all persons drawn at the same rate. This gives us seven groups of persons each one of whom had an equal chance of being included in the sample and each group will be large enough to give the stability a sound weighting system should have. Thus the weight for an interview in the ith of those seven groups would be Wi = Ni (1-ai)/ki Notice that Ni(1-ai) is only the number of persons who had a positive probability of being drawn. We shall consider the problem of the zero probability cases later in this section (pages 21-23). The unrecognized matches present additional problems. There were 97 actual interviews in this group which were drawn at the twelve rates given in Table 5. Even twelve groups are too many for a total of 97 interviews of which, only 39 are in the low income sample. Notice that each of these rates is almost the same as one of the rates given to some interviews in A = 1. Therefore this writer suggests that the unrecognized matches be included in the k of groups in A = l drawn at about the same rate. The problem with this schema is that there were no groups drawn at rates which approximate .0690, .3475, or .6000 very closely. Only eight interviews were involved And it seems reasonable to give them weight equal to the reciprocal of the rate at which they were drawn. The interviews which remain in A = 2 and A = 3 present a different problem. WAIS does not know the true size of the population of either of these groups because there are other matches in the populations. Only 68 low income interviews were taken with persons still in the A = 2 group. Nineteen low income interviews were taken with persons who remain in the A = 3 group. To divide the interviews in A = 2 into six groups according to the rates they were drawn does not seem reasonable because this would result in some extremely small groups. Table 6 shows the rates at which A = 2 interviews in the low income sample were drawn and the number of* James Geffertr 1966<5Logical Construction of the 400 Character Master FileJanuary 7, 1966 WAIS paper656-037Master File- Tax RecordsJames Geffert WAIS 656-037 January 7, 1966 Logical Construction of the 400 Character Master File* I. Overall construction of long forms B387 - 1, 3, 5, B36 B45 + Income Sources . . . B126 TOTAL INCOME B135 -B144 Auto or Business Expense B153 Adjusted Gross Income - B162 Standard Deduction Allowed B171 Net Taxable Income, Standard Deduction Basis The above computation is performed for all years 1956-1960. Prior to 1956 auto or business expense is an itemized deduction and thus for 1946-1955: B153 = B135 - B144 B162 = 9%. (B135) or $450 B171 = B135 - B162 If deductions are itemized: B180 Wiscons Tax Paid B189 Union Dues B198 Medical Dental Expenses B207 Total Interest Paid -B216 Business Interest Paid B225 B225 . . . B252 Forest Crop Land B261 Total Deductions before Federal Tax and Donations If Medical Dental Expense is the deductible portion, B389 will contain a D. If Medical Dental Expense is the total amount paid B389 will contain a T. ----------- *All labels are standard 400 character Master labels defined in 656-036 and earlier WAIS documents. ----------- Medical Dental limits 1946-1952 $50 - $500 1953-1960 $75 - $1,500 Prior to 1956 B144 auto or business expense is included in B261. B270 = B135 - B261 1946-1955 B270 = B153 - B261 1956-1960 B270 Net Income before Federal Tax and Donations - B279 Federal Tax and Social Security Deduction B288 Net Income before Donations If B279 is the deductible portion of Federal Tax and Social Security B390 will contain a D. If B279 is the total amount of Federal Tax and Social Security paid B390 will contain a T. The maximum deductible = 3% (B270) in all years. B288 Net Income before donations - B297 Donations B306 Net Taxable Income itemized basis If B297 is the deductible portion of Donations B391 will contain a D. If B297 is the total donation exceeding 10% of B288 B391 will contain a T. The amount of Donations which can be deducted cannot exceed 10% (8288) in all years. II. Construction of Short Forms B387 = 2, 4 Short forms are constructed in the same way as long forms except that deductions cannot be itemized. Thus nothing should be expected in B180 through B306 on short forms III. Other Forms B387 = 6 This is a special type of form which contains no source or deduction items but does have Taxable Income in B378 (by whatever basis) and exemption and tax liability figures. Inconsistency Indications a) Addition of Sources of Income: if EB36, B126 does not equal B135 the size of discrepancy is indicated in B392. b) Subtract auto: If B135 - B144 does not equal B153 the size of discrepancy is indicated in B393. c) Standard Deduction: If (1946-1955) B135 - B162 does not equal B171 the size of discrepancy is indicated in B3940 If (1956-1960) B153 - B162 does not equal B171 the size of discrepancy is indicated in B394. d) First Phase Deduction: If B180 thru B252 (as explained under (I) above) does not equal B261 the size of discrepancy is indicated in B395 and B389 is blank. e) Net Income before Federal Tax and Donations: If (1946-1955) B135 - B261 does not eqaul B270 the size of discrepency is indicated in B396. If (1956-1960) B153 - B261 does not eqaul B270 the size of discrepancy is indicated in B396: f) Net Income before Donations: If B270 - B279 does not equal B288 the size of discrepancy is indicated in B397 and B390 is blank. g) Net Income Itemized Basis: If B288 - B297 does not equal B306 the size of discrepancy is indicated in B398 and B391 is blank. d, f and g above imply neither total nor deductible portion interpretation yield proper result. IV. Discrepency Scale Code Amount 0 $0.00- $4.99 1 5.00 - 20.00 2 20.01 - 40.00 3 40.01 - 60.00 4 60.01 - 80.00 5 80.01 - 100.00 6 100.01 - 200.00 7 200.01 - 300.00 8 300.01 - 400.00 9 400.01 - or overhahttp://www.ssc.wisc.edu/wais/WAIS656037.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656037.txt James Geffertd 1966&Notes on Portfolio ConstructiondJanuary 21, 1966 WAIS paper656-039p Property Filea4.Jim Geffert WAIS Paper 656-039 January 21, 1966 Notes on Portfolio Construction lntangible Assets The problem before us is that of determining the value of taxpayer portfolios at points in time from information about certain transactions and receipts of income from financial property. In general the procedure for determining total money value of a given holding at a point in time involves the use of several items of information: 1. Income information by source (a) Interest - yearly (b) Dividend - yearly (c) Capital Gain 2. Dates of acquisition and sale if asset is sold, 3. Values of asset on acquisition and sale dates if asset is sold. Note that items lc, 2 and 3 are not present unless the asset is sold. Note also that la and lb will not be present unless the asset pays a dividend or interest. There are thus two critical questions which must be answered: "did this issue pay dividends or interest?" and "was this issue sold?" If the asset is not sold at some time during our sampled years and does not pay interest or dividends, we cannot know of its existence from tax form information, If the asset is not sold at some time during our sampled years but does pay interest or dividends, we can determine a probable value for the asset. If the asset is sold but does not pay interest or dividends', we have a good estimate of the value of the asset sold - however we don't know if more than the amount sold is held. If the asset is sold and pays dividends or interest we have enough information to closely approximate the value of this asset in the portfolio. Assuming that one knows the rate of interest or the dividend per share an estimate of the value of the holding can be made by solving one of the following equations: V = Dr/Dps (P) for dividend paying stock V = Ir/I for interest paying items where V is dollar value of the asset Dr the reported dividend received, Dps the yearly dividend per share, P the price of the stock Ir the interest reported and I the yearly rate of interest. Note that this is only an estimate which may be more or less accurate depending on the actual transactions occurring in the taxpayers portfolio. In order to get some "feel" for the reliability of this estimate let us examine several cases. The cases we will investigate assume a semi annual dividend or interest payment rather than a yearly payment. We will also assume that the price P, the dividend per share Dps and the interest rate I do not vary over time. These assumptions are made for the sake of clarity of presentation of the central issue, that of estimating value without complete transaction information. Case I.- The simplest and most obvious case we shall encounter is that of the holding which is acquired (or is being held) at the beginning of the year and held without purchases or sales over time and which pays dividends or interest. In this case we know Dr or Ir, the reported income, we know Dps, P, and I from other data, and we solve for V. (graphic) Case la, Let us modify case I as follows. Instead of acquiring the asset at the beginning of the year our taxpayer acquires the asset at some time during the year such that he receives only the second semi annual dividend for the first year. He continues to hold the asset through time. In this case, instead of observing Dr we are actually observing 1/2 Dr. If we use the observed dividend reported in our formula we shall undervalue the taxpayers holding by 50% in the latter part of the first year and impute to him a fictitious holding in the first part of the first year. (graphic) Case 2. In this case we will assume that the taxpayer acquires the asset (or has it already) and adds to his holding by acquiring more and selling none. If he is courteous enough to acquire each new lot at the first of each year, our estimate of his total holding is good when we use the dividend he reports as Dr. (graphic) Case 2a. If the taxpayer in case 2 acquires additional lots such that he does not receive the first dividend payment for the additional lot in the year of acquisition we have a situation similar to case la. We observe a dividend which is composed of the total dividend paid by his former holding plus the semi annual portion of his new holding. We thus overvalue his holding in the first part of the year of acquisition and undervalue his holding at the latter part of that year. (graphic) Case 3. In this case, we assume that the taxpayer acquires an amount of stock and later sells the same amount, making (we think) a gain or a loss on the capital transaction. We know (if he did not hold any of the issue prior to his purchase of this lot) his exact holding through time because the taxpayer must report dates of purchase and sale and value of his holding at the two dates. (graphic) Case 3a. Here we modify case 3 to allow the taxpayer to acquire an amount of stock and later sell some, but not all, of the total acquired. In this case, if the issue pays dividends, we know that that which he sells he bought at time t. The amount sold could have paid a dividend Ds: Dr - Ds is the dividend from the additional holding which was not sold.hahttp://www.ssc.wisc.edu/wais/WAIS656039.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656039.txt@Richard Bauman 1966XQPreliminary Report: Cohort Changes in the Level of Incomes Around Retirement AgeslJanuary 31, 1966 WAIS paper656-040Master File- Tax Records>>Dick Bauman WAIS Paper 656-040 January 31, 1966 Preliminary Report: Cohort Changes in the Level of Incomes Around Retirement Ages This paper is an exploration of the Wisconsin Assets and Incomes Studies Data with particular attention to its usefulness for studying changes in retirement-age incomes. This analysis is restricted to the use of grouped data for individuals in lieu of tax or family units and/or micro unit time series. While these alternative groupings are no doubt illuminating, the grouped individual analysis is a convenient starting point since it affords a basis for comparison with other studies. 1. I. Table I shows WAIS age groups (numbers of individuals) as a percentage of the corresponding Wisconsin population age group.* Besides age, individuals are classified according to sex, year of filing, and income. For selected years, there is a further breakdown by marital status. WAIS age groups and Wisconsin population age groups are not strictly comparable. The WAIS data is from a Wisconsin State Income Tax sample, therefore inclusion in the sample depends upon whether or not an individual files a state tax form in the year under consideration. Let Xit be the probability that individual i files a return in year t + 1 for year t. Xit = X (it, Lt-1, Rt, At + 1, Dt, Ft, Wt) Where It = Income in year t Lt-1 = Whether or not filed return in year t-1 -------------------- *1 Table I is not reproduced herein. It is used as an input to Table 11 which is summarized on page 2. -------------------- Rt = Resident of state in year 1 At + 1 = Whether or not i alive in year t + 1 Dt = Definition of taxable income in year t Ft = Tax Forms for year t Wt = Non-resident in Wisconsin income in year t Let Ck,t = The number in the kth class in Wisconsin in year t, where k is an age, sex class, and Wj,k,t = The number of WAIS individuals in the j, kth class in the tth record year, where j is an income class. The ratios in Table l are ( ) and since Wj,k,t = ( ) and ( ) only,k the ratios are only approximations of the true proportions of sampled individuals in the j,k,t class ( ) was chose instead of ( ) primarily because the filing deadline corresponds closely with the enumeration period. Moreover, estate income tax returns are excluded from the WAIS sample, so Wj,k,t and Ck,t+1 are more comparable. Are non-resident returns included? Previous filing, history, migration, non-resident returns, and variations in tax forms all contribute to incomparability of the two figures. Their net effect is probably quite small, especially when comparing year-to-year changes in the ratios (with the possible exception of the change to "joint" forms in 1959). Adjustments for the effects of Dt, involving included and excluded items of income will be considered below. SUMMARY OF TABLE 11 (ESTIMATED WISCONSIN TAXPAYER AGE GROUPS AS PERCENT OF CENSUS AGE GROUPS) (Not corrected for missing age data) AGE MALES ALL FEMALES ALL MALES ALL FEMALES ALL GROUP YR AGI MALE AGI FEMALE AGE AGI MALE AGI FEMALE $100 FILERS $1000 FILERS GROUP YR. $1000 FILERS $1000 FILERS 38-47 47 61.3 62.7 17.4 64-73 53 35.1 39.6 5.7 7.9 48-57 47 49.0 51.1 10.1 12.8 74-86 53 10.3 13.4 .9 1.3 58-67 47 37.3 39.2 5.4 6.8 45-54 54 67.6 71.6 19;0 24.9 68-80 47 11.1 US .9 1.2 55-64 54 54.8 61.6 14.6 18.6 39-48 48 64.0 65.4 15.2 20.1 65-74 54 35.1 38.4 5.4 7.5 49-58 48 52.1 53.4 11.2 14.6 75-87 54 11.6 13.4 100 1.4 59-68 48 38.8 40.6 6.6 8.1 36-45 55 66.7 69.3 16.5 24.6 69-81 48 11.6 12.6 1.0 1.2 46-55 55 67.6 71.5 20.5 25.9 20-29 49 50.8 54.0 17.7 22.3 56-65 55 54.3 59.1 14,3 19.0 30-39 49 60.2 62.9 11.9 16.3 66-75 55 29.9 36.8 4.8 6.8 40-49 49 61.6 64.5 15.3 19.5 37-46 56 68.4 71.1 18.2 26.6 50-59 49 51.6 54.3 11.1 130 47-56 56 68.1 72.0 20.5 26.1 60-69 49 36.4 33.6 5.7 7.7 57-66 56 55.3 60.7 13.3 18.7 70-82 49 10.1 11.5 1.0 1.0 67-76 56 29.5 37.0 5.2 7.4 41-50 50 66.3 68.6 15.7 21.3 38-47 57 70.9 74.0 21.4 29.8 51-60 50 53.4 56.0 12.5 15.2 48-57 57 70.8 74.4 23.0 29.1 61-70 50 36.2 38.4 6.8 8.4 58-67 57 56.1 62.2 15.1 20.5 71-83 50 10.2 11.9 .8 1.2 68-77 57 28.8 36.5 5.8 7.4 42-51 51 69.6 71.1 17.4 22. 39-48 58 70.9 74.0 21.9 30.4 52-61 51 55.4 57.8 12.5 17.2 49-58 58 71.7 75.1 23.5 29.0 62-71 51 38.8 40.6 6.8 8.1 59-68 58 53.8 60;6 14.7 2002 72-84 51 11.7 12.6 .2 1.5 69-78 58 26.1 34.4 5.3 7.7 43-52 52 72.2 74.2 19.7 25.5 30-39 59 70.4 73.5 16.5 25.7 53-62 52 58.2 60.4 14.2 18.1 40-49 59 72.0 75.4 27.7 37.1' 63-72 52 39.2 41.7 8.3 9.2 50-59 59 70.3 74.8 24.1 31.0 73-85 52 10.7 12.9 .9 1.3 60-69 59 51.6 60.7 15.3 20.6 44-53 53 69.8 73.0 18. 6 25.0 70-79 59 24.1 34.6 5.3 7.5 54-63 53 56.0 60.4 13.9 18.6 80-92 59 10.1 15.9 1.9 2.7 SUMMARY OF TABLE IIA (ESTIMATED WISCONSIN TAXPAYER AGE GROUPS AS PERCENT OF CENSUS AGE GROUPS (Corrected for missing age data) AGE YR MALES ALL FEMALES ALL MALES ALL FEMALES ALL GROUP AGI MALE AGI FEMALE AGE AGI MALE AGI FEMALE $1000 FILERS $1000 FILLERS GROUP YR $1000 FILERS $1000 FILERS 38-47 47 78.4 81.4 20.3 27.6 64-73 53 42.6 48.2 7.8 11.2 48-57 47 62.7 65.8 15.2 20.3 74-86 53 12.7 16.3 1.3 1.9 58-67 47 47.7 50.5 8.1 10.8 45-54 54 81.4 86.8 26.0 35.0 68-80 47 14.2 14.9 1.5 1.9 55-64 54 66.0 74.3 20.0 26.1 39-48 48 81.6 84.3 23.1 30.0 65-74 54 41.4 46.6 7.4 10.5 49-58 48 65.2 67.4 17.1 21.8 75-87 54 14.0 16.0 1.4 2.0 59-68 48 48.5 51.2 9.5 12.1 36-45 55 79.6 83.3 22.3 34.1 69-81 48 14.6 15.9 1.6 1.8 46-55 55 80.9 85.6 27.6 35.9 20-29 49 62.8 67.3 24.2 31.5 56-65 55 64.9 71.1 19.3 26.3 30-39 49 74.7 78.0 16.1 23.2 66-75 55 35.8 44.3 6.4 9.5 40-49 49 76.2 80.4 21.0 27.7 37-46 56 81,7 85.4 24.6 36.6 50-59 49 63.7 67.0 15.2 19.7 47-56 56 81.3 86.2 27.5 36.0 69-79 49 44.9 48.1 7.8 10.9 57-66 56 66.0 72.7 17.9 25.8 70-82 49 12.6 14.3 1.5 1.5 67-76 56 35.2 42.5 6.9 10.2 41-50 50 81.3 84.5 21.7 30.0 38-47 57 84.4 88.4 27.9 40.5 51-60 50 65.4 69.0 17.3 21.5 48-57 57 84.2 88.9 30.0 39,5 61-70 50 44.3 47.3 9.4 11.9 58-67 57 66.8 74.3 19.2 28.1 71-83 50 12.5 14.6 1.1 1.7 68-77 57 34.2 43.6 7.5 10.1 42-51 51 84.8 87.0 23.9 31.3 39-48 58 84.5 90.1 28.9 41.4 52-61 51 67.5 70.7 17.4 23.7 49-58 58 85.4 89.9 31.0 39.5 62-71 51 47.3 49.1 9.3 11.4 59-68 58 64.1 72.6 19,2 27.6 72-84 51 14.2 15.5 1.7 2.2 69-68 58 31.1 41.2 7.0 10.5 43-52 52 87.5 90.2 26.7 35.5 30-39 59 83.9 88.2 22.3 36.3 53-62 52 70.5 73.5 19.2 25.1 40-49 59 85.9 90.5 36.3 40.8 63-72 52 47.5 50.7 11.1 12.8 50-59 59 83.8 89.8 32.6 43.3 73-85 52 13.1 15.7 1.3 1.8 60-69 59 61.4 72.9 20.6 29.1 44-53 53 84.5 88.8 25.5 35.5 70-79 59 28.7 41.6 7.1 10.4 54-63 53 67.9 73.5 19.1 26.3 80-92 59 11.7 19.1 2.7 3.9 Charts I and II (not included) are based on the ratios in Table I for males and females with Wisconsin Adjusted Gross Income greater than or equal to $1000.** Including those persons with AGI less than $1000 would result in a slight decline in the differences between age groups as well as a slight upward shift for all points. Most of the non-filing of tax returns occurs at low levels of incomes, therefore exclusion of persons with AGI less than $1000 does not significantly affect the filing percentages. Table II contains estimates of Wisconsin taxpayer groups expressed as a percentage of corresponding census groups. Entries in Table II are equal to the entries in Table I multiplied by the reciprocal of the WAIS sampling rate (.00775). Thus Charts I and II apply to Table 11 with a simple transformation of the vertical axes. In Chart 1, the several sets of lines describe various aspects of age differences in incomes. The lines labelled "B" show the effect upon filing rates of aging a particular group by one year. The year-to-year fluctuations are most likely due to individual fluctuations and sampling variability. Some trend effects are present, the most obvious is the peak for 1952 for all groups except the oldest. The lines labelled "A" connect crosssectional differences for the selected years '47, '49, '55, and '59. The "C" lines show the different filing rates which occur when a particular age group is aged 10 years rather than one. A comparison between one set of "A" and "C" lines is shown as the shaded areas on the Charts. ----------------- ** Copies of these graphs are available for anyone who would like to see them. ----------------- Since incomes rose during the years 1947-1959, and the state tax filing requirements remained relatively the same, this is the sort of comparison one would expect. In addition to those shown in the summaries of Tables I and II, the percentages for the following groups were calculated: (age) 2-19, 1949; 76-88, 1955; 77-89, 1956; 28-37, 1957; 78-90, 1957; 29-38, 1958; 79-91, 1958; 12-29, 1959. Total percentages were also calculated for each age group and year. For the years 1949, 1957, 1958, and 1959, comparable percentages were calculated for a sex-marital status breakdown. Table IIA is based on a simple correction of the data for the absence of age data on certain persons. The distribution of persons without age data is assumed to be identical to the distribution of persons with ago data in the WAIS sample for each year. This procedure probably increases the differences in age-specific filing rates over the true rates whereas an alternative approach (assuming non-age persons are distributed the same as the census distribution) would decrease the differences in age-specific filing rates. Therefore the 90.5% filing rate for males age 40-49 is probably overstated, and the 28.7% rate for males ages 70-79 in the same year is probably understated. II. Treatment of Persons for Whom No Age Data Is Currently Available The above statistics exclude all persons for whom WAIS currently*** does not have age data. An indication of the extent of WAIS's age data is in the following table: --------------------- *** This excludes additional age data from the "Supplementary 805 Data". An indication of the magnitude of the improvement from this source is that the change in persons in the entire sample with age data will be from 74% to 86%. --------------------- PERCENTAGE OF WAIS PERSONS WITH AGE DATA BY YEAR, SEX Year Total Male Female 1947 7303 77.6 63,0 1948 7509 79.2 67,1 1949 77.3 80.3 69.0 1950 78.4 81,2 70.8 1951 78.7 81;7 70,9 1952 79.2 82.3 71,9 1953 78.8 82,2 70.5 1954 79.2 82,5 71.2 1955 79,9 83.2 72.2 1956 80.1 83.5 72.6 1957 80.5 83,7 73.9 1958 80.3 83,5 73.4 1959 79,1 83,4 70.7 Exclusion of such a large number of persons undoubtedly has an effect upon any calculations involving age groups. Whether the effect is a proportional one or something else is difficult to estimate. Some persons simply do not have Social Security account numbers. This is dependent upon age as well as occupational status. Other persons have numbers but did not volunteer them on the tax forms. Another group put incorrect account numbers (consistently) on their tax forms. WAIS is now involved in supplementing its age data with additional social security data and birth and death records. As soon as this job is done, the tables of filing percentages by age groups can be redone, and the resulting age differences should be more meaningful. III. The Effect of Non-Filing of Tax Returns Each entry in Table II (if it is adjusted for the absence of some age data), subtracted from one hundred percent, will yield a percentage which represents the extent of non-filing for a particular group. For example, 24.6% of the males age 40-49 in 1960 in Wisconsin did not file state tax returns for 1959 (or did file, but did not have any age data in WAIS files). Non-filing is clearly concentrated at the low and non-income levels, however there are other reasons described above. If we ignore reasons for filing other than the income level, it seems reasonable to form a simple hypothesis that filing percentages approach 100% asymptotically with increasing income. **** Table III and Chart III were constructed to get some idea of the shape of the non-filing-income relationship. In Table III, columns 1 and 5 show the numbers of persons in the stated income classes, columns 2 and 6 show the percentage of the Census number of persons in the same income class, columns 3 and 7 show the cumulative percent of persons with incomes greater than or equal to the lower boundary for each class, and columns 4 and 8 show the estimated percentage of Wisconsin taxpayers in corresponding Census income classifications (s = sampling rate). Definitional differences between income for Wisconsin income tax purposes and income included in the Census enumerations are discussed by Gene Moyer in "An Estimate of Untaxed Wisconsin Income in 1959 and of Non-Filing Individuals in 1962". WAIS 645-035, March 10, 1965, The largest quantitative ----------------------------- ****This doesn't take account of the fact that evasion is more valuable as income level increases, This is imprecisely offset by the gains to be made by the State in catching high income evaders. ----------------------------- are not taxable in Wisconsin but are included in the Census income definition. The treatment or capital gains accounts for some individual differences, although aggregate capital gains amount to mach less than five percent of Adjusted Gross Income in Wisconsin. Capital gains are included in taxable income in Wisconsin but are excluded from the Census definition.***** OASDI benefits and other transfers are generally paid to persons with low incomes from other sources, and capital gains incomes are generally concentrated in the upper income brackets. The net effect of these differences upon the hypothesised relationship can be expected, therefore, to increase differences in filing rates for income classes, i.e., even lower rates should be found for low income persons, and higher rates for higher income persons. Chart III illustrates the relationship between income class and filing rates shown in Columns 4 and 8 of Table III, The extreme scatter of points for women and at the extremes of the income distribution is due to small numbers of persons in these categories. The expected asymptotic properties do not consistently appear. The estimated filing rates for men in 1959 actually drop consistently after reaching a peak in the $4000-4999 income class. The large number of points in the 80-90% filing range indicate a non-filing rate for higher income persons of about 15%., which is surprisingly high. ---------------------- *****For example, in 1959, aggregate capital gains in the WAIS sample accounts for only 2.3% of the sample aggregate AGI. ---------------------- TABLE III NUMBERS OF WAIS PERSONS IN SELECTED INCOME CLASSES AND RATIOS TO NUMBERS OF CENSUS PERSONS IN CORRESPONDING INCOME CLASSES BY SEX, YEAR (EXPRESSED AS PERCENTAGES) MALES FEMALES (Cum.) (Cum.) Income yr. No. WAIS WAIS T (Y) WAIS WAIS T (Y) Census Census Col. 2 No. Census Census Col. 6 x 1/s x 1/s Any 59 7617 .617 xxxx 79.6 3789 .498 xxxx 64.2 $1-$999 59 673 .396 xxxx 51.1 1172 .342 xxxx 44.1 or loss 1000-1999 59 799 .583 .653 75.2 790 .565 .627 72.9 2000-2999 59 739 .643 .663 82.9 747 .685 .658 88.3 3000-3999 59 841 .660 .666 85.1 523 .633 .641 81.6 4000-4999 59 1196 .715 .667 92.2 282 .611 .648 78.8 5000-5999 59 1235 .679 .651 87.6 132 .648 .692 83.6 G000-6999 59 850 .664 .635 85.6 62 .718 .737 92.6 7000-9999 59 871 .627 .618 80.9 41 .551 .753 71.1 10000 or 59 413 .600 xxxx 77.4 40 1.204 Xxxx 155.3 more Any 49 5811 .546 xxxx 70.4 2084 .398 xXxx 51.3 $1-999 or 49 362 .174 xXxx 22.4 523 .193 xxxx 24.9 loss 1000-1999 49 883 .490 .637 63.2 -842 .619 .619 79.8 2000-2999 49 1637 .663 .676 85.5 518 .629 .619 81.1 3000-3999 49 1610 .681 .683 87.8 130 .614 .594 79.2 4000-4999 49 693 .710 .685 91.6 30 .519 .560 66.9 5000-5999 49 254 .618 .659 79.7 10 .394 .595 51.3 6000-6999 49 112 .644 .690 83.1 8 .656 .715 84.6 7000-9999 49 127 ..668 .711 86.2 15 1.079 .736 139.2 1000 or 49 133 .758 xxxx 97.8 8 .461 xxx 59.5 more (CHART III) Income - Census and WAIS AGI Males 1959 Females 1959 Total 1959 Males 1949 Females 1949 Total 1949hahttp://www.ssc.wisc.edu/wais/WAIS656040.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656040.txtPRichard Bauman 1966^WThesis Proposal - the Extent and Determinants of Temperal Variations in Taxable IncomesiMarch 10, 1966 WAIS paper656-046a,%Proposals- For Analyses, Theses, etc.c Richard Bauman WAIS 656-046a March 10, 1966 Thesis Proposal - The Extent and Determinates of Temperal Variations in Taxable Incomes Proposal Outline 1.0 Background 1.1 Usefulness of studies of income variability 1.2 Review of sources of major empirical studies 2.0 Relationships Between and Income Fluctuation Model and Other Models of Individual Incomes 2.1 Income determination models 2.2 Income dynamics models 2.3 Lifetime income models 2.4 Tax averaging models 3.0 Effects of Varying the Time Period 3.1 General 3.2 Application to tax sample data 4.0 Choosing an Appropriate Dependent Variable 4.1 Fluctuations vs. trends 4.2 Total income vs. components 4.3 Individuals vs. units 4.4 Effects of stratification 5.0 Suggested Bivariate Analyses of Income Fluctuations 6.0 A Multivariate Model for Investigating Income Fluctuations 7.0 Critique 1.0 BACKGROUND 1.1 Usefulness of Studies of Income Variability Studies of the temperal variability of individual incomes have been limited by the inavailability, cost, and cumbersomeness of large bodies of data over long periods of time. WAIS has eliminated the first of these limitations and now possesses a sample which enables us to observe income streams for individuals covering up to fifteen years. Why should we want to investigate income variability? In general we would like to answer some of the following questions: (a) What type of person experiences fluctuating income? Are fluctuating incomes more prevalent in some groups than others? (b) What is the extent of fluctuation i.e., how much do incomes fluctuate in general and for groups distinguished in (a)? It is important to point out at the outset the rather obvious fact that the use of tax data to investigate those questions only gives a partial answer. Variations which allow taxpayers to file returns intermittently as well as fluctuations in incomes that are never taxable cannot be estimated without recourse to inferences of non-taxable incomes. Nevertheless the writer feels that a study of the variability of taxable incomes is not without merit. Tan equity considerations alone ate important reasons for studying income fluctuations. Giving empirical content to questions (a) and (b) should permit informed evaluation of whether corrective measures (some form of tax averaging) are necessary and effective. This of course does not resolve the question of whether other forms of relief are needed. The dangers inherent in using a tax sample bay be illustrated by the following example: Suppose we conclude that self employed persons are more susceptible to income variation than unskilled workers over, say, a three year consecutive filing period. This conclusion is valid in general only if all of those persons in each group filed tax returns in each year. The comparison is invalidated not only by the presumably different filing rate for the two groups but also by complications such as an additional incentive for the self employed person to file a non-taxable return because of loss carryover provisions. The only valid conclusion, then, is that the difference in income variability is one due to occupational characteristics conditional upon consecutive filing. What is the relation between average size of fluctuation and average income? There is some evidence [Kravis, p. 2551 that there is a direct relation between proportional deviations and average income. This conclusion is subject to qualifications and is based on a relatively small sample. (d) Do the determinants (or correlates) of fluctuating incomes have an additive effect? A multivariate analysis using some index of income fluctuations as a dependent variable and.incorporative suitable interaction variables will enable estimation of the magnitude of nonadditivity. (e) Are measured fluctuations really fluctuations or just trends? Discussion of this question is rather meaningless without a specific formulation of the measurement technique. A cursory treatment involves the following specific questions. (1) What portion of income fluctuations can be related to general changes in income level? (2) What portion of income fluctuations can be attributed to cohort changes relative to general economic condition? (3) Do incomes tend to stabilize with maturity? (f) Are income fluctuations more prevalent in certain components of income, such as capital gains? (g) What is the long run trend of income variability? Are incomes more or less variable now than ten years ago? Why? (h) Can an appropriate measure of permanent income be found on the basis of individual average income over 4 period of years? (k) To what extent does measured income inequality decrease due to changes in income and extension of the measurement period? (j) What implications for consumption-savings behavior are there from evidence of differential income variability? Do units with varying degrees of income uncertainly adjust their behavior accordingly? See, for example, the demand study by Lee [ ] The above survey produces an ambitious agenda for empirical studies of income variability. A less ambitious proposal follows. An evaluation of the writer's proposal in terms of these global goals is in the last section of this paper. 1.2 Other Empirical Work Two different approaches have been taken to studies of income variation. The first is essentially bivariate analysis of survey panel data. Survey of Consumer Finances Panel Data have been analysed in various Federal Reserve Bulletins (Kravis, p. 27992811, by Katona and Fisher [ ], by Bristol [ ], and most recently by Morgan [ ]. Typically, the analyses apply to income changes over two or three periods. Bureau of Labor Statistics data on income change has been compared with the earlier SCF studies by Kravis [ ]. Kravis presents some original findings on income variability based on Market Research Corporation of America Data based on four observations on a five year time period. A second approach involves correlation analyses of incomes in adjacent and nearby years. This approach has been applied to very broad groupings of individuals and to quite homogeneous groups. A summary of the work of Freidman and Kugnets [ ], Mendershausen, Hanna [ ], and Reid (SCF data) appears in Freidman [pp. 187-89]. A recent application of this technique appears in Huang and Myers [ ]. Freidman outlines a more sophisticated approach which separates income variance into permanent, quasi-permanent and transitory components. 2.0 Relationships Between an Income Fluctuations Model and Other Models of Individual Incomes The general form of a model of income fluctuation (sections 3 and 4 cover these relationships somewhat more detail) can be written: Vit = Vit (Tit, Yit, Pit, Fit, Uit, Dlit...Dkit...,Mit...Mkit...) where Vit = the variation of income (total or selected sources) of individual i over period t Tit = individual trend in income Yit = individual income level Pit = income "mix" by source - the structure or pattern of income Fit = income level of other persons in i's unit Vit = variation of income of others in its unit Dkit = a "demographic" characteristic t Mht = a macroeconomic characteristic of period t For simplicity, these can be grouped into three categories which we shall call Income Earning characteristics - Tit, Yit, Pit, Fit, Vit Demographic characteristics - Dlit,...,Dkit,... Macroeconomic characteristics - Mlt,...,Mht,...jchttp://www.ssc.wisc.edu/wais/WAIS656046a.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656046a.txtMike VonSchneidemesser 1966^XReport on the Matching of the FFID File with Master, SSA Form 805 and State Roll Extract June 16, 1966  WAIS paper656-062jcSocial Security Earnings Data- 805 Fixed Format Identification File (FFID) Master File- Tax RecordsdF@M. von Schneidemesser WAIS Paper 656-062 June 16, 1966 Report on the Matching of the FFID File with Master, SSA Form 805 and State Roll Extract At the writing of this report, all changes and updates on the FFID file have been made. The main programs involved in this matching sequence are: JR Job 9 - UPDATEFID see WAIS 656-019 JR Program 4 - SELECT805 MVS - FFKEY see Program Catalog MVS - SSRFF The number of matches and nonmatches are as follows: File Total # Not matching with of Rcds Master FFID FORM 805 Rollextract 20,150 - 49 FFID 21 523 191 - 1,163 11,514 FORM 805 17,001 Rollext . 44,170 34,348+ This number should be zero when the necessary changes on the Master have been made. There are 11 individuals who have to be added, 25 individuals who have to be dropped and 15 individuals whose ID# has to be changed. Almost all are Benefit FFIDs which are not yet on the Benefit file. Also 8 individuals could not be entered on the Master (doomage returns, etc.). We decided to drop these 8 cases. In addition there are 2,900 individuals who do not have a SS# and thus no FORM805 data either. +These high residuals are due to the fact that they could not be matched on SS#, last name, and first initial. The number of matches between FFID (less the 191 nonmatches) and Rollextract records is 9,822. By hand, approximately 250 additional individuals could be matched. Some 42 FORM 805 cases could not be matched even after intensive hand checking using income figures. Some may be people who actually do not belong in our sample. There are also a considerable number of duplicate FORM 805 cases appearing as nonmatches. This is due to the fact that we got these data in two separate deliveries. For references to these files and listings of nonmatches, see the file catalog.hahttp://www.ssc.wisc.edu/wais/WAIS656062.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656062.txtlMike VonSchneidemesser 1966.'The WAIS Master File Maintenance SystemIAugust 8, 1966 WAIS paper667-003(F?Maintenance System - Files, Data, Etc. Master File- Tax RecordsSZS**This document could not be translated to basic text. Please view the PDF file.**6**Terms and topics from paper, listed for searching purposes** The WAIS Master File Maintenance System entry codes fixed field updating system file merge program duplicate records Program MA Update Cobol Card-edit program inter-card edit checks inconsistent records incomplete records Alter C-cards "The Layout of MA UPDATE input cards" "Master File Layout and Entry Codes" "Duplicate entry code, last used" "No coded data - rcd accepted" "No amount fields - rcd accepted" "No indicators - rcd accepted" ID # J-card Negative amounts Field Consistency 1410 Leading Zeros M 312 M 314 M 703 M 708 M 711 - M720 Considerations for Conversion for an Expanded Master File FD Master File Layout and Entry Codes Position Old Label New Label Entry Code for MA UPDATE Item 1 ID # Year of Return Date (coded) Largest Wage Second Wage Total other wages total interest received dividends Rent Gain or Loss assets Profit or loss business Income from Trustees Partnership other income auto or business expense adjusted gross income standard deduction allowed net taxable income Wisconsin tax paid Union dues paid Medical-Dental expenses Total interest paid Business interest paid dividend deductible other deductions alimony paid forest crop land N-cards I-cardshahttp://www.ssc.wisc.edu/wais/WAIS667003.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667003.txtoMike VonSchneidemesser 1966RLA Proposal for Documenting at the Macro-Level the Programs and Files of WAISApril 19, 1966 WAIS paper656-054Data ProcessingM. Von Schneidemesser WAIS paper 656-054 April 19, 1966 A Proposal for Documenting at the Macro-Level the Programs and Files of WAIS As a response to Max Ellis' memo of February.-22, 1966, I suggest a simple method of getting WAIS started towards a more systematic way of keeping track of its programs and files. Max Ellis gave already some reasons for the need of documentation, and in anticipation of the great turnover of personnel which WAIS will experience in the near future this seems to be an urgent matter. There are three areas which need documentation (A) The work and information files which are the basis of all WAIS work. (B) The programs having to do with these files. (c) Files and programs which have been created from and with (A) and (B) for specific research projects. The problem with (A) and (B) is that we have so far no inventory system of the files and programs, For (C) a system of classification or identification seems useful which would associate these outputs with its sources under (A) and (B). It seems to me that WAIS cannot hope to install an overall system of documentation which is completely integrated from the variable name to the final output table-our operations are too much of a nonroutine character to make this possible. But we should agree on a few basic procedures and rules and then strictly implement these. Various steps have been done already in this direction, but they were either too limited -- like Jim Geffert's erasable tape use chart -- or were simply forgotten -- like Gene Moyer's KWIC indexing system (WAIS 656-002, lst revision). Area A: Documentation of Files SSRI is already valiantly striving to keep track of its tapes. We can support their effort by filling out their tape logging sheets, making a xerox copy of it for our own use and file this together with a filled out form WAIS 000-001 (appended to this paper) in a "File Reference Manual," which has to be set up. This way we will have the technical and data descriptions accessible to everyone. But the "File Reference Manual" should also contain descriptions of card and printed or written files, which also can be logged on form WAIS 000-001. Files of this kind are for example: multiple ID #, multiple SS account #, Benefit data, the Death data file, coversheet cards, etc, Such a file would be of invaluable help whenever a new problem has to be tackled. The erasable tape use chart installed by Jim Geffert could be used to keep track of tapes which do not contain permanent files, i.e., all tapes available for output operations. When an intermediate product has been put on a tape the operator should enter this information in the tape use chart, and erase this again as soon as the tape becomes available again. The tape's number should be taken off the chart completely when a permanent file has been written on it. Area B: Documentation of Programs Here documentation is most urgent. We have Already all sorts of descriptions for various programs and two file maintenance systems. That, however, is not sufficient for someone, who wants to find out if there exists already a program suitable for a certain purpose. Therefore, analogous to the "File Reference Manual" a "Program Reference Manual" should be established. For every program, already at the time of its inception, the programmer should fill out a form WAIS 000-002 (appended to this paper) and file it in that manual. This "Program Reference Manual" should be organized along the files we have. With each file usually three types of programs will be associated: (a) file generation programs. (b) updating and maintenance programs. (c) utility programs, like print, select, match and other programs. Since programs of type (a) and (c) are often connected with more than one file xerox copies of the form WAIS 000-002 should be filed under each file where applicable. Area C: Identification Codes for Programs, Files and Outputs Thereof Such an identification system is needed to allow cross-referencing between files and programs. Also such a code tells on inspection which program created the file and what kind of a file we are dealing with. The suggested procedure is completely compatible with the IBM KWIC system which is being used by SSRI, and which has been proposed by Moyer (WAIS 656-002) for our output tables. It uses the ideas presented by Fisher in his Monograph "Data Documentation and Decision Tables" (Comm. of the ACM, Jan. `66). The suggested Identification code is: pos. 1 "W". This indicates WAIS. pos. 2-4 Programname. Either alphabetic or numeric. Special codes for pos. 2 could indicate the type of the program. Example "G" for a file generation program "M" for update or maintenance programs "U" for utility print and select programs etc.. pos. 5-6 Name or number of output. If the output is a permanent file, use an alphabetic like "MA" for master file, "FF" for identification file, etc. All non-permanent outputs, like extraction tapes, tables, prints should be designated by a numeric code. Even numbers could designate on-line outputs, while odd numbers indicate off-line outputs. Various other conventions could be introduced. A different number in pos. 6 only could indicate the same output in a different sequence, i.e., 5 = sorted on ID #, 7 = sorted on SS #, etc. We may want to decide that position 5 is always an alphabetic code for every tape or punched card file. Then we could use the file designation code for the first two positions of every variable name or label of that file, which will be of importance when we start to program in Cobol, where a standard data-devision is of great help. pos. 7-9 Blank, or designation of book number or other filing information, if needed. Programs then will be completely identified with position 1-4 only. Any file will be identified by position 1-6; that also tells the program which created the output or file. For files, which were not set up by a program, a special code for position 2-4 may be designated. Maybe "OOR" or simply blank. This is just one possible way of labeling our products systematically. Especially one may want more than just position 5-6 for the file name. But then we should check if this is still compatible with KWIC and if we really need this compatibility in every case. Also one may want to make the files the basis of classification and not the programs, since our files are less numerous than our programs, and of a more permanent nature. But in any case, it seems essential that, if we agree on such an identification system, we have to stick to it. Otherwise cross-referencing will be far more difficult and it will be impossible to set up cross-reference tables and decision tables as aids to programming -- if they should become desirable with a growing number of files and programs. It would be useful, if the identification code is always used as the first entry in the job card and page headings. It also could be used in title cards or subroutine names. WAIS 000-001 File Description Sheet Information Type Entries (1) File name code system (2) Type of File (3) Sorted on (4) Changes updates (5) Format described (6) other relevant papers (7) Location of file (8) Labels of boxes, tape numbers, color of cards etc. (9) Short description (10) Comments, damages, missing items, etc. (11) Date generated WAIS 000-002 Program Description Sheet Information Type Entries (1) Program name (code) (2) Job or Header card entries (3) Title or Subroutine name if different from (1) (4) Author and date checked out (5) inputs (a) code + description (b) (c) (6) Outputs (a) code + description (b) (c) (7) WAIS papers in which described (8) Other papers (layouts of outputs, etc.) (9) Physical form of program (10) Where to find (11) Labeling of deck or calling inform color of cards (12) Short description Commentshahttp://www.ssc.wisc.edu/wais/WAIS656054.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656054.txtb James Geffert 1966Completion EstimatesMarch 15, 1966 WAIS paper656-046pAdministrationJames Geffert WAIS 656-046 March 15, 1966 March 23, 1966 Completion Estimates 1. Weighting Survey and Report to Respond (a) Theory Paper - March 11 (b) Coding and Punching - March 25 includes program (c) Wistabs - April 15 include control cards running tables 2. Verification of Age (a) Death Records - March 11 (b) Motor Vehicle 1. fixing input - March 18 2. lookup at Motor Vehicle Department - March 25 Information back - March 25 (c) Coding and Punching - April 1 Age will be included in files just prior to EXT 01 3. Treasury Tables (a) Revision and Testing - March 18 (b) Production Finished - April 8 4. History File (a) Pre-edit Program checkout (b) Control Cards begin History file - April 1 5. Roll Extract (a) Complete - March 11 6. Social Security Tables (a) FFID Update and checkout complete- March 25 (b) Run Jons programs- April 1 (c) Additional age data, EXT 01 and EXT 02 - April 8 (d) Wistabs - checkout and production - April 29\hahttp://www.ssc.wisc.edu/wais/WAIS656046.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656046.txte James Geffertt 1966&History File Job Plan (General)tApril 21, 1966 WAIS paper656-057n History FilepjJames Geffert WAIS paper 656-057 April 21, 1966 HISTORY FILE JOB PLAN (GENERAL) 1. Purpose of this job will be to create a file which will be a compilation of information from the basic WAIS files. 2.1 The job will progress in stages using as input: a) The WAIS master file b) The Wais ID and social security file c) The Wais property income file d) The WAIS survey file e) SSA benefit data file f) Additional age date (cards) g) Two ID and two SS# cases (cards) 2.2 The operation in general will be to extract certain information for each person from the several files and merge it into one file. Extensive use will be made of George Loniello's pre-edit program Since this is a multiphase job the descriptions and flow charts which follow are by phase number. Phase 1, step 1 Input - 400 character master file SSRI 423 Reel I of 4 SSRI 259 Reel 2 of 4 SSRI 277 Reel 3 of 4 SSRI 129 Reel 4 of 4 File description (input) 10 x 400 RM in 400 556 c.p.i., BCD. 134, 411 recs, record description attached. Record description (output of Pre-edit) Pos. 1 Level number 1 Pos. 2 Record number 1 Pos. 3-10 WAIS ID Pose 11 Record for 1946 1 yes, no error (check) yes error 0 no Pos. 12 Interest for 1946 1 yes 0 no Pos. 13 Dividend for 1946 Pos. 14 Capital gain 1946 Pos. 15 Rent: 1946 Pos. 16 Business 1946 1 yes 0 no Pos. 17 Record for 1947 Pos. 18 Interest for 1947 Pos. 19 Dividend for 1947 Pos. 20 Capital Gain 1947 Pos. 21 Rent 1947 Pos. 22 Business 1947 Record description (output of Pre-edit) Pos. 23 Record for 1948 Pos. 24 Interest 1948 Pos. 25 Dividend 1948 Pos. 26 Capital Gain 1948 Pos. 27 Rent 1948 Pos. 28 Business 1948 Pos. 29-34 like information 1949 Pos. 35-40 like information 1950 Pos. 41-46 " 1951 Pos. 47-52 " 1953 Pos. 53-58 " 1954 Pos. 59-64 " 1954 Pos. 65-70 " 1955 Pos. 71-76 " 1956 Pos. 1 Level number 1 Pos. 2 Record number 2 Pos. 3-10 WAIS ID# Pos. 11 Record for 1957 Pos. 12 Interest for 1957 Pos. 13 Dividend for 1957 Pos. 14 Capital Gain for 1957 Pos. 15 Rent 1957 Pos. 16 Business 1957 PoS. 17-22 like information 1958 Pos. 23-28 like information 1959 Pos. 29-34 like information 1960 Step 2 Input - Records created in step 1 above Output - Merged two records into 1 record wit three additional fields: a) last two digits of first year record appears, b) last two digits of last year record appears, c) total number of records for this person. FLOW OF PHASE 1 STEP 1 Pre-Edit Program 400 Character Master Condensed 2 record output STEP 2 Merge and count Merged 1 record output of phase 1 History Phase 2 Phase 2 Step 1 Input Input record is the combined FFID and 805 record. (WAIS 645-063 and revisions) Output from Pre-edit Program. Pos. 1 level 1 Pos. 2 Record number 1 Pos. 3- 10 WAIS ID# Pos. 11-19 Social Security Number Pos. 20 1951 earnings present? Pos. 21 1952 Pos. 22 1953 Pos. 23 1954 Pos. 24 1955 Pos. 25 1956 Pos. 26 1957 Pos 27 1958 Pos. 28 1959 Pos. 29 1960 Pos. 30 1961 Pos. 31 1962 Pos. 32 1963 Pos. 33-37 Month and year of birth Pos. 38 Race Pos. 39-45 Sex Phase 2 Step 2. Combine the output of this stage with the output of stage 1. Construct three variables: a) last two digits of first year SS earnings present b) last two digits of last year SS earnings present c) Variable indicating only FFID portion of record present (examine 33-45 of Step 1 record above) For nonmatching records leave nonmatched area entries blank. History Phase 3 Phase 3 Step 1 Input. WAIS Property Income File Comm 115 File Description (input) 50 x 31, RM in 31, 556 c p.i. BCD. Record Description WAIS 645-036 1st Revision Feb. 25, 1965, pp 40-46 Action. Sort Records on Pos. 2-9 ID# Pos. 10-11 Year Pos. 1 Card type Pos. 12-13 Card # or Asset Type Step 2 Input - a) Sorted Records from Step 1 b) C cards, J cards, P cards Format WAIS 645-016 Dec. 2, 1964 c) Additional card records Action. Write and run program(s) to make the C, J and P changes and to accept the additional card records. Program should list any duplicate records and indicate which accepted and which rejected. Step 3 Input. Final product of Step 2. Action. Perform Pre Edit Program to get the following output. Pos. 1 Level Number 1 Pos. 2 Record number 1 Pos. 3-10 WAIS ID# Pos. 11 Interest for 1946 G yes E no Pos. 12 Dividend for 1946 G yes E no Pos. 13 Capital Gain for 1946 1 yes 0 no Pos. 14 Rent for 1946 1 yes 0 no Pos. 15 Business for 1946 1 yes 0 no Pos. 16-20 Like Information 1947 Pos. 21-25 1948 Pos. 26-30 1949 Pos. 31-35 1950 Pos. 36-40 1951 Pos. 41-45 1952 Pos. 46-SO 1953 Pos. 51-55 1954 Pos. 56-60 1955 Pos. 61-65 1956 Pos. 66-70 1957 Pos. 71-75 1958 Pos. 76-80 1959 Pos. 1 Level Number 1 Pos. 2 Record Number 2 Pos. 3-10 WAIS ID# Pos. 11 interest for 1960 G yes E no Pos. 12 Dividend for 1960 Pos. 13 Cap. Gain 1960 Pos. 14 Rent 1960 Pos. 15 Business 1960 Step 4. Merge the records created in Step 3 above with the output of Stage 2. At merge time create 3 variables: a) last two digits of first year property income present b) Last two digits of last year property income present c) Total years property income appears.hahttp://www.ssc.wisc.edu/wais/WAIS656057.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656057.txt6o Alan Duchana 1966d]Description of WAIS' Tax Averaging Tables Using the Federal Definition of Personal Deductions  May 12, 1966 WAIS paper656-058\Averaging Studies TablesnTnNAl Duchan WAIS Paper 656-058 May 12, 1966 Description of WAIS' Tax Averaging Tables Using the Federal Definition of Personal Deductions 1.0 Introduction WAIS has completed a set of tables designed to show the effects of varying certain provisions of the tax averaging section in the 1964 Internal Revenue law. Although no attempt will be made here to describe the intricacies of the law, a minimal understanding of its basic provisions is required to comprehend the development of the tables. Very briefly, taxpayers who experience unusually large incomes in some year as compared to average income over the previous four years may, under the averaging law, have part of their income taxed at a lower rate than it would be without the averaging law. A simple formula will make this clear. Let FNTI = current taxable income B = average taxable income over the previous four years Under current law, FNTI-1.33B is a person's (or married couple filing a joint return) Potentially Averagable Income - potentially because a person may average only if FNTI-1.33B is greater than $3,000.* --------------------- *The taxpayer must meet two other requirements in order to average, One of these, which WAIS has ignored, concerns a person's citizenship or residence status. The other requires that the taxpayer be self-supporting throughout the period. See Section 3.3 for the ad hoc rule used to simulate this test. The expression given above can be generalised to allow varying the parameter. Replace 1.33 with PCV, then Potentially Averagable income is FNTI-[PCV]B. The tables show the effects of changing PCV and also of changing the requirement that Potentially Averagable Income be at least $3,000 in order to average. The tables also furnish information for evaluating the effects of enlarging the scope of the law to allow some type of averaging for taxpayers who experience a negative income fluctuation. That is, if FNTI is less than B, Potentially Averagable Income is defined as [PCV]B-FNTI. Again, the tables show the effects of varying PCV. In addition to showing the implications of varying PCV the tables also show what may happen under different definitions of income. One reason for using several income definitions is that current law does not allow net income to fall below zero. WAIS wanted to see what happens when this restraint is removed. The second reason is that the data used to develop the tables consist of information gathered from Wisconsin state income tax returns for which the definition of income is different from the Federal definition of income for averaging purposes which in turn is different from the conventional Federal definition of income. The major discrepancy between the Federal and Wisconsin bases affects people with capital gains. In the hope of partially overcoming this difference, the tables show the effects of totally including and totally excluding capital gains. 2.0 Outline of the Computational Procedure In order to easily understand the tables, the reader should have an overview of the computation of Potentially Averagable Income. We discuss the input data, the different definitions of income and how to compute Potentially Averagable Income. 2.1 Input The input includes adjusted gross income, marital status, and exemptions (year by year from 1954 through 1958) for 2,577 persons single in 1958 and 5,723 couples married in 1958 whose State income-tax returns are in WAIS files.** All the data are taken, directly or derived from the state returns. When a person's return was missing, data were interpolated (in accordance with WAIS 645-050) from the next later return available or from the return of the person's spouse if he was married. Because Wisconsin does not allow joint returns, the two state returns for a married couple were combined to create a simulated joint return. Before computing Potentially Averagable Income, two tests were given to each person or couple. The first, called the A test, was an ad hoc substitute for the Federal requirement that a person must be self-supporting to be eligible to average his income. The second test, called the B test, refers to the completeness of a. person's record. If the number of years for which records were missing was less than an arbitrary figure, he passed the B test. Although not at present done, it is possible to redo the tables with a B test' requirement of zero interpolated years. This would permit judgment on the validity of the interpolation procedure. The specific requirements for passing the two tests are given below in section. ------------------- **WAIS has returns for approximately 20,000 Wisconsin taxpayers, but only those people for whom we had a return in 1958 or 1959 were used. 2.2 The Definitions of Income After giving a person the two tests, the next step was to compute, in each year, the person's net income for averaging purposes. We call, this "Allowable Net Income for Averaging Purposes" or Bt. As was noted, to compensate for the difference between State and Federal definitions of capital gains income and to see the effects of allowing negative net income, the program computes Bt in four ways. They are differentiated by calling them Legal Definitions 1,2,3 and 4 as defined below. Let It = Adjusted Gross Income including capital gains in year t less $600 per exemption in year t less personal deductions in year t Ct = Capital Gains or Losses in year t Bt = Allowable Federal Net Taxable Income for Averaging Purposes in year t Using this notation, the differences between the four Legal Definitions are Legal Definition n 1 Bt = max (It - Ct; 0) i.e., (a) capital gains and losses are excluded and (b) negative net income is not allowed. Legal Definition 2 Bt = max (It; 0) i.e., (a) capital gains and losses are included and (b) negative net income is not allowed. Legal Definition 3 B t = It - Ct i.e.,(a) capital gains and losses are excluded and (b) negative net income is allowed. Legal Definition 4, Bt = It i.e., (a) capital gains and losses are included and (b) negative net income is allowed. 2.3 Computation of Potentially Averagable Incomes The average of Bt for the years 1954-57 is compared to Bt for 1958. Let FNTI = B58; i.e. allowable Federal Net Taxable Income for Averaging Purposes in 1598. B = Average Bt for the base period. Then if FNTI-B>0, the person or couple has a positive income fluctuation and his Potentially Averagable Income is FNTI less a percentage of B. That Potentially Averagable Income equals FNTI-[PCV]B. Potentially Averagable Income is computed four times using PCV = 1.00, 1.25, 1.33 and 1.50, If FNTI-B 0) and those with negative fluctuations (FNTI-B<0). Potentially Averagable income, designated by POTAVINC, is the column variable. The values shown on the table are the upper bounds for each interval except for the last interval where the upper bound is + (infinity). The row variable, designated by AGI, is Adjusted Gross Income in 1958 per the Wisconsin returns. Again, the values shown are upper bounds except for the last where the upper bound is + (infinity). The first page of the table shows, for example, three taxpayers whose AGI was between $5,001 and $7,000 and whose POTAVINC was between $3,001 and $5,000. From the right most column and bottom row, which are marginal totals, one can see that 388 had AGI between $3,000 and $5,001 and zero people who had POTAVINC of $10,000 or more. The table is divided into 64 sections, each section showing a different combination of (1) marital status; (2) Legal Definition of income used; (3) positive or negative income fluctuation; and (4) percent of average base period income used in computing POTAVINC. Pages are identified by the values of the four variables GRPTYPE, LEGALDEF, DIRFLUC and PCV. GRPTYPE defines marital status in 1958 and has the following values: (a) Sections with GRPTYPE = 1 cover persons who were single in 1958. (b) Sections with GRPTYPE = 2 cover couples married in 1958. The observations on these pages represent couples filing joint returns. LEGALDEF defines the Legal Definition of income used and has the values 1,2,3, and 4 corresponding to the four Legal Definitions described on page 4. These are summarized below: Capital gains and Capital gains and losses are excluded losses are included Allowable Federal Net LEGALDEF = 1 LEGALDEF = 2 Taxable Income for Averaging Purposes is set equal to zero if computed figure is negative. Computed figure is LEGALDEF = 3 LEGALDEF = 4 used whether positive or negative DIRFLUC defines positive or negative fluctuations. (a) Sections with DIRFLUC = 0 cover people with FNTI-B>0 i.e.., people whose Allowable Federal Net Taxable Income for Averaging Purposes in 1958 was greater than their average Allowable Net Taxable income in 1954-57. (b) Secticns with DIRFLUC = 1 cover people with FNTI-B<0. PCV defines the percentage of D used in computing POTAVINC. The percentages used on these tables are tabulated below.*** Value of PCV shown Percentage used if Percentage used if on the table DIRFLUC = 0; i.e. if DIRFLUC = 1; i.e. if Direction of Fluctuation Direction of Fluctuation is positive is negative 1 100% 1007 2 125 80 3 133 75 4 150 67 Each section is divided into six parts. The first is a frequency count. The next two show row and column percentages. The fourth shows the mean of POTAVINC and the last two show row and column a percentages for the mean of POTAVINC. ---------------------- ***In other WAIS papers and in the complete description of computations at the end of this paper, PCV is referred to as a if the direction of fluctuation is positive and as y if the direction of fluctuation is negative. Table J1 shows the marginals, summed over marital status (GRPTYPE), for Table H1. That is any page in Table J1 is the sum of the two analagous pages in Table H1. 3.2 Tables E1 and F1, Qualifiers Only, Potentially Averagable Income by FNTI Tables El and Fl show Potentially Averagable Income (POTAVINC) by Allowable Federal Net Taxable Income for Averaging Purposes in 1958, designated by FNTI, for all who passed the A and B tests. Except for replacing AGI with FNTI, they are identical to Tables H1 and J1, respectively. Note, however, that while a person's (couple's) AGI is invariant, his FNTI varies from one Legal Definition to another. 3.3 Tables A1 B1 C1 and D1 - Qualifying Status by FNTI and by AGI Tables Al thru Dl classify people by whether they passed the A test, B test, both or none. The variable QUALTYPE defines the person's or couple's Qualification Status. QUALTYPE = 0 means the person or couple passed both tests; i.e., qualifies. QUALTYPE = 1 means the person or couple failed the A test only. QUALTYPE = 2 means the person or couple failed the B test only. QUALTYPE = 3 means the person or couple failed both tests. The specific requirements for passing the two tests are: A test (Support test) For GRPTYPE 1. A person single in 1958 passes if [Person's total adjusted gross income during 1954-58;] max [total earnings from necessarily positive sources of ] > $3,000. [income during 1954-58. ] For GRPTYPE 2. A couple married in 1958 passes if [couple's total adjusted gross income during 1954-58;] max [total earnings from necessarily positive sources of ] > $5,000. [income during 1954-58 ] B Test (completeness of record test) For GRPTYPE 1: A person single throughout period passes if no. of interpolated years < 3. A person single in 1958, but married previously passes if sum of interpolated years for him and his wife < 4. For GRPTYE 2: A person married in 1958 passes if sum of interpolated years for him and his wife < 5. Table A1 shows QUALTYPE by FNTI. Observations are divided by marital status, (GRPTYPE), direction of income fluctuation (DIRFLUC) and each person or couple is treated four times, one for each Legal Definition of income (LEGALDEF). For example, for single persons (GRPTYPE 1) and using LEGALDEF 1, 1203 people had a positive income fluctuation (DIRFLUC=O) and of those, 947 passed both tests (QUALTYPE=O). Under the same Legal Definition, there were 1374 single people who had a negative income fluctuation (DIRFLUC.). Of these, 614 passed both tests. Table B1 shows the marginals summed over marital status (GRPTYPE) for Table A1. Tables C1 and D1 are the same as A1 and B1, respectively, except people are classified by FNTI instead of AGI. The first three parts of each Section of these tables show frequency counts and row and column percentages. (As before, for married people (GRPTYPE=2], the observations represent number of couples, not number of people.) The last three parts show the means of AGI and row and column percentages for the mean. 4.4 Specific treatment of each taxpayer. Although the reader now has sufficient information to read the tables, an understanding of some of the inconsistencies and interpolations in the data is advisable. What follows is a detailed description of how the data from the state income tax returns were used to determine each person's Potentially Averagable Income. For reference, we first present a list of the symbols used. Next is a discussion of some of the problems that arose and how they were solved. The final section of this part gives the exact computational procedure. 4.1 Definition of symbols 4.11 Subscripts: (i) i ith person (i=1,2) in the record (ii) k = kth record (k=l, ..., ?) (iii) t ~ tth year (t=1,2,3,4,5), 5 is 1958 (comp. yr.) (iv) J r- ith type of legal. definition (j.1, ..., 4). 4.12 Variables in the input to this program (i) Ekit a Number of exemptions (ii) Rt An estimation of exemptions used when LS=O for married people. (iii) 14 M kit Marital status (iv) Vkit '~ "Value" of tth years record for ith person (v) G~ = Sum of necessarily. positive sources of income (vi) Gkit .~ Adjusted Gross Income per Wisconsin. [On the table's G for. (vii) 1958 is designated by AGI] Nkit = Net Taxable Income per Wisconsin 0 (viii) Chit = Capital Gains Net Income per Wisconsin (ix) Tk = Record Vpe 4.13 Variables created in this program: (i) Akit = Allocated (Deductions + Personal Exemptions) (ii) Bkitj = Allowable Federal Not Taxable Income (iii) Dkit = Deductions for a single person per Federal basis (iv) DJ kit a Deductions for a married couple per Federal basis (v) Fkitj= Federal Net Taxable Income (vi) Jkitj= Joint Adjusted Federal Net Taxable Income with Later Spouse (vii) Pkitj= Proportion of aggregate adjusted gross income with later spouse (viii) Qki = Qualifying Parameter (ix) Sit]= Separate Adjusted Federal Net Taxable Income from later spouse (x) Symbols used for Potentially Averagable Income [Potavinc] (1). Xkij l I - a I if FNTI > i (Positive income fluctuation) 2) ski] 1 I if MI < B (Negative income fluctuation) (xi) FNTIkij Allowable Federal Net Taxable Income for Averaging Purposes in 1958 4.14 Other symbols used: (i) IDk# = WAIS Identification Number (ii) Hkit Husband (iii) Wkit Wife (iv) Symbols used for PCV, the percentage of B used in computing Potentially Averagable Income (1) Percentage used for positive income fluctuations a = 100, 125, 133, and 1507. (2) Y = Percentage used for negative income fluctuations; Y =100, 80, 75, and 67%. 4.15 Types of Legal Definition: [LEGAL DEF] (i) j i s> Non-negative net taxable income exclusive of capital gains and losses.. (ii) J-2 m> Non-negative net taxable income inclusive of capital gains and losses. (iii) j=3 *> Any-signed net taxable income exclusive of capital gains and losses. (iv) j=4 => Any-signed not taxable income inclusive of capital gains and losses. 4.16 Groupings of Record Types. (3) The code for Record Types (T) taken from the input is Code for T Description 1 Male, never married during 1954-58 2 Female, never married during 1954-58 3 Male, married sometime in 1954-57, not in 1958 4 Female, married sometime in 1954 -57, not in 1958 5 Male, married in 1958, no other spouse in 1954-57 6 Male, married in 1958, one other spouse in 1954-57 7 Male, married in 1958, two other spouses in 1954-57 (ii) For tax treatment, the seven types are divided into four groups (a) Group I ...=> T = 1, 2 (b) Group II s> T = 3,. 4 (4) Group III = T 5 (d) Group IV s> T - 6,7 4 Only 18 out of 8300 cases had more than one spouse. Because of the complex rules for treating these people, we decided to ignore the earlier spouse and treat them as Group III. 4.2 Inconsistencies and Interpolations To make the sample as large as possible, we interpolated income data when we did not have a person's return. The source of interpolation data was the person's next later available return and (when he was married and the spouse filed) his spouse's return. The key to the interpolation is the answers to two questions on the returns. The first is whether the person filed in the previous year and if not, why not. The second asked of married filers, is whether the person's spouse had income. There were three mutually exclusive situations: (a) a person's spouse filed and said the person did not have income, (b) a person's spouse filed and said the person had Income, and (c) the spouse did not file or the person was not married. Let P be the person whose return is missing and let S be the person's spouse. Then the three possibilities and their treatment are: (a) P's return was missing, S filed and said P had no income or P died during the year. P's income is set equal to zero. (b) P's return was missing. S filed and said P had income. P's income was set equal to the income shown on P's next later available return. (c) P's return was missing and he was not married or S's return was also missing. Then: (1) If on P's next later available return, he said that he had not filed previously for a reason that implied little or no income (e.g., a student), then his income in the year in which his return was missing was set to zero. (2) If on P's next later available return, he said that he had not filed for a reason that did not Imply little or no income (e.g., just moved into state). or if P did not answer this question, then the income shown on the next later available return was used for the year in which the return was missing. WAIS paper 645-050 (April 27, 1966) is a full description of the method of interpolation summarized above. When, however, we began calculating Potentially Averagable Income, we found that further interpolation was needed. We also found errors in the computer program developed to carry out the interpolation procedure. Presented below are the additional interpolations, the errors found, and the ad hoc rules used to deal with the errors. 4.21 Marital Status Errors in the original coding of WAIS's sample of state tax returns have caused inconsistencies in marital codes of the people in a record. To determine marital status for these calculations only the head filer's marital status was examined. That is, for Group II and III persons (married sometime in 1954-57), the sole determinant of whether the head filer is married in a Year t is his own marital status. No problem occurs for Group I persons since they are single throughout the period. WAIS paper 645-050 (pages 4-5) explains the procedure for assigning marital status for filers in years in which their returns are missing. The procedure utilized the same code as was used on the original coding of returns, i.e.: M = 0 -> Single person M = 1 -> Married; spouse had separate income M = 2 -> Married; spouse did not have separate income M = 3 -> Married; but spouse died during the year In addition, whenever a person's record was missing for some year t, and for the next later year (t + 1); and no current spouse had filed in year t, then the person's marital status in year t was set equal to "7". Thus "7" implies that a filer's marital status in year t is in doubt, signaling caution when treating this person. The paper, however does not present a specific rule for treating such a person. For these tables, it was decided to treat the person as single unless evidence of being married was very strong. Specifically, the following ad hoc decision rule was used. Examine later years (t + 2, t + 3, etc.) until either the latest year is reached or until a year is found where M (not)= 7. Examine earlier years (t - 1, t - 2, etc.) until either the earliest year is reached or until a year is found where M (not)= 7. If both an earlier and a later year in which M (not)= 7 are found and if M = 1 or 2 in both these years, treat the person as married in year t. In all other cases, treat the person as single in year t. 4.22 Exemptions Due to an error in the creation of the input to this program, in some records the exemptions for both the husband and wife were set equal to zero. This occured when a husband did not file, his wife did file and stated that her spouse did not have separate income. There were 108 such cases divided among Group II and Group III people. In these cases, we created an estimator for E, E, and set it equal to 2. In effect, we lost all exemptions except the two for the taxpayer and his spouse. Another possible error forces us to make a second approximation for exemptions. In years in which a Group III couple was single, the husband's exemptions were set equal to l for the person + his dependents + 1 if he is over 65. In addition, his future wife's dependents may have been included in his exemptions, but we are not certain of this. In any case, the future wife's exemptions were set equal to the head filers. Thus if the wife's dependents were not included in the husband's exemptions, we have lost them. To obtain a person's minimum standard deduction ($200 plus $100 per exemption) in a year in which he was single requires knowing his personal exemptions. This information is not available, for single, base period years of people who were married in the computation year. All that is known is the couple's total exemptions. For an estimate, each person was given one-half of the total exemptions. This procedure will be accurate whenever each person had the sane number of exemptions and this includes the most common situation where each person had only one exemption. 4.3 Computational Procedure WAIS 645-052 (April-28, 1965) was the first paper to suggest a program for computing averagable income under Federal Law from State returns. A great number of changes and additions have been made since then so it seemed best to give the complete new program 4.31 Simulation of the Federal Basis of Personal Deductions The major difference between the approach described herein and the one proposed in Miller's WAIS 645-052 (April 28, 1965) is that he used a filer's deductions exactly as given on the state return. Here, an attempt to simulate the current Federal definition of deductions is made. A person's state deduction will be either itemized deductions or the State standard deduction (min[.09 State AGI; $450]). Federal law allows a standard deduction for joint return filers or single persons of either 10% of joint AGI up to $1,000 or $200 plus $100 for each exemption up to $1,000. To simulate Federal deductions, using state tax returns, compute: For a single person Ds = max[min(1,000;.1G); min(1,000;200+100E); G-N] For a married couple filing jointly Dj = max{min[l,000,.1(G1+G2)]; min[1,000;200+100E],G1+G2-Nl-N2} The first two arguments represent the allowable Federal standard deduction: the larger of 10% of AGI or $200 + $100 per exemption, but in no case more than $1,000. The third argument is the total deductions shown on the State return. If (G-N) Is larger than the Federal standard deduction, the taxpayer must have itemized deductions and (G-N) becomes an approximation of Federal itemized deductions. It will be off only to the extent that Wisconsin itemized deductions are not allowed by the Federal government or vice-versa. 4.32 Treatment of Group I Qualifiers (a) These people were single throughout the base period and computation year, (b) For all Legal Definitions, j = 1, 2, 3, 4 Compute: Dst as described above(t = 1, 2, 3, 4 ) Ft = Gt - Dst - 600Et (c) For j Compute: Bt = max(B;Ft - Ct) B = 0 Let FNTI = B5 If FNTI > B, compute X = FNTI - a B If FNTI < B, compute z = yB - FNTI (d) For j = 2 As in (c) above except Bt = max(B;Ft) (e) As in (c) above except B = -(infinity) (f) As in (d) above except B = -(infinity) 4.33 Treatment of Group It Qualifiers (a) These people were single in the computation year, but married in one or more base period years. Their treatment in single years is the same as for Group I people [section 4.32 above]. In married years, we treat them as filing jointly. A person who is filing separately in a computation year must reconstruct his income for any base period in which he fined a joint return. For averaging computations, the person's net income is defined as his own AGI less a fraction of joint deductions. The fraction of total deductions that the person is allowed is defined by the expressions for P and A, below. His separate net income, however, may not be less than one-half the couple's joint net, income as shown on their return for the base period year. To satisfy this requirement, 1/2 J must be made one of the arguments for determining Bt. (b) For all married years Compute E = max(E1,E2) If E = 0 Define 8 = 2 and use E in place of E for the rest of the computations. Compute Dj as defined on page 17. (P ) (Dj+6008) (DD+6008) if P1 < 0.15 if 0.15 < P < 0.85 if P > 0.85 (c) For j W 1 Compute G Al C1 II M G1+GZ D3-C1-C2.6008 t max(P;S1.;kJ1) with jS m 0 (d) For j = 2 As (c) above except GI-Al 1+G2-Dj-600E (e) For j ~ 3 As (c) above except 0 .-c (f) For j ~. 4 As (d) above except 0 a-.' 4.34 Treatment of Group III Qualifiers (a) These people were married in the computation year. We assume they filed jointly in all years during which they were married. In all single years, their separate incomes are computed and summed to obtain joint income for averaging purposes. (b) For base period years in which single (1) For all j, compute E = max (E1;E2)+1 Dsi as defined on page 16, using 1/2 E for each person (i = 1,2) (2) For j = 1, compute j = G1+G2-Ds1-Ds2 -600E-C1-C2 Bt= max (B; J) (3) For j = 2, same as (2) above except J = G1+G2-Ds1-Ds2-600E (4) For j = 3, same as (2) above except B = (infinity) (5) For j = 4, same as (3) above except B = (infinity) (c) For the computation year and base period years in which married (1) For all j, compute E = max (E1;E2) If E = 0, define E = 2 and use E in place of E for all further calculations. Compute Dj as defined on page 17. (2) For j = 1, compute J = G1+G2 -Dj-600E-C1-C2 Bt = max(B; J) where B = 0 (3) For j = 2 same as j = 1, except j = G1+G2-Dj-600E (4) For j = 3, same as j = 1 except B = -(infinity) (5) For j = 4, same as j = 2 except B = -(infinity) 4.35 Treatment of Group IV Qualifiers These people were married in 1958 and had a different spouse sometime in 1954-57. Because these were only 18 such cases and because their treatment is difficult to program, they were treated as Group III people with the earlier wife ignored.hahttp://www.ssc.wisc.edu/wais/WAIS656058.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656058.txt& Roger Miller 1966b\Portfolio Evaluation from Wisconsin Individual Income Tax Returns: I. General ConsiderationsMarch 23, 1966 WAIS paper656-050 Property Filec$$Roger F. Miller WAIS paper 656-050 March 23, 1966 Portfolio Evaluation from Wisconsin Individual Income Tax Returns: I. General Considerations A. Objectives Using data about the time flows of incomes from property of individuals, we propose to construct a time series of portfolios, or periodic balance sheets giving the composition of their wealth in considerable detail. These data are to be used in subsequent studies requiring that the result of our data manipulation be a fairly accurate record distinguishing assets by type of asset, and by specific asset for certain types. With respect to any asset so identified, we require a time series of asset values, each value accompanied by an estimated proportion of the value which is in the nature of a capital gain; the time stream of income derived from the asset broken down into money payments and capital gains (whether realized or not); and the yield of the asset, in distinguishing assets, it is particularly important to be able to keep separate assets whose values fluctuate relatively independently of other assets, and also assets whose yields are subject to taxation differentially from ordinary wage and salary income. B. Distinguishing Data Sources Since we have data from many different sources, of varying completeness and reliability even in one source, and of varying complexity as among asset types, it is important to keep these features separately recorded through special codes. The codes will be separately recorded for each distinguishable asset at each point in time, indicate the sources of data, and provide an index of the quality of the data with respect to whether or not interpolation or estimation was required, etc. Below is a proposed set of codes for identifying data sources: "Sources of data" codes (i) Internal sources: (i) Tax Return Master Files 11 (ii) " " Asset Income Files 12 (iii) Interview File 13 (iv) Self Enumeration Booklet File 14 (ii) External Sources (i) Merrill (N.Y.S.E) Data File 21 (ii) Compustat (S & P) " " 22 (iii) Wisbank File*/ 23 (iv) Wisloan File*/ 24 (v) Residual File*/ 25 */ These files correspond to the "lists" described in Moyer's WAIS paper on the coding of the asset data, lists 1, 2 and 4 respectively. These lists simply give the asset identification number and the name of the company, but other data are available and should be processed as soon as possible to put them into a form similar to the Merrill and compustat data, for each company. In some cases this will require additional data gathering - in some cases there will be no useful information available and the data file should have in it a record which so indicates. C. Distinguishing Reliability Each item relating to a particular asset needs, in addition to the source indication, a code which indicates the reliability of the item. Reliability here is a complex concept which cannot be fUlly and precisely defined until after the specific procedures for data manipulation are spelled out. out. In general terms, however, the "reliability rating" of an item will be a function of: (1) Completeness of source. ( Was the item intact in anoriginal record, or is it an interpolation or an estimate? If it is a derived measurement, how much interpolation or estimation is involved in the data from Which derived, and in the procedure of derivation itself?) (2) Quality of source. (Is the file from which the item was taken generally of high, medium or low quality, accuracy or precision? Has the particular record or set of records to which the item pertains been subjected to tests for internal consistency? If so, which tests were passed and which failed? Did any of the tests involve this particular item in the record? If so, which tests were passed and which failed?) These codes are important partly because they will embody within them an indication of the procedure that we will have used in deriving our final data. As we did with the Tax Averaging specifications, we will be able to run the data selectively on data of different degrees of reliability in order to see whether any significant bias is indicated. The assignment of codes can be done electronically for the most part, during the processing of the data. An exception to this is the set of external files 23-25, especially 25, where the individual records and items will have varying degrees of completeness and quality but may not be subject to testing once transcribed. It is obvious that these reliability ratings could be made sophisticated as to make their cost and their difficulty of use prohibitive. I believe that the scheme I will present will avoid this pitfall without being unduly wasteful of information. The best reliability rating should be attached to complete items in the Merrill and Compustat Files (21 and 22), automatically. the internal source files will have mixed ratings because we are capable of testing the consistency of some of the data, which will vary from item to item, record to record, and year to year as well as from taxpayer to taxpayer. D. General Strategy There are a number of distinct steps which can be identified in the entire procedure. Some of these are logically prior to others, and some can proceed in parallel. All of these require record by record processing of many records, sometimes in conjunction with records from other sources. (i) File completion. Presumably files 11-14 and 21-22 are complete, Files 23-25 have hardly even been begun. There is already provision for updating File 22. (2) Within source consistency checking and correcting. This is presumed not necessary for Files 21 and 22, and not possible for Files, 23-25. Much work along these lines has already taken place on File 11. Files 12-14 should immediately be investigated as to the extent to which such internal checking is feasible. Preliminary assignment of reliability ratings. (3) Integration of Files 11-14, testing for interfile consistency, and completing the reliability rating of the integrated data. This sentence calls for an extensive anount of work which has hardly begun. (4) Use of Files 11-14 to complete Files 23-25. Where in File 12 we have a sale Of an asset not included in Files 2l-22 the File 12 information can be considered an observation on our missing variables and transferred to the appropriate location in Files 23-25. This is also true with the assets booklet, and may be true for some categories of assets in the interview (Files 14 and 13 respectively). Assignment of reliability ratings to Files 23 - 25. (5) Bring the integrated internal File into conjuntion with the external Files and make all necessary calculations for creating a final output file, integrating the information in the files where possible. Assign reliability ratings to combined data or newly created items. (6) Obtain analysis files from the output of stage (5). This stage may require additional integration with other files. Additional calculations also could be performed at this stage. It will be noted that stages (1) - (4) above require a considerable input of programming and analytic effort before the crucial stage (5) can be performed. Having gained considerable experience with the testing of File 11 and with problems of file integration connected with the Social Security contract, we close to being able to begin this work provided funds and manpower are available. In what follows I will assume stages (1) and (2) are otherwise provided for, except for the reliability coding in (2), and confine myself to the details concerning stages (3)-(6). E. Budgetary, Staffing and Timing Problems It seems clear me that the entire package above falls within the purview of our N.S.F. grant, so that financing is presumably adequate for the current work. Even if that is so, however, it will be difficult to complete the work outlined within one year without some first rate programming assistance, probably two full time persons. This may be unduly pessimistic, but I write from bitter experience. There are excellent reasons for not engaging in a crash effort along these lines. It is doubtful if we would be successful in so doing because there are too may obstacles in the way. Even if adequate staff time were available, however, and if we could be fairly sure we wouldnt run into the great difficulties encountered in previous efforts, it might be desirable to develop the programs involved (and test them thoroughly) but not rush to a final running of them since the additional data from 1959-1965 would be available soon thereafter. These new data would enhance greatly the value of the present data, especially of the interview and booklet data. I therefore suggest we proceed with the recent-year data gathering posthaste, and put a major effort into the processing of those person's records having property in the past, or who were interviewed, or who begin to show property income in the new data, or who are in the high income supplement. This Would also give us some breathing room to fulfill our other commitments regarding income fluctuations, etc., and some time to pursue additional financing for the capital gains studies which are contingent on the completion of the portfolio construction. It seems clear to me that it will be difficult to obtain such financing until we have been able to bring forth and publish some substantial studies based on what we already have ready for analysis.hahttp://www.ssc.wisc.edu/wais/WAIS656050.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656050.txt3z Gene Moyer 1966>7Some Evidence of the Validity of the Weighting Function7September 6, 19660 WAIS paper667-005{Survey Data and File2h2bGene Moyer WAIS Paper 667-005 September 6, 1966 Some Evidence of the Validity of the Weighting Function Certain portions of the rematch operation to check the parameters of weighting function described in WAIS 656-043 have been completed. The rematch will probably not be completed in the near future because the new data will allow a much better idea of the population of persons who, in the namegroups who, filed a tax return in 1959 or 1962 than anything WAIS can do with data currently available. In addition to the limited rematch data, however, we have certain data from the survey itself which provide some evidence about the validity of the weighting function. The available comparisons which allow some check on the weights are: 1. Total number of persons in each "key A" group (the basic population division) before and after the rematch. 2. Percentage distributions for each key B-J for the matched sample before and after the rematch. 3. The percentage of persons in each Joint Return Income Class (the sum of husband's and wife's income) from the survey and from the Statistics of Income, 1963 for Wisconsin. 4. The distribution of education of respondents to the survey and a similar distribution from the 1960 Census. 5. The distribution of the occupations of respondents to the survey and a similar distribution from the 1960 Census. While many more comparisons with distributions from other sources are possible, these would seem to be the most interesting and the most revealing of the validity of the weights. We will discuss each of these five comparisons in the order given above. Table 1 shows the number of persons in the population from which the survey was drawn and in each of the strata into which the sampling frame was divided. Table 1 Persons in the Name Groups Who Filed a Tax Return in 1959 or in 1962 by Division of the Sampling Frame (key A) Key A 1. Matched Sample in PSU 8. Matched Sample not in PSU 2. Unmatched Master File in PSU 3. Unmatched State File in Name Groups in PSU 1,679 2,961 1282 (1) Number in the Population Before the Rematch 5,914 1,912 8,863 7,826 12,902 10,520* - 2382 (2) Number in the Population After the Rematch 7,718 10,599 2,881 7. Unmatched Master File not in PSU 4,039 4. Unmatched State File in Name Groups not in PSU 649 1,181 Unmatched State File not in Name Groups in PSU 21,550 532 22,418 868 Unmatched State File not in Name Groups and not in PSU 7,140 7.7.87 647 Sum 51,746 55,460 + 3720100 Each of these includes or excludes about 1000 had matches which were discovered after the rematch but which have not been punched. These hand matches were not subtracted from the State file numbers given above. The salient point about Table 1 is that the population "increased" by about 2,700 persons (37,000 less the 1,000 hand matches) during the period between the rematches. One reason for this is that George Loniello pulled more people off the State Rolls by getting those with "titles" such as "Jr." or "Sr." Another important reason is that WAIS found about 3,000 more persons in its sample during the spring, 1965 edit round. This writer feels that most of the large increase in the matched sample is caused by the inclusion of these persons in the FFID file. The effect of this increase is probably to cause the weighted aggregates from the survey to be low. At the same time, the sum of the weights is 10,363 while the number of 1959 male filers is 7,617. This does not suggest that aggregates are terribly low, but they still may be. If the aggregates are low, this will cause the survey weights to be invalid for the analyses concerned with comparing survey totals to totals from other data sources. Low aggregates do not necessarily affect the percentage distributions, however; initial analysis will be limited to percentage distributions anyway. Therefore the important consideration is whether the rematch changed the distributions of the other "key" variables in the population. The rematch did not progress to the point that we can be sure about whether these "key" variable distributions were changed or not, but some evidence is available. The matched sample distributions before and after the rematch are available. Unfortunately the new "matched sample" does not include about 1,000 matches which could not be made by machine. Table 2, then, compares the distributions of the "key" variables from the old matched sample to now matched sample distributions which do not include these hand-matches. Table 2 The Distribution of Key Variables Before and After the Rematch In PSU Not in PSU Key B - Occupation (Old Match) (New Match) (Old Match) (New Match) 0 (Farm) 5 5 16 16 1 (Non-Farm) _15_ 84 84 Total 100 100 100 100 69 Key C - Did Taxpayer Report Capital Gains or Dividends, 1954-1959? 0 (No) 71 69 71 1 (Yes) 29 31 29 31 Total 100 100 100 100 Key D - In How Many Years (1954-1959) Did the Taxpayer File? 4 5 0 (0) 7 7 1 (1-2) 14 15 14 15 0 2 (3-4) 12 12 12 11 3 (5-6) 67 66 70 69 Total 100 100 100 100 Key E - Average Property Income, 1954-1959 (Yp) 65 65 75 0 Yp < $200 77 1 $ 200 < Yp < $ 500 7 7 8 2 500 < Yp < 1,000 5 5 7 7 3 1,000 < Yp < 2,000 4 5 8 8 4 2,000 < Yp 7 13 12 Total 100 100 100 100 Table 2 (Contd.) 5 In PSU Not in PSU Key F - Average Gross Income, % % % % 1954-1959 Yg (Old Match) (New Match) (Old Match) (New Match) 0 Negative 0.14 0.21 0.52 0.54 1 0 2 1 - 999 7.73 8.01 4.16 5.18 3 1,000 - 4,999 20.46 20.92 23.90 23.94 4 5,000 - 9,999 49.72 49.25 59.41 56.99 5 10,000 - 14,999 19.59 18.91 11.43 12.07 6 15,000 - 24,999 1.32 1.48 0.40 0.86 7 25,000 - 49,999 0.64 0.80 0.06 0.31 8 50,000 - 99,999 0.39 0.40 0.06 0.08 9 100,000 and over 0.02 0403 0.06 0.04 Total 100.00 100.00 103.00 100.00 Key G - 1959 Gross Income 0.52 0.51 1.04 .97 0 Negative 1 0 15.34 16.41 12.99 14.71 2 1 - 999 7.96 8.10 10.80 11.02 3 1,000 - 4,999 41.27 40.79 54.56 51.11 4 5,000 - 9,999 31.08 29.89 18.88 19.70 5 10,000 - 14,999 2.15 2.36 1.39 1.75 6 15,000 - 24,999 1.01 1.24 0.23 0.58 7 25,000 - 49,999 0.52 0.52 0.06 0.12 8 50,000 - 99,999 0.14 0.17 0.06 0.04 9 100,000 and over -- -- -- Total 100.00 100.00 100.00 100.00 Table 2 (Cont'd.) In PSU Not in PSU Key N - 1962 Gross Income (old Match) (New Match) (01d Match) (New match) 0 Negative 1 0 0.97 1.05 1.33 1.67 2 1 - 999 10.39 11.35 14.84 14.64 3 1,000 -4,990 42.24 42.34 54.27 52.16 4 5,000 - 9,999 40.57 38.86 26.10 27.68 5 10,000 - 14,999 3.51 3.81 2.37 2.37 6 15,000 - 24,999 1.20 1.46 0.87 1.17 7 25,000 - 49,999 0.87 0.92 0.17 0.27 8 50,000 - 99,999 0.17 0.14 9 100,000 and over 0.06 0.06 0.06 0.04 Total 100.00 100.00 100.00 100.00 Key J - 1962 Percent of Tax 19.82 21.80 36.87 35.81 Withheld 10% or under 1 11-20% 0.62 0.61 0.81 0.62 2 21-30% 0.45 0.47 0.64 0.58 3 31-40% 0.37 0.43 0.52 0.51 4 41-50% 0.72 0.72 0.75 0.78 5 51-60% 0.68 0.74 0.92 0.82 6 61-70% 1.32 1.39.. 0.92 0.93 7 71-80% 2.12 2.22 2.08 9.91 8 81-90% 5.01 4.94 3.70 3.46 9 91% or more 66.59 64.35. 52.08 53.29 N.A. 2.31 2.33 1.21 1.28 Total 100.00% 100.00% 100.00% 100.00% The distributions of Table 2 are almost too good. None of the differences are more than 3.5 percent, and only one is that great. Thus while we may not be able to trust either set of percentages precisely, they are at least consistent. Of more interest is the way weighted survey distributions compare with similar independently generated distributions. Table 3 compares the distribution of respondents by "Joint Return income" class (the sum of husband's and wife's incomes) with the distribution of Federal joint returns in Wisconsin by 1963 AGI class. Table 3 The Distribution of Respondents by "Joint Return Income" class and the Distributions of Wisconsin Federal Joint Returns by 1963 Adjusted Gross Income class Income class Percentages of 1963 Joint Federal Returns in Wisconsin* Weighted Survey Percentages % Cumulative %, % Cumulative % under 1,000 4 4 87 8 1,000 - 1,999 11 19 7 11 2,000 - 2,999 6 25 6 17 3,000 - 3,999 7 32 7 24 4,000 - 4,999 12 8 32 5,000 - 59999 10 54 11 43 6,000 - 6,999 8 62 14 57 7,000 - 7,999 10 72 11 68 8,000 - 8,999 6 78 10 78 9,000 - 9,999 2 80 6 84 10,000 or more 20 100% 16 100% _ Total 100% 100% ---------------- *Source: U. S. Internal Revenue Service, Statistics of Incomes 1963, p. 116. The distributions compare very well with each other in that the largest difference in the proportion in any bracket is six and the modal difference is four. The cumulative percents, however, point up a more significant difference. WAIS has 54 percent of its respondents in the below $5,000 class while the state has only 32 percent of its taxpayers in that class. At the same time, a chi-square goodness-of-fit test using state frequencies as expected values gives a chi-square value of 9.87 with 6 degrees of freedom (both end categories were combined). This is not significant at the .05 level. Therefore WAIS' distributions should not show a significant income bias. Table 4 shows WAIS' education of respondents distribution compared to the 1960 Census distribution of the education levels achieved by males in the state who were 25 years old or older. Table 4 Grades of School Completed: WAIS Survey and the 1960 Census for Wisconsin (1) (2) (3) WAIS 1960 Census: Difference Survey males over 25 years old* (1-2) Grade of School Completed 0 1% 1 0 4 1 5 -4 6 6 7 -1 7 7 7 0 8 19 26 -7 9 - 11 17 16 +1 12 35 23 +12 Some college 6 7 -1 College degree or above 8 8 0 Total 100% 100 -- Number of cases 10,363 --------------------------------- *Source: U.S. Bureau of the Census. U.S. Census of Population: 1960, General Social and Economic Characteristics Wisconsin. Final Report PC(1)-51C. U.S.G.P.0., Washington, D. C., 1961, p. 174. While the proportions of tax filers in the survey and males 25 years old or older who have some college are almost exactly the same, WAIS has significantly fewer persons with an eighth grade education and more who have a ninth to twelfth grade education. This is somewhat disturbing, but a difference in the characters of the two populations may account for the difference in the percentage distributions. A chisquare test of goodness-of-fit using the census percentages as the expected values yields a chi-square value of 10.27 with 4 degrees of freedom (0, 4, and 6 were combined) significant at the .05 level, but not at the .02 level. Thus there does seem to be a significant difference even though we cannot be sure whether the weights or differences in the population definitions caused the statistical differences. Table 5 compares the survey distribution of the usual occupation of respondents to the distribution of the occupations of men 25 years old or older in Wisconsin and to the occupations of men in the WAIS tax sample for 1959. Table 5 WAIS Survey Distribution of Usual Occupation and Similar Census and WAIS tax Sample Distributions Occupation Class Survey Wisconsin Males 25 years old or older* WAIS Tax Sample Males in 1959** Professional-Technical 6 35 Manager-Official Self employed 18 Clerical 9 16 Sales 7 Service 15 Craftsmen, foremen 21 48 Labor 12 Others, N. A. --*** ---------------------- *Source: 1960 Census 51(c), p. 183 **Source: WAiS_ Preliminary Tables ***Less than 0.5 percent 6 6 12 10 14 56 32 4 100 100 Total Table 5 indicates that there may be a slight bias toward self-employed persons, service persons, and craftsmen and foremen and away from laborers. Differences in definition may account for the apparent bias toward service persons and away from laborers, but this writer suspects that flows in the weights may account for the other two. At the same time, the two distributions are probably similar enough to allow WAIS to use the weights as they presently exist. Summary The population from which the sample was drawn is larger than we thought. This probably causes aggregates to be lower than they should be. The sum of the weights (10,363) is quite a little larger than the number of 1959 male filers (7611), but total increases in the population since 1959 may cause the weighted aggregates to be too low anyway. The rematch made no discernible difference in the distribution of persons in the population over each of the key variables. The weighted percentage distribution of "joint return income" receivers is not significantly different from that of the number of Federal joint returns filed in the state in 1963. Distributions of education and occupation are different from the distributions given in the 1960 for males 25 years old and older, but differences in time and in the population could account for the differences in the distributions. This writer suggests that the weights be used in their present form.hahttp://www.ssc.wisc.edu/wais/WAIS667005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667005.txtQ John deVries 1966JDSome Thoughts on the WAIS Survey File and its Readiness for AnalysisSeptember 26, 1966 WAIS paper667-007$Analysis Survey Data and FilesPPJohn de Vries WAIS Paper 667-007 September 26, 1966 Some Thoughts on the WAIS Survey File and its Readiness for Analysis 1. Introduction During the last few months, a number of papers has been written regarding the survey data; we also have numerous tables run on most of the variables in the survey extract. In this paper, I would like to express some thoughts on a) the weighting system, and b) the validity of the data as they currently exist on the survey extract file. 2. Some thoughts on the weighting system We now have two papers dealing with the weighting system, as well as cross tabulations of "head's income" versus "weight 1" and "weight 2". 2.1. Remarks--regarding the first paper (WAIS 656-043) It appears that table 10, the second part, carries an error: if I am correct in assuming that this part was derived from table 9 by adding the contents of the appropriate cells," and that table 9 is correct segment two of table 10 should read: Table 1 The sample (actual interviews) (A = 1,2,3,4,5,6,7,8) Key G or H 5 6-9 Sum J=9, E=0 97 49 146 J=0-8, E=1-4 99 173 272 196 222 418 This affects the fourth segment of the table; this should now read: Table 2 The weights of the High Income Sample(A = 1,2,3,4,7,8) (A = 5,6) Key G or H Key G or H 5 6-9 5 0-9 3= 9, E= 0 138/97= 1.423 65/32=2.031 J= 9, E= 0 0 0 J= 0-8, E= 1-4 113/99-1.141 124/108=1.148 J= 0-8, E=1-4 0 0 This alteration should have only a slight effect on the weights for respondents with key G or H equal to 5 (i.e. Joint Return Income in 1959 and 1962 between $10,000 and $14,999). 2.2 Remarks regarding the second paper (WAIS 667-005) A number of remarks can be made about this paper. The author claims, first of all, that the rematch of the State Roll and the Master File did not affect the weights greatly; this I will not debate. Then, however, the author continues by comparing the distribution of Joint Return Income in the weighted survey results with that of Federal Returns in Wisconsin for 1963 (table 3, page 7). Regarding this presentation I would like to point out the following things: 2.2.1. The presentation of the survey data in the table. Dr. Moyer has a dislike of presenting percentage distributions where, due to rounding, the total does not add up to exactly 100%. His method to circumvent this, however, is to add the missing percent(s) (or to subtract if the total adds up to more than 100%) on a rather subjective basis. I would suggest that, instead of this, admittedly rather efficient, technique, adding a decimal is more accurate and scientifically more justifiable. When we do this on the basis of the marginals for table 73 (X-TAB, run. September 7), we can construct the following table: Table 3 Weighted Joint Return Income Income class Weighted survey responses Weighted survey percentages Weighted survey percentages (from table 73) weighted survey percentages (as in WAIS 667-005) <$1,000 808 7.. 3 8 0 IMO-1,999 1140 11 0 11 11 2,000-2,999 663 6 3,000-3,999 6.5 7 7 4,000-4,999 1235.676 9 12 12 5,000-5,999 966 9 6,000-6,999 761 7 3 7'3000-7,999 1003 10.5 10 8,000-8,999 649 6.3 6 6 9,000-9,999 214 24 2 2 10,000 or more 2165 20.9 20 20* 10363 l00.0 98 100 The elements with the asterisk are the cases where the difference between the more accurately calculated results and those in table 3 above exceed .5%. 2.2.2. The design of the income categories This objection concerns the fact that the last income category combines all people with income of $10,000 and over. Their number (2165) and percentage (20.9) suggest that a finer differentiation should have been made (especially when we regard the rather detailed differentiation of those with income under $10,000, where one category contained only A). This finer differentiation (again from table 7.3 marginals) then reveals an interesting fact: one category is clearly overrepresented (see table 4 below). Table 4 Weighted survey marginals for Joint Return Income (high-income sample) Joint Return Income Weighted frequency 9,000-9,999 214 10,000-14,999 1588 15,000-19,999 341 20,000-24,999 87 25,000-49,999 114 50,000 and over 22 N/A 13 It is obvious from this table that the category of people with income over $10,000 should have been split into at least two categories. In this context, the combination of the end categories (WAIS 667-005, p. 8) can not be justified on any methodological basis. 2.2.3. Improper use of chi-square It is my contention that chi-square can not be used on percentage distributions, but should always be applied to raw frequencies instead. If we apply chi-square to the raw expected and observed frequencies of the data in WAIS 667-005, table 3, the observed weighted distribution of income is obviously significantly different from the income distribution of the state population. I would therefore conclude that the WAIS' distribution does show a significant income bias: it appears that the very low income people (i.e. those with Joint Return Income less then $5,000) and the high income people (especially those with income between $10,000 and $15,000) are overrepresented, at the expense of the medium-income people (those With Joint Return Income between $5,000 and $10,000). The same criticism (re improper use of chi-square) can be applied to the data in table 4 (WAIS 667-005, page 8); here again, inspection of the raw frequencies shows us that the sample distribution on education is significantly different from the state population. 2.3 Possible causes for the differences in distribution There is a number of factors which could explain why the distribution of weighted responses is not as close a fit as would be desirable: 2.3.1. The range of the weights While the lowest weight is theoretically 1.00, the highest weight shows up as 622.96. In other words, the responses of this particular person (1 out of 1300) account for 6% of the weighted responses; in many cases, omission of this particular response will change a relationship between two or more variables completely. The effect of any coding error in this particular response could be totally disastrous. There are, in total, twelve respondents (i.e. ea. 1% of the total number of respondents), all in the low income sample, who together account for 27% of the weighted results (all these people have weights exceeding 100.00). 2.3.2. A priori weights vs. a posteriori weights While there may have been some justification to setting u a priori weights initially, the actual responses to the survey indicate that these weights should have been adjusted to a posteriori weights as soon as the survey responses were available. Differences between a priori and a posteriori weights can arise if respondents who were given "low-income" weights show up in-the survey with high incomes (the cut-off point between low and high income was taken as $10,000), or the reverse (which is less likely to happen, but not impossible). There are various possible causes for this "switching": 1. The a priori weights were based on income information for 1962 or 1959, while survey responses deal with 1963 income. There are obviously some people who earned less than $10,000 in 1959 or 1962 and more than $10,000 in 1963. 2. There are probably some respondents who moved into the state during 1959 or during 1969; They therefore reported only part of their earnings over that year for Wisconsin Income Tax purposes. it is quite conceivable that this part was less than $10,000, while their 1963 earnings exceeded $10,000. 3. Underreporting of income for tax purposes is a widespread phenomena; it is quite possible that some "under-reporters" stated their income on the tax returns to be below $10,000, while their responses to the survey might have been more truthful and over $10,000. 4. Various other inaccuracies in the responses to the survey (rounding!) could also have had effects. Causes 1, 2 and 3 all have the effect of overstating the survey response in comparison to the incomes reported for tax purposes, while the effect of cause 4 is not known. 2.4. Conclusions regarding the weights Now, all the above considerations become important when we realize that, generally, low-income weights are designed to be higher than high income weights, and that, therefore, the above elements have the effect of overweighting at least some high-income respondents. Concrete proof of this assumption can be found in table 83 (X-TABS, run September 7, second group), which tabulates the weights vs. the income of the head of the household. From WAIS 656-043 we can deduce that the range of weights for the high-income respondents is only between 1.14 and 17.38. Obviously, therefore, no person with a Joint Return Income of $10,000 or over should have a weight larger than 17.33. Since the income of the head of the household can not be larger than the Joint Return Income for that same household (provided, that is, that our data are correct), we should not find any high-income people in table 83 with weights over 20. There are, however, 7 people who clearly have too high weights. This does not exclude the possibility that other high-income respondents have too high weights, without exceeding the value of 20.00; these cases can obviously not be detected by such a quick inspection of marginals. A reduction of the weights for the seven respondents mentioned above, to an average value of 10.00 would result in a reduction of 300 on the total weighted responses in the $10,000-$14,999 income category. My conclusion is, that a validation study on income data and a subsequent reconsideration of the weights would be desirable. 3. Some observations regarding the validity of the survey data 3.1. Materials available We now have a set of marginals on practically all the variables on the survey extract (see WAIS. 656-061 for a description of this file), run on WISTAB, as well as a number of cross-tabulations run in various groups on X-TAB. To the best of my knowledge, these tables were all run after all the corrections to the file, supposedly, were made (i.e. if we would run the same tables again, with the same specifications, we would get output identical to the output we now have). If, contrary to this assumption, changes were made after the WISTAB marginals were run, new WISTAB tables are obviously required and most of the observations below will require revision. The checks conducted regarding the validity of the data, consisted mainly of a quick inspection of marginals to ensure that no "invalid" codes existed; some comparisons between tables, as well as some inspection of two-dimensional tables, were also conducted. 3.2. The effect of "blanks" Before I proceed to a detailed discussion of errors and inconsistencies, I would like to discuss the effect of "blank" codes on the validity of coded data. A "blank" can be "produced" as a result of one (or more) of the following events (in chronological order): 1. The interviewer may fail to ask a certain question or fail to record a response. A blank can result, if sufficient checks in the editing, coding and keypunching stages are lacking. 2. The coding specifications can allow "blanks" under certain circumstances. 3. A coder may fail to code a response to a given question; insufficient checks at subsequent stages will cause resulting blanks. 4. A keypuncher may fail to punch a particular field. 5. A keypuncher may fail to punch a particular card. 6. A card may be missing for other reasons. The above listing may seem to be trivial; the significant observation to be made, however, is that only one of the above causes is legitimate (#2). Since, once the data have been punched, it becomes difficult to distinguish between "legitimate" and "illegitimate" blanks, we should, if at all possible, design our coding system in such a way that we have no legitimate blanks at all, or restrict the number of "legitimate" blanks to a minimum. (This practice will facilitate our finding illegal and invalid codes and, therefore, facilitate the checking on validity.) Since, however, the survey under discussion allowed many blank codes, the convenient method of editing on blanks had to be used in a modified form: 1. Certain fields should still not be blank; in all these cases, all blanks are, therefore, invalid. 2. For fields where blanks are allowed under certain circumstances, we can sometimes calculate (on the basis of information from other tables) how many blanks are valid; any difference between the calculated number and the actual number indicates the existence of invalid data. With this in mind, let us proceed to a summary of findings. 3.3. Summary of findings Obviously, inspection of marginals can only reveal "impossible" codes if the boundary specifications arc set up in such a way that all invalid codes fall in separate categories. Since this was not always consistently done for the tables under discussion, it is quite conceivable that additional erroneous items exist. Nevertheless, the following findings can be reported: 1. Blanks (clearly illegitimate), Table 5 summarizes the fields where blanks were found and the number of cases affected. Table 5 Blanks located in WISTAB marginals Field # of blanks Race 1300 Sex 1300 Use of windfall income 6 Reasons for use of windfall income 9 Quality of portfolio 7 Highest school grade finished 2 Vocational school attended 3 Highest degree held 4 State code (of college education) 3 State of graduate degree 3 Field of highest degree 3 Year of birth 3 State where R grew up 3 Size of the place where R grew up 4 R's father's educational level 3 R's father's occupation 5 R's employment status 5 # of employees responsible to R 56 R's occupation 10 R's industry 53 Method of income payment 428 Overtime payment arrangements 432 Total number of companies worked for 119 Number of hours worked per week 13 Table 5 (Cont.) Field # of blanks Number of important jobs ever had 14 Most important occupation 55 Industry of most important job 714 Frequency of unemployment 122 Social security coverage 9 # of friends, relatives helped by R 6 # of persons living with R 6 # of trusts to R 5 Probability of receiving an inheritance 5 Satisfaction with home 6 Number of rooms 5 Sign of expected income change 9 Head's income 5 Total family income 5 Net worth class 9 Fixed return investment 10 # of parcels of real estate owned 9 Joint return income 5 R's wife's income 5 Income of other family members 5 Trusts established by R 5 # of transfer payments received 6 2. Blanks (possibly illegitimate) were found for: --number of children. 1159 blanks were found, while the number of people never married was only 106 -- this leaves 1053 "unexplained" blanks. 3. Impossible codes (i.e. codes not specified by general or specific coding instructions) were found for: -- use of windfall income: 4 were coded "26"; --wife's educational level: 1 coded "23"; -- number of friends and relatives helped: code "0" is not specified as a valid code in the coding instructions, yet there are 962 respondents with this code; -- number of persons who lived with R: code "0" not specified in the coding instructions, yet 971 respondents were coded "0"; -- own freezer: 1 coded "4" -- own hi-fi: 1 coded "6". 4. Unlikely codes (these may be correct, but deserve some investigation): -- highest grade of school finished: 5 respondents coded "00"; -- age: 1 person born between 1060 and 1864 (marriage took place in 1945); -- occupation: 11 coded "00"; -- number of important jobs R had (for three years or longer!): 1 coded "09"; -- hours per week spent on part-time job: 39 people spent more than 35 hours per week on a part-time job; --trusts established by R: 1 coded "10", 1 coded "20". 5. Combinations of information from more than one table yielded, furthermore, the following inconsistencies: a) educational level vs. highest degree received: bachelor's level 156 157 master's level 43 27 doctoral level 46 62 other degrees - 5 no degree 1050 1049 b) state where first degree received: information implies that 448 respondents had degrees; a tabulation of educational level, however, indicates that only 250 respondents received degrees, and only 413 had at least some college education; c) a comparison of the information regarding the state where R's highest degree was received with educational level yields similar inconsistencies; d) tabulation of the field of R's highest degree reveals 1051 coded either "00" or "blank" (ergo, did not have a degree), which is inconsistent with the findings by educational level and by highest degree attained;. e) while only 17 people responded that they were "looking for work", 20 answered the next question (reason for unemployment), contrary to coding and keypunching instructions; f) 5 respondents indicated that they had made donations in property or stock in 1963, yet 13 answered that there was (or was no) tax advantage connected with these donations; g) a comparison of the number of inheritances, received by R, with the amount of inheritance, received, reveals that 13 respondents received inheritances for "blank" amounts; the year when the last inheritance was received appeared to be "blank" for three respondents;. h) various tables giving information regarding the respondent's housing point up the following inconsistencies: 31 neither own nor rent; 192 rent furnished or unfurnished; therefore, 1027 respondents either owned their homes or did not answer the question. Yet: 391 blanks for "year when home was purchased"; 437 blanks for "value of home"; 403 blanks for "investment in home"; 143 blanks for "amount of mortgage", 349 with amount 6. Confusing codes (codes where the meaning is ambiguous): number of relatives and friends helped: code "3" is ambiguous, can either mean "no" "three mentioned"; -- number of person's who lived with R: same comment. 7. Inspection of two-dimensional weighted cross-tabulations (X-TAB) revealed an additional number of inconsistencies or unlikely situations a) table 60 (run August 23) reveals that 101 (weighted) people with en educational level of 30 or less (i.e. did not enter high school or did not finish first year high school) indicated a state where they had attended college; in addition, 542 (weighted) people with educational level 9-12 (i.e. entered high school, did not finish one year of college) are in a similar situation. Also 23 (weighted) respondents with a Bachelor's degree indicated a state code "00" (which is possible, but questionable). b) table 5 (run September 7): it is questionable that people (with the possible exception of service workers, farmers, and farm laborers) work more than 60 hours per week on their Jobs (involves 237 respondents, weighted). c) table 6 (run September 7) indicates 6 people with occupation code "00" who are, nevertheless, paid fees for their services. d) table 11 (run September 7) indicates 6 people (weighted) with occupation code "00", but with 11-50 employees responsible to them. e) table 31 (run September 7, second group) reveals 41 people (unweighted!) with head's income < $6,000, yet with weight 2 (see WAIS 656-043) not equal to 0. (It is unlikely that for all or even most of these people the Joint Return Income is larger than $9,999). 4. Conclusions 1. It is the writer's opinion that: a) while the weighting system itself might be acceptable, certain revisions to individual cases will be required (i.e. cases with incomes "straddling" the $10,000 cut-off point as well as cases with extremely high weights) to at least approach a reasonable distribution; b) while the evidence regarding the validity (or, rather, lack of validity) of the survey data is already substantial, there is, clearly, a large number of as yet undetected errors (e.g. if a mispunch did not result in an impossible or inconsistent code) in addition to the ones specified above; c) the validity of the survey data, after we consider the remarks sub a) and b) above, is not as high as would be desirable for scientifically worthwhile analyses. 2. Therefore, it is recommended that: a) a new check be made on missing cards in the survey file; b) a new act of edits on the cards be made; c) a new extract tape be constructed; d) marginals for all extract variables be run with the specific purpose of locating all existing "impossible" codes; e) "trick-tabulations" be designed with the specific purpose of catching inconsistencies; f) each of the above stages be repeated until the results are satisfactory; g) none of the above stages be started before the preceding ones are satisfactorily completed.hahttp://www.ssc.wisc.edu/wais/WAIS667007.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667007.txtMark Lieberman 1967Coversheet FormatFebruary 6, 1967 WAIS paper667-026Survey Data and FileMark Lieberman WAIS 667-026 February 6, 1967 COVERSHEET FORMAT The following information is the first of the "hidden outputs" discussed in WAIS 667-022. It is an updated coversheet format to be appended to the WAIS survey code book. Unlike all previous card formats, it is a descriptive format instead of a prescriptive one. After the coversheets were listed, it became clear that the then existing formats did not really describe the cards. This format is intended to do so. Since it may not be the format that was used, it is subject to further change. This may be the place to update the time estimates of 667-022 in order to measure the progress of the survey in the two weeks since that paper. The following table will summarize these changes P A Pr Cl Total time added Step 1 10 10 11 1 1 IV 25 Hopefully, this will underline the necessity for new help on the survey (see WAIS 667-022). Coversheet Code Card Column 1 "C" - Identifying punch 2-9 ID number note: 152 respond were chosen from the State Tax File and were not previously in our files. These people will have new I.D.'s assigned starting with folder number 1000 for each name cluster 10 Blank 11 1 if a substitute respondent answered. Blank otherwise. 12 Blank 13-16 Interview/Non-interview number 0001 - 2999 = Interviews taken 3000 - 3999 = No eligible R - married out of sample (divorced) 4000 - 4999 = R deceased (no widow) 5000 - 5999 = R moved out of state 6000 - 6999 = A moved no further date available 7000 - 7999 = R moved within state - no address available or cannot be contacted by an interviewer 8000 - 8999 = Refusal to grant interview 9000 - 9499 = Unable to participate due to illness 9500 - 9999 = Away for duration of study 17 blank 18-21 Office Number: Interviews serially numbered before they were sent out 22 Blank 23 N = Nothing returned I = Interview returned 24 B if first assets booklet Was returned, blank otherwise. 25 B if second assets booklet was returned, blank otherwise 26 1 if interview was complete 2 if interview was partially complete 3 if interview was not taken coversheet Card Column Columns 27 - 29 will only be punched if the respondent returned data but was for some reason outside the sample i.e. columns 13 - 16 will have a number greater than 2,999, but an interview if still present. If both these conditions are not met, skip to 40. 27 Race of R 1. White 2. Negro 3. Other 9. N.A. 28 Sex of R M - male F - female 29 Marital status 1. Married 2. Single 3. Widowed 4. Divorced 5. Separated 9. N.A. 40-41 Month interview or non-interview noted received 42-43 Day interview or non-interview noted received 44-45 Year interview or non-interview noted received e.g. March 27, 1964 will be coded as 032764 coversheet Card Column if Assets book was never received or was received at the same time as the interview, skip to 61 46 Blank47-48 Month first booklet received49-50 Day first booklet received51-52 Year first booklet received53 Blank54-55 Month second booklet received56-57 Day second booklet received58-59 Year second booklet received60 Blank 9 digit key 61 Sampling data 1. Merged Master and State File in PSU 2. Unmerged Master in PSU 3. Unmerged state in name cluster and in PSU 4. Unmerged state in name cluster and outside PSU 5. Unmerged state not in name cluster mad in PSU 6. Unmerged state not in name cluster and outside PSU 7. Unmerged Master outside PSU 8. Merged Master and State file outside PSU 62 0 - Farm occupation 1 - Non-farm occupation 63 0 - Taxpayer never reported dividends or capital gains 1 - Taxpayer did report dividends or capital gains 64 How many years between 54-59 did R file tax returns? 0 = None 1 = 1-2 2 = 3-4 3 = 5-6 65 How much was taxpayers average property income (Yp) (total less that from wages and salaries), 1954-1959. 0 Y < $200 1 $200 < Yp < $500 2 $500 < Yp < $1000 3 $1000 < Yp < $2000 4 $2000 < Yp 66 How mach was taxpayers average gross income (Yg), 1954-1959. (Coded as column 65 above) g 67 How much was taxpayers gross income in 1959 (Y59) (coded as columns 65 above) 68 How much was taxpayers gross income in 1962 (Y62) class bounds for 6,7, and 8. 0 Y < - 1 1 -1 < Y < + 1 2 +l < Y < 1,000 3 1,000 < Y < 5,000 4 5,000 < Y < 10,000 5 10,000 < Y < 15,000 6 15,000 < Y < 25,000 7 25,000 < Y < 50,000 8 50,000 < Y <100,000 9 100,000 < Y 69 What percent of tax paid was withheld from the taxpayer's 1962 income (TW%). 0 TW % < 10 1 10 < TW % 0 2 20 < TW % 30 3 30 < TW % < 40 4 40 < TW % < 50 5 50 < TW % < 60 6 60 < TW % < 70 7 70 < TW % < 80 8 80 < Y < 90 9 90 < TW % If the key was not altered, skip to 79. Otherwise code 70-78. 70 Codes same as for 61. 71 Occupation - codes same as for 62 72 Dividends - codes same as for 63 73 Returns? - codes same as for 64 74 Yp - codes same as for 65 p 75 Yg - codes same as for 66 76 Y59 - codes same as for 67 77 Y62 - codes same as for 68 78 TW% - codes same as for 69 79 1. Probably has mother or father in master file 2. Has father or mother in sample. 3. Should be in master file but isn't. (If noun of these apply leave blank) 80 How is hand matching done? 1. Name and address 2. Name and occupation 3. Name and dependent ages If none of 4. Combination of 2 and 3 these apply 5. Portfolio leave blank. 6. Other 7. Not ascertainedhahttp://www.ssc.wisc.edu/wais/WAIS667026.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667026.txtB Gene Moyer 196660A Profile of the Wisconsin Income Taxpayer, 1964September 29, 1966 WAIS paper667-011&General Papers (Regarding WAIS)AAGene Moyer WAIS Paper 667-011 September 29, 1966 A Profile of the Wisconsin Income Taxpayer, 1964 The income tax has been the center of much controversy in recent years. During the Kennedy administration, the Council of Economic Advisors suggested that the rates should be cut so that more money would be available for personal consumption. This increase in personal consumption would, then, generate new investment and employment opportunities, and Gross National Product would approach full employment levels. The tax cut was passed in 1964, and personal consumption, investment, employment, and Gross National Product did increase as the Council had predicted. In fact, they increased to such a large extent that inflationary pressures existed in the economy. Therefore early in 1966 some members of the Council of Economic Advisors suggested that taxes should be raised somewhat, possibly back to pre-1964 levels. When the tax-cut controversy began, Professors Harold M. Groves, Martin H. David and Roger F. Miller decided that the reason for much of the controversy was that economists and the general public did not really know very much about the personal and economic characteristics of taxpayers. Therefore, in 1964, these professors asked the Wisconsin Survey Research Laboratory to interview a sample of persons on the Wisconsin State income tax rolls (and so presumably on the Federal income tax rolls) to determine their personal and economic characteristics. This report summarizes certain of the results of that study under four headings: 1. The Early Lives and Educations of Wisconsin Taxpayers 2. The Work of Wisconsin Taxpayers and the Rewards from Their Work 3. The Homes of Wisconsin Taxpayers 4. Wisconsin Taxpayers Look to the Future 1. The Early Lives and Educations of Wisconsin Taxpayers Most taxpayers who responded to the study were in their forties and fifties, but respondents ranged in age from under 18 years old to over 100. Almost all of them were men because husbands gave information about their wives. Thus there was no need for a separate interview with the wife. Most taxpayers (70 percent) were married at the time of the interview. The median length of time they had been married was 10 years.* This figure may be a little misleading, however, because it represents the length of time respondents had been married to their (then) current spouses. Many respondents had been married to other spouses earlier in their lives. Table 1 shows the complete breakdown of the marital status of respondents. Table 1 Marital Status of Respondents at the Time of Interview Marital Status Percent of Respondents Married 70% Widowed 13 Divorced or separated 6 Never married 11 Total 100% Number of cases 1300 Respondents had as many as thirteen children, (some from more than one marriage), but the average (median) number was four. We know very lttle about the kind of life taxpayers lived when they were young, but we do know the jobs their fathers held while respondents ----------------------- *The median divides a distribution in half. Thus half had been married for 10 years or less; half had been married 10 years or more. 100% 1300 were growing up, the size of the place in which respondents grew up, and the education respondents received. Table 2 shows the size of place in which respondents grew up and Table 3 shows the occupations of fathers while respondents were growing up. Table 2The Size of Place in Which Respondents Grew Up Size of Place Percent of Respondents Farm 31% Village (under 2,500 population) 20 Small city (2,500-50,000 population) 25 Large city (over 50,000 population) 24 Total 100% Number of cases 1300 Table 3 Occupations of Fathers of Respondents While Respondents Were Growing Up Occupation Percent of Respondents Professional and Technical 4% Managerial 4 Farmers 41 Self-employed (excluding farmers) 7 Clerical-Sales 3 Craftsmen-Foremen 27 Service, Labor 13 NA 1 Total 100% Number of cases 1300 Fathers of respondents were not so highly educated as their children, but they were not lacking in formal education. Table 4 shows the distribution of the highest grade respondents finished and the highest grade their fathers finished. Table 4 Percent of The Highest Grade of School Finished by Respondents and by Their Fathers Highest Grade Percent of Percent of Respondent's Finished Respondents Fathers Grade School 34% 67% (none-grade 8) Some High school 52% 6% (grade 9-12) Some College 6 11 (1-3 years) College degree or 8 5 graduate work Don't know how much -- 11 education father had Total 100% 100% Number of cases 1300 1300 In addition to the education shown in Table 2, 54 percent of respondents attended a vocational school. In many cases this vocational schooling allowed them to enter their present occupations. Although most taxpayers grew up in Wisconsin, a significant proportion (16 percent) grew up in other states or in a foreign country. Interestingly enough, those persons who grew up outside Wisconsin were more likely to have completed at least one year of college than were taxpayers who grew up inside the state. Table 5 shows the distributions of highest grade finished by persons who grew up in Wisconsin and by persons who grew up somewhere else. Table 5 Highest Grade of School Completed by the Place Taxpayers Grew Up Percent of Taxpayers Who Grew Up Highest Grade of In Outside All School Completed Wisconsin Wisconsin Respondents Grade 8 or less 37% 11% 34% 9-12 (graduation from 50 56 52 High School) 1 year of College to 11 18 11 College degree Some graduate work 2 7 3 Total 100% 100% 100% Number of cases 1012 288 1300 Table 6 shows that these persons who grew up outside Wisconsin were not educated at the expense of Wisconsin taxpayers. Table 6 Respondents Who Had Some College: The State in Which Respondents Received College Educations by the State in Which Respondents Grew Up State in which State in Which Respondent's College was Located Respondent Grew Up Wisconsin Other Total Number of Cases Wisconsin 92% 8% 100% 295 other 8% 92% 100% 121 All Respondents 65% 35% 100% 416 with some college Notice that when we concern ourselves only with those respondents who had some college, 92 percent of those who grew up in Wisconsin received their educations from Wisconsin colleges; 92 percent of those who grew up elsewhere received their educations from colleges located outside Wisconsin. Thus while students from Wisconsin colleges do, no doubt, leave the state in great numbers to find employment after graduation, Wisconsin is attracting educated persons from other states to replace those who leave. In other words, while the "brain drain" from Wisconsin may exist, a "brain influx" also exists. The data do not allow us to measure the relative sizes of these two forces, but since one quarter of the respondents who had some college received their educations in other states, the "brain influx" must be substantial. The Work of Wisconsin Taxpayers and Their Rewards from Work Over 85 percent of our respondents were employed in 1963 even though state unemployment rates were quite high. Many respondents (14 percent) were not working at the time of the study, so we asked them to give their occupations in the last job they had. Thus the occupation data are an index of the latest occupation of taxpayers rather than of their occupations at the time of the study. Table 7 gives the distribution of respondent taxpayers by general occupation classification. Table 7 Respondents to the Interview by General Occupation Class (Latest occupation) Occupation Class Percentage of Respondents Professional, Technical Workers 11% Business, Executives, Officials 6 Non-Farm Self-Employed 5 Farm Owners and Managers 13 Clerical Workers 9 Sales Workers 7 Craftsmen, Foremen 21 Laborers, Service Workers 27 Not ascertained or not employed 1 Total 100% Number of cases 1300 About one in four of our respondents had from one to over 5,000 employees directly or indirectly responsible to them. About half of those who had any supervisory duties had from one to five employees responsible to them. Respondents worked fairly long hours. The median number of hours worked per week was 53.7. Many respondents (24 percent}, however, had such varied hours that they could give no absolute number of hours they worked per week. One in five respondents had second or even third jobs in addition to their main jobs. These jobs ranged from working weekends at a gasoline station to running a profitable manufacturing firm. Most persons who had second jobs worked from 10 to 20 hours per week at the second job in 1963. Only one percent of our respondents said they were looking for work (i.e. unemployed) although unemployment rates were substantially higher than that in 1963. At the same time, one in four respondents said they had been unemployed "every year" or "every few years" since they had started working. Thus unemployment was a threat to many of our respondents. Some respondents (7 percent) said they were retired in 1961, but this proportion is low because many indicated that after a short period of inactivity, they had taken new jobs and in some cases had new careers. Average family income of our respondents was $6,824, approximately the same as the state average in 1963. Table 8 shows the average 1963 family income for each general occupation class. Table Average 1963 Family Income of Respondents by Latest Occupation Category Average Family Occupation Class Income in 1963 Professional, Technical $10,432 Executives, Administrators, Officials 11,624 Non-Farm Self-Employed 11,069 Farmers, and Farm Managers 6,068 Clerical Workers 5,158 Sales Workers 4,618 Craftsmen, Foremen 7,637 Laborers, Service Workers 4,470 Not ascertained or no previous job 3,190 All Respondents 6,824 Number of Cases 1,300 In addition to current money income however, our respondents received many fringe benefits. Seven in ten received health insurance paid for at least in part by their employees, over one half received sick pay from their job, and over one third received a retirement nest egg from their employers. 3. The Homes of Wisconsin Taxpayers Wisconsin taxpayers were in general well satisfied with the housing arrangements they had at the time of the Interview. Nine respondents in ten said that they were either "very satisfied" with their living arrangements or "satisfied" with them. Eighty percent of respondents lived in non-farm houses and 49 percent of respondents owned their own homes or were paying off mortgages on them. The bulk of these dwellings were single family dwellings, but 3 percent of respondents had rented apartments or rooms in the homes they owned. The average estimated value of the homes our respondents owned was $15,000 at the time of the interview. Of those who rented homes, 85 per cent rented them unfurnished. The average monthly rent paid was $79 per month in 1963. Of the farmers, 67 per cent owned their farms, most of which included 100 to 200 acres. Many farmers also rented land. Of these, 80 per cent rented 50 acres or fewer in 1963. 4. Wisconsin Taxpayers Look to the Future Our respondents protected themselves against an uncertain future in many ways. One way was by owning assets. Table 9 shows the distribution of the net worth of respondents. Table 9 The Percentage of Respondents in Each Net Worth Class Net Worth Class Percent of Respondents Under $5000 32% $5000 - $9,999 15% $10,000- $24,999 27% $25,000- $49,999 12% $50,000- $74,999 7% $75,000- $99,999 2% $100,000 or more 2% Not Ascertained 3% Total 100% Number of Cases 1,300 The proportion of respondents who own each of several assets asked about is shown in Table 10. Table 10 Percentages of Respondents Who Owned Assets of Various Kinds in 1963 Asset Type Percentage of Respondents Assets with fitted rates of return (bonds 82% savings accounts, and so forth) Common and Preferred Stocks 27% Holding in a business 25% Real Estate (in addition to residence) 15% Home equity 62% Insurance 81% Total T Number of cases 1300 T Total does not equal 100% because some tax payers owned more than one kind of asset. In addition to these assets, 36 percent of respondents were members of a retirement plan or had a program in which they only collected part of their salary in 1964 with the balance to be paid in the future. Of course, respondents also looked to government for aid when their own recourses were not adequate to meet the demands of living. Almost half (47 percent) of our respondents were receiving (or had received in the fifteen years 1949-1963) pensions, educational subsistence, social security payments, public assistance, work-men's compensation, or some other payment from government. While our respondents were concerned for security, they were generous and optimistic about the future. One quarter of them supported someone outside their homes in 1963 or had persons other than those in their family living in their homes from 1959 - 1963. A substantial number gave goods or money to organized charities as well. Our respondents were optimistic in that 57 per cent thought that their family income would be greater in 1968 than in 1963. Only 17 per cent felt that the income of their families would be less. The major reason given for feeling that incomes would rise was that respondents felt the economy of the United States would grow and that their incomes would keep pace with that growth. Retirement was the major reason given by those who felt their family incomes would fall. In addition, respondents were optimistic about their employment possibilities and were anxious to improve them. Thirty percent of our respondents had worked for more than one company during 1959 - 1964 and three-quarters of these felt that these job changes had improved their chances for success. One-quarter of those interviewed said that there was at least a possibility of their making a job change before 1968. In general these people were looking for a better future in this new job. In summary, the "average" tax payer in 1963 was forty to sixty years old, was married, and had four children. He grew up in a rural area and his father farmed. He had more formal education than his father. Most fathers of taxpayers had completed no more than eight grades of school while better than half of the taxpayers had completed grades nine to twelve. If he grew up outside the state, his chances of having gone to college are better than if he grew up inside the state and he probably attended a school outside Wisconsin before (or even possibly after) coming to Wisconsin. He probably worked as a laborer or service worker in 1963, although the chances were about as good that he worked as a craftsman or foreman. He probably worked fairly long hours on his job. If he didn't work long hours on his main job, he may have increased the total number of hours worked by taking a second (or even a third) job. His income was around $7,000 in 1963; but if he worked as a professional person or executive, it was probably around $11,000. If he worked as a laborer or service worker, it was probably around $4500. These amounts do not include fringe benefits, however, and so are lower than his true income. He lived in a non-farm house and probably owned it. If he owned it, it was probably worth about $15,000 in 1963. If he rented he probably paid about $80 per month in rent. If he owned his house, he was probably worth from $10,000 - $25,000; if he did not he was probably worth less than $5,000. He owned several assets, the most valuable being the equity in his home. He probably felt that his income would increase during the period 1963-1968 because the economy would grow and his income would grow apace. He was probably settled into a job he was going to keep for a long time, although he was possibly looking around for a better one if an opportunity could be found and this job is not the first job he had after leaving school. In short, Wisconsin taxpayers work long hours for good wages, and welcome new job opportunities. They desire certainty in an uncertain world, but they do not fear a future for which they have made plans.hahttp://www.ssc.wisc.edu/wais/WAIS667011.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667011.txtRichard Bauman 1966>7A Plan for Preparation of a Longitudinal Selection FileeOctober 6, 1966d WAIS paper667-012dLongitudinal Analysis.GGRichard A. Bauman WAIS 667 - 012 October 6, 1966 Revised November 15, 1966 A Plan for Preparation of a Longitudinal Selection File I. Introduction A. Need for Longitudinal Selection File. A meaningful analysis of individual or unit income time series from WAIS' Master File should include some explanations of why certain data years for persons are missing.* It is often possible to exploit ------------------------- *See Gene Moyer's Monograph 1, 9 8.4 for a description of the problem. Also see Roger Miller's WAIS 645-050 for a description of the "gapplugging" used in creating the 1964 Tax Averaging Law File #1. ------------------------- coded items in the existing records to explain missing records. An elementary breakdown of the causes of missing year records for persons involves two possible explanations: 1) The person was not eligible for WAIS's sample in the missing years,** e.g. a) was ------------------------- **See below I. C. 5. d. for a definition of sample eligibility ------------------------- not in the WAIS name groups, b) was a resident of another state who did not have taxable income in Wisconsin, or c) was not a future wife of a person who was included in WAIS's sample. Or 2) The person was eligible and a) did not file because his income was below the filing limits in the missing years or b) filed a return which was not included in the sample or c) cheated. Our data (on persons who appear somewhere in the sample) can be used to indicate somewhat different reasons for missing years: 1) Records indicate the person was not in WAIS's sample in certain missing years, e.g. moved into or out of the state; 2) Records indicate the person was eligible for the WAIS sample but did not have income over the filing limit in certain years, e.g. he indicates in a subsequent year that he had no or little income in the preceding year; 3) Records and apriori knowledge indicate it is highly improbable that the person would file in certain years, e.g. he may begin filing at the age of 16, earlier years may be missing because of no or little income, or 4) Records show only "low-quality" or "contradictory" information re missing years, e.g. the person claims he filed in a preceding year but no record is found in WAIS files. B. Summary of Procedure The logic used in preparing the longitudinal selection file is divided into 4 basic sets of rules. The first set treats the selection of the population. The extension of the population from all persons who filed a return in WAIS's sample to those persons plus their spouses is designed to provide a more representative population of females. The second set of rules treats the assignment of selected "sample entry" codes. The third set deals with "sample exit" codes. The last set of rules are designed to allow "record" gaps to be filled according to coded information in existing records. C. Summary of Classifications Used 1.) Selection of the Extended Population The Selection File should include all persons in the 1946-1960 Master File and all of their spouses during the same period. In the rules which follow in part II of this paper, we present a sequential series of tests which divides History File Records (supplemented by selected demographic codes from the Master, Benefit, FFID, and Age files) into six groups, depending upon the type of husband-wife unit the person belongs to (according to our records). Group 1 Persons are those who filed a tax return or group of returns in WAIS's sample during the years 1946-1960 and for whom there is no indication of an existing or new marriage during the years they were in the sample. Group 2 Persons are those who give some information about a spouse during the years they were in the WAIS sample; at least one person in the unit filed during 1946-1960; and there is no indication of more than one wife in the unit. Group 3 Persons are those who do not indicate that they have a spouse yet both persons in the unit filed in at least one (and the same) year and the coders assigned a husband-wife ID# to the unit. Group 4 Persons are those who indicate a change in spouses during the years 1946-1960, at least the husband filed at sometime during the period, and there is no indication of more than two wives in the unit. Group 5 Persons are those in units in which both the last wife (more than 1) and the husband filed during 1947-1960, variables indicating change in spouse status are not inconsistent with the assumed number of wives, and the coders found it necessary to assign "multiple-wife" ID#s. Group 6 Persons are those in units in which 1 or more persons filed and which have a questionable or not easily defined number or persons in them. These include all units which do not pass any of the tests for Groups 1-5. The most common cases would appear to be: a.) "Single person" units which indicate a marriage but never indicate a spouse. b.) "Single person" units that have the 7th digit of the WAIS ID equal to 2 or more. c.) Units which have 2 or more marriage details codes (not 0) unless the number of indicated persons is consistent with information on a last wife who filed. d.) Units which have 2 or more spouse death indications unless the number of persons is consistent with information on a last wife who filed. e.) "Husband-wife" (per ID#) units in which both filed but never in the same year and neither ever indicates a spouse. f.) Units in which 2 or more wives are indicated and both a marriage and subsequent spouse death are indicated and the last wife did not file. Note that Groups 3 and 5 depend on certain (weaker) assumptions about the accuracy of the process of coding - specifically in assigning WAIS ID's. Explicitly, these assumptions are: Group 3 - Although the coder coded the spouse indicator "0" (no wife's name given) he found sufficient information (address match - enclosures - marriage details) to assign husband-wife ID#'s to the unit. Group 5 - Since the coder found it necessary to assign at least one wife ID (other than a first wife) we assume he gave the correct wife number designation to the "last" wife. 2. Sample Entry Codes Entry into WAIS's sample is defined by the earliest year for which we have information for a person. Because of the special rules for inclusion of wife's returns (See I, C, 5, d. below), there is a marriage entry code (MENCit) which applies only to wives, as well as an individual entry code (IENCit) which applies to all persons in the extended population. Four codes are used to designate the type of entry by marriage: Code 1 indicates that details concerning a (new) marriage are given, Code 2 is used to indicate, in the absence of marriage details, a new marriage that must have occurred because of the existance of a prior wife in the household, Code 3 indicates a probable new marriage when there is an indication that the unit was not married throughout the time it appears in the sample, Code 4 identifies wives who are married when the unit begins to file in the sample. When the first year of information for a person is a year in which he filed, the individual entry code (IENCit) is merely a recoding of the return-reason code. When the first year of information for a person is the result of information given by the spouse, the entry code (IENCit) is a recoding of the spouse information. The codes are classified according to the "quality" of the data. 3. Sample Exit Codes Exit from WAIS's sample is defined by the latest year for which we have information for a person. Exit codes are the marriage exit code (MEXCit) and the individual exit code (IEXCit). Four codes are used to designate an "exit" of the wife related to marriage: Code 1 indicates a marriage dissolution because the husband died, Code 2 indicates a marriage dissolution that must have occurred because of the entry of another wife, Code 3 indicates a probable marriage dissolution when there is an indication that the unit was not married throughout the time it appears in the sample, Code 4 designates wives who are married when the unit stops filing. Individual exit codes are assigned when a person dies, leaves the state, or does not file for some other reason (unknown), or in the case where the last year of information is from the spouse's data, are a recoding of the spouse information in the last year for the unit. Again, the codes are according to the "quality" of the data. 4. Gap Plugging Codes The gap-plugging codes are an extension of the binary indicator of records present in the Master File. Three types of codes are combined in the gap-plugging code: a.) type of gap-pre-entry, post-exit or "gap", b.) information concerning the gap, and c.) a "distance" or extent of extrapolation index for selected gap information types. The "b" part of the code is the most relevant part. Spouse information in a current year always has priority over information from later years. Otherwise later year information is used extensively in deriving gap-plugging codes; except where post exit gaps exist. The gap-plugging codes therefore are essentially recoding of the return-reason and spouse information codes. These codes also have some "quality" indication. 5. Notes a.) In selecting the extended population, the tests must be applied in sequence, ie. Group 2 persons must fail the test for Group 1; Group 3 persons must fail the test for Group 2, etc. b.) In assigning sample entry, exit, and gap-plugging codes, sequencing is important in some cases. Priorities are specified in the rules. c.) Each wife in the extended population should have a MENCit and a MEXCit and at least one year designation for each. Each person in the extended population should have an IENCit and an IEXCit (and associated years) and 15 gap-plugging codes. d.) The definition of sample eligibility. By persons eligible for the WAIS sample, we mean all persons in the name group sample during the years 1946-1960 who were subject to the Wisconsin State personal income tax.*** Persons subject to the tax therefore include resident -------------------------- ***From the Wisconsin Statutes, 1965, Chap.71.01(1): "...There shall be assessed..., and paid, a tax on all net income ... by every natural person residing within the state; by every non-resident, natural person upon such income as is derived from property located or business transacted within the state, and also by every nonresident natural person upon such income as is derived from the performances of personal services within the state.... Every natural person domiciled in the state shall be deemed to be residing within the state for the purposes of determining liability for income taxes...." Wisconsin Statutes Annototed, on 71.01(1) shows only minor changes in previous definitions during the years 1946-1965, with the following exceptions: (1) All persons in the Armed Forces were excluded during 1942 - 1940, and (1) Prior to 1963, a 7 month "domicile" test was included in the law (See last sentence in 71.01(1) above. -------------------------- persons, and nonresident persons having income earned within the state. Note that this is a more inclusive definition than persons who filed a return during the years 1946-1960. Many persons subject to the tax were never required to file returns. In addition, some persons were presumably required to file, but did not. The choice of a name group sample and the Tax Department's filing system for return archives are the other relavent considerations in defining "Persons eligible for the WAIS sample". The complete definition is all persons subject to the Wisconsin Personal Income Tax who:**** -------------------------- ****See also WAIS 656-033 "The Treatment of Wives in WAIS Tax Sample" -------------------------- 1. were single or married males having names in the selected name groups during 1946-1960; or 2. were wives of males in 1; or 3. were single females having names in the selected name group during 1946-1960 who did not subsequently (before 1961) marry males not in the selected name groups during 1946-1960; or 4. were unmarried females who subsequently (before 1961) married males in the selected name groups during 1947-1960; or 5. were former wives of males in the selected name groups who did not subsequently (before 1961) marry males not in the selected name groups during 1947-1960. e.) An illustration of the development of the Selector File. B. (Output of the "Extended population) - Persons who filed a return in the WAIS sample plus identified spouses who did not file. C. (Unknown) Eligible -> persons not sampled. Includes: 1. Not required to file. 2. Required to file but didn't. 3. Filed a return(s) which was not included in the sample. D. (WAIS Master File Individuals) Persons who filed a return which was included in the WAIS sample. E. (Added persons resulting from extension of population) spouses who did not file. A. (Unknown) All persons ever eligible for the WAIS Sample F. (Input to coding of Entry, Exit, and Gap-Plugging Information.) II. Suggested Rules for Selection File Development A. Definitions 1. Data Files, Abbreviations a. Longitudinal Selection File (Subfile) SF b. History File HF c. Master File MF d. Benefit Data File BF e. FFID File FFID f. Supplementary Age Data File Age Note: Further recoding and verification of the SF may involve extensive use of data files such as the Social Security Earnings (805) File of the Benefit Data File. At present, this paper deals primarily with coded items in the Master File supplemented by one or two data fields in the BF, FFID, or AGE files (See 5 below). 2. Subscripts a. i = > individual b. j = > spouse c. k = > individual or spouse or both if both filed d. t = > year [1(46), 2(47) ---15 (60)] e. t, *t' => given values for t f. a = (1, 2, --- 60-t*) g. b = (1, 2, ---t*-46) 3. Number of Adults in Unit N = The number of WAIS ID#'s in a household that have digit 8=0 (Source: HF) 1 otherwise 4. Wife Number and Sex Indicator WN = 7th digit of WAIS ID# 0 => male 1 => 1st wife or single female 2 => 2nd wife 3 => 3rd wife etc. 5. Existing Data Used in Processing Label Meaning, Codes Abbr. Source(s) a Birth Year -------- BYi BF, AGE, HF b Death year -------- DYi BF, MF, AGE c Spouse- Answers to "Does Spouse have SSYjt MF Separate separate Income?" Income Code 0 - No spouse (no answer) 1 - Spouse - no separate income 2 - Spouse - yes separate income 3 - Spouse died during year Label Meaning, Codes Abbr. d Return Reason Answers to "Did you file last year RRkt Code -Why not?" 1Yes 2 - No - insufficient income 3 - No - student 4 - No - lacked knowledge 5 - No - in military service 6 - No - not resident of Wisconsin 7 - No - unemployed 8 - No - no reason given 9 - Not ascertained e County Prior Answers to "In what county did you CPYkt Year live last year? 01-71 Wisconsin counties 98 Out of state 99 Not ascertained f Residence Residence location in current year RLkt Location Codes 010-719 Wisconsin residence 980 Out of State 999 Unknown h Total of sources $Total of sources of Income TSkt MF i Type of Form Type of form on which taxpayer(s) FTkt MF filed j County of Latest Code for Last known address CCkt FFID Address Code 01-71 Wisconsin Counties 98 Out of State 99 Not Ascertained Source(s) MF MF I Marriage Details Answers to "Were you newly MDkt Code married in t?" Details? 0 - No marriage 1 - Marriage, no details 2 - Marriage, details g MF k Record Present 0 - Master file record not present RPit HF Indicator 1 - Master file record present 6. Other variables developed in SF processing are specified in parts C - F below; these are Group Type, MENCit, (t , t1), IENCit(t*), MEXC it (t*> t'), IEXC it, t*, t', GP it' B. Proposed Format for SF The SF should eventually become part of the WAIS HF. We may find that creation of the SF is facilitated by forming a "work file" which is later integrated with the HF. Location in Item Description Input Output File File Field Label Size 1. SFID 8 WAIS ID number HF dummy 2. SFRP 15 History File Master Record Present HF Indicators 3. SFFY 2 First year filed return HF 4. SFLY 2 Last year filed return HF 5. SFGY 2 First year filed return (not 1946) BF 6. SPRY 2 Last year filed return (not 1960) 7. SFBY 4 Mo., Yr. of Birth BF Note: Birth year information in the plus BF at present is incomplete; neither AGE nor BF BYi data is in the BF 8. SFSI 15 SSYit NF 9. SFRR 15 RR it NF 10. SFCP 15 1 0-01-72 NF indicator 1-98 CPYit 2-99, other 11. SFRC 15 1 0-010-729 NF indicator 1-980 i 2-other 12. SFMD 15 1 0-0 MF indicator 1-1,2 MD it 2-other Location in Input File 1-8 plus ID's created 9,15,21,27, 33,39,45,51, 57,63,69,75, 81,87,93. 99-100 101-102 103-104 105-106 132-136 additions 23 (each year filed) 21 (each year filed) 15-16 (each year filed) 12-14 (each year filed) 24 (each year filed) Location in Output File Field Input Location in Label Size Item Description File Input File 13. SFTS 7 TSit (1946-1952 only) indicator 127-135 (each 0-TS>3500(46-48) year filed 46-52) 0-TS>5000(49-52) 1-TS<3500(46-48) 1-TS<5000(49-52) 14. SFFT 7 FT it (1946-1952 only) indicator 387 (each year filed 0-FT SOFT 46-52) 1-FTit=2 15. SFCC 1 0-01-72 FFID 120-121 CC it indicator 1-98 2-other /a6 16. SFAY 2 Year of Last Known Address FFID 122-123 17. SFDY 4 ., Yr. of Death Age MF 23 BF J 18. SFNH 1 "N" * 19. SFWN 1 * 20. SFGT 1 1-Passed test for Group 1 * 2- 11 of 2 3- " " " " 3 Indicator 4- " 4 " " " 5- " 5 " " " 6- " 6 0-Failed all tests 21. SFDR 1 Indicator of type -No Dummy-in MF * * 22. SFME 1 1-No Dummy-not in * of Dummy Record NF but in BF 2-Dummy created 23. SFM1 2 MENCit * * 24. SFM2 2 Primary for MENCit * * year Secondary for MENCit year Location in Output File Label Field Item Description Input Location in Size File Input File 25. SFIE 2 IENC * 26. SFI1 2 it * * of 27. SFMX 1 Year IENCit * * 28. SFM3 2 MEXCit * * 27. SFM4 2 Primary for MEXCit * * year 28. SF IX 1 Secondary for MEXCit * I * year 29. SFI2 2 IEXCit * * 30. SFI3 2 Primary for IEXCit * * year 31. SFGP 30 Secondary for IEXCit * * year Gap Plugging codes - 2 char. each year 1946-1960 * Calculated in S F Program ( See Rules )LxLqC. Rules for Selection of the Extended Population 1. Preliminaries It is assumed in what follows that the integration of records for persons with more than one ID number has been accomplished. It would also be helpful if discoverable inter-year and interperson inconsistencies (see WAIS 656-020) were resolved before the selection file was created. The following rules are designed to minimize the effect of existing inconsistencies, both discoverable and unknown. Needless to say, the magnitude of "impossible" and "not ascertained" cases could be reduced if we were willing to attempt to resolve discoverable inconsistencies. 2. General The SF should include all persons in the MF and all of their spouses for the time period covered by the MF. N is derived from the HF because I feel the number of dummy records to be created for wives who never filed but existed during the period covered by the MF will be substantially reduced by the number of wives who appear in the BF and thus the HF. We define six types of units based on the variable N and the coded information RP it, SSY it, MD MD it and WN. In many cases it will be necessary to consider coded data for more than one person in a given household. Unless otherwise stated below, use of the "i" subscript means only the individual himself must be considered in determining his status; use of the "j" subscript means only the person's (then-current, given t) spouse should be considered; use of the "k" subscript means both (current spouses') coded items should be considered, if both records are available, either record can be used to determine the units' status. 3. Creation of "Dummy" SF Records where indicated in Groups 2, 4, and 5. Indicated missing wives (or husbands from Group 2) should be given their own SF Record so that the appropriate entry, exit and gap-plugging codes can be assigned in later processing. The following information can be recorded at the time of creation of the record; all other fields will, of course be unknown: 1. SFID 18. SFNH ("N does not include this person) 19. SFWN (same as 7th position of 1.) 20. SFGT (=2, 4 or 5) 21. SFDR (=2) 4. Test for Group 1 - Consistently Single Persons All persons in the HF should be given the following test: If 1. N=1 and 2. RPit = 1 (at least one t) and 3. SSYit = 0 RPit 1 (all t) and 4. MDit = 0|RPit=1 (all t) and 5. WN=0 or 1 we will treat the person as being consistently single throughout the years he was included in the sample. All persons who fail the above test should be given the following test. 5. Test for Group 2 - Married persons - only one wife If 6. N.1 or 2 and 7. RPkt 1 (at least one t) and 8. SSYkt>0|RPkt = 1(at least one t) and 9. (the # of MD 1 (all t) and kt>0)< 10. (the # of SSYkt 3)5 1 (all t) and 11. (RP 0|SSYjt* 3 => RPjt*=1 (All a) and 12. it*+a SSYit*+6 0|SSYit*=3 => RP it*=1 (all a, if any, where RP 1) and it*+a 13. SSYkt*-b=0|MDkt*>0 => RP and kt*=1 14. WN=0or1|N=1 and 15. WN=0 and 1|N=2 we will treat the persons in this unit as being married but only indicating one wife. If N=1, a dummy SF record for the spouse should be created. Males should be assigned an ID in the household ending in 00, females should be assigned an ID in the household ending in 10. It should be noted that these tests, of course, are not foolproof - we can only hope to specify a strong, but meaningful and manageable group of tests. An example - A man who had two wives in our sample could pass the above test (and be misclassified) if A. Wife number 1 died and(1.) there was no indication of her death (a.) because the man didn't file that year, or,(b.) the man didn't say wife died when she did or (c.) SSYit was miscoded and (2.) both wives did not file and (3.) the second wife was given an erroneous ID and (4.) there was no indication of a remarriage. Or B. Wife number 1 was divorced and (1.) both wives did not file and (2.) the second wife was given an erroneous ID and (3.) there was no indication of a remarriage. Possibility B is obviously more likely than A; however the tests appear strong enough to preclude errors of this sort in most cases. 6. Test for Group 3 - Married Persons - Only one Wife Persons who fail the test for Group 2 should be given the following test: If 16. N=2 and 17. (RP it) . (RP j t) =1 (at least one t) and 18. Same as 9 and 19. Same as 15, We will treat the persons in this unit as being married but only having one wife. Note that SSYkt (must) =0 (all t) in this case but the coder assigned husband-wife ID's to this unit anyway. 7. Test for Group 4 - Married Persons - Only two Wives Persons who fail the test for Group 3 should be given the following test: If 20. N=1 or 2 or 3 and 21. Same as 7 and 22. Same as 8 and 23. Same as 9 and 24.. Same as 10 and 25. SSYit*+a 31 MDkt*>0 (All a, if any, where RP it*+a 1) 26. Either SSYkt*+a>0 or MD SSYkt* 3 => RP kt*=1 (at least one a => RP 1, some a) and kt*+a 27. SSYkt*-b>0|MDkt*>0 =>RPkt*= 28. (at least one b =>RPkt*-b=1, some b) and WN=2 in t+.a|RPit*+a1, SSYjt* 3 = . * 1 29. (all a, if any, where RP 1) and it*+a W161 in t*-b t RPit*-b= 1, MDkt*> 0 =>RPkt* 1 30. (all b, if any, where RPit*-b=1) and WN 0 J N_ 1 and 31. WN=0 and I or 21N=2 and 32. WN=0 and 1 and 21 n=3, We will treat the unit as one in which there are two wives. If N =l, dummy SF records for both spouses should be created (wives 10 and 20). If N=2, a dummy SF record for the "missing" wife should be created depending upon the outcome of rules 30 or 31 above. 8. Test for Group 5 - Married Persons Two or Three Wives - Last Wife Filed. Persons in units which fail the test for Group 4 should be given the following test: If 33. Max WN 2 or 3 and 34. RP 1, RP 1|WNt Max WN (at least one t) and st jt 35. (the # of MDkt>0) 5 Max WN (all t) and 36. (the # of SSYjt=3)_< Max WN-1 (all t) and 37. 25N5 Max WN+1 and 38. WN=0 and 2 or 0 and 31 N=2 and 39. WN=0 and 1 and 2 or 0 and 1 and 3 or 0 and 2 and 31N=3 and 40. WN=0 and 1 and 2 and 31N=4, We will treat the unit as one in which the number of wives is equal to the largest wife number. If N=2, dummy SF record(s) for the missing wife (wives) should be created. If NO, a dummy SF record for the missing wife should be created if WN=either 0 and 1 and 3 or 0 and 2 and 3. 9. Test for Group 6 - Persons in Units which filed but did not pass preceding tests. Any unit which did not pass the preceding tests but has 41. RP kt=1 (at least one t) should have all existing HF information for all known persons in the unit printed out for clerical determination of the proper unit size. D. Rules for the assignment of Sample Entry Codes There are two types of codes to be used for determination of entry into the sample: Marriage Entry Codes (MENCit) which apply only to all wives in the extended population, and Individual Entry Codes (IENCit) which apply to every person in the extended population. 1.) The Four codes to be used for MENCit*are: MENCit* Meaning Required Values of t 1 Positive information on occurance of marriage- Primary and Secondary marriage details given 2 Bounded information on occurance of marriage-- Primary and Secondary no marriage details given 3 Proximate information on occurance of marriage Primary and Secondary no marriage details given 4 Unit was married prior to (unit) entry Primary only Code 1 should be assigned when marriage details relavent to this wife are given, i.e. MD kt*>0. The primary t value should be the indicated year that the marriage took place, t*. Since we may have returns for the wife in years preceding the marriage, it is also necessary to indicate a secondary t value, t', the year (if any) in which this wife filed her first return. of t10. The secondary t value indicates (as in code 1) the year (if any) in which this wife filed her first return. Code 3 should be assigned when marriage details relavent to this wife are not given, no prior wife exists, but there is some indication that the unit was not always married. The primary t value should be the first year for which SSYkt*0 if RPkt*-a = 1. The secondary t value should again be the year (if any) this wife filed her first return. Code 4 should be assigned if, in the first year for the unit i.) SSYkt*>0 and ND kt* 0 The primary t value should be the first year the unit appears in the sample. 2.) The codes to be used for IENCit* are: IENCit* Meaning 01 Positive information concerning entry 02 Proximate information concerning 03 Proximate information concerning 04 Proximate information concerning 05 Proximate information concerning entered state entry previously military entry previously student entry previously unemployed entry previously had low or no income 06 Proximate information concerning entry spouse says no income in t* 07 Proximate information concerning entry filed jointly with spouse in t* 08 Proximate information concerning entry spouse says died in t* 09 Low quality information concerning entry- lacked knowledge 10 Low quality information concerning entry- gave no reason 11 Low quality information concerning entry- did not answer 12 Inconsistent information concerning entry- claims filed previously 13 Inconsistent information concerning entry- spouse says i filed in t* 14 Claimed filed previously but filed before 1948 IENCit* is assigned in the first year information is available for a person - t* is the earliest year either RPit 0, SSYjt>0 (codes 6-C or 1 ) or RPit=1 (codes 1-5,9-12 , or 14). Code 1 should be assigned if either RRit* 6 or CPYit*=98 This code has priority over codes 2-5, 9-12, 14. Code 2 should be assigned Code 3 should be assigned Code 4 should be assigned Code 5 should be assigned Code 6 should be assigned if RR it*5 if RRit*3 if RR it*= 7 if RR it*=2 if SSYjt* 1 if SSYjt*=2 and t*=47 or 48, FTj t = 2, TSjt< 3500 or 49< t* s 52, FTjt= 2, TSjt < 5000 Code 8 should be assigned if SSYjt* 3 Code 9 should be assigned if RRit*4 ,Code 10, should be assigned Code 11 should be assigned Code 12 should be assigned RR it* 1, t** 46, 47 Code 13 should be assigned Code 14 should be assigned Code 7 should be assigned if RR it*" if RRit*= 9 if SSYj t* =2 and person failed tests for Code 7 if the person fails the second part of the test for Code 12. It may be desirable to add a Code 15 to take care of invalid e.g. (RR it* b) source codes, provided that no editing is done prior to this stage. IEXCit and the gap-plugging codes should also be modified to allow for invalid source codes. E. Rules for Assignment of Sample Exit Codes As in the coding of Sample Entry, we have both Marriage Exit Codes (MEXCit) which apply only to all wives in the extended population and Individual Exit Codes (IEXCit) which apply to every person in the extended population. Unfortunately we are unable to discover quite as much information concerning sample exit, consequently there are fewer codes. MEXC it* 1.) The Four codes to be used for MEXCit* are: Meaning 1 Positive information concerning dissolution of primary and marriage husband died secondary 2 Bounded information concerning dissolution of primary and marriage husband remarries secondary 3 Proximate information concerning dissolution primary and of marriage secondary 4 Unit was married at time of (unit) exit primary Code 1 should be assigned when SSYit*=3. The primary t value is t*, the year the husband died; the secondary t value, t' is the last year this wife filed a return. Note that t' must fit* in this case. Code 2 should be assigned only when a subsequent wife is indicated. The primary t value is the last year for which SSYkt>0. The secondary t value designates the year (if any) in which this wife filed her last return. Code 3 should be assigned when a subsequent wife is not indicated but SSYkt=0 in the last year available for the unit. If the above is true, then the primary t value, t* , is the last year for which SSYkt>0, and t' designates the year (if any) in which this wife filed her last return. Code 4 should be assigned if, in the filed, SSYkt*>0 but SSYit*+3 2.) The Eight codes for IEXCit are: Required IEXC it Meaning Values of t 1 Positive information concerning sample exit-death primary 2 Positive information concerning sample exit left state primary 3 Proximate information concerning sample exit spouse says primary no income in t* 4 Proximate information concerning sample exit filed jointly with spouse in t* primary 5 Bounded information concerning sample exit-- primary and left state secondary 6 Inconsistent information concerning sample exit-- primary spouse says i filed in t* 7 Low quality information concerning sample exit-- primary 1958 or earlier - no reason 8 No exit information but filed after 1958 primary IEXCit* is assigned in the last year information is available for a person - t* is the last year either RP it 0, SSYjt>0 (codes 1,3,4, 5 or 7) or RP it 1 (codes 1,2,5,6,8). last year either spouse Code 1 should be assigned if either SSYjt*=3 or DYi* exists. Code 2 should be assigned if, either RLi 980 or CCit*=98. This code has priority over code 1. Code 3 should be assigned if SSYjt*= 1 Code 4 should be assigned if SSYjt*=2 and t *=47 or 48, FT. 2, TSjt< 3500 or 49t*,the primary t value is the last year of information, t*, the secondary t value is t', the year associated with CC it. This code has priority over code 1. Code 6 should be assigned if SSYjt*=2 and person failed test for code 4. Code 7 should be assigned if none of the above codes were assigned and the last year of information for the person is 58 or earlier. Code 8 should be assigned if none of the above codes were assigned and the last year of information for the person is 59 or 60. 3.) Primary and Secondary Years (MENCit*) or (MEXCit*). (MENCit*) 's 2 and 3 and (MEXCit*) 's 2 and 3 will be assigned to persons who never have an indication of a current spouse but are considered married; (e.g. Group 3 persons) a primary entry (exit) year (t*) cannot be assigned; the secondary entry (exit) year (t') will be assigned since the person must have filed. In all other cases, primary entry years are required and secondary entry years are required only if the wife filed. These codes may be useful input to an analysis routine which generates summary marital status indicators (for both husband and wife) and which incorporates some information on the probability of being married but not reporting it (e.g. thru $ tax credit--a sample?). A summary marital status indicator may be of the following form: a.) indicates married in t - treated as married in t b.) does not indicate married - treated as married in t c.) does not indicate married - treated as single in t F. Rules for the Assignment of Gap-Plugging Codes 1.) A gap-plugging code should be assigned in each of the years 1946-1960 for each person in the extended population. The gap-plugging codes are: GPit Meaning 01 Positive information concerning yr. - MF Record present 02 Positive information concerning gap yr. - spouse indicates null income. 03 Proximate information concerning gap yr. - filed jointly with spouse 04 Inconsistent information concerning gap yr. - spouse claims i filed 05 Inconsistent information concerning gap yr. - spouse doesn't indicate married 06 Indefinite information concerning gap yr. - Later year only spouse filed - this year neither filed 1-S- -JS Positive information concerning gap or post-exit yr. - left state 2S -BS- Bounded information concerning yr. or pre-entry yr. - left state or entered K S Bounded information concerning yr. or pre-entry yr. - left state 4S-DS- Bounded information concerning gap or pre-entry year - student 5S-ES- Bounded information concerning gap or pre-entry year - military 6S-FS Proximate information concerning gap or pre-entry year - low income or unemployed 7S-GS Inconsistent information concerning gap or pre-entry year - i claims i filed 8S-HS Indefinite information concerning gap or pre-entry year - lacked knowledge, NR, NA 90 Indefinite information concerning pre-entry year - later year only spouse filed 91 Indefinite information concerning post-exit year earlier year only spouse filed 92 Positive information concerning year - died(but did not file) 93 Positive information concerning post-exit year - died previously 94 Indefinite information concerning post-exit year - no indicated reason 95 Entered sample in 1947 only used in 46 if otherwise inconsistent 96 Left sample in 1959 only used in 60 if otherwise inconsistent Codes 01 - 06 and 90-95 are unique Codes 11 - H9 designates classes of gap-plugging codes wherein the first of the two digits designate the "type of gap" - i.e. 1,2,4,5,6,7,8 indicate a gap which is post-entry and pre-exit; B,D,E,F,G,H indicate a pre-entry "gap"; and J,K, indicate a post-exit "gap". The second digit of codes 11-H9 is a "distance" code which indicates if t is the gap year: Second digit of GP it Indicates information given in Or 1 (Not post-exit) to (Post-exit) t-1 2 (Not post-exit) t+2 (Post-exit) t-2 9 (Not post-exit) t4-9 or more (Post-exit) t-9 or more Code 01 should be assigned whenever RP it=1 Codes 02-96, should be assigned if RPit=0 and: Code 02 SSYjt 1 Code 03, - SSYjt =2 and a.) t=47, 48, FT it 2, TS it <3500 or b. 49 t 52 FT =2 TS. <5000 jt t Code 04 - SSYjt 2 but failed test for code 03 Code 05 - primary MENCktIENCit yr. and in first year following gap, RP it =0, RPjt = 1 Code 1S should be assigned to the years t*+1 to t*.-a-1 for which a,) all RPit*+1,..., RP and it*+a-1=0 b.) RP 1 and it* c.) RLit*= 980 and d.) RP i t*+a = 1 and e.) CPYit*+a = 98 or f.) RR it*+a 6 Code JS should be assigned to the years t* l ,..., 60 if IEXCit*=2 Codes 2S-BS should be assigned if, in the first year following the gap, RP it 1, and RR it 6 or CPYit=98 (Assign code BS if IENCit=01) Code LS should be assigned if IEXCit 5, to all years following the primary exit year Codes 4S-DS should be assigned if, in the first year following the gap, RPit=1, and RRit = 3 (Assign code DS if IENCit=03) Codes 5S-ES should be assigned if, in the first year following the gap, RP it 1 and RR it 5 (Assign code ES if IENCit 02) Codes 6S-FS should be assigned if, in the first year following the gap, RP it 1 and RRit=2 or 7 (Assign code FS if IFNCit=05) Codes 7S-GS should be assigned if, in the first year following the gap, RPit=1 and RR it 1 (Assign code GS if IENCit=12) and t>47 Codes 85-HS should be assigned if, in gap, RP it 1, and RR it 4,8 or 9 (Assign code HS if IENCit 09,10 or 11) Code 90 should be assigned where t< yr of IENCit and IENCit 06-08 or Code 91, should be assigned where t> yr of IEXCit and IEXCit=3,4, or 7 Code 92 should be assigned if t=yr of IEXCit and IEXCi t =1 and RP it =0 Code 93 should be assigned in all years following the year coded 92 Code 94 should be assigned where t> yr of IEXCit and IENCit 7 and t< 59 be assigned if RP 146= 0, RP i47'-- 1, RR W= 1, RP J467 0 the first year following the Code 95 should 13 Code 96 should be assigned if RP 160=0, RPi59=1, IEXCit 8, 1 j60=0hahttp://www.ssc.wisc.edu/wais/WAIS667012.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667012.txt~ John deVries 1966RKProcessing 1959-1964 Wisconsin Income Tax Returns: Keypunching InstructionslNovember 7, 1966 WAIS paper667-013.(Data Processing Master File- Tax RecordsOOJohn de Vries WAIS 667-013 First Revision November 23, 1966 PROCESSING 1959-1964 WISCONSIN INCOME TAX RETURNS KEYPUNCHING INSTRUCTIONS SUMMARY DATA AND IDENTIFICATION CARDS TABLE of CONTENTS 1. Introduction 1 2. The keypunching of the identification cards 3 3. The keypunching of the summary cards 3 3.1. Multiple ID numbers 3 3.2. Multiple Social Security numbers 3 3.3. Summary cards 3 3.3.1. Miscellaneous amounts cards 4 3.3.2. Assessment cards 4 4. Conventions 4 5. Card Layouts 6 5.1. Identification cards 6 5.2. Multiple ID numbers cards 8 5.3. Multiple Social Security numbers cards 9 5.4. Summary cards, form 1 10 5.5. Summary cards, form 2 15 5.6. Summary cards, form 3 20 5.7. Summary cards, form 4 25 5.8. Miscellaneous amounts cards 28 5.9. Assessment cards 29 6. Examples of coding-filing log sheet and codesheet 30 7. Detailed description of coded demographic data 33 1. Introduction WAIS is presently engaged in gathering data from the 1959-1964 Wisconsin Income Tax Returns of a 1% sample of Wisconsin taxpayers. So far, a number of stages prior to the keypunching has been partly or fully completed: (a) the microfilming of the returns and the preparation of prints; (b) the filing of the prints and the assignment of identification numbers; (c) the coding of the demographic information. Further descriptions of the above three phases can be found in WAIS paper 667-002, together with an overview of the total processing system required to incorporate the new data into WAIS' existing data files. The keypunching of the summary data for the 1959-1964 Wisconsin Income Tax Returns will consist of two separate jobs: (a) the punching of the identification cards; (b) the punching of the summary data, plus a number of additional cards, required only under special circumstances. Exact requirements for the two jobs will be given in sections 2 and 3 of this paper, card layouts in section 5. The information to be punched will be found in folders, each of which carries one or more gummed labels on the tab. These labels carry the name of the person(s) whose income tax returns are filed in the folder, as well as their ID number, social security number and latest address. Most labels also provide information specifying in which years during the 1946-1960 period the taxpayer(s) filed a return. The people whose returns are contained in the folder can be either a single person or a married man with his current wife and/or any former wife (or wives) who filed income tax returns during the 1959-1564 period. Inside each folder you will find: a) A coding-filing log sheet (see example in section 6 of this paper). On this sheet, all prior actions taken on this folder will be recorded (carrying the initials of the coder or filer and the date when the action was taken). The entries on this sheet should indicate that the "demographic coding" was completed; if this indication is missing, see the coding supervisor. You should also notify the supervisor if you find a log sheet which clearly does not belong in the folder you are working on, or if the folder you are working on does not contain a log sheet at all. After you have punched all the cards for the information contained in the folder, mark this by putting your initials and the date in the appropriate column on the log sheet: the one marked "summary information keypunching" for the identification cards, the one below that for the summary data cards. b) A code sheet (see example in section 6 of this paper) for every person whose income information is contained in the folder. These code sheets contain, besides coded information from the tax returns, data from other sources which WAIS already had. The code sheet consists of two parts: the top section, which is coded once for every person whose returns were collected or about whom WAIS wants to collect information, and the bottom section, containing specific information for each of the years for which the taxpayer filed a return. c) Income tax returns for all years in the period 1959-1964 for which the taxpayer(s) whose name(s) you find on the outside of the folder filed a Wisconsin Income Tax return. The returns should be filed in the folder in chronological order for each person separately, preceded by the code sheet for that person. The order of the sheets for one year's return should be: first the front and back pages (for the "short" forms, the front side) second, the second and third pages (for the "short" forms, the back side). Then follow, if present, the information returns, then schedules, assessment notices and any correspondence between the taxpayer and the Taxation Department. If, in any folder, this order is seriously disturbed, or if you have the feeling that certain parts are missing, see the coding supervisor. Try to put everything in the folder in the correct order before you return the folder! The labels on the tab of the folder refer either to a single person or to a married man, his current wife and/or his former wife or wives (if any). If you find that the folder contains information about persons whose name does not appear on a label on the tab, or does not contain any information about a person whose name appears on the label, or if there are returns in the folder for two persons who definitely are not (and were never) married to each other, see the coding supervisor. 2. The Keypunching of the Identification Cards For every person whose information is contained in the folder, one set of two identification cards has to be punched. All fields to be punched are indicated on the code sheet or have specific contents depending on the number of the card. If, for any field which you are going to punch, the code sheet does not give any information, punch a "0" (zero) in the first column for that field, then skip to the next field, except if the missing field is the identification number (in that case: see the supervisor). Card 2 is to be done on alternate program with columns 1-18 to be duplicated from the first identification card. 3. The Keypunching of the Summary Cards This job consists of a number of separate jobs, some of which are only to be done when specific conditions are satisfied; the remainder has to be done for all years for which the taxpayer filed a return. 3.1 Multiple ID Numbers Cards (M-cards) To determine whether you must punch a M-card for a particular person, check field lA on the code sheet. If the contents of this field are "0" (zero), no M-cards have to be punched for this person. If the contents of field 1A are not zero, check the back of the code sheet. There should be at least one additional ID number written down for this person. Punch as many M-cards as there are additional ID numbers for this person. 3.2 Multiple Social Security Numbers Cards (S-cards) Check the code sheet, field 2A. If the contents of this field are "0" (zero), no S-cards have to be punched for this person. If the contents of field 2A are not zero, check the back of the code sheet. There should be at least one additional social security number written down for this person. Punch as many S-cards as there are additional Social Security numbers for this person. 3.3 Summary Cards For each year for which you find coding on the codesheet, summary cards have to be punched. There are four different types of forms: (1) 1959, 1960 and 1961 long forms (marked "Form 1"), (2) 1962 long forms (marked "Form 1"), (3) 1963, 1964 long forms (marked "Form 1"), (4) 1962, 1963, 1964 short forms (marked "Form 1W"). Each form is described in the appropriate part of Section 5. If you find that for any given year there is coding on the code sheet but no return in the folder, check field 9 on the code sheet. If the contents of this field show "0" or "2", search the folder once more for the return; if you still don't find it, see the coding supervisor. If the contents of field 9 do not show "0" or "2", punch only the first card for that year and punch a "Z" in column 77. Besides the standard cards (5 for the long forms, 3 for the short forms) you may have to punch additional cards, depending on the information given on the tax returns. 3.3.1 Miscellaneous Amounts Cards (A-cards) A-cards are necessary when amounts are indicated on the return for which no fixed position is given on the form nor on the standard card layouts, e.g. non-taxable income, unclassified deductions, etc. A-cards have the same format for all returns. 3.3.2 Assessment Cards (L-cards) For some returns you will find assessment notices from the Taxation Department. One L-card has to be punched for every year for which an assessment was made. Be careful! Field 4 (cols. 11-12) should contain the year in which the income was received, not the year in which the assessment was made! 4. Conventions 1. In some cases you will find that the taxpayer filed an amended return after he filed the original return. Whenever you find this situation, punch the information for the amended return rather than for the "original" return. 2. In some cases you will find that the taxpayer filed a return for a 12-month period which did not coincide with a calendar year. For all of these cases, see the supervisor. 3. For the identification cards, all alphabetic fields are leftjustified, all numeric fields (e.g. street number, zone number) are right-justified. If you notice any cases where a field is justified the wrong -ray, but for the rest correct, punch it the corrected way. If the coding seems to be incomplete or otherwise incorrect, see the coding supervisor. 4. All amount fields are to be right-justified, contain dollars and cents. If an amount was not filled out, punch a "0" in the leftmost column for that field and skip to the next field. For negative amounts, punch a "__" in the leftmost column for that field, followed by the amount (right-justified!) 5. The demographic data, which are to be punched, for all forms, in Cols. 24-66 of card 1, should all be coded when you punch them, except if field 9 is not coded "1" or "2". You will also find that field 25 for years other than 1959 is not coded. Leave one column blank for these cases. If you notice that less than 43 digits were coded, see the supervisor. 6. When you have trouble reading information from the code sheet or the tax returns, see the supervisor. Do not guess!! 7. A note on "continuation indicators". They are a device which enables us, after the keypunching has been completed, to check for missing or superfluous cards within a record. The system is, basically, quite simple: if another card of the same type for the same person follows, this is indicated by a "C" in col. 77; if another card, of a different type, for the same person follows, this is indicated by a specific letter in col. 77: "A" for A-cards, "L" for L-cards, etc.; if no card for that year for that person follows, this is indicated by a "Z" in col. 77. You will notice that the only type of card without a continuation indicator is the L-card. 8. You will notice that, in the summary cards, you have to punch several amounts for wages (if the taxpayer had more than one source of wages in a year). If there are only two sources of wages, you will find them as indicated on the tax return. If there are more then two sources, you will have to follow this procedure: the largest and second largest wage will be checked with a red pencil; if there is only one more entry, you will find this on the tax return, to be punched as "total other wages"; if there are more than one remaining entries, you will find the sum on a little slip of paper stapled to the tax return. Punch the total amount as you find it there. 5.1 Identification Cards Card 1 Name of Field Contents or Source # of Cols. Cols. 1. Identification letter "I" (as in "Iceberg") 1 -1 2. WAIS Identification number Code sheet, field 1 8 2-9 3. Social Security number " " " 2 9 10-18 4. Last name " " " 3 17 19-35 5. Title " " " 3A 3 36-38 6. First name " " " 4 13 39-51 7. Middle name " " " 4A 12 52-63 8. Street/Box number " " " 5 10 64-73 5 9. RR, RFD number " " " 5A 3 74-76 5A 3 10. Continuation indicator "C" 1 -77 11. No data Blanks 2 78-79 12. Card number "1" 1 -80 Identification Cards Card 2 (On alternate program) # of Name of Field Contents or Source Cols. Cols. 1. Identification letter "I" (as in "Iceberg") 1 -1 2. WAIS Identification number Code sheet; field 1 8 2-9 3. Social Security number " " 2 9 10-1.8 4. Street name $1 of of 6 17 19-35 5. Street class " " " 6A 4 36-39 6. Post Office (city) ft. of " 7 17 40-56 7. Zip Code or Zone " " " 7A 5 57-61 8. County code " " " 7B 2 62-63 9. Age in 1964 " " " 8 2 64-65 10. Date of taxpayer's death 8A 6 66-71 (if any) 11. No data Blanks 5 72-76 12, Continuation indicator "Z" 1 -77 13. No data Blanks 2 78-79 14. Card number ''2" 1 -80 5.2 Multiple Identification Numbers (M-cards) Name of Field 1. Identification code 2. Card number -l 3. WAIS Identification number 4. No data 5. Social Security number 6. Additional ID number 7. No data B. Continuation indicator 9. No data Code sheet, field 1 Blanks Code sheet, field 2 Numbers on back of code sheet 1 per card) Blanks 2 3-10 11-13 1.4v22 23-30 "Z", unless another M-card for this person follows - then punch a "C" instead 31-76 O -77 Blanks 78-80 0 5.3 Multiple Social Security Numbers(S-cards) Name of Field 1. Identification code 2. Card number Contents or Source "I" for first card, .11211 for second card, etc. 1 -2 3. WAIS Identification number Code sheet, field 1 8 3-10 4. No data Blanks 3 11-13 5. Social Security number Code sheet, field 2 9 14-22 6. Multiple Social Security Code sheet, field 2A 1 -23 indicator 7. Additional Social Security Social Security number on back 9 24-32 number of code sheet (only one per card) O 8. No data Blanks 44 33-76 9. Continuation indicator "Z", unless another S-card is 1 -77 required for this person. - then punch a "C" instead 10. No data Blanks 3 78-80 5.4 Summary Cards, Form 1 Returns 1959, 1960, 1961 Card 1 Name of Field 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Multiple 10 indicator 6. Social Security number 7. Multiple Soc. Sec. indicator 8. Demographic data (See Section 7) 9. Largest wage 10. No data 11. Continuation indicator 12. No data log Code sheet, field 1 Last page Code Code Code Code two digits 1) sheet, field 1A sheet, field 2 sheet, field 2A sheet, fields 9-31 of year (Lop Page 1, line 1- largest entry Blanks "C", unless this is the last card for this return then punch a 12f Blank 1 -1 1 -2 8 3-10 2 11-12 1 -13 9 14-22 1 -23 43 24-66 8 67-74 2 75-76 1 -77 3 78-80 0 Summary Cards, Form 1 Returns 1959, 1960, 1961 Card 2 (on alternate program) 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Second largest wage 6. Total other wages 7. All other income 8. Total income 9. Business expenses 10. Adjusted gross income 11. Standard deduction 12. Net taxable income 13. Continuation indicator 14. No date 11111 11211 Duplicate from card #1 Page 1, line 1 - second largest entry Page 1, line 1 sum of remaining entries or slip from adding chine Page I, Iine 2 Page 1, :ine 3 Page ., lire 4 Page 1, line 5 Page l., line 6 Page 1, line 7 "C", unless this is the last card '"Z" for this return - then punch Blanks 1 -1 1 -2 8 3-10 2 11-12 8 13-20 8 21-28 8 29-36 8 37-44 8 45-52 8 53-60 8 61-68 8 69-76 1 -77 3 78-80 Summary Cards, Form 1 Returns 1959, 1960, 1961 Card 3 (On alternate progrmn) # of Name of Field Contents or Source Cols. 1. Identification code "1" 1 -1 2. Card number 1130 1 d2 3. WAIS Identification number 8 3-10 4. Year of return Duplicate 2 11-12 5. Exemptions Page 1, line B 8 13-20 6. Total tax Page 1, tine E 8 21-28 7. Tax to other states 1959, 1960; blanks 8 29-36 O 8. First installment 1961: page 1, line ii 8 37-44 Page 1, line 9. Non-business interest paid Page 2, ssme of entries for 8 45-52 line 2 10, Medical and Dental expenses. Page 2, line 3 8 53-60 11. Wisconsin income tax paid Page 2, line 4 8 61-68 12. Union dues Page 2, line 5 8 69-76 13. Continuation indicator "C", unless this is the last card 1 -77 for this return.-then use "Z" 14. No data Blanks 78-80 Summary cards, Form 1 Returns 1959 1960 1961 Card 4 (On alternate program) Name of Field Contents or Source 1. Identification code "1" 2. Card number "4" 3. WAIS Identification number Duplicate 4. Year of return 5. Alimony paid Page 2, line 6 6. "Forest crop land" expenditures Page 2, line 7 7. Total deductions Page 2, line 8 8. Net Income before Federal I income tax 'Page 2, line 9 9. Federal income, social security tax deductions Page 2, 1ine 10 10. Net income before donations Page 2; Line 11 11. Donations deductible Page 2, line 12 12. Interest income Page 3, line 1 13. Continuation Indicator "C", unless this is the last card for this return - then punch "Z" 14. No data Blanks # of Cols Cols 1 -1 1 -2 8 3-10 2 11-12 8 13-20 8 21-28 8 29-36 37-" 8 45-52 1 8 53-60 8 61-68 8 69-76 1 -77 3 78-80 Summary Cards Form 1Returns 1959, 1960, 1961 Card 5 (On alternate program) Name of Field 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Dividend income 6. Rent income 7. Gain and loss, property 8. Profit or loss, business 9. Partnership income 10. Estate income 11. Other income 12. Interest paid 13. Continuation indicator Page 3, line 2 Page 3, line 3 Page 3, line 4 Page 3, line 5 Page 3, line 6 (sum entries if more than one) Page 3, line 7 (sum entries if more than one) Page 3, line 8 (sum entries if more than one) M Page 3, below Schedule F of -1 I 8 2 8 8 8 -2 3-10 11-12 13-20 21-28 29-36 37-44 45-52 8 53-60 8 61-68 "Z", unless an "A". card is needed -- then punch "A" instead or unless an "L" card is needed then punch nL" instead. 14, No data Blanks 3 78-80 8 69-76 1 -77 NOTE: if an "A" card as well as an "L" card are required, punch col. 77 as "K" 5.5 Summary Cards, Form 2 Returns 1962 ("Long" forms) # of of Field Contents or Source Cols. Cols. 1. Identification code "2" 1 -1 2. Card number "1" 1 -2 3. WAIS identification number Code sheet, field 1 8 3-10 4. Year of return 'Last two digits of year 2 11-12 (top, page 1) 5. Multiple ID indicator Code sheet, field 1A I -13 6. Social Security number Code sheet, field 2 9 14-22 7. Multiple Social Security indicator Code sheet, field 2A 1 -23 8. Demographic data Code sheet, fields 9-31 43 24-66 (See Section 7) 9. Total income Page 1, line 1 8 67-74 10. No data Blanks 2 75-76 11. Continuation indicator "C", unless this is the last card for this return- I -77 then punch "Z" instead 12. No data Blanks 3 78-80 Summary Cards, form 2 Returns 1962 ("Long" forms) Card 2 (On alternate program) Name of field 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Business expenses 6. Adjusted gross income 7. Standard deduction 8. Total deductions 9. Net income before contributions 10. Contributions 11. Net taxable income 12. Personal exemptions 13. Continuation Indicator 14. No data # of Contents or Source Cols. "2 "2 it 1 1 Co" a -1 -2 Duplicate Page 1, line 2 Page 1, line 3 Page 1, line 4 Pagel, line 5a Page 1, line 5b 8 3-1 ~ 2 11-`2 8 13,20 8 2.-28 8 9-36 8 37-64 8 45-52 Page 1, line 5c Page 1, line 6 Page 1, line 8 8 1 53-60 "C", unless this is the last card for this return - then punch "Z" instead. 8 I 61-6869-76-77 Blanks 3 78-80 Summary Cards form 2 Returns 1962 ("Long" forms) Card 3 (On alternate program) 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Net normal tax 6. Tax paid to other states 7. Total payments and credits 8. Largest wage 9. Second largest wage 10. Total other wages 11. Dividend income 12. Interest income 13. Continuation indicator 14. No data "2" "VI Duplicate Page 1, line 9 Page 1, line 10 Page 1, line 14 Page 2, line 1 - largest entry Page 2, line 1 - second largest entry Page 2, line I - sum of remaining entries or slip from adding machine. Page 2, line 4 (total on bottom line) Page 2, line 5 (total on bottom line) "C", unless this is the last card for this return - then punch "Z" instead--Blanks Name of field Contents or Source # of Cols. Cols. 1 -1 1 -2 8 3-10 2 11-12 1 8 13-20 8 21-28 8 29-36 8 37-44 8 45-52 8 53-60 8 61-68 9 69-76 1 -77 3 78-80 Summary Cards, form 2 Returns 1962 ("Long" forms) Card 4 (On alternate program) of field Contents or Source 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Rent income 6. Gains and losses, property 7. Profit or loss, business 8. Partnership income 9. Estate or trust income 10. Other income 11. Non-business interest paid 12, Net medical expense 13. Continuation indicator 14. No data 1%" Duplicate Page 2, line 6 (total on bottom line) I 2 8 Page 2, line Page 2, line 8 Page 2. line 9 (sum entries if more than 1) 8 8 8 21-28 29-36 37-44 Page 2, line 10 (sum entries if more than 1) 45-52 Page 2, line 1. 8 53-60 Page 3, line 1 (total on bottom line) 8 61-68 Page 3, line 4 8 69-76 "'C", unless this is the last card for this return - then punch "Z" instead 1 -77 Blanks 3 78-80 4 Summary Cards form 2 Returns 1962 ("Long" forms). , 1. Identification code "2" 1 -1 2. Card number 01501 1 -2 3. WAIS Identification number 8 3-10 4. Year of return Duplicate 11-12 5. Casualty losses Page 3, line 5 8 13-20 6. Wisconsin Income Tax Paid Page 3, line 6 8 21-28 7. Union dues Page 3, line 7 8 29-36 8. Alimony paid Page 3, line 8 8 37-44 9. "Forest crop land" Page 3, line 9 8 45-52 10. expenditures Blanks 24 53-76 No data 11. Continuation indicator "2", unless an "A" card is 1 -77 12, No data needed - then punch an "A" 3 78-80 instead Blanks (On alternate program) # of Name of field Contents or Source Cols. Col NOTE: If an "A" card as well as an "L" card are required, punch col. 77 as "1E". 5.6 Summary Cards, Form 3 Returns 1963, 1964 ("Long" forms) 1. identification code 2. card number 3. WAIS Identification number 4. Year of return 5. Multiple ID indicator 6. Social Security number 7. Multiple Social Security indicator 8. Demographic data (See Section 7) 9. Adjusted gross income 10. No data 11. Continuation indicator "1" Codesheet, field 1 Last two digits of year (top, page 1) Code sheet, field id Code sheet, field 2 Code sheet, field a Code sheet, fields 9-31 Page 1, line 1 Blanks ", unless this is the only card for this return - then punch "Z" instead 1 1 2 75-76 1 -77 12. No data Blanks 3 78-80 Summary Cards, form 3 Returns 1963= 1964 ("Long" forms) Card 2 (On alternate program) # of of field Contents or Source 1 1. Identification code 2. Card number 3. WAIS Identification number. 4. Year of return 5. Standard deduction 6. Total itemized deductions 7. Net income before contributions 8. Contributions 9. Net taxable income 10. Personal exemptions 11. Net normal tax 12. Tax paid to other states 13. Continuation indicator 14. No data -1 "2" 1 8 5. Page 1, Page 1, line 3a Page 1, line 3b Page 1, line 3c Page 1, line 4 Page 1, line 6 Page 1, line 7 Page 1, line 8 line 2 2 8 8 8 a 8 8 8 "C", unless this is the last card for this return- then punch "Z" instead 8 -2 3-10 11-12 13-20 21-28 29-36 37-44 45-52 53-60 61-68 69-76 -77 Blanks 3 78-80 Summary Cards form 3 Returns 1963, 1964 ("Long" forms) 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Total payments and credits 6. Largest wage "3" 1 5. 7. Second largest wage 8. Total other wages 0 9. Unemployment Compensation 10. Total dividend income 11. Total interest income 12. Total rent income 13. Continuation indicator 14. No data 0 1 8 2 Page 1, line 12 8 Page 2, line I (largest entry. 8 Page 2, line 1 (second largest 1 8 entry) Page 2, line I (sum of remaining entries) or slip from adding machine 1963 leave blank (not specified on form) 1964: Page 2, line I (bottom entry) Page 2, line S (total on boct(rm line) 8 - 1 -2 3-10 11-12 13-20 21-28 29-36 I 8 37-44 45-52 8 53-60 8 61-68 "C", unless this is the last card for this return - then punch "Z" instead 1 69-76 3 -77 78-80 Summary Cards, form 3 Returns 1963, 1964 ("Long" forms) Card 4 (On alternate program), l Name of Field Contents or Source 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Gains and losses, property 6. Profit car loss, business 7. Partnership income 8. Estate or trust income 9. Refund of Wisconsin Income Taxes 10. Other income 11. Total income 12. Business expenses 13. Continuation Indicator 14. No data "3" "4" Duplicate Page 2, line 7 Page 2, line 8 Page 2, line 9 (sum entries if more than one) Page 2, line 10 (sum entries if more than one) Page 2, line 11 Page 2, line 12 Page 2 line 13 Page 2, line 14 "C", unless this is the last card for this return - then punch "Z" instead Blanks 5 Summary Cards, form 3 Returns 1963, 1964 ("Long" forms) Card 5 (on alternate program) of Name of Field Contents or Source Cols. .ols A 1. Identification f01300 1 -1 code 2, Card number 195 11 1 3. WAIS Identification number 8 3-10 4. Year of return Duplicate 2 11-12 5. Total non-business interest Page 3, line 1 (total on bottom 13-20 6. Net medical expense line) 8 21-28 Page 3, line 4 7. Casualty losses Page 3, line 5 8 29-36 8. Wisconsin Income Tax withheld Page 3, line 6d 8 37-44 9. Union dues Page 3, line 7 8 45-52 10. Alimony paid Page 3, line 8 S 53-60 11. "Forest crop land" Page 3, line 9 8 61-68 12. No data expenditures Blanks 8 69-76 13. Continuation indicator "Z", unless an "A" card is 1 -77 14. No data required - then punch an 3 78-80 "A" instead, or unless an "L" card is required-then punch "L" instead. Blanks NOTE: if an "A" card as well as an "L" card are required, punch col. 77 as "K". 5.7 Summary Cards Form 4 Returns 1963, 1964 ("Long" Forms) 1. Identification code 2. Card number 3. WAIS Identification member 4. Year of return 5. Multiple ID indicator 6. Social Security number 7. Multiple Social Security 0 Indicator 8. Demographic data 9. (See Section 7) Largest wage 10. No data 11. Continuation indicator 12. No data 114" "1" Code sheet, field 1 Last two digits of year (top, front) Code sheet, field lA Code sheet, field 2 -1 -2 3-10 11-12 1 -13 9 14-22 -23 1962: back, line 6 (largest entry) 1963-1964: front, line 4 (first entry) 24-66 67-74 1'C", unless this is the only card for this return - then punch "2"1 instead 2 75-76 1 -77 3 78-80 Summary Cards, form 4 Returns 1962, 1963, 1964 ("Short" forms) Card 2 (on alternate program) Name of Field 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Second largest wage 6. Total other wages 7. Interests, dividends and other wages 8. Total income 9. Standard deduction 10. Net taxable income 11. Total Wisconsin Income Tax Withheld 12. Personal exemptions 13. Continuation indicator 14. No data "4"" "21' 1962: back, line 6 (second largest entry) 1963-1964: front, line 4 (second largest entry) 1962: back, line 6 (sum of remaining entries) 1963-1964: front, line 4 (sum of remaining entries) or slip from adding machine 1962= back, line 7 1963-1964: front, line 5 1962: back, line 8 1963-1964: front, line 6 1962: back, line 9 1963-1964: front, line 7 1962: back, line 10 1963-1964: front, line 8 1962: back, line 12 1963-1964: front, line 9 1962: back, line 14 1963-1964: back, line 12 "C", unless this is the last for this return - then punch instead 1 1 1 8 2 8 8 8 8 8 8 8 card 1 21-28 O 29-36 37-44 45-52 53-60 61-68 69-76 "VO -77 78-80 Summary Cards, form 4, Returns 1962, 1963, 1964 ("Short",forms) (On alternate program) Name of field . 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return 5. Net normal tax 6. Tax to other states 7. No data 8. Continuation indicator 9. No data Contents or Source Duplicate 1962: back, line 15 1963-1964: back, line 13 1962: back, line 16 1963-1964: back, line 14 Blanks "Z", unless an "A" card is is needed - then punch an "A" instead, or unless an "L" card is needed - then punch "L". Blanks V M: If an "A" card as well as an "L" card are needed punch co. 77 as "K" 5.8 Miscellaneous Amounts All Forms Returns 1962, 1963, 1964 ( On Alternate Program) Name of Field 1. Identification code 2. Card number 3. WAIS Identification number 4. Year of return S. Social Security received 6. "Other" deductions (not 7. elsewhere classified) Non-taxable income O 8. Non-taxable income class 9, 10. Same as 7, 8 if more than 11, o" source of non-taxable income 12. Do 13, 14. Do 15, 16. Do. 17, 18. Do. 19. Continuation indicator 20. No data "1" for first card, "2" for second card, etc. 1 8 3-10 2 11-12 Usually written on page 1 of return Usually written beside the "itemized deductions", page 2 or 3. Code indicating source of income above military exemptions student income "R" - retirement income "N" - source not ascertained "Z", unless another "A" card is needed; then use "C" 8 13-20 8 21-28 7 29-35 1 -36 8 37-44 8 45-52 8 53-60 8 61-68 8 69-76 1 -77 3 78-80 J 5.9 Assessments (all years) name of field # of Cols. Contents or Source ols 1. Identification code fOLOO 1 -1 2. Card number "1" for first card, '12" for 1 second card, etc. 3. WAIS Identification number Code sheet, field 1 8 3-'0 4. Year of return (i.e, year when Last two digits of income 2 K-12 the income was received) year 5. Income previously taxed 8 13-20 6. Adjusted taxable income 8 21-28 7. Additional normal tax 8 29.36 8. Interest computed (on 8 '37-44 additional tax only) 9. Additional surtax 8 45-52 10. Interest on surtax 8 53-60 11. Total additional tax 8 6168 12. No data 12 69-80 6. Examples of coding-filing log sheet and codesheet. 1959-1965 Income Tax Returns Coding-filing Log Name of Head Social Security Number of Head Identification Number of Head Function Date Initials Remarks 7. Detailed description of coded demographic data,, Field # # of C 9 Co . -24 Field Name Returns filed ID Number Assigned Demographic Coding Summary Information Keypunching Summary Information Edit Correction (1) Summary Information Edit Correction (2) Summary Information Edit Correction (3) Details and Assets Coding Details and Assets Keypunching Details and Assets Edit Correction (1) Details and Assets Edit Correction (2) Details and Assets Edit Correction (3) Numeric 9A Consistency Ind. Numeric 1 -25 10 Residence location Alpha 4 26-29 11 County code Numeric 2 30-31 12 County prior year Numeric 2 32-33 13 Address change Numeric 1 -34 13A Non-resident indicator Numeric 1 -35 14 Occupation code Numeric 2 36-37 15 Industry code Numeric 4 38-41 16 Occupation change Numeric 1 -42 17 Return previous year Numeric 1 -43 17A Labor force previous year Numeric 1 -44 18 Marital status Numeric 1 -45 18A Martial status consistency Numeric 1 -46 19 Spouse separate income Numeric 1 -47 19A Separate income reliability Numeric 1 20 Recent marriage Numeric 1 -49 20A Dissolution of marriage Numeric 1 -50 21 Number of dependents Numeric 2 51-52 22 Dependents' age code Numeric 4 53-56 23 Dependents' address code Numeric 1 -57 234. Students in college Numeric 1 -58 24 Total sources of wages Numeric 1 -59 25 Diva paid In stock Numeric or bleu 1 -60 26 "Head of Family" exemption Numeric 1 -61 27 Auto expense indicator Numeric 1 -62 28 Supplementary schedules Numeric 1 -63 29 Enclosures Numeric 1 -64 30 Occupation and industry reliability Numeric 1 -65 31 Next year filed Numeric 1 -66hahttp://www.ssc.wisc.edu/wais/WAIS667013.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667013.txt John deVries 1966PIOutline for Further Processing of the 1959-1964 Wisconsin Income Tax DataOctober 19, 1966 WAIS paper667-0150.(Master File- Tax Records Data Processing66John de Vries WAIS 667-015 October 19, 1966 OUTLINE FOR FURTHER PROCESSING OF THE 1959 - 1964 WISCONSIN INCOME TAX DATA Table of Contents 1. Introduction 1 2. Further processing of new data 2 2.1. Separation of files 2 2.2. Elimination of error cards 3 2.3. Processing of individual files 4 2.3.1. The identification file 4 2.3.1.1. Creation of sorted file 4 2.3.1.2. Check on missing or superfluous cards 4 2.3.1.3. Card edits 4 2.3.1.4. Creation of tape records 4 2.3.1.5. Intra-file consistency checks 5 2.3.1.5.1. Check on multiple Soc. Sec. numbers 5 2.3.1.5.2. Sort on ID number 5 2.3.1.5.3. Husband - wife checks 5 2.3.1.6. Inter-file consistency checks 5 2.3.1.6.1. Check on presence or absence of records 5 2.3.1.6.2. Inter-record consistency checks 7 2.3.2. The M - file 7 2.3.2.1. Creation of sorted file 7 2.3.2.2. Check for missing or superfluous cards 7 2.3.2.3. Card edit 8 2.3.2.4. Creation of tape record 8 2.3.2.5. Check for duplicate socia1 security numbers 8 2.3.2.6. Check for duplicate ID numbers 8 2.3.3. The S - file 8 2.3.3.1. Creation of sorted file 8 2.3.3.2. Check on missing or superfluous cards 8 2.3.3.3. Card edit 8 2.3.3.4. Creation of tape record 8 2.3.3.5. Check for duplicate social security numbers 9 2.3.4. The A - file 9 2.3.5. The L - file 9 2.3.6. The summary files 9 2.3.6.1. Creation of sorted file 9 2.3.6.2. Check on missing or superfluous cards 10 2.3.6.3. Card edits 10 2.3.6.4. Creation of tape records 10 2.3.6.5. Merge of separate summary tape files 10 2.3.6.6. Intra-record checks 10 2.3.6.7. Inter-record, intra-file checks 11 2.3.6.8. Inter-file checks 11 2.4. Inter-file processing (new files) 12 2.4.1. [T17] against all other files (check) 12 2.4.2. [TP2] against all other files (check) 12 2.4.3. Integration of [TAI] and [TL1] with [TP2] 13 3. Preparation of old files 13 3.1. General procedures 13 3.1.1. Elimination of multiple ID-numbers 13 3.1.2. Various inter-record consistency checks 14 3.1.3. The "gap-plugging" process 14 3.2. Preparation of the FFID-file 15 3.3. The preparation of the master file 16 4. Integration of old and new files 20 4.1. Updating of [FFID] 20 4.2. Merge of [MF] with [TP3] 20 4.3. Updating of the history file 20 1. Introduction So far, fairly definite outlines have been set up for the first stages in the processing of identification and summary data: the filing of returns, assigning of ID numbers and coding of the demographic data are described in WAIS 667 - 002 (revised). In the meantime, instructions for the keypunching of the identification cards and the summary data have been written and, although slight revisions are always possible and even likely, the main structure of that part of the processing effort is clearly defined (see WAIS 667 - 013). The next stages wil1 deal with the checking of the data, the creation of tape - files and the integration of the old and the new date. This paper is an initia1 attempt to analyse and plan the various stages required for the eventual integration of the data. The basis underlying the proposed procedure consists of the following two assumptions: 1. It is preferable to catch any error at the earliest possible moment. 2. Since the sample is already ordered by name - group, it is desirable for most of the following jobs to be done by name - group (rather than waiting for all name - groups to be completed). The paper consists of several different sections: the processing of the new data (leading up to creation of new, but still segregated files); the processing of the old data (elimination of inconsistencies, preparation of a segregated old file which is compatible with the new data); the integration of the old and the new files. A Note on Terminology: In the following sections, certain abbreviations have been employed which require some explanation. Any group of numbers and/or letters between rectangular brackets [ ] indicates a file. The symbol " :[xyz]" (where xyz can be any combination of letters and/or numbers) indicates that the process described 1eads to the creation of file XYZ. Old files are [MF] - master file, and [FFID] - old FFID - file. Newly created files are usually symbolized by three digits: the first digit is either C (for a card file) or T (for a tape - file); the middle digit can be M (for multiple ID numbers), S (for multiple soc. sec. numbers, A (for miscellaneous amounts), L (for assessments), N (collective for the four different forms of summary data), 1,2,3 or 4 (for separate types of summary data, E (for error files), I (for identification file) or P (for preliminary combined summary data); the last digit is in most cases a numeric, indicating the file's position in a sequence of intermediate files. [CER] is an error file of cards which, at any given stage, are marked as incorrect, but where the error can not be determined. Per name-group, this file will probably be fairly small. It is suggested that for each operation the unexplained error cards be put in [CER] and that during each operation, [CER] be checked for possible missing cards. This paper is set up in such a way that the total processing effort is broken down into relatively small, but distinct analytical procedures. This approach facilitates a systematic analysis of al1 possibilities; it is not necessary to follow this exact breakdown during the actual processing itself. In other words, several of the separate phases may very wel1 be combined into one operation; several files which are mentioned in the analysis may never physically occur. Another point which has to be kept in mind is that almost every checking operation should be performed at 1east twice: once to 1ocate errors and inconsistencies and once (after the errors have been corrected) to verify that all old errors have been corrected) and that no new errors have been introduced. 2. Further Processing of New Data 2. 1. Separation of Files As is evident from the keypunching instructions, the identification cards will be keypunched as a separate job; these cards will therefore be separated from the other cards. The other files will be split off by sorting on column 1. This will result in the creation of the following: a) the M - file (multiple ID numbers) :[CM1] b) the S - file (multiple soc. sec. numbers) :[CS1] c) the A - file (miscellaneous amounts) :[CA1] d) the L - file (assessments) :[CL1] e) the summary files, types 1-4 (summary data) :[C11], :[C21], :[C31], :[C41]. f) (already separate) the I - file (identification cards) :[CI1] g) various remaining cards with incorrect identification code in column 1 error file :[CE1]. 2. 2. Elimination of error cards, [CE1] [CEI] will contain, theoretically, mispunched cards belonging to other files and "stray" cards not belonging to any of our card-files. If we assure that the I-cards have indeed been kept separately, the best approach to separate these cards (based on other characteristics) will be: a.) Sort on col. 77. All cards with "blank" in 77 are probably L-cards. If such a card is verified as an L-card, correct and add to [CL1]. If it is not an "L" - card place in residual file, [CER]. b.) Sort the remaining cards on column 31. All those with "blank" in col. 31 are supposedly mispunched M-cards. Every card which can be verified as such can be corrected and added to [CM1]; remaining cards to be added to [CER]. c.) Sort the remaining cards on column 33. All those with "blank" in col. 33 are supposedly mispunched S - cards. Every card which can be verified as such can be corrected and added to [CS1]; remaining cards to be added to [CER]. d.) Sort the remaining cards on column 76. All those with either blanks or alphabetics are supposedly mispunched A-cards. Those that can be identified as such, can be corrected and added to [CA1]; those that can not be identified, can be added to [CER]. e.) The remainder is, hopefully, a mixture of summary cards, types 1-4. Sort these cards on column 12. Cards with 9,0 or 1 in column 12 are supposedly mispunched type 1 cards; if so, correct and add to [C11], if not add to [CER]. Cards with 2, 3 or 4 in column 12 can be form types 2,3 or 4. My guess is that this can be determined easiest by checking the source data. If the type can be determined, correct the card and add to [C21], [C31], or [C41]; otherwise, add to [CER]. f.) If, at any of the above stages, [CE1] is exhausted, go immediately to stage g). g.) If [CER] does not contain any cards, proceed to the next stage for the processing of the individua1 files (sect. 3). Otherwise, try other means to determine for each card in [CER] where it belongs; if additional determinations can be made, add the cards to their respective files (after correction); store the remainder (if any) of [CER]. 2.3. Processing of individual files 2.3.1. The identification file, [CI1]. 2.3.1.1. Sort on card number (column 80) and ID-number (columns 2 - 9), resulting in a sorted identification file: [CI2] and an error file: [C13]. Cards in [CI3] will be those with card number other than 1 or 2, and with other than numeric characters in columns 2 - 9 (except for NN in cols. 8 - 9 for non filing wives). Some of these can easily be corrected and added (in their proper place) to [C12]; the remainder is added to [CER]. 2.3.1.2. Check on missing or superfluous cards Each person on [C12] should have two cards. One with "1" in column 80 and one with "2" in column 80. A check will produce, besides a file of correct cards, :[CI4], a file of error cards :[CI5]. This file wil1 contain incomplete records and superfluous cards. Matching of incomplete records with each other and with superfluous cards will locate some additional records (mispunched;); other incomplete records will require punching of the missing parts; cards in [C15] which can not be identified will be added to [CER]. 2.3.1.3. Card edits. Card edits on cards 1 and 2 can now be done; incorrect cards can be corrected and added to [C14], replacing the incorrect cards. 2.3.1.4. Creation of tape records. At this stage, we can create a tape record for the cards in [CI4] and sort this file by ID number: [TI1]. Several checks on this file are required to ensure internal file consistency. 2.3.1.5. Intra - file consistency checks 2.3.1.5.1. Check on multiple soc. sec. numbers. [TI1] is to be sorted by soc. sec. number (eliminating all those persons without a soc. sec. number): [T12]. This file can now be checked to ensure that no two (or more) people have the same soc. sec. number. Erroneous records will be printed out; corrections will be made directly to [T12]. This process should be repeated until no duplication of soc. sec. numbers occurs. Then, 2.3.1.5.2. Sort [TI2] again on ID number :[TI3]. On this file, run: 2.3.1.5.3. Husband - Wife checks. Wherever identification records are present for a husband and his wife (or wives), several fields must be identica1 for the spouses (except in cases where they are legally separated or divorced): last name field and all address fields. The best procedure is to print out all records where differences occur ( including records for divorced or separated couples) and correct only the records which are incorrect. After the necessary corrections have been made, the corrected file can be resorted : [TI4]. Several checks can now be run between this file and the old identification file, [FFID]. 2.3.1.6. Inter file consistency checks. 2.3.1.6.1. Check on presence or absence of records. This check will result in three files: 1. Those with records on [TI4] as wel1 as [FFID] :[TI5] These records are "correct" as far as this check is concerned. 2. Those with records on [TI4], not on [FFID] Records in this group can belong to one of the following categories: (i) Wives of husbands whose records are present in [TI4] as well as in [FFID] ; (ii) Persons with newly assigned ID - numbers (either because they never filed in the 1946 - 1960 period or because they were given a "dependent's" ID number) -- all of these cases should have household unit numbers > 4000; (iii) All persons who can not be classified in either of the above groups. These cases require human inspection, followed (usually) by correction. Corrected records can be added to [TI5]. 3. Those with records on [FFID], not on [TI4]. Records in this group can be classified into the following categories: (i) People with "dependent's ID number", i.e. the 1ast digit not equal to "0" -- this class can be subdivided into: a.) Those who were given a "new" ID number as well; b.) Those who did not file in the 1959 - 1964 period. We can determine who belongs in which subgroup by a check against the M - file: persons with a record on the M - file belong in subgroup a.) and are correct; persons not on the M - file in subgroup b.) and have to be treated in the same way as those in class (ii) below. (ii) People who did not file at all during the 1959 - 1964 period; all these cases should be found in the remaining preprinted labels. Persons whose labels are found could be added to a list of persons for whom we may decide to go back to the tax archives; they are correct as far as the checking process is concerned. Those records for which no preprinted label can be located require further investigation and, possibly, correction (to be added to [TI5]. [TI5] with corrected records will be sorted, again on ID - number, :[TI6]. On this file, we can run some additiona1 checks: 2.3.1.6.2. Inter-record consistency checks ([TI6] - [FFID]): For records with the same ID number, several fields must be identical: Social Security number, name fields (for females, the last name may be different if a legal divorce took place). Also, if the street number and name are equal, all other address fields must be identical too (except the ZIP code). Corrections are to be added to [TI6]; the result to be sorted again by ID - number: [TI7]. After all these checks, the file will be in satisfactory condition and can be used for further inter - file consistency - checks, as well as integration with [FFID].[[2.3.2. The M - file [CM1]. The processing of this file follows largely the same steps as that of the identification file: 2.3.2.1. Creation of sorted file Sort on card number (column 2) and ID number (columns 3-10), resulting in a sorted file: [CM2] and an error file: [CM3]. Cards in [CM3] will be those with card number other than 1 or 2 (we will assume that no single person has more than three ID numbers) and incorrect ID - numbers (for this file, NN in columns 9 - 10 can not be accepted). Cards in [CM3] will be corrected where possible and added to (CM2]; the remainder of [CM3] wil1 be added to [CER]. 2.3.2.2. Check for missing or superfluous cards. Although no fixed rules can be given (as was the case for [C12]), several rules can be set up for this file: No card 2 should be present without a card 1 for the same person. If a card 1 has a "C" in column 77, there should be a card 2 for the same person; No person should have more than one card 1 or card 2. Error cards resulting from the check: [CM4] can be corrected and added to [CM2]; cards which can not be corrected will be added to [CER]. 2.3.2.3. Card edit. Since cards 1 and 2 have basically the same format, there is no need for a split in the file for separate card edits. The edit will result in a file of correct cards :[CM5] and an error file :[CM6]. Error cards can be corrected and added to [CM5]; cards which can not be corrected will be added to [CER]. 2.3.2.4. Creation of tape record. At this stage a tape file :[TM1] can be produced from [CM6]. This file can be submitted to further checks: 2.3.2.5. Check for duplicate Soc. Sec. numbers. Sort [TM1] by social security number :[TM2]; check for more than one person with the same social security number (error file :[TM3]). Error corrections to be added to [TM2]. 2.3.2.6. Check for duplicate ID numbers. [TM2] is to be sorted by ID number :[TM4]. An additional file is to be produced, sorted by "secondary" ID number :[TM5]. Records in error are: a.) Those were more than one person has the same "secondary" ID number ("repeats" found in [TM5]); b.) "Matches" on ID number between [TM4] and [TM5]. Corrected records to be added to [TM4]; total to be sorted again on ID number :[TM6]. This file is now ready for inter-file consistency checks. 2.3.3. The S - file [CS1]. The first four stages in the processing of this file are parallel to those for the M-file; for explanation, therefore, see section 2.3.2. 2.3.3.1. Sort on card number (col. 2), ID number (cols. 3-10); 2.3.3.2. Check on missing or superfluous cards; 2.3.3.3. Card edit; 2.3.3.4. Creation of tape record: [TS1]. The processing of [TS1] is different, though, from that of [TM1]. 2.3.3.5. Check for duplicate social security number. Sort [TS1] by social security number :[TS2]; also create a file sorted by "secondary" socia1 security number :[TS3]. Matching of the two files will produce an error file :[TS4] containing three types of errors: a.) "duplicates" found in [TS2]; b.) "duplicates" found in [TS3]; c.) "matches" between [TS2] and [TS3]. Corrections from [TS4] to be added to [TS2]; sort [TS2] again by ID number: [TS5]. This file will be ready for interfile checking. 2.3.4. The A-file [CA1] The processing of this file consists only of four stages, parallel to the first four stages for [CM1]; see section 2.3.2. for further description of: 1. Sort on card number (column 2), year (11-12), ID number; 2. Check on missing or superfluous cards; 3. Card edit; 4. Creation of tape record, sorted by ID number and year of return :[TA1]. This tape file is now ready for inter-file checks. 2.3.5. The L-file [CL1]. All stages run paralle1 to those for the A-file; see section 2.3.4. and, for more detailed description, section 2.3.2. Eventua1 result: tape file :[TL1] which has to await further inter-file checks. 2.3.6. The summary files [C11], [C21], [C31], [C41]. The processing of these files will follow parallel lines; I wil1 discuss them under one heading. The stages required for the checking and processing of the summary files are: 2.3.6.1. Creation of sorted file. Sort on card number (column 2), year of return (columns 11-12) and ID number (columns 3-10). This will result for each of the summary files, in a sorted file :[CN2] and an error file :[CN3]. The error file contains cards with incorrect card numbers, year identifications and ID numbers. Cards in [CN3] wil1 be corrected and added to [CN2] wherever possible; the remainder is to be added to [CER]. 2.3.6.2. Check on missing or superfluous cards. A number of rules can be set up for card requirements: (i) Every year-record should at 1east have a card "1"; (ii) No year-record should contain more than five cards ( 3 for [C42]); (iii) No card should occur more than once; (iv) Within each year-record, no card should follow the card with "Z" in column 77; (v) There should be as many cards in a year record as the card number on the card with "Z" in column 77. A check on these conditions will produce a file of correct records: [CN4] and a file of incomplete records and superfluous cards: [CN5]. Corrections from [CN5] will be added to [CN4]; the remainder will be added to [CER]. 2.3.6.3. Card edits. Separate card edits can be run for each of cards 1-5 for a11 files concerned ( 1-3 only for [C44] ), resulting in: [CN6] with correct and: [CN7] with incorrect records. Corrected cards from [CN7] together with [CN7] produce: [CN8]; unidentified cards from [CN7] go to [CER]. 2.3.6.4. Creation of tape records. [CN8] can be converted to tape, sorted by ID number and year: [TN1]. 2.3.6.5. Merge of separate summary tape files. The four tape files created above, [T11], [T21], T31], [T41] can be merged (by ID-number, year) into one preliminary tape file: [TP1], on which several further checks are to be done. 2.3.6.6. Intra-record checks. Several consistency checks within each record will have to be made; this stage will depend heavily on J. Geffert's CONSIST - programme. 2.3.6,7. Inter-record, intra-file checks. Several consistency checks between records can be made. They fall into the following categories: 1. Checks between subsequent years for the same individual. Various checks can be set up for a multitude of fields, especially absence/presence of returns (current year, previous year and following year), address 1ocations (comparing two subsequent years), 1abor force status, occupation code and industry code, combined with a code indicating the change in labor force status, etc. 2. Checks between spouses' records, same year. Severa1 fields should be identical, for any one year, for the records of two spouses: residence 1ocation, county code, county prior year, address change, non-resident indicator, marital status code, information regarding recent marriage, dissolution of marriage. Other fields should be compatible: spouse separate income vs. presence or absence of spouse's return; marital status code vs. presence or absence of return, etc. 3. Checks between all persons on [TP1]. Extract, for all persons with at 1east one year-record on [TP1], items containing the ID number as well as the social security number: [TPS]; sort this file on social security number and print out al1 cases where the same socia1 security number occurs more than once. After all the corrections indicated by the checks above have been made, [TP1] should be internally consistent and ready for the next stage. 2.3.6.8. Inter-file checks. [TP1] can be compared with [NF]. A comparison by absence/ presence of records will separate items into three classes: 1. Those present on [TP1] as wel1 as on [MF]. Severa1 fields here should match (social security number, address codes vs. address change codes, etc.) 2. Those present on [TP1], absent on [MF]. These people should either be new filers (household unit number 4000) or filing wives of husbands who filed (either present on [MF] or on [TP1] or on both). my person who does not fit into either of the above categories requires further investigation. 3. Those present on [MF], absent on [TP1]. These people should either be holders of "multiple" ID numbers, whose returns are on [MF] under a "secondary" number (al1 these cases should be on [TM6]) or people who really did not file on [TP1] - al1 these cases should be identified on the remaining preprinted labels. All cases not identified under either category should be investigated further. Corrections made on the basis of these checks should be added to [TP1], which then will have to be resorted by ID number,: [TP2]. This file is now ready to be submitted to further inter-file checks. 2.4. Inter-file processing (new files). All the files which, after al1 the steps in section 2.3, have been produced are now ready for inter-file consistency checks: 2.4.11 [TI7] against all other files: 1. All records on al1 other files ([TP2], [TM6], [TS5], [TAI], [TL1]) must have a record on [T17]; every ID number for which there are records on any one of the other files, but not on [TI7] must be investigated. For the records that "match" on ID number, other fields should match too (especially soc. sec. number). 2. All records on [T17], except non-filing wives (records with I D numbers ending in NN) must be present on [TP2]. Any record on [T17] but not on [TP2] should be investigated. 2.4.2. [TP2] against all other files: 1. All records on [TM6] must have at least one record on [TP2]; for all these cases, the multiple ID indicator must be "1"; for al1 records on [TP2] but not on [TM6] the multiple ID-indicator should be "o". 2. All records on [TS5] should have at least one record on [TP2]; for al1 these cases, the multiple Social Security number indicator must be "1"; for all records on [TP2] but not on [TS5] the multiple Social Security number indicator should be "o". 3. All records on [TA1] must have year-records on [TP2] (for the same year, naturally), with "L" in the fina1 continuation indicator; all records on [TP2] with an '.^." in the final continuation indicator must have a record on [TAI]; no record with anything else in the final continuation indicator should have a record on [TAI]. 4. All records on [TL1] must have a year-record (for the same year) on [TP2]; the "total income" on the year-record in [TP2] must be equa1 to "income previously taxed" on the record in [TL1]. 2.4.3. Integration of [TA1] and [TL1] with [TP2]. After all corrections of errors found in the two preceding stages have been made, files [TA1] and [TL1] are ready to be integrated with [TP2],; [TP3] (integrated new master file) which is now ready for the merging with [MF]. 3. Preparation of old files. Severa1 of our files will require preliminary adaptation, as well as some "cleaning up", before the new data can be integrated. This is due to a) considerations for longitudinal studies (see also WAIS 667-012), and b) the fact that the new data will contain different types of information and, frequently, more detailed information than the old data. The best approach, in my opinion, for the preparation of the existing files wil1 follow the stages as described below. 3.1. General Procedures. 3.1.1. Elimination of multiple ID numbers Given the fact that the new data wil1 have integrated all returns for each person under a separate and unique ID number, and given the fact that the new data will not have any records under a "dependent's" ID number (i.e. a number with a number other than "0" in the last digit), and assuming that for longitudinal studies we should have, as far as possible, all returns for each person together, we can conclude that it is desirable that for no person in our sample records are filed under two (or more) ID numbers. An initia1 approach to this problem is currently in progress: all persons who, under the existing rules, have more than one ID number illegitimately, are being investigated and "illegitimate" ID numbers are eliminated. After that job has been completed it seems desirable to eliminate all other multiple ID numbers as well - file [TM6] could, if desired, serve as a source for the integration of additional records. 3.1.2. Various inter-record consistency checks. The consistency checks for the presence or absence of records should either be run again, or the available output should be processed. Existing errors should be corrected. This phase again, is also essentia1 for the creation of the "longitudinal selection file" (see WAIS 667-012). 3.1.3. The "gap-plugging" process. The "gap-plugging process", as described in WAIS 667-012, can be executed at two different points during the processing of the new data: before or after file-integration. A number of factors will affect a decision about the timing, the most important ones being a) cost, b) speed, and c) accuracy. The ideal approach would be to "plug" after the files have been integrated; although we lose in speed, we gain in accuracy (at a lower cost.) We could, however, compromise and do a "preliminary plug" on the old data only, followed by a more elaborate and accurate procedure after the integration of the files. 3.2. Preparation of the FFID - file. Several fields on the new data differ with the comparable fields on the old data in size and/or form. For the FFID - file, the fields affected are: 1. Title. This field wil1 contain three (3) digits on the new data, vs. two (2) on the old data. The easiest way to make this field compatible is to put a "blank" in the third digit of all "expanded" old records. 2. Post Office, (city). This field wil1 contain 17 digits rather than 21. n preliminary check indicated that reducing the size of the field will not result in the mutilation of the name of any city; for out-of-state addresses it may result in the elimination of the state code. A check on the current FFID-file to find all cases where positions 18 - 21 of this field are not all "blanks" would be useful; if very few records show up or if the records which show up can all be easily corrected, I recommend that we reduce the size of this field on the old FFID-file. If corrections are too numerous, we may consider to expand the new FFID codes. 3. Posta1 zone. This field wil1 contain five (5) rather than two (2) digits; [FFID] can be expanded by inserting three blanks to the left of the postal-zone field. 4. Age in 1964. This field is not contained in the current FFID. The best approach seems to be to insert a 2-digit field with "99" following the county-code (positions 120-121). Date of death. This field, too, is not contained on the current FFID. The best approach, again, seems to be to insert a sixdigit field, filled with "8", after field "age in 1964" (see section 4 above). After the format of [FFID] is changed, this file is ready for the integration with [T17] (see section 4.1 of this paper). 3.3 The preparation of the master file. In the following discussion it is assumed that the elimination of multiple ID numbers and the correction of inter-record inconsistencies, as described in sections 3.1.1. and 3.1.2. of this paper have taken place. Severa1 fields show up differently on the old and the new master file: amount fields are sometimes dropped, in other cases added. In the case where the old master contains information which is missing on the new data, the new data will show zero amounts; in the case where the new data carry a previously non-existent amount, the old master record format will have to be expanded and carry a zero amount. Significant differences show up, however, in the coded demographic data. There we wil1 probably run into a large amount of clerical checking to make the two sets of data compatible. I nevertheless think that we should try to make most of the fields compatible eventually. The main fields with differences are: 1. Return filed indicator (field 9). This field is not included in the old data. Its main purpose will probably be to operate a 1ist of persons for whom we want to return to the Tax Archives to microfilm missing returns. Since including the field in the final record does not seem very useful, there is no need to expand the old master file for this field. 2. Inconsistency indicator (field 9A): The same remarks apply as above (field 9). 3. Residence location (field 10). The old data have a 3-digit numeric code, while the new data have a 4-digit alphabetic code. I would suggest a 1-to-1 machine recode for all the cases where this can easily be done; the residual may be listed by machine and subsequently coded manually. 4. Address change (field 13). The code for the new data provides more detai1 than that for the old data. Codes "0" and "9" do not require recoding I suggest that all other codes be recoded on the basis of the recoded residence location - the majority of cases can be done by machine, a residual by hand. 5. Non-resident indicator (field 13A). This field is not contained on the old data. A machine procedure could code all Wisconsin addresses; all records with address codes outside Wisconsin can be coded manually. 6. Occupation code (field 14). The new codes provide more detail. The old codes can be subdivided into the following categories: a) Those where the new and the old codes are identical(e.g. codes 01 - 05, 07 -08, 99) - no change necessary; b) Those where the old and the new codes are different, but with a 1-to-1 relation (e.g. codes 10, 11, 12, 14), these can be machine-recoded; c) Those where the old codes specify more detai1 than the new codes (e.g. old codes 09, 13 will both be recoded into new code 19) - these also can be recoded by machine; d) Those where the new codes give more detail than the old codes - these cases should be recoded manually. 7. Industry code (field 15). This information is not given on the old file. We may consider a procedure where those records which can easily be recoded on the basis of their occupation code, be coded by machine; the remainder could either be manually recoded or could temporarily be given a district code (e.g. "9999 - old data stil1 to be coded") - after the integration of old and new data several additional subcategories can be coded by machine. 8. Occupation change (field 16). The new code supplies more detail than the old one. After a recode of the industry and occupation codes (see paragraphs 6 and 7 above), a 1arge subgroup can be recoded by machine. For the time being we should recode this field on the old file to a distinct code (must be alphabetic, since all numeric values are used in the new code). 9. Return filed previous year (field 17). The new data for this particular field provide less detail than the old ones (but additional information is provided in field 7A, see paragraph 10 below). This field can, therefore, quite easily be recoded by machine. 10. Labor force participation for previous year (field 17A). This field is not contained in the old data. Most codes can be assigned by machine on the basis of information in the fields "return filed previous year ", and "occupation code". The remainder can be done manually. 11. Marita1 status code (field 18). The old data do not contain this field. In many cases, the recoding can be done by machine (based on information given in fields "spouse separate income?", "Marriage during tax year?", and "Head of Family exemption claimed"). The remainder can be coded manually. 12. Consistency indicator (field 18A). This field, also, is not contained in the old data. It can easily be coded by machine once field 18 has been coded (see paragraph 11 above). 13. Spouse separate income (field 19). The specific codes assigned are slightly different for the new data; recode of the majority can be done by machine; a small remainder will have to be done manually. 14. Spouse's income reliability indicator (field 19A). This field, too is new. The majority can be coded by machine on the basis of the recoded information in field "Spouse separate income"; the remainder will have to be done manually. 15. Information re: recent marriage (field 20). This field is contained in the old data, but the specific codes assigned have different meanings. Most can be recoded by machine; doubtful cases can be verified manually. 16. Dissolution of marriage (field 20A). Another new field. This majority of cases (i.e. all cases where no dissolution of marriage took place) can be coded by machine; the remainder plus all doubtful cases will have to be coded manually. 17. Dependents' age code (field 22). Another new field. In this case, no coding by machine can be done. The most efficient method is probably to assign a temporary code to al1 people with dependents; after the integration of the new and the old files we may wel1 be able to "work backwards" and thus code many cases by machine on the basis of information in later years; the remainder can then still be done manually. 18. Dependent's address code (field 23). The same remarks apply here as to field 22 (see paragraph 17 above). 19. Students over 18 in college (field 23A). See remarks for field 23 in paragraph 18 above. 20. Head of family exemption claimed? (field 26). The new code provides more detai1 than the old code. Recoding would not be necessary if we assumed that every taxpayer always correctly claimed (or correctly did not claim). The best procedure is probably to postpone any recoding until the new data have been processed to a point where we can get some idea about the proportion of people who incorrectly, claimed or did not claim. If the proportion is negligible, recoding would not be necessary; if the proportion is large, recoding could still be done manually. 21. Reliability indicator (field 30). Another new field. Machine coding is practically impossible; it may not be worthwhile to code all of our records manually. After integration of old and new files and after. recoding of occupation code and industry code we may consider to investigate some questionable cases. 22. Next year filed indicator (field 31). Another new field. In most cases, the recoding can be done by machine (e.g. all cases where the following year's return is present); the remainder can be done manually. After the format of the master file has been changed and the necessary recoding has been done (either temporarily or permanently), the file will be ready for integration with the new data, as well as further consistency checks. 4. Integration of old and new files. 4.1. Updating of [FFID]. A merge of [FFID] and [T17] can take place after completion of the stages as mentioned in sections 2.4.1 and 3.2. Three situations can occur: 1. Both files contain a record for the same ID number. Always place the new record (from [T17]) on the output file: [FFII]. If there is any difference between the two records, a check-item has to be produced or a separate file: [FFIC], to facilitate checking of the data after the file-integration. 2. Only [T17] contains a record for an ID number; the record can be placed directly on [FFII]. 3. Only [FFID] contains a record for an ID number; the record can be placed directly on [FFII]. 4.2. Merge of [MF] with [TP3]. If [MF] has been changed to a format compatible with that of [TP3], the merge can take place without any trouble. The later stages of old file recoding , as well as further intra-file consistencychecks will follow after the merge has been completed. 4.3. Updating of the history file. The last stage in the processing of the summary data is the updating of the history file; information regarding the new data has to be added to the existing information.hahttp://www.ssc.wisc.edu/wais/WAIS667015.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667015.txt"&Mike VonSchneidemesser 1966>8Documentation and Housekeeping Procedures for WAIS StaffNovember 4, 1966 WAIS paper667-016a<5Administration Maintenance System - Files, Data, Etc.  Mike von Schneidemesser WAIS 667-016 November 4, 1966 Documentation and Housekeeping Procedures for WAIS Staff All WAIS Staff members are requested to comply with and implement the following procedures. I. Programs and Files Relevant programs are those written and maintained by WAIS personnel and those specifically written for WAIS by other people. A file for WAIS purposes has a very broad meaning. Every device containing data (but not procedures and instructions) should have an entry in the file catalog. Examples for files can be found in WAIS 667-014 under A. l.-4., 6.; B. 5.-8.; C. l.-4.; D. 7.,8.; E. 4., 6.; F. l.-4.; H. l.-2.,5. At the time a new program or file is planned fill out a program catalog sheet or a file catalog sheet as completely as possible at that stage and file it in the respective catalog. The catalogs are contained in a blue binder and have subdivisions for each file. If there is no subdivision for the file you are setting up or no proper subdivision for your program, either set up this subdivision or file the sheet in the front of the catalog. For programs affecting more than one file, make out additional sheets filling out only the first two or three rows and make a reference to that part of the catalog where a complete description of that program can be found. At the time a program or file has been checked out or generated, fill out the missing items in the catalog sheet. At the time a program or file is being changed or stored at a different place be sure to make the appropriate entries or corrections on the catalog sheet. When discarding a program or erasing a file remove the catalog sheet from the catalog. In the job or header card or similar identification medium of a program the minimal entries should be: A meaningful statement about program purpose or a meaningful program name, accounting information, and the programmers' initials. ("ZZ" is not enough, since meaningless; "ANOTHER MESS" too general, since it describes about every program.) The responsibility for the above rests with the staff member who has the file or program produced. It follows a copy of a file catalog sheet and a program catalog sheet replete with ideas on what to do with these forms. FILE CATALOG SHEET Identification code Not yet available File name The one(s) used in WAIS papers Type of file e.g. Card, tape (density if other than 556) booklets, forms, etc. Arrangement, sorting sequence Primary and secondary sort fields, for tape files: blocking factor times record-length, internal tape labels if used. Maybe some indication of quality of file. Date generated Date and initials of responsible individual Date of updates, changes Format changes, changes of population size, corrections made; location of correction (update) cards if other than WAIS office. Format described in Number or description of document where the layout can be found. Other relevant papers All documents which mention file, especially those containing descriptions of its generation, having explanations of codes used, etc. Location of file Your desk, SSRI Office, card drawer #, room and building. Labels, tape numbers, color of cards etc. External tape numbers Location of listing Drawer # or name, room and or desk, shelf Labeling of Listing, Index Everything which has been marked on the sides of the paper stack, contents of headings, both machine or longhand. Summary description of file Not needed if sufficiently documented in WAIS papers referred to above. Damages, missing items etc. Strange records, gaps, things which will cause trouble to the user if not anticipated. PROGRAM CATALOG SHEET Identification code Not yet available Program name Name by which program is referred to in papers or title cord entry or IDENT-portion in card deck. Programmer, date checked out Initials, month and year Inputs (1) Use file name also used in file catalog File name (2) If file has more than one version , state version #, card, or code (3) or tape with record length X blocking factor. Outputs (1) Same as above rile name (2) or code (3) Program described in WAIS paper #, if no WAIS # available, describe location of write-up if different from program location. Other relevant papers Physical form of program Auto, fortran, Cobol source, object for Machine; coding forms, flow chart, if previous things not available. Location of program decks Building, Room #, drawer # or name Labeling of card deck, color of cards, etc. If not yet labeled, mark it please! Location of program listing Shelf, your desk, room # Labeling of program listing State anything which has been marked on the sides of the paper stack, or name of book, or if neither: "see job card" Summary description of program One or two sentences about purpose of program; not necessary if program is described in a WAIS paper referred to above. Changes made and date Fill in here whenever you modify program, not when you just improve its performance. Do not forget to make corrections or amendments for other entries like location. II. Magnetic Data Tapes All tapes used by WAIS (those owned by WAIS or for which WAIS pays rent) must be entered on the erasable tape use chart. If a file is contained on these tapes, then there must also be an entry in the file catalog. The responsibility for this rests with the WAIS programmer, or the staff member who submits the tapes. III. Error or Correction Cards For each tape file for which a maintenance program exists a card box is kept in the WAIS office labeled with the File-Name and "Update to do". After a successful update run the used cards should go into a box marked with the file name, "Updates done", and the date of the run. If more than one box of these cards accumulates, label it additionally with WAIS and bring to the SSRI card storage room. Make a note in the "Date of Updates" field of the file catalog sheet pointing out that old update cards can be found in the SSRI storage room. Responsibility for filing the new update cards rests with the staff member who coded them or had them made up. Filing of used update cards should be done by the user of the maintenance program. IV. Control Cards Control cards for WISTAB, XTAB, RGR and other canned utility programs should be kept in the drawer marked with the particular program name. Label each deck with the file with which to be used. On the face of each deck write the WAIS paper number in which the tabulation run is documented. For XTAB include name of driver program and file this together with control cards, if owned by wAIS. CARD EDIT Control Cards have a separate drawer and can usually be identified from the entries on the first "twenty-eight" card . Responsibility for filing these control decks rests with whoever used them last. V. WAIS Papers The WAIS "DATA processing" binders will be kept up to date by the WAIS secretary. Each volume of these binders is preceeded by a chronological index of all WAIS papers and some of them by a subject and author index. For the use o the personnel and visitors WAIS keeps copies of each paper in the WAIS-paper file ordered by WAIS-paper numbers. If you return a paper to this storage drawer put it back in its proper sequence. If no paper is available, have some new copies made up by the WAIS secretary. She usually will have the mimeo and a few copies of the paper. VI. Tabulations, Statistical Extracts At this point no decision has been made as to how to index these end products of the WAIS work. But in any case we can document each set of tables in the same manner as any other file on a file catalog sheet. There is a special place in the file catalog for tabulations and extracts. The control card deck used for the table generation is to be stored in its respective card drawer (XTAB, WISTAB, RGR etc.), the listing either bound into a book or placed in the computer output shelves, which WAIS will hopefully acquire. For any but the smallest set of tabulations a WAIS paper like 667 - 010 should be prepared to supply sufficient information for the interpretation of the tables. It is important that for each set of tabulations or each statistical job a unique name be chosen from the outset, as long as WAIS cannot afford an indexing system. This name should be used to label all listings, control decks, and WAIS papers. VII. Forms for coding A special drawer to contain unused coding forms will be set up, as soon as space allows.hahttp://www.ssc.wisc.edu/wais/WAIS667016.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667016.txtRichard Bauman 1966:4On a WAIS Inventory and a Proposal for a Finder FileOctober 7, 1966s WAIS paper667-014n<5Administration Maintenance System - Files, Data, Etc.wRichard A. Bauman WAIS 667-014 October 7, 1966 On a WAIS Inventory and a Proposal for a Finder File WAIS 667-010 is a significant step in the effort to achieve an indexing system for WAIS's materials. The history files, the file catalog, and WAIS documentation papers are examples of other efforts along this line. Some important needed bits of information, however, always seem to elude classification, and therefore are in danger of: a) being lost b) being misused c) being ignored For example--the recent riddle involving what sort of C-Card, P-Card, I-Card file we have was solved in a particularly painful way. An elementary inventory of WAIS property should include, as a minimum: Input, Documents and Forms 1. Files of Source Documents Examples: a) Microfilm copies of 1946-1960 Tax Returns (old) b) Microfilm copies of 1959-1964 Tax Returns (new) c) Benefit Data printouts and SSA 9249's d) Interviews, Booklets, Int'r's. Supplements 2. Files of Non-Source Documents Example: Pretest Interviews, etc, 3. Files of Form 1's, Form S's, etc. 4. Completed Coding Forms Examples: a) FFID coding forms b) "Old Tax Return" Demographic code sheets c) "New Tax Return" Demographic code sheets d) Special purpose 80-col. code sheets 5. Blank (sample) Printed or reproduced Forms Examples: a) Survey "memoranda", Interview, Booklet b) "Old Tax Return" Demographic code sheets c) Survey pretest 1, 2, etc. 6. Microfilm Reels of Tax Return Samples B. Hollerith Card Files l. Program object decks 2. Program source decks 3. General program control card decks Examples: a) card edit control cards b) X-tab, Wistab control cards 4. Test card decks 5. Basic Data Card decks Examples: a) Benefit data cards b) Property file data cards c) Survey cards 6. Single-purpose, data revision cards Examples: a) MAUPDATE N-cards b) Revised benefit data cards c) revised survey data cards 7. Multi-purpose Data Revision Cards Examples: a) Multiple ID cards b) C-Cards, J-Cards, I-cards c) FFID Update cards 8. Other reference cards Examples : a) SSA error cards b) SSA Identified claims cards 9) Card decks which defy description C. Magnetic tape files 1. Card-image tapes 2. Intermediate tapes 3. Final (Permanent) tapes 4. Security tapes 5. output tapes (tables, listings, etc.) 6. Scratch tapes 7. Extracted records on tape 8. Outside files on tape Example: The "compustat" file D. Programs 1. Flows charts 2. Systems flow charts 3. Printed program listings 4. General purpose program write-ups E. Documents 1. Numbered WAIS papers 2. WAIS publications--monograph series 3. Unnumbered WAIS papers--duplicated Examples: 4. a) papers produced prior to numbering system b) lists of Form 1's from the survey Hand-written records Examples: 5. a) New ID's assigned in coding new tax data b) New ID's assigned in coding benefit data Correspondence 6. Card index files 7. Reprints of Journal articles, etc. involving WAIS 8. SSRI papers relevant to WAIS F. Output listings and tables 1. Extract listings 2. Complete file listings 3. Error listings 4. Tape file prints 5. Table input arrays 6. Intermediate or test tables 7. Production tables Note: Each of these should also be identified according to: a) Number of original copies b) Account for all copies c) Number of annotated copies d) Account for annotated copies G. Hardware Examples: l. File cabinets 2. Microfilming Equipment H. Library of Outside (Not SSRI) Publications and Duplications Examples: 1. Wisconsin Taxpayer's Alliance "Taxes" 2. Selected standard and Poor's stock listings xeroxed 3. Social security handbook 4. Dictionary of occupational titles 5. Description of Wisconsin Corporations I. Supplies N.E.C. The Inventory Finder File A basic finder file for all of the materials listed above should be developed. A 3 x 5 card file containing the following description for each item is suggested: 1. Major class (letters A through I above) 2. Minor class (numbers 1 through X above) 3. Item description (letters a-n above) 4. Physical location(s) of Item In addition, the following description should be entered wherever applicable: 5. Description of other indexing device applicable to this item 6. Can device in (5) be used now? Why not? 7. Size (e.g. no. of records, cards, copies, etc.) 8. Comments on condition 9. Date (s) promulgated For many of WAIS's materials, the finder file may be the only available and/or necessary indexing system. In almost all cases, it will be the basic reference tool. Supplementary or Special-Purpose Indices The finder file is not intended as a substitute for various other indices and cross-indices. There is a need for these as well, however there seems to be an insurmountable lag in preparing and keeping up-to-date most of the useful indices. Hopefully, the suggested finder file can be prepared and utilized quickly and may be used as a more arduous substitute for other indices. Other useful indices include: 1. Chronological Index of WAIS papers (available now) 2. Author-subject indices of WAIS papers and other documents (proposed) 3. WAIS tables and listings indices (proposed) 4. Index of miscellaneous output (available now) 5. The file reference manual (available now) 6. A variable location index (Proposed) 7. The history file and associated output listings (proposed) 8. A correspondence file (available now) 9. A grease-pencil tape use chart (available now) 10. Various project personnel heads (available only when communication is possible)hahttp://www.ssc.wisc.edu/wais/WAIS667014.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667014.txt(N Richard Bauman Alan Duchan 1966^WOn Interpretation of the Tax Averaging Tables and a Proposal for Further Summary TablestNovember 8, 1966 WAIS paper667-017Averaging Studies Tables' 'Richard A. Bauman Alan I. Duchan WAIS 667 -017 November 8, 1966 On Interpretation of the Tax Averaging Tables and a Proposal for Further Summary Tables 1.) A "production" run of a portion of the Age-Occupation Tables as specified in WAIS 667-004 was recently made. Some errors were discovered in the "new variables" tabulated, i.e. GRPWFINC, AGE, and OCCUPTN, making necessary some revisions before the final Age-Occupation tables are produced. Table "B" of the Age-Occupation table is comparable to several (marginal) distributions in the Treasury tables, e.g. Tables El, Fl, H1, and J1. (See Duchan's WAIS 656-058). Table B is not affected by the above errors and is not inconsistent with the results of the treasury tabulations. Some of the comparable summary statistics are shown in the following table. Note that two different values of PCV (percentage change variant or a and y) were used in the new tables. DIRFLUC Treasury Tables AGE-OCCUPATION Tables Direction of Fluctuation number of Units With POTAVINC > $3000 PCV Number of Units With a if Dirfluc=+ y if Dirfluc=- POTAVINC > $3000 LEGALDEF = 3 LEGALDEF = 4 LEGALDEF = 3 LEGALDEF = 4 0-Positive 1.00 241 267 241 267 " 1.10 * * 210 233 " 1.25 175 196 * * " 1.33 164 187 164 187 " 1.50 162 178 * * " 2.00 * * 186 191 1-Negative 1.00 178 188 178 188 " 0.90 * * 145 146 " 0.80 122 118 * * " 0.75 111 102 111 102 " o.67 93 79 * " .0.50 * * 72 58 2.) We were originally alarmed when we discovered that there was an increase in the number of persons with POTAVINC > $3000 when PCV increased from 1.33 to 2.00, and went off in search of an explanation. This phenomenon also occurs in a small number of cases in the Treasury Tables but not as dramatically, presumably because of the values assigned to a and y. 3.) We discovered that the assumed relationship between POTAVINC and PCV is a special case of the more general relationship. The relationship depends on the signs of FNTI and B as is shown below: P = POTAVINC B = BINCOME F = FNTI L LEGALDEF a,y = PCV -aB | F> is oo > a > 1 8- F | B>F, 0<_y < 1 Case Relation of B to F .._ Absolute Value Solved sign 1.i .. c 1 P as a - Case of _ s Sign Expression. for of P Value Function of Occurs h to :F of of for P Of 1.00 when of P a or 7 is: When B _ a,y=1 B = J F > B + P=F - a B P =F B + decreasing 1,2,3,4 F > B + 0 P=F P-P + P=F Single valued 1,2,3,4 F > B P=F . F 18| P=F + 1! 1 + '+' F3 | B - P=a|B P= |B| + |B| < P<+oo increasing 3,4 - - P P=|B|-|F| |B|-|F|<<+w increasing 3,4 F < E + P = yB-F P=B-F + -FOP B-F increasing 1,2,3,4 j F B + P=yB P=B + 0 B, Cases 1 and 2) or is either an increasing or single-valued function of y (F0 may also have B <0. A second method is to examine the number of people in the zero column of POTAVINC under PCV = 1.0, DIRFLUC, negative. These people must have exactly zero POTAVINC because with PCV = 1.0: P B - F<0 for them to fall into this column. But to be on the DIRFLUC negative page, they must have B - 1 0. F or both unequalities to hold B - FI = 0. Under LEGALDEF'S 3 and 4, these people did actually have no fluctuation because negative net taxable income are allowed. LEGALDEF'S 1 and 2 include these people plus others who had a negative income in at least one base period year and non-positive incomes for all base period years. Thus, a comparison of the people in this category under LEGALDEF'S 1 and 3 (and again, under LEGALDEF S 2 and 4) will set a minimum on people with B0, that classify by F - B . The procedure is the following. For a given LEGALDEF and for DIRFLUC, positive, assume that for PCV = 1.5, N2 people have P>O; and for PCV = 1.33, N1 people have P . Then if all of the people have B>O, N2 - N, have 1.50>F>1.33 or .50>F - B>.33. Similar computations will work for people with negative fluctuations. For persons with B<0, no classification other than by absolute fluctuation appears possible without rather large changes in Ellis' program. If some other classification is to be made, the values of a and y selected must allow P to go from positive to negative as a or y varies. Thus the number of people with P>O for one value of a or y can be compared to the number of people with P>O for another value of a or y. But with B0 for any a>1 For negative fluctuations P=yB-F I FI - y|B| PLO for any y such that 00. Of course, as PCV changes, the numbers of persons on a table will change. The intervals proposed for POTAVINC are: 00< P < 100 100 < P < 1000 1,001 < P 2000 2,001 < P_ 3000 3 001 < P < 5000 5,001 < P The purpose of the smallest interval is to indicate the people whose POTAVINC is so small that tax-savings may not be worth administrative costs. The current high intervals have been dropped in order to increase the number of observations per cell. The row variable will be B with the same intervals as currently used for FNTI except that the upper income cells are combined. B < - 1,000 - 1,001 < B < 0 1 < B < 3,000 3,001 < B < 5,000 5,001 < B 7,000 7,001 < B < 10,000 15,001 < B < 20,000 20,001 < B Besides the frequency table, we will call for row percentages of frequency table, for means of POTAVINC, and for means of B. Table size is computed as follows. Intervals Variable GRPTYPE 2 LEGALDEF 4 DIRFLUC 2 PCV 4 POTAVINC 6 Cells for basic frequency table = 2x4x2x4x6x8 = 3,072 Cells for means of B = 3,072 Cells for table showing GRPTYPE marginals = 3,072 (both frequency count and means) 9,216 To obtain means of B requires calling for a second frequency count. Therefore, table size given above must be doubled x2 18,432 Note: Computer storage capacity is 32,000 words so we can modify and enlarge this proposal if desired. 9.) For the programmer, two modifications are required. First, B must be introduced into the X - TAB anoy. Second, A P > 0 if statement must be inserted in the appropriate places. As usual, we will have to run a set of test cases before making the final run.hahttp://www.ssc.wisc.edu/wais/WAIS667017.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667017.txt*Richard Bauman 1966rkOutline and Timetable for Preliminary Processing of WAIS Files Relevant to Analyses of Property Income Data\November 22, 1966c WAIS paper667-020 Property File)@):Richard A. Bauman WAIS 657-020 November 22, 1966 Outline and Timetable for Preliminary Processing of WAIS Files Relquent to Analysis of Data 1. Introduction A. Scope Roger Miller's complete outline for the "Portfolio Evaluation from Wisconsin Individual Income Tax Returns", WAIS paper #656-050, March 23, 1966, begins with three steps which are essentially preliminary processing of the several data files. These steps are: (from I. D. (1)-(3) pp. 4-5) (1.) File Completion (2.) Within source consistency checking and correction (3.)Integration of files ... testing for interfile consistency and completing the reliability rating of the integrated data. The outline which follows has two purposes: (1.) to indicate in somewhat more detail the specific steps to be taken in completing the files and checking them for both intra- and inter-record consistency, and (2.) to establish a estimated period for accomplishment of (1.) B. Identification of Files; Pertinent Documentation The following file codes are assigned in WAIS 656-050 Description File Code WAIS Tax Return Master File File 11 WAIS Tax Return Asset Income File File 12 WAIS Interview Film File 13 WAIS Assets Booklet File File 14 WAIS Data Files re Firms and Financial Institutions Files 21-25 Basic documentation of File 11 is found in WAIS 645-017 (Keypunching), WAIS 645-038 (Coding) and WAIS 643-056 (Tape Record Format). File 12 is also documented in WAIS 645-017 (Keypunching - Phase 1I) and WAIS 645-038 (Coding Part III). A revised and combined version of 645-017 and 645-038 is being prepared as Monograph II in the WAIS Monograph series. I. Files 13 and 14 are documented in WAIS 656-049 (Card Formats, Coding) Files 21-25 are inadequately documented at the present, formats and a description of the files are being prepared. 11. Preliminary Processing of File 11 A. Summary Description of File The WAiS Tax Return Master File contains information from tax returns of a sample of Wisconsin taxpayers who filed during the years 1946-1960 (approximately 20,000 persons and 154,000 year records). This file includes total amounts of "property income". i.e. total interest received, dividends, capital gains, net rent, etc. The file is complete. B. Consistency file 11 has been checked extensively for internal consistency. C. Revisions File 11 may be revised if the interfile check (see E. C) with File 12 reveals an error in File 11. III. Preliminary Processing of File 12 A. Summary Description of File File 12 contains detail information on asset incomes from the tax returns in File 11. Detail information was collected on: 1.) Interest 2.) Dividends 3.) Net rent 4.) Capital Gains 5.) Farm Income 6.) Business or Professional Income items 1, 2, and 4, were assigned a code which designated the source of income, i.e. the institution (if any) which was involved. This file is complete. B. Consistency Checking The following steps are being taken in checking file 12 for consistency: 1.) interfile check #1 - This portion of the interfile check (File 11 - File 12) checks for the existence of proper records; since File 12 contains detail on items found in File 11, certain correspondence requirements can be checked in this stage. 2.) Internal validity check. This check tests for the presence of valid codes, blanks, characters, contingent required codes and other intra-record consistency. 3.) Preparation of extracted record containing summary detail from File 12. 4.) interfile check # 2 - Comparison of output of step 3 above with corresponding items in File 11. C. Revisions 1. Errors discovered in Interfile check #1 must be resolved by correcting File 11 or File 12. 2. Errors discovered in the internal validity check must be resolved by correcting File 12. 3. Some of the errors discovered it Interfile check #2 may be resolved by correcting File 11 of File ?2. Note: "Errors" which are the result of taxpayer inconsistency sometimes cannot (and should not) be corrected, e.g. a taxpayer lists New York Central - $12.00 under "interest" received (File 12) but transfers the total to "dividends" received. Assuming no other information is given, (contiguous years, sale of the asset, etc;) it is impossible to establish whether he held New York Central stocks or bonds. "Resolution" of this type of error involves creating additional variables which describe the inconsistency. See Miller's I.C Distinguishing Reliability IV. Preliminary Processing of File 13 A. Summary Description of File 13 The WAIS Interview File contains data from an interview taken in 1964 from a sample (1300 households) which largely includes persons (1157/1300) also in File 11 (and therefore File 12, if they reported income from assets). Interview data includes detail on assets such as owner-occupied homes or farms and summary data for other assets. The file is complete. B. Consistency Checking The following steps are being taken in checking File 13 for consistency; 1. Internal validity check #1 - this portion of the internal check tests for the presence of certain required cards and contingent-required cards. 2. Internal validity check #2 - This portion of the internal check uses the same card edit routine as III B 2 above. 3. Internal validity check #3 - This is identical to Step 2 above except that it involves inter-card checks (within a record) rather then intra-card checks. 4. Interfile check - Compare summary data in File 13 to detail data in File 14. C. Revisions 1. Errors discovered in consistency checks 1.)-3.) must be resolved by correcting File 13 or adding reliability codes. 2. Errors discovered in consistency check 4.) must be resolved by correcting File 13 or File 14 or adding reliability codes. V. Preliminary Processing of File 14 A. Summary Description of File 14 The WAIS Assets Booklet File includes detail information on assets other than homes or farms owned by most of those who Responded to the Interview (1105/1300) most of these are also in File 11 (1005/1153). This file is complete. B. Consistency Checking Same steps as III B above. C. Revisions Same steps as IV C above. VI. Preliminary Processing of Files 21-25 A. Summary Description of Files The following eight files can be usefully distinguished Description Prev. Fi1e Code Revised File Code Merrill (N.Y.S.E.) Data File 21 21 Compustat (S & P) Data File 22 22 Stockguide (S & P) Data File 23 Wisbank File 23 24 Wisloan File 24 25 25 WisSEC File -- 26 Resid. (Firm) File prepared -- 27 before coding Residual File -- 28 File 21 contains price and earnings data for a large number of firms whose securities were traded on the N.Y.S.E. This file is complete. File 22 contains balance sheet and income data on what is essentially a subsample of the firms in File 21. This File is complete. File 23 contains price and earnings and some summary balance sheet data on Firms which are not in File 21 but which are held by taxpayers in Files 12 or 14. This File is incomplete. Source documents for a selected period have been coded and are ready for keypunching. File 24 contains data on Wisconsin Barks. This file is complete. File 25 contains (at present) only a list of savings and loan associations in Wisconsin. Data of questionable value for our purposes is available; therefore this file is incomplete. File 26 contains (at Present) only a list of names of firms or institutions not in Files 21 - 25 but whose securities were approved for sale in the State of Wisconsin by securities and exchange Commission. This file is incomplete. Some price, earnings data may be available. File 27 contains (at present) only a list of firms not on Files 21 - 26 which may have been held by Wisconsin Taxpayers. This file is incomplete but presumably has a different reliability than other files because of its sources. File 28 contains (at present) only a list of firms not on Files 21 - 27 but which were owned by Wisconsin taxpayers. This file is incomplete. B. Completion of Files 1. Files 21, 22, and 24 are complete. 2. File 23 is incomplete but requires only keypunching,verifying and card-to-tape processing to make it complete. 3. Files 25 - 28 are incomplete. We do not plan to complete these in the immediate (3 - 4 months) future. C. Consistency checks 1. Interfile check #1 on name and I.D. This check should be made to eliminate possible duplicates in the lists of names and ID#'s of firms. 2. Intrafile check - File 23. A check similar to that in III.B.2 should be made on this file. 3. Preparation of files 25 - 28 for check of permissible "asset type" codes. This involves coding of differentiable firms-institutions (e.g- hospital, city, firm). 4. Interfile check #2 (Files 21 - 28 - Files 12, 14) for permissible "asset type " codes . D. Revision 1. Interfile check #1 errors will result in corrections to Files 21 - 28. 2. Intrafile check - File 23 errors will produce corrections to be made in that file. 3. Interfile check #2 should result in corrections and reliability codes in Files 12, 14. VII. Other WAIS Files relevant to Property Income Analyses. A. New Master File Data WAIS is currently coding and keypunching data similar to that in Files 11 - 12 for the same name group sample for the years 1959 - 1964. We made special efforts to gather tax return data for persons in Files 13 and 14 so that valuables interfile checks can be made when the "New Master File" is complete. B. Benefit Data File This file is nearly complete and contains data on one non-taxable income source for approximately 6000 persons. C. Age File. This file is complete. D. Selection File. This file gives coded information on missing year records in Files 11 - 12. The file is incomplete. VIII. Estimated Completion Dates A. Programing Requirements 1. A general card-edit routine (suitable for checks such as III.B.2 above) is available. Only specific control cards are needed. 2. A conversion routing (suitable for checks such as III.B.3 above) which prepares records for use with 1.) is available. Only specific control cards are needed. 3. Other programming requirements are basically simple file match routines and simple extraction routines. B. Timetable Steps Expected Completion Date II. B. - C Complete III. B1 - Cl 12/15/66 III. B2 - C2 2/1/67 III. B4 - C3 3/1/67 IV. V. B1 - B3, C1 2/1/67 IV. V. B4, C1 3/1/67 VI. B2 1/1/67 VI. C1, D1 1/1/67 VI. C2, D2 2/1/67 VI. C3 2/1/67 VI. C4, D3 4/1/67 VII. A (File) 6/1/67 VII. B (File) 2/1/67 VII. C (File) Complete VII. D (File) 2/1/67hahttp://www.ssc.wisc.edu/wais/WAIS667020.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667020.txt,Mark Lieberman 1967 Processing the WAIS SurveyJanuary 19, 1967 WAIS paper667-022pSurvey Data and File++Mark Lieberman WAIS 667-022 January 19, 1967 Processing the WAIS Survey WAIS 667-007 shows the WAIS survey, as it then existed, having too many inconsistencies for meaningful analysis. Most of the errors cited in section 3.3 of that paper still exist since the present survey file is essentially the same as the one then existing. This paper will give a working outline designed to implement section 4.2 of that earlier paper. The goal of this scheme will be having an updated consistent survey file that can be integrated with the rest of the project in as short a time as possible. Input: Current Survey tape Survey booklets Current survey tape Updated cover sheet cards Corrected master tape Survey inventory corrected survey tape Logic: Run C-cards; drops; etc. Correct ID's on surveys to correspond with tape Ids 10 10 Check tape for presence or absence of cards 11 Add or delete cards from tape Rerun check fo presence or absence Errors still exist 11 No errors 20 20 Check tape and updated cover sheets 21 Alter cover sheets to agree with tape Do they agree? No 21 Yes 30 30 Check tape against survey inventory 31 Alter tape to reselct inventory's content Do the surveys and inventory check out? No 31 Yes 40 Outputs: Updated surveys New survey tape Corrected master tape Updated cover sheet cards Corrected survey tape Est. Time, mnhrs: Pl An Pr Cl 1 0 0 3 1 0 6 15 1 0 0 20 1 0 0 15 Inputs: Edited weight cards Survey master tape Card edit specifications Survey master tape Survey master Corrected survey master Logic: 40 Check weight cards with survey tape 41 Alter weight cards of survey Do weight cards and surveys now chek? No 41 Yes 50 50 Run card-edit programs on all 40 survey cards 51 Make corrections to tape Re-edit survey to double chekc corrections Errors 51 No errors 60 60 Run pre-edit program 61 Make corrections to survey Does tape checek out on pre-edit? No 61 Yes 70 70 Create new survey extract from master 80 Outputs: Corrected weight cards Corrected survey master Corrected survey tape New survey extract Est. Time mnhrs Pl An Pr Cl 1 0 0 20 2 0 0 200 2 1 3 ? 10 2 10 0 Inputs: New survey extract WISTAB Cards Logic: 80 Run WISTAB check tables and frequency counts 81 Analysis and editing Are check tables consistent No 81 Yes End Outputs: Frequency counts on extracted variales Est. Time mnhrs. Pl An Pr Cl 4 0 0 ? Total 23 3 16 273+ Pl = Planner's time An = Analyst's time Pr = Programmer's time Cl = Clerical time There are essentially two major categories of errors on the present survey tape, and each category has three subgroupings. The flow chart in this paper was designed with these categories in mind. I. Card errors A. Extra cards - for example, a man having a set of duplicate cards appended to his regular cards. B. Missing cards - for example, a farmer whose file does not contain Card 24, the card all farmers must complete. C. Inconsistent reference cards - Reference cards include weight cards and cover sheet cards i.e. cards referring to the survey records. A coversheet card for which there is no survey record is obviously inconsistent. II. Field errors A. Impossible codes, no other data relevant - this would be a code that just does not exist e.g. a code 4 on a question that should be answered with a code 1 or 3. B. Unlikely codes - other data relevant - fields showing a respondent having both ten years of school and a college degree would be unlikely. These are codes that theoretically could exist but are difficult to believe on common sense grounds. C. Coding errors, miscoded data no checks possible since "Ye have the poor always with you," we cannot avoid these types of errors. If a person was born in 1905, and a coder or keypuncher erred giving to 1915, it is unlikely that this error will be found since it is a conceivable code ( as opposed to II A ). This flow chart breaks the survey processing into nine blocks. The blocks are not totally dependent on one another, however, so that some of the work on a step can be done before the preceeding step is complete. Thus, the card edit specifications needed for step VI can be written up while step IV is being completed, In this way, extra increments of labor can substantially speed up the survey processing. However, no step can be entirely completed before the preceeding step is done since the input tape for each step is an output of the first earlier step. One "hidden" output of the entire process is an updated survey code book (WAIS 656-049). Apparently, people who did not neatly dovetail into the original survey codes were quite a problem to our coders. Numerous ad hoc codes were used and, while these codes were often recorded, such is not universally true. The most recent codebook updating was November 1966, but mistakes have been found since that time. In the process of the edit, we hope to record codebook errors so a final book will be available at the end of the job. The following is a step by step explication of the flow chart. Step I. This will eliminate some type IB errors. The current survey will have for example, 10 of a man's cards under one ID and 6 of them under another ID. By combining these cards under the proper ID, we create a greater number of complete sets of cards. Although we do not know how many complete sets we will have, it should be slightly less than 1300C (129 or so seems very impossible). Step II. This step checks for IA and IB type errors. Essentially what is being done is checking the indicators on each card to tell if another card should follow (see WAIS 667-009). We will then either code now cards by hand - a fairly slow process - and then add these cards, or we will delete the superfluous cards. After these corrections, we should have complete records for all respondents. However, before going to the next step, we will rerun the same program to see if the card sets are complete after our corrections. This second check becomes standard procedure for all our other checks. The program needed for this step already exists. it needs minor revisions, however. Step III. This step must largely be done by hand. Designed with type IC errors in mind, it will give us a final list of what surveys we have, how many we have, and some information concerning each respondent. The program for this check also already exists in final form, however the changes to the survey and coversheets must be done by hand. Assuming there are not too many errors, step II should be brief. Step IV. This step is the final check telling us we have an equal number of survey booklets and respondents on our survey tape, a check never that was not previously done. This operation must be done entirely by hand, but it should take a relatively small amount of time. Step V. Another check for IC errors, this step, when completed, will allow us to determine the final weights for the survey. It is hoped that by this stage a decision on the reliability of Gene Moyer's weighting system will have been reached although this decision is incidental to the processing of the survey. This is another check that must be done entirely by hand. Step VI. Step VI is the most time consuming part of the survey edit process. Editing each card requires extracting the card from tape, running the edit program, hand correcting survey errors, overlaying the corrected card on the master tape, reextracting and re-editing. However, the actual machine work is incidental compared to the manpower needed for this job. Writing an edit for any card in the survey is a very difficult process requiring fairly intricate cross field packs. While most of the previous clerical work could be done by anyone, this particular job is such that it requires a fairly subtle understanding of the survey. When the initial survey card edits were done preceding the summer of 1966, they were done incompletely. I made about five times as many error checks as were made at that time when I re-edited card 2. It should be realized, however, that these specifications can be written up starting immediately. Step VII. Unlike previous steps, this will require some programmer time. The pre-edit program will allow us to run checks between fields an different cards. It does this, however, by a very indirect process. It extracts the required characters from the master tape, producing a new tape with only the fields we want. This extract appears as an 80 character card image tape. We then edit this tape with the card edit program. However, at the present time, there exists no write-up on the pre-edit program, and George Loniello is the only one who knows how to use it. Hopefully at the time we will be ready to use the program, a write up will be available. Being unfamiliar with the program, I have made no estimate of clerical requirement for step VII. Step VIII. By the time we arrive at step VIII, the WAIS survey should be fairly well cleaned up. Step VIII will create a survey extract tape off which we can run our tabulations. Jim Geffert wrote a program creating such an extract; however, the program has at least one error In it which must be corrected. The format of the extract program is in WAIS 667-006. We could of course decide we want this same information in the future extract. If so, it is a minor job to correct the present program. If, however, we decide to have different data on this new extract, we still have only a small amount of programming to do since the logic of the program already exists and the changes are simple. The decision of what should go on the new extract must be made by a planner, and I have alloted time for it. This, also, should be debugged sufficiently long before its needed. In this way, we can have an extract the day after we are done with step VII. Step IX. Hopefully, Step IX is largely superfluous, It Will consist of two things: check tables and frequency counts. The check tables are such things as two dimensional tables of marital status and number of dependents to see if unmarried R's have children. If errors persist here, we may have to go back to step VI or VII to correct the survey. However if the previous edits were executed, properly, no inconsistencies should show here.. Frequency counts on all variables are another output. These will allow more intelligent construction of later multivariable tables. If these are run on WISTAB, - which is suggested strongly since WISTAB is much easier to write then X-TAB and since these counts run into none of WISTAB's shortcomings - we have still another way to check for errors on the extract tape since the WISTAB symbol ( ) allows one to count all codes other than those specifically indicated as allowable in a field. After step IX the survey master and extract should be ready for analysis. As the flow charts indicate, the principal requirement is manpower. It is recommended that at least one additional individual be assigned to this work. With this extra 15-20 hours/week we have a good chance to finish the survey before the end of the second semester of 1967. As an additional benefit, this new job opening would create a good chance for an intelligent undergraduate, 5' 4" blonde girl, to learn about economic research.hahttp://www.ssc.wisc.edu/wais/WAIS667022.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667022.txtAshok Bhargava 19674.Strategy for Micro-Filming Missing Tax Records July 14, 1967 WAIS paper667-053mB;Master File- Tax Records Missing Data (Master File Records)cAshok Bhargava WAIS 667-053 July 14, 1967 STRATEGY FOR MACRO-FILMING MISSING TAX RECORDS In making any attempt to micro-film the missing tax records a number of steps are involved, and some decisions have to be made before proceeding. The first step would be to pull the remaining folder shots. (At present folder shots have been pulled from 29 name groups - which leaves 21 name groups from which the folder shots have to be pulled.) While pulling the folder shot, a list of ID#'s not in the new MS is compiled - called the Residual List. Unmatched FFID's are eliminated from the residual list. ID#'s not in the FFID file are also eliminated from the residual list. If any ID#'s are left on the residual list - these are checked with the old MF's to see whether they exist. (If they do we will compile a record absent/ present key for the next step). This procedure is outlined in greater detail in 667-029. The whole operations time estimates are: Clerical time 200 hours Planners time 10 hours The next stop is the actual search. This will have to be carried out in three stages, because the data in the Tax Department is at three different places (1947-53 is on micro-film; 1954-58 is in the purged files; and 1959-64 is in the current files). As a pre-requisite the micro-film with 1947-53 returns will have to be acquired. Before carrying out the actual search, we will have to take a decision on the scope of the search, and for this purpose three alternative strategies are outlined. Alternative Strategies Complete Search For All Missing Records Cases in 667-032 1. 1947-53: (a) Folder shots (old and new MF) 3,4,5 (b) Unmatched FFID's (old MF) ,10,11,12 (c) 4000's (New MF) 13,14,15,16 2. 1954-58;,(a) Folder shots (old and new MF) 4,5,8,9 (b) Unmatched FFID's (old MF) 10,11,12 (c) 4000's (New MF) 13,14,15,16 3. 1959-64: (a) Folder shots (old and new MF) 7,6,8,9 (b) Unmatched FFID's (old MF) 10,11,12 (c) 4000's (New MF) 14,15 (d) All returns missing (new MF) 16 (b) Unmatched FFID's (old MF) Time required: Clerical 2000 hours Planners 20 hours II. Restricted Search Alternative Cases in 667-032 1. 1947-53: (a) Folder shots (old and new MF) 3,4,5 (b) Unmatched FFID's (old MF) 10,11,12 (c) 4000's (New MF) 13,14,15,16 2. 1954-58 (a) Folder shots (old and new MF) 4,5,8,9 (b). Unmatched FFID's (old MF). 10,11,12 (c) 4000's (new MF) 14,15,16 3. 1959-64 (a) Folder shots (old and new MF) 6,7,8,9 (b) Unmatched FFID's (old MF). 10,11,12 (c) 4000's (new MF) 14,15 (d) All returns missing (new MF) 16 Time required: Clerical 1660 hours Planners 20 hours III. Restricted Search Alternative Cases In 667-032 1. 1947-53: (a) Folder shots (old and new MF)-Selected 4,5 (b) Unmatched FFID's (old MF)-selected 10,11 (M)* 2. 1954-58: (a) Folder shots (old and new MF) selected 8,9(4),5(M) (b) Unmatched FFID's (old MF)-selected 10,11(M) 3. 1959-64: (a) Folder shots (old and new V)-selected 7,8,9 (b) 4000's (New MF) 16 (M) (c) All returns missing (New MF)-All 16 *M - stands for male Time required: Clerical 614 hours planners 20 hours Note: Time estimates do not include time of these doing the actual microfilming. Recommendation: I feel it is only worthwhile to undertake the third alternative. The additional cases in the first two alternatives will have an insignificant pay-off, if any. The advantage of the first two alternative is that they are comprehensive and thorough. Some decision will have to be taken with regard to the name change problem outlined on page 6 of 667-043. (As was pointed out in that paper, we found a number of cases with this problem). In most cases there is no easy way of deciding what the tax payers actual name is, and hence, whether he belongs to our sample or not. An arbitrary rule will have to be made (e.g. find additional returns for all those already in our sample - even if their other returns are filed under a different name spelling -- as long as we are convinced it is the same person, by looking at the address and social security number). Once the additional returns are micro-filmed they can be coded, key-punched and added on to our tape files. No estimates of time required can be made for these at present - since it will depend on the number of returns we find.hahttp://www.ssc.wisc.edu/wais/WAIS667053.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667053.txtq ,%Richard Bauman Mike VonSchneidemessert 1966"Benefit Data Processing Plan July 7, 1966 WAIS paper667-001"Benefit File Data ProcessingDzDtRichard Bauman and M. von Schneidemesser WAIS Paper 667-001 July 7, 1966 Benefit Data Processing Plan Table of Contents 1.0 Introductory Remarks 2.0 Data Logging Procedures 3.0 Precoding and Assignment of WAIS ID Numbers 4.0 Data Keypunching 5.0 Editing of Data 6.0 Year Record Creation 7.0 Organization of Benefit Data 8.0 Revision Benefit Data 9.0 Summary Appendices I. Card Format for Logging of Benefit Data II. Card Formats for Benefit Data 111. Format for Year Records IV. Source Format (Card 3 Data) V. Source Formats (Card 2 Data) VI. SSA Codes VII. WAIS ID Assignment Procedure for Benefit Data 1.0 Introduction and Scope of Project This is a revision of WAIS paper 656-032 and an anthology of documents pertaining to the processing of the benefit data. The Benefit Data are data on the payments made by the Social Security Administration to beneficiaries of 3,217 "identified claims cases" found in the Form 805 files shared by the SSA with WAIS, 2,727 of the 3,217 cases are from the original WAIS 805 file; the remaining 490 are from the supplementary 805 file. As of June 1, 1966, we have received benefit data for 2,810 claims cases, See WAIS 656-006, Parts I and II for a description of the data. The data covers payments on social security accounts from January 1, 1946 through a variable closing date, which depends upon the date the data was received by WAIS. Data received prior to January 11, 1966, was extracted from the Social Security Master Tape after the April 1965 entries were made. Data received on or after January 11, 1966, was extracted from the Master Tape after the December 1965 entries ware made. Columns 25-28 of Card 1 indicate the last month for which monthly payment information is given. Since only a few year records are complete for 1965, the latest year records created from the present data will be for the year 1964. 2.0 Data Logging Procedures The Social Security Administration sends us benefit data in small groups. Summary information is punched in Logging Cards so that we can know what portion of the file we have on hand and also so that we can make sure that all of the data are processed. Appendix I is a Card Format for the Logging Cards. 2.1 Notes on Card Numbering System 2.11 Three digits are used to specify card types. These are the 1 digit card number (cols. 3) and the 2 digit sequence number (cols. 79-80). Card I is used for benefit account information, The sequence number for Card 1 is always blanks Card numbers 2 and 3 are used for beneficiary information. These cards always have a numeric sequence number. Card 2 is used when the source document is an SSA Form 9249. Card 3 is used when the source document is an SSA computer printout. Thus an individual's data may be on cards numbered 2 only, cards numbered S only, or cards numbered 2 and 3; depending upon the form of the source document(s). 2.12 If Card 3 is used or if Card 2 is used and there is no change in the beneficiary ID code (BIC), the 00 sequence card, containing basic demographic data on the individual is always used. If any monthly payment information is also present, this is recorded (chronologically) on cards with sequence numbers 01, 02, etc. Each sequence card has space for four monthly payment entries. 2.13 If Card 2 is used and there is more than one BIC data related to the first BIC (in chronological order) is recorded as in 2.12.Data for the second BIC is recorded as follows: Demographic data is recorded on a 10 sequence card and monthly payment information is recorded on cards with sequence numbers 11, 12, etc. 2.14 When a person has data on both card numbers, card 3-01 is a chronological continuation; of the highest (sequence) numbered Card 2. 2.15 An example: Suppose a woman starts receiving benefits on her husband's account. After 5: changes in the monthly payment amount, her husband dies and she is entitled to benefits as a widow. After 3 more changes in the monthly payment amount, her benefit payment history is continued on the SSA computer printout which shows two changes in monthly payments. All data for the woman is recorded on 8 cards as follows: Summary of-Data Included - Card # Seq. # BIC Col. 3 Col. 79-80 80 Col. 21 1 2 2 2 blank no basic account information - primary bend ID, #.of ben., etc. 00 B basic date for wife - ID, DOB, DOE, BIC, etc.. 01 no first 4 monthly payment entries 02 no fifth monthly payment entry J for wife 10 D basic data for widow since BIC,DOE change 11 no sixth - eighth monthly payment entries 00 D basic data for widow since different format 01 no ninth and tenth monthly payment entries 3.0 Precoding and Assignment of WAIS ID Numbers Most of the Benefit Data are easily punched from the forms we receive from the SSA. WAIS staff are responsible for the assignment of WAIS ID's beneficiaries and for the precoding of cases that cannot be punched directly. ID assignment and preceding are done before forms are submitted for data keypunching. The ID Assignment Procedure of WAIS 656-006A Section III D, has, in general, worked well, However Rule 2 has been so indiscriminately applied that the intended classes have become either virtually extinct or irrecognizable mutants. 2(a) and 2(b) of the old rules together have yielded 5 new ID's while 2(c) contains about 130 ID's, some of which belong in WAIS name groups. The revised rules in Appendix VII involve a simplification at a cost of somewhat more extensive record keeping. 4.0 Data Keypunching After precoding and ID assignment, the benefit data are keypunched directly from the SSA forms. A revised copy of the card formats for the benefit data is included as Appendix II of this paper. This is a revision of Section III E of WAIS 656-006. There are three basic types of cards. A Card "1" is punched for every account. A Card "2" sequence is punched for every beneficiary as well as some non-beneficiaries whose data is found on an SSA Form 9249, A Card "3" sequence is punched for every beneficiary whose data is found on en SSA printout. The same beneficiary may have Card 2 and Card 3 sequences if data appears on both forms. Sequencing of Cards 2 and 3 depends on the number of monthly payment entries for a beneficiary, Card "4" is a special card to be used only where there is a representative, payee, 5.0 Editing of Benefit Data Error checking will be handled in two stages: Single Card Edits and Other Edits. 5.1 Single Card Edits The General Card Edit Program (developed cooperatively by WAIS and SSRI, the data processing center for the social sciences at U.W.) is used extensively for error checking of the benefit data cards, program checks the internal validity in one or more ways of the following data and control items: (1) Study number (2) Card number (3) Sequence number (4) Blanks in the record (5) WAIS ID numbers (6) SSA beneficiary ID codes (7) Date of birth (8) Date of entitlement (9) Sex (10) Type of claim (11) No record code (12) Date of disallowance or denial of benefits (13) Indicator of disallowance or denial of benefits (14) Date of each payment history entry (15) Amount for each payment history entry (16) Explanatory code(s) for each payment history entry 5.2 Other Edits In addition to checking the internal consistency of the several types of cards, it is also desirable to check the internal consistency of several cards associated with an account or with a beneficiary. In this section; we also treat the checking of consistency of the cards with the source documents. 5.2.1 Internal Consistency Checks We need to check for: (1) Duplicate cards in a sequence (2) Missing cards in a sequence (3) Proper chronology of history entries (4) Last history entry for beneficiaries consistent with account data (5) (Card 3 only) correct number of beneficiaries 5.2.2 External Consistency Checks We should have (1) Data for each account logged in (2) Data for each beneficiary or non-beneficiary who is assigned a WAIS ID This can be checked by matching with the FORM 805 Data and the Benefit logging file (WAIS 645-074) 6.0 Year Record Creation The basic reason for transforming the benefit data into year records is to make a record which is compatible with other data collected by WAIS. The year record: (1) Changes the data unit from an account bases to WAIS Identification number bases (2) Changes monthly payment information to yearly payment information (3) Contains codes which permit conversion of records from a cash basis to an accrual basis. (4) Contains codes indicating months beneficiary worked during year (5) Indicates relationship of beneficiary to primary beneficiary Appendix III is a Format for the Benefit Year Record. Year records will be made for each individual in the benefit data for each year he received benefits during the 1946-1964 period. 7.4 Organization of Benefit Data The benefit data may be organized by account numbers or individual ID #s or integrated with other WAIS files. Organization by benefit account number is the system used in the basic card data. Organization by individual is the form which is most useful at presents, since most of our other files exist in this form. The year record is created for individuals and is set up to integrate data for individuals who receive benefits from several accounts. Integration by tax unit is also possible by using the WAIS ID system. Integration with other WAIS files way be accomplished via a matching process using either beneficiary ID, beneficiary SSA#, or both. 8.0 Revision and Addition of Benefit Data 8.1 Revision Revision of the data is necessary whenever errors are discovered in Section 5 (Editing) of this paper. All corrections will be made to the benefit data cards before the Year Record is created since most of tae possible corrections would affect the Year record and in most cases corrected data will facilitate the creation. 8.2 Addition Addition of data can occur in four ways. Unfortunately, all sources of additional data involved some possible overlaps at this time. It is therefore desirable to incorporate flexibility into several stages of the data processing so that addition of data can ee efficiently done. The four sources of additional data are: 8.2.1 Additional account information in data received. Beneficiaries of the identified claims accounts may also be beneficiaries of other accounts (e.g. their own) which were not identified claims accounts. It may be desirable to expand the benefit data for these beneficiaries by recording this data. This addition would not involve existing account data but Would affect the year records for the beneficiaries involved quite substantially. 8.2.2 Receipt of Data on the 407 Outstanding Identified Claims Accounts. This addition will not affect year records created from the data on hand. It will involve possible additional data treated in 8.2.1 and 8.2.4. 8.2.3 New benefit data for new accounts. Social Security account numbers may be discovered in the new Tax Data or other sources in a quantity sufficient to warrant collecting more benefit data. This might overlap some data on beneficiaries in the current sample. 8.2.4 Further benefit information on identified claims accounts. This type of additional data is primarily useful in extending the year records for individuals, i.e., a time expansion rather than a sample expansion. Once these additional data have been processed and transformed into year records these new records could ee merged with the existing records by use of a generalised file merge program now being prepared by WAIS. This file merge program will check for duplicate or overlapping records. 9.0 Summary of Data Processing for Benefit Date. Description Flow Related Items or Outputs WAIS-SSA I File ; 2 I WAIS -SS Identified Claims. F116 .Identified Claim Cards C WAIS Benefit Documents Logging.BenefitData Cards Assignment Assignment 1. Social Security Account numbers in WAIS Files are matched with SSA Form 805's. 1 WAIS-SSA Form 805 File A Form 805 File 2. Some of the account are identified as claims cases. 2 WAIS-SSA Identified Claims File B Identified Claim Cards 3. SSA sends WAIS benefit data documents for some of the claims cases. 3 WAIS Benefit Documents C WAIS Benefit Documents 4. A logging card is prepared for each account which is included in 3. 4 Logging of Benefit File D Logging Benefit Data Cards 5. All individuals for whom there is data on the benefit documents are assigned WAIS ID numbers. 5 WAIS ID Assignment E New FFID's F Updated FFID's G WAIS FFID File 6. Difficult items are precoded. 6 Precoding of Benefit Data 7. Benefit Data are keypunched and verified. 7 Preparation of Benefit Data Cards H Benefit Data Cards 8. Card-image tape is made from benefit data cards. 8 Benefit Data Card-to-Tape I Benefit Data Card Image Tape 9. Single Card Edit is made 9 Single Card Edit J Error Listing 10. Other Edits made 10 Other Edits Made K Error Listing 11. Corrections are prepared by comparing source documents [C] to listings [J] and [K] 11 Correction of Errors L Correct Benefit Data Cards 12. Corrected card-image tape prepared by inserting and removing of error cards. 12 Correct Benefit Data File M New Benefit Data Tape 13. Year Records created. 13 Year Record Creation N Benefit Year Record 14. Analysis or integration of Benefit Data. 14 Use of Benefit Data Appendix Card Format for Logging of SSA Benefit Data Columns 1-9 11-18 20-34 45-65 67-69 Data Social Security Account Number Wisconsin ID Number Date received; example: June 23, 1965 Account Holder's Name (and whatever else appeared in 1st 20 columns of our old "floating ID format") TOC . . . . . . SSA's code for the first TOC which appears on the printout SSA's code for the second TOC which appears on the printout, if any SSA's code for the 3rd TOC SSA's code for the 4th TOC SSA's code for the 5th TOC SSA's code for the 6th TOC SSA's code for the 7th TOC SSA's code for the 8th TOC 71-71 72-72 73-73 74-74 75-75 76-76 77-77 78-78 Columns Appendix II - Card Formats for Benefit Data Card 1 - Benefit Account information - appears an either Printout or SSA 9249 Data 1-2 Study number - 06 3 4-12 13-20 21-22 (If col. 21-22 are blank skip out rest of card) Card number - 1 SSA benefit account number(SS# in claims status) WAIS ID# for primary beneficiary of the above SS account NOP (= # of payments) on the account in claims status Blank if no account information on printout 23-24 25-28 29-31 32 Blank if no account information on printout Last date for monthly payment information on printout Blank if no account information on printout PIA = primary insurance amount Blank if no account information on printout if no "current pay" shown enter no. of "current pay"/s shown Blank if no account information on printout 33 10 if data is on Master Tape printout only NOB (= # of beneficiaries - primary, secondary) on the account 11 I if data is on Master Tape printout and on SSA form 9249 Blank If no account information on printout 34-80 blank Card 2 Beneficiary Information from SSA Form 9249 Card 2 will be used only for those beneficiaries for whom there is an SSA Form 9249. Columns Data 1-2 3 402 13-20 21 I 22 23-28 29 30 31-34 35 36 37-47 48 49 Study number Card Number 2 SSA benefit account number WAIS ID# for this beneficiary SSA beneficiary ID code, BIC (from cl. sym. column of SSA 9249) SSA beneficiary ID code subscript (If any - otherwise blank DOD - date of birth of beneficiary Sex of beneficiary 0 male 1 female blank BOE- date of entitlement of beneficiary (Item 8 of SSA 1249) blank 1 if OAB checked 5 if DIB checked, or if "H" appears in cl. sym. column of p. 2, SSA Form 9249 9 if neither checked blank if "no record" as shown on SSA No record indication 9249 9249 otherwise blank if "disallowance" or "denied" does not appear under remarks on SSA 9249 1 if "disallowance" appears on SSA 9249 2 if "denied'' 50-55 Date of disallowance or denial of benefits 56-61 Date of death of beneficiary or account holder if shown, blank otherwise 62-78 blank 79-80 Sequence number within card type - 00 Card 2 Sequence No. 01 Columns Data 1-20 Duplicate data for Card 2 - sequence no. 00 21-24 Date of first (chronological) entry on SSA 9249 Month, Year eg. 1163 25-30 Amount shown for first entry (if any - otherwise 000000) incl. cents 31 R if "R" action symbol is shown for first entry S if "S" " " " " " " P if "P" " " " " " T if "T"" " " " " " " A if 11;91 " " " " " G if "0" claim 0 if other action was taken for first entry (see remarks) blank otherwise 32-33 blank 34-37 Date of second (chronological) entry on SSA 9249 (excluding months of same benefit) 38-43 44 45-46 47-57 Amount shown for second entry (if any - otherwise 000000) Use same code as for column 31 blank Data for third entry or SSA 9249 Columns 1-20 21-31 32-33 34-44 45-46 47-57 58-59 60-70 71-78 79-80 Card 2 - Sequence No. 01 (contd) Columns Data 58-59 blank 60-70 Data for fourth entry on SSA 9249 71-78 blank 79-80 Sequence no. within card type - 01 Card 2 - Sequence No. 02 Data Duplicate data for Card 2 Sequence no. 00 Data for fifth entry on SSA 9249 blank Data for sixth entry on SSA 9249 blank Data for seventh entry on SSA 9249 blank Data for eighth entry on SSA 9249 blank Sequence no. within card type - 02 Other card 2's will have the same format as the above cards (sequence no. 01, 02). Sequence cards should be added until all history entries are punched.+~+wCard 3 - Beneficiary Information from SSA Master Tape Printout Card 3 will be used only for those beneficiaries for whom there is an SSA Master Tape Printout. Card 3 - Sequence No. 00 Columns Data 1-2 Study number - 06 3 Card number - 3 4-12 SSA benefit account number 13-20 WAIS ID for this beneficiary 21 SSA beneficiary ID code (Item 12 of printout) 22 SSA beneficiary ID code subscript (If any - otherwise blank) 23-28 Date of birth of beneficiary (Item 14 of printout) 29 Sex of beneficiary (Item 15 o printout) 0 male 31-34 Race of beneficiary l female (Item 15 of printout) b if blank is shown DOE - date of entitlement 0 if N.A. is shown 1 if White is shown 2 if Negro is shown 3 if other race is shown of beneficiary (Item 16 35 of printout PSC - payment status code (Item 17 of printout) 36 PSC subscript; if any 37 TOC - type of claim code (Item 18 of printout) 38 XR - cross reference account number indication 0 if no XR is shown 1 if XR is shown 39-42 1963 ARD - annual report data (Item 21a of printout) 43-46 1964 ARD - annual report data (Item 21b of printout) 47 TEC - type of earnings code (Item 22 of printout) 48 Indication of RPS 0 if no RPS is shown 1 if RPS is shown (If 1 in col. 47, make card type 4) 49-35 blank 79-80 Sequence no. within card type - 00 Card 3 - Sequence No. 01 Columns Data 1-20 Duplicate data for card 3 - Sequence no. 00 21-24 Date of first entry in history block on printout 25-30 Amount shown for first entry 31 RFD - reason for deduction code WARNING - this may be blank - check spacing on printout 32 WIC - work indication code 33 BPD - beneficiary payment designation code 34-37 Date of second entry in history block on printout 38-43 Amount shown for second entry 44 RFD 45 WIC 46 BPD 47-59 Data for third entry on printout 60-72 Data for fourth entry on printout 73-78 blank 79-80 Sequence no, within card type - 01 Card 3 - Sequence No. 02 Columns Data 1-20 Duplicate data for card 3 Sequence no. 00 21-33 Data for fifth history entry on printout 34-46 Data for sixth history entry on printout 47-59 Data for seventh history entry on printout 60-72 Data for eighth history entry on printout 73-78 blank 79-80 Sequence no. within card type - 02 Other card 30s will have the same format as the above cards (sequence no. 01, 02). Sequence cards should be added until all history entries are punched. Card 4 Card 4 will only be used if there is any representative payee data (RPS). Card 3 sequence no. 00 will have a 1 in col. 47 if there is any RPS. Columns Data 1-2 Study number - 06 3 Card Number - 4 4-12 SSA Benefit Account Number 13-20 WAIS ID# for this beneficiary 21-62 SSA RPS data from printout - same format' as on printout (21-25 is date of selection, 26-62 are codes) 63-80 blank Appendix III Tape Format for Benefit Year Record (1) Format Positions 1-8 Number of Positions 9-17 8 18-26 9 27-28 29-34 6 Data WAIS ID Number for Beneficiary SSA Account Number for Beneficiary SSA Benefit Account Number for Beneficiary at Beginning of year Year of Record Amount of Benefits received during year (incl. cents) 35-46 47-50 4 Blank 51 1 No Record (see code) 52-60 12 Monthly Payment Record (see code) 9 SSA Benefit Account Number :For Beneficiary at end of year 61 1 Number of Benefit accounts for this beneficiary 62-72 11 SSA Beneficiary ID codes for entire history 73 1 X if this is last year of SS Benefit history 74 1 Indicator of method of record creation (see code) (2) Codes for Monthly Payment Record Code Explanation blank Benefits not paid during month because not yet entitled or terminated in previous month 1 Benefits paid during month 2 Benefits not paid during month because of retroactive payment later 3 Benefits not paid during month because beneficiary worked 4 Benefits not paid during month because they were withdrawn for adjustment 5 Benefits not paid during month because they were terminated 6 Benefits not paid during month because beneficiary was previously entitled to another type of benefit 7 Benefits paid during month include a lump sum death payment 8 Benefits not paid during month because claim disallowed 9 Benefits not paid during month because claim denied Benefits paid during month include payments for excess deductions in prior month (e) 0 (3) Codes for No Record Code N Explanation No information - from "No Record" code No payment record (4) Codes for Record Creation method Code Explanation 0 Some monthly payment information for this year C This year record extrapolated from previous year's record and no monthly payment information is given this year Appendix - IV Source Document Format (Card 1, Card 3 Data) Name-prim. ben. Account NOP NOP NOB NOB NOB LMOA //// PIA MPA///// PCOR BA # Payment SCC Latest Name and Address of Payee Beneficiary Name Ben. DOB DOB SAC //////////////////////////// CURR PAY Race XR XR 11963 ARD( 1964 ARD /mil/ TEC TEC History ..This is repeated for all beneficiaries This is repeated every-time there is a change in payments, Legend: XYZ - Labels XYZ - Record Data Coded Date XYZ ////- Non-Recorded Data Coded Data Appendix V - Source Format (Card 1, Card 2) Data SSA Form 9249 is a two page form. Page 3, is attached. Page 2 is a large grid that has spaces for each month of the years 1946 through 1964. The position of the form for 1946 is reproduced below: Jan. Feb. March Claims Account Number Action Symbols - Vertical line thru months of same benefit April may June July Aug. Sept. Oct. Nov. R - Initial Payments - months covered by first check S - Suspension (Benefits withheld for any reason)P - Payments made for excess deductions in prior year T - termination of entitlement Dec. Appendix VI - SSA Codes and File Reference 29 second line printout. NOP - Number of payments on the account NOB - Number of beneficiaries on the account LMOA - Last month of activity on the account PIA - Primary insurance amount PCOC - Payment center office code 1. N. Y. 2. Philadelphia 3. Birmingham 4. Chicago 5. S. F. 6. K. C. 7. Baltimore On third line of printout: PIC - Payment identification code A - Primary B - Wife C - Child D - Widow (er) E - Widowed mother under age 62 with children) in her care F - Parent O - Combined husband-wife check payment NOB - Number of beneficiaries in the payment EPA - Monthly payment amount SCC - State and county code Current pay (only shown in current pay cases) On fourth and following lines of printout: Payee name and address On seventh line of printout: A - OAIB - Primary beneficiary B - Wife (whose entitlement or benefit amount at the time of filing, is not dependent on having a child in her care) B1 - Husband (dependent) B2 - Wife (whose entitlement or benefit amount at the time of filing, is dependent on having a child in her care) C - Child (including disabled child) D - Widow Dl - Widower E - Mother (widow) El - Mother (divorced wife) F - Parent Beneficiary name DOB date of birth Sex and race On eighth line of printout: DOE - Date of entitlement Payment status code (PSC) A - Adjustment C - Current pay D - Deferral M - Matured deferred S - Conditional (suspension) T - Termination TOC - Type of claim 0 - Death claim 1 - Life claim 2 - Reduced claim (retired at 62-under 65) 3 - Death claim (for disabled child or for mother entitled solely because of disabled child) 4 - Life claim (for disabled child or for young wife entitled solely because of disabled child) On 5 - Disabled claim (other than disabled child's claim) DIB 6 - Reduced (DIB) disability claim rolls 7 - DIB claim (for disabled child or young wife entitled solely because of disabled child LDA ARM For internal use only SAC On ninth line of printouts Hist. - Last history posting a. RFD - Reason for deduction Space - no deductions 0 - 9 - deductions Y - previous entitlement to another type of benefit T - termination A - withdrawn for adjustment b. WIC - Work identification code 0 - No work indication 1 - Worked 2 c. BPD - Beneficiary payment designation 0 - Not paid 1 - Paid d. XR - Cross reference account number indicated On tenth line of printout: ARD - Annual report data Incomplete a. 1963 estimated b. 1964 estimated or actual Type of earnings code as reported on latest annual report (TEC) 1 - Wages 2 - Self-employment 3 - Self-employment and wages On eleventh line of printout: Representative payee data 1. Date reflects the date of selection of payee 2. (Codes represent type of payee, custody and guardianship) All history postings for the beneficiary from 1/62 on (other than Baltimore Payment Center, PCOC - code 7) a. Date of each change b. Monthly benefit amount c. RFD, WIC, BPD - refer to item 20 Action Symbols R - Retroactive payment later S - Suspension of Benefits P - Payments for excess deductions in prior year T - Termination of entitlement Claim Symbols Same as BIC codes with the following additions: G - indicates lump sum death payment H - indicates on DIB (disability) rolls. H will be used in addition to a BIC code for the beneficiary. Appendix VII - WAIS ID Assignment Procedure for Benefit Data This replaces Section IIID of WAIS 656-006. Existing ID's assigned under the old system will be reassigned if necessary. (1) ID's for beneficiaries who are institutionalized will be assigned as before. The extended definition of the WAIS family unit is head, spouse, and unmarried children who continue to live at the same address and any institutionalized member who could otherwise qualify. (2) The output of program FFIDB lists all FFID's assigned before the processing of the benefit file to all members of a family unit which has at least one benefit account -(see WAIS 656-026 for a writeup)is used to establish which beneficiaries should be assigned existing ID's and which should be assigned new ID's. Updating of addresses and dates of address may also be done at this time. (3) New ID's should be assigned as follows: (a) Head of unit is in a WAIS name group. If the unit has already been assigned a household ID (1st 6 digits), assign an individual an ID within that household. If the unit does not have a household ID, assign new ID within the name group. (b) The name of the head of the unit is not in WAIS name groups. Assign an ID in "name group 70" as follows: Digits 1-2 -70 (Indicates unit not in WAIS name groups) Digits 3-6 -consecutive number beginning with 0001 Digits 7-8 sex-family status (c) Lump sum death payments are sometimes made to unknown beneficiaries. In such cases, no FFID should be created. The WAIS ID number that should be assigned for the purpose of recording the data is: Digits 1-6 - Same as family unit ID number Digits 7-8 - 71 for LSDP to unknown male relative 72 for LSDP to unknown female relative 73 for LSDP to a funeral home 74 for LSDP TO THE ESTATE 79 for all other unidentified LSDP's (d) Record all new numbers assigned as well as old WAIS ID's associated with the person's former household. Do not use these where it is possible to establish the identity of the beneficiary through WAIS files - e.g. "LSDP to spouse", where a spouse is identified in the tax returns.hahttp://www.ssc.wisc.edu/wais/WAIS667001.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667001.txt$Mike VonSchneidemesser 1967WAIS File Maintenance/February 3, 1967 WAIS paper667-024,&Maintenance System - Files, Data, Etc.##Mike Von Schneidemesser WAIS 667-024 February 3, 1967 WAIS File Maintenance The chart on the following page gives an overview over the current WAIS files and their structural relationship for the purpose of updating. Since the WAIS files contain time-series data for a panel of individuals, a modification of the population of any one file will often have to be made also on the other files (the "basic" files) to maintain the proper correspondence between any one file and the others. But also do the data differ very much by source and content from file to file, which may be reflected in different kinds of changes to the various files. It is therefore necessary to consider all the files simultaneously when making a change to any one person, especially for changes of the identification # (ID), but also for modification of many fields, (i.e. changing the amount of rent income in the Master File makes a change of the rent amount in the Property File necessary). To make the proper changes one typically has to visually inspect the relevant records of all the basic files, to see if (a) records are affected by a change (b) and if yes, in what way they are affected. For this it is usually necessary to have printouts of all files available, which so far has not been the case and never can be implemented for voluminous files like the Master. Also WAIS typically uses most of the source documents in its work, which makes it necessary to search thru these unwieldy source document files. Often the correspondence between the source documents and the taped information of the basic files is doubtful or not even given, which leads to incorrect file changes. Documents and files to be updated directly Updated indirectly = recreated To be updated or recreated on demand Source documents Basic files Programs for inter indicator + intrafile creation Consistency files Permanent analysis Special analysis files Microfile tax records SSA form 805 Detail code sheets: rent, property, farm, interest Master folders (tax returns) ID code-sheets (FFID) Coded Age data list (death ext.) SSA benefit printouts and code sheets Interview + assets booklets, coversheets Coversheet cards Property etc. file Master 400-? (MA) Indentification (ID) SSA Form 805 Reformatted SSA benefit Interview (survey) PROPNON ID + All indicators Alid 805, ALAGE IDBENE IDINT ID + INDIC ID + INDIC + 805 + Age ID + INDIC ID + INDIC MA + 805 Data files 442-Chat Benefit Year RCDS Ext-03 Ext-01 File Class Maintenance group The WAIS file structure and its maintenance scheme To facilitate the task of locating the relevant source documents (which most of the time are those which will have to be updated also), all the source forms should be filed in one place, that is the Master Folder. (With the exception of the Interview and Assets booklets for confidentiality reasons). Similarly it might be ideal to have all the basic files merged into one file (as has been done with the Master and SSA 805 data). This, however, would create some problems in computerized data handling due to the size and format of the file. Also, it would make the maintenance of such a file a first class programming challenge, because of the need to shuffle information back and forth between records. A practicable compromise is to append to the ID file a few numeric indicators for the absence and presence of records in the other basic files. Since the listing of the ID file is the primary tool in making file changes, one always knows immediately which other file is affected. These indicators can easily be generated by the programs which check for inter and intra file consistency. Merged with the ID file they result in the format as described in the appendix. With all source documents for one individual in one place and the ID file with indicators listed, one (a) needs to find at the most two references and (b) knows immediately which other basic files have to be inspected if the source documents are contradictory or doubtful. Updating of Files All the basic tape files have to be updated with an updating program allowing for record deletion, ID modification, insertion, and modification. A generalized program to accomplish this on any tape file is now being prepared by the author (see WAIS 667-025). Two special purpose programs for the Master File and for the Identification File are operational at this time. All three programs have four phases: at first the input detail data are checked and assembled into the format of the file and put out on tape (this is known as the "transaction" file in commercial systems). Then the "transaction" file is passed against the basic file and the four different types of changes are made. In phase 3 the records the ID#'s of which have been changed are sorted sequentially and merged into the updated version (output from phase 2) during phase 4. During all four phases extensive messages and record prints are provided for all records which are incorrect or give rise to the suspicion that they do not conform to the derived result. It is imperative that these messages are checked after each run, and then the update run has either to be repeated with correct input cards, or the resulting updates are saved for the following run. Typically there is such a round of updates more or less simultaneously for all basic files. It then becomes necessary to check the files again for inter and intra consistency, which may result in further updating runs for some files, but will also create an ID file with updated indicator fields. The broken lines in the chart on page 2 reflect the flow of operations for the consistency checks. The unbroken lines indicate the data flow from the source document via the basic files to the analysis files. These analysis files can be updated also by use of the generalized update program, i.e, update directly. But this method necessitates making update cards parallel to those for the basic files but conforming to the format of the analysis file. Except where the number of update cards is small it is easier to recreate these files from the updated basic files, that is update indirectly. Also, one can skip some rounds of updates, if the analysis file is not used between such rounds. By recreating it from the basic file one has the most recent version of the analysis file immediately. It now follows a tabulation of various basic file aspects and manipulation capabilities. Names in UPPERCASE letters are program names, numbers denote WAIS papers where the item is described and the sign ( ) denotes a program only documented in the WAIS file catalog. Appendix BASIC WAIS FILES File name Property File aspect (Master details) Master Identification SSA Form - 805 Age Data ID or FFID Source documents Tax returns Folders with tax Code sheets based 3 printer image Printed list with Coded forms returns. on various files tapes hand coded data Type Card image tape 400 Char tape 124-Char tape or 404 Char tape Punched cards 128-Char including Indicators Format in 645-017 645-056 645-058 645-063 and 656-055 WAIS paper and Monograph-2 667-003 645-063 656-019, p.11-1 (Revision) Updating Update DELETE MA-UPDATE Update UPDATEAL By hand capabilities al CCHG 667-003' FID 667-025 (a) drop 667-025 656-019 (only ID should (b) change ID FFYR 656-024 ever be changed (c) insert (d) modify Printing or dis- Tape dump Whole tape or Whole tape or 1 Selected records Not needed play program selected records selected records specified by SS# in edited form in edited fog from source tape 645-067 656-045 656-019 Intra file PROPNON ( ) J. Geffert's MULTIPID ( ) SELECT 805 ALAGE ( ) consistency consist of 656-037 ALFID 805 656-019 checks ALAGE ( ) p. 4-1 Appendix: BASIC WAIS FILES File name SSA Interview File aspect Benefit data or Survey Source documents Printed and hand- Interview and coded forms assets booklets coversheets Type Card image tape Card image tape Format in WAIS 667-001 656-049 paper # Updating capabilities update- STRAN ( ) Update CJPCH ( ) (a) drop al al (b) change ID 667-025 667-025 (c) Insert (d) modify Printing or Selective print I Tape dump display progress and punch SELPT ( ) Intra file IDBENE ( ) IDINT ( ) consistency checks 6 and 667-O22 Appendix Format of Indicators A ended to ID-File Cols. 1-123 Same as ID file (WAIS 645-058) Col. 124 0 = No Master, no property generated 1 = Master, no property by 2 = Master + property PROPNON (3 = Property only - to be illiminated) Col 125 0 = No Form 805, no age data generated 1 = Form 805 present, no age data by 2 = No Form 805, but age data present ALAGE 3 = Both Form 805 and age data available Col. 126 0 = No Benefit file entries generated by 1 = Benefit file entries present IDBENE Col. 127 0 = No Interview, not selected 1 = Interview completed generated 2 = Interview refused, i.e. coversheet by available IDINT Col. 128 Record markhahttp://www.ssc.wisc.edu/wais/WAIS667024.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667024.txtZJDGene Moyer James Geffert Richard Bauman John deVries Marcia Hinckley 1966PJProcessing 1959-1964 Wisconsin Income Tax Returns Filing and Coding Manual July 26, 1966 WAIS paper667-002Master File- Tax Recordstn3.1.3 Codebook You will work with a codebook containing detailed instructions for every field which has to be coded, plus information which will help you locate the answers on the tax returns. As the coding progresses, changes may have to be made in the coding instructions. If changes occur, you will be told what to change; often it will mean that you have to replace a sheet in the codebook by a revised version. It is essential that you make all changes immediately and that you always code the returns according to the most recent coding instructions. A master codebook will, be kept by the supervisor; it will contain every change in the coding. Your personal codebook must be an exact copy of the master codebook. Every day, before you start coding, take your codebook over and compare it with the master codebook to ensure that you have made all the required changes. Every page in the master codebook containing any change will carry a note immediately below the page number, saying: "Revision (date)." If your personal codebook and the master codebook contain different versions of the same page, write the change in your codebook if the master codebook contains a written change; if, however, you have an older version of the page in the master codebook, do not write the change in your book, but see the supervisor. Do not make any changes on your own initiative; whenever you feel that instructions for a particular field are incomplete or inaccurate, mention this to the coding supervisor. so that, if any changes are to be made, all codebooks can be modified at the same time and in the same way. 3.l.4 Other informational material Several items will be available for general use; these you may have to share with other coders, or obtain from the supervisor: road maps (to be used if you cannot determine the geographical location of the taxpayer's residence) and the "Dictionary of Occupational Titles" (to be used if you cannot determine the correct occupation code for a taxpayer) are major examples. 3.2 Coding Procedure 3.2.1 Read the codebook and other information You have received, besides the material required for the coding, some information explaining what happened before and what will happen later with the income tax data that you will code. Read the other information: it will give you some idea as to where your job fits in the total operation and how important it is that you do an accurate job. Then, read the codebook and all other coding instructions carefully. Make sure that you understand everything; if you have any questions, be not afraid to ask someone on the supervisory staff. Also make sure that you have all the items you are supposed to have--especially that your codebook is complete--and that you know where to find additional material when you need it. 3.2.2 "Test-coding" When you are certain that you are sufficiently prepared to start the coding work, notify the supervisor. You will be given a number of returns to be coded. These cases have all been coded by the supervisory staff; several returns will contain special coding difficulties, of which the supervisory staff is aware. In general, the test cases are an adequate example of situations which will occur during the regular coding; they are not selected for their extreme difficulty. Several questions will undoubtedly arise. Don't be afraid to ask the supervisor! When you have completed the coding of these test cases, bring them over to the Supervisor, who will check your results, discuss coding problems and explain things which turned out not to be clear. Please note that this is not an "entrance exam"; its main functions are to enable you to discover the specific problems before you start the actual coding and to enable the supervisory staff to ensure that you fully understand your job. After the completion of the test coding, the supervisor will assign a name-group to you and let you begin to code the regular files. 1. ID Number 1A. Multiple ID indicator 2. Soc. Sec. # 2A. Multiple So Sec Indic. 3. Last Name 3A. Title 4. First Name 4A. Middle Name 5. Street/Box # 5A. RR, RFD Number 6. Street Name 6A. Street Class 7. Post Office (city) 7A. Postal Zone (Zip code) 7B. County Code 8. Age in 1964 8A. Date of Death 1959 1960 1961 1962 1963 1964 9. Return Filed 9A. Consistency Ind. 10. Residual Location 11. County Code 12. County Prior Year 13. Address Change 13A. Non-Resident Ind. 14. Occupation Cone 15. Industry Code 16. Occupation Change 17. Return Previous Year 17A. Labor Force Previous Year 18. Marital Status 18A. Marital Status Consistency Code 19. Spouse Separate Income? 9A. Separate Income Reliability 20. Recent Marriage? 20A. Dissolution of Marriage 21. Number of Dependents 22. Dependents' Age Code 23. Dependents' Address Code 23A. Students in College 24. Total Sources of Wages 25. Dividend Paid in Stock? Do Not Do Not Do Not Do Not Do Not Code Code Code Code Code 26. "Head of Family " Exemption 27. Automobile Expense Indicator 28. Supplementary Schedules 29. Enclosures 30. Occupation, Industry Reliability Indicator 31. Next Year Filed? 3.2.3 The regular coding All information pertaining to one taxpayer (for all years filed) should be coded on one coding sheet only; no coding sheet should contain information about more than one taxpayer. Make sure that every person whose returns are (or are supposed to be) in a folder receives a coding sheet; even if no returns are present for any of the years we are concerned with, some information will be required for this person: a. The top section. Code this from the most recent information you can find, combined with the information contained on the label on the lip of the folder. Watch especially for: (i) "Multiple" ID numbers. These cases will have two (or more) labels affixed to the lip and/or the face of the folder. For persons with more than one ID number, code field (1A) as "1" (see coding instructions, section 3.5 of this manual, field IA); record all ID numbers for this person on the back of his coding sheet, beginning with the number you are going to write in field (l); also write "multiple ID" in the remarks column on the filing-coding log sheet; (ii) "Multiple" Social Security numbers. For these cases you code field (2A) as "1" (see section 3.5 of this manual, field 2A); record all social security numbers for this person on the back of his coding sheet, beginning with the number which is written at the top of the folder or printed on the gummed label; also write "multiple SS#" in the remarks column on the filing-coding by sheet. b. Field 9 (Return filed): Even if there is no return for the year you are working on, this field will have to be coded (see coding instructions, section 3.5 of this manual, field 9). c. Field 31 (next year filed) should also be coded in all cases even if both the return for the year you are working on and the next one are missing (see coding instructions, section 3.5 of this manual, field 31). You will find several married women who never filed during the period 1959-1964, while their husbands did file in at least one of the years. These cases should contain the following: (i) a folder containing the returns(s) plus any other documents pertaining to this taxpayer unit; (ii) at least one return identifying the husband as well as his wife, and clearly showing that the wife did not earn any income for any of the years for which the returns are present; (iii) a label containing the ID number for the husband; (iv) a label containing the ID number for the wife; if you have identified a wife from the returns, but you do not find a label for her, see the supervisor. For all these cases, do not code anything beyond the areas specified above (top section, field 9 and field 31). Place the code sheet in a box on the supervisor's desk marked "No returns", and write "No wife returns" in the remarks column of the filing-coding log sheet. 3.2.4 Logging When you have completed all the coding for a folder, write the date and your initials on the filing/coding log sheet that you find in the folder, on the line marked "Demographic Coding." If any problems arose in the coding, make a note under the column "Remarks." This will enable us later, if any discrepancies are found, to look at the "Coding History" and possibly to explain such discrepancies. 3.3 Miscellaneous instructions 3.3.1 "Form S" file If, during the coding of a return, you have to make a somewhat arbitrary decision about the code to be used, consult with the supervisor; after an agreement has been reached regarding the proper code to be used, write a "Form S", indicating: a) the taxpayer's ID number; b) the year of the return; c) the number of the field giving you trouble; d) the code you used; e) the exact answer as you found it on the tax form. Do not file these "Form S " sheets in the folders, but give them to the supervisor. Make a note in the "Remarks" column in the filing/ coding log sheet, indicating that you coded a "Form S" for this taxpayer. 3.3.2 "Fiscal year" returns Some taxpayers file returns for periods of 12 months which do not coincide with a calendar year. This will be indicated at the top right hand corner on page 1 ("long" forms only): "Calendar year 19XX or income year beginning . . . . ending . . . ." If you spot any such case, see the supervisor. 3.3.3 Maintenance of files The folders are filed in the drawers in a specific order (by identification number). If you take folders out of the drawer to be coded, be careful to file them in exactly the same location where you found them (unless, of course, they were filed out of order). If, for any reason, you have to take one folder out of the files from a drawer on which you are not working, take one of the cards which you will find in the front of the drawer, write the ID number of the taxpayer whose folder you are taking out, the date, your initials and the reason why you are taking this folder out, and place this card at the location where the folder belonged. When you return the folder, take the card out, cross out the information you wrote on it, and place the card again in the front of the drawer. 3.4 Coding conventions 3.4.l Last names should always be coded and punched as one work, even when they are written in more than one word on the form (e.g. "V** M****" should be coded as "V**M****"). This convention will not always produce the correct name, but it will produce consistent results, even in cases where there were discrepancies between returns for the same taxpayer. 3.4.2 Alphabetic fields (e.g. names, addresses) should be left Justified (i.e. must begin with the leftmost box provided). Numeric fields must be right Justified (i.e. must end in the rightmost bon provided). In all cases, unfilled positions must be left blank. 3.4.3 If any field exceeds the number of boxes provided, do the following: a) if the field is numeric: enter "9 . . . . 97" (i.e. all "9" S except for a "7" in the rightmost box) and fill out Form S with the pertinent information; b) if the field is alphabetic: fill out Form S and see the coding supervisor. 3.4.4 Codes for "unknown", "not ascertained", "miscellaneous", etc. should be used only if none of the other codes fits; 3.4.5 1959 and 1960 returns are to be coded only when the gummed label indicates that 1959 and 1960 returns were previously not in our files. If the label for this person is not preprinted (which indicates that no returns for this person were in the files) and you have returns for years before 1959, see the supervisor; if the returns are all for 1959 and later. Code every one of them. 3.4.6 When in doubt: do not lead trump, but see the coding supervisor. 3.5 Codebook The numbers of the items below correspond to the numbers of the fields on the code sheet. An example of the code sheet is attached at the and of this manual. Husbands and wives must have separate coding sheets. (l) The identification number Source: written at the top of the folder or on the gummed address label on the top of the folder. Be careful to check that each ID number is unique to a taxpayer. For a married couple, the two ID numbers would be the same except for the last two digits which indicate sex and position in family. Each single person must have an ID number different from any other; for single males, the last two digits of the ID number must always be "00"; for single females, these digits should be "10". It is very important that these numbers be correct before the keypunching begins. As you go through the files, therefore, see the supervisor if you see more than one folder for the same person's returns; also, if you find the wrong label on a folder, see the supervisor. Use the listings which were used in filing (as described in section 2.3.1.4 of this manual) to check the correctness of the match between label(s) and folder. The coding is the last stage at which the correctness can be checked. PRINT LOG I. The first numbers written in the first and last counter number spaces were transcribed from the Reel Log. 2. Under these numbers, record the counter numbers from the first and last images in the corresponding bundle. 3. If the print counter numbers do not match the prerecorded numbers, see your supervisor. 4. Record your initials and the date for each bundle you file. Reel # First Counter ## Last Counter # Remarks Filer - Date (1A) Multiple ID Indicator: This field is used to indicate any taxpayer who has more than one ID - number assigned to him. If you find such cases and you are certain that all numbers pertain to one and the same person, code field (A) as "1", write all ID numbers on the back of the code sheet and write "Multiple ID" in the "Remarks" column of the filming-coding log sheet. The codes to be used are: 0 - only one ID number for this person 1 - more than one ID number for this person. (2) Social Security Number Source: written at the top of the folder or on the gummed address label on the top of the folder. This number must agree with that on the returns themselves. If it is not identical to the number on any of the returns, see the supervisor. If it is identical to the number on some returns but not all, and you are confident that all these returns were filed by the same person or married couple, turn the code sheet over on its back and record all the social security numbers given on the tax returns for the various years (See also the instructions for field 2A). The first one should be the number written at the top of the folder or printed on the gummed label. Another case where you might have to take action is that where neither the folder nor the gummed label on the folder indicates any social security number. If the returns also do not have a social security number, leave the field blank and continue coding. If, however, you find a social security number on any of the returns and you are reasonably certain that this pertains to the individual whose returns you are coding, take this as his social security number (See also instructions for field 2A). (2A) Social Security Number Indicator: This field is used to indicate the existence of more than one Social Security number for the same taxpayer. If you find these cases and you are convinced that all numbers belong to one and the same person, code field (2A) as "1", write all Social Security numbers on the back of the code sheet and write "Multiple SS" in the "Remarks" column of the filing-coding log sheet. The codes to be used are: 0 - only one Social Security number (or none at all) for this person 1 - more than one Social Security number for this person. (3.7B) Name and address fields Source: the information supplied on the most recent tax returns (in most cases the 1964 returns) at the top of page 1. If you discover a later address (e.g. on correspondence with the taxation department or on a 1965 return), record that address and record the date of the source of the address. For the coding of these fields it is important to follow the "General Coding Conventions" strictly. Additional instructions for some of the fields: (3A) Title: various items will be found. Examples of what you will find, and the codes to be used, are: Junior - JR Senior - SR "Estate" - EST Other cases, to be coded as you find the, are "II" and "III" (code as two and three I's respectively.) (5) Street, or Box Number: in the event that the taxpayer lives on a rural route and has a box number, write only the number in this space; print "Box" in field 6 (street name0. (5A) RR, RT. or RFD number: in this field only the rural route number should be coded; it is not necessary to differentiate between RR, RT or RFD. (6) Street name: (i) Do not include the street "class" or "type"; this goes in field 6A (See below). (ii) When a taxpayer has a RR number, a box number and a street name, omit the street name and put "BOX" in the three leftmost positions of field 6. (iii) When a street name is a number (e.g. 66th Street), put only the number in field 6 (do not record the "th"). (iv) Record the direction portion of the street name as an initial, followed by one blank and the street name, e.g. "West 19th Street" would be coded as "Y 19". (v) If the taxpayer's address is a Post Office Box, code this as: "P.O. BOX (6A) Street class: use the standard abbreviations as given below: Avenue - AVE Boulevard - BLVD Court - CT Circle - CIR Drive - DR Highway - HWAY Lane - LANE Place - PL Road - RD Square - SQ Street - ST Terrace - TERR Trail - TR Way - WAY (7A) Postal zone or ZIP code: use the ZIP code when it is available. If only a postal zone is given, record it in the rightmost two digits of the field. (7B) County code: this must be the county in which the taxpayer lives (most recent address), not necessarily his tax district. Use a road map to check this if necessary. A list of Wisconsin counties and corresponding codes follows in Appendix A. (8) Age in 1964 Of all the returns you have to code, only the 1964 returns carry information about the taxpayer's age as responses to a distinct question: Long forms: part IV, page 3, should have the age (s) of taxpayer(s) whose information is recorded (single persons or married couples); Short forms: carry the age information in the answer to question 10 (on the back of the return). Codes to be used: if the exact age can be written for this field. If the precise information is not available on the 1964 form: 1 if you find other information in the folder indicating the taxpayer's age for any year, calculate the age as it would be in 1964 and code it; 2 if there is no indication regarding age anywhere in the folder, code 99 (not ascertained, unknown) In all cases where the age is 97 or over, code 97 and fill out a "Form S ". In some cases you will be able to ascertain that the age is 65 or over (e.g. from an indication on a personal exepmtions schedule for other years) but not the taxpayer's exact age; in this case, code 98. (81) Date of taxpayer's death If there is an indication, anywhere among the return, that the taxpayer died, record the date of death if you can find it. The order to be used is: year (last two digits)--month--day. If there is evidence of taxpayer's death, but no clear indication of the date of death, code all 9's for the elements you cannot locate. e.g. if a document states that the taxpayer died in March 1963, but the date is not given, code field (8A) for this return as "630399"; or if there is evidence of death without any indication of the date, code field (8A) as "999999". If there is no evidence that the taxpayer died, code the field as all o's. (9) Return filed Source: the presence or absence of the return in the folder. Codes to be used: 0 - Return present 1 - part of return present 2 - Return missing 3 - Return missing because of field audit. If a return was out for field audit, the folder shot will indicate this (See Section 2,2.3.2 of this manual). If you find such cases, do not continue the coding of these returns, Write "out for field audit" in the "Remarks" column of the filing-coding sheet log and give the folder to the supervisor. (9A) Consistency indicator. Source: the presence or absence of the return, as compared with the indication on the "folder shot." If the two indications agree, use code "0"; if they disagree (e.g. the "folder shot" indicates that a return should be present, but you cannot find the return in the folder), use code "1". (10) Residence location. Source: the tax district, a four digit alphabetic code which is written on the return by the taxation department. On the 19591962 returns it can probably be found at the top of page 1; on the 1963-1964 returns it will usually be at the bottom of page 1. On the 1962-1964 short forms it will be found at the left hand side on the front of the return. If no tax district code can be found anywhere on the return, you can: 1 compare the returns with prior year's returns and/or following year's returns. If there was no address change, take the tax district code as indicated on those returns. If the address is different from all other addresses, or if the tax district cannot be determined for any other reason, code it as "ZZZZ." (11) County code Source: page 1 on all forms; written after "TAX DISTRICT." This is a 2-digit numeric code, according to the list in Appendix A; it should pertain to the taxpayer's tax district, as coded in field 9. In most cases, the code will be identical to the one you wrote in field 7B, but this is not necessarily so in all cases; especially if the taxpayer moved from one county to another, there will be differences. (12) County prior year: again a 2-digit numeric code (Appendix A) Source: the response to the question "In what county did you reside in 19.. (the previous year)?" on page 1 of the return. If the taxpayer did not answer the question but the previous year's return is in the folder, take the information from that source. If the question was not answered and the previous year's return is not present, use code 99 (not ascertained). (13) Address change Source: the address as indicated at the top of page 1 of the return you are currently working on, as compared with the address on the return for the year before the current one. The codes to be used are: 0 - No change 1 - Change, within tax district 2 - Change, within county 3 - Change within state 4 - Change, "interstate" 7 - Possible change, but coder cannot be sure 8 - Change, extent not known 9 - Not ascertained, unknown Codes l-4 indicate an increasing "extent" of change, so you have to select the code which is most applicable to the change. E.g. if a person changes address within the same county, but into a different tax district, code "2" is the most applicable (even though code "3" could also be applied). Code "4" includes all address changes where the state of residence changed; it also includes those cases wehre the taxpayer moved into or out of the country. (13A) Non-resident indicator Source: In some cases you will find an indication that this return was filed by someone who was not a Wisconsin resident. Indications can be of the following kinds: --- the taxpayer may have used a "Non-resident Form" (Form 1N) to file his return; --- the taxpayer may have used a Form from another State to file his return; --- he may indicate by means of his address (top section of page 1 of the return) that he is not a Wisconsin resident. Code to be used: 0 - Resident 1 - Non-resident 9 - Resident status not ascertained (14) Occupation code: a 2-digit numeric code - see Appendix B. Source: the answer written by the taxpayer on page 1 of the return, for his "occupation". If you are in doubt about, the specific code to be used, use the "Dictionary of Occupational Titles". You will have to use your ingenuity in several cases: the taxpayer may give a new description of his occupation while in effect no change has taken place (e.g. undertaker - mortician, garbageman-garbologist, etc.), Another case where your imagination will be helpful is that where the taxpayer is not explicit enough about his occupation. If necessary, look at the amount of income to determine whether the proper classification is skilled, semiskilled or unskilled. (15) Industry code: a 4-digit numeric code Source: in many cases, you will be able to code this from the answer regarding occupation (field 13). In other cases, evidence regarding the taxpayer's employer will be available (Source of income-page 1, 1959-1961, page 2, 1962-1964, back, 1962 short form, front, 1963-1964 short forms). In other cases, withholding forms with the employer's name will be available; in still other cases, correspondence might carry the employer's letterhead, etc. Codes to be assigned: if you can identify the taxpayer's employer, check the list of main Wisconsin and U. S. corporations in Appendix D. If you find the employer on this list, use the four-digit code as indicated on the list. If the employer is not on the list of Appendix D, or cannot be identified, determine the industry and find the first two digits of the industry code from Appendix C; the last digits should be 00 for all these cases which were not identified on Appendix D. (16) Occupation change Source: the occupation and employer as indicated on the return you are working on, compared with the same information for the year before the one you are working on. There are three factors which change; a change in any one of these does not necessarily imply that the other factors have to change too: change in employer change in industry change in occupation. The way in which this code is constructed is to assign "values" to each of the separate elements: "0" for no change, "l" for "change in employer", "2" for "change in industry", "4" for "change in occupation". An easy way to find the code, then, is to decide which change (if any) occurred, find the values associated with these changes, and add them. You can check the value you find against the list of codes given below. The codes to be used are: 0 No change 1 Change in employer only 2 Change in industry only 3 Change in industry and in employer 4 Change in occupation only 5 Change in occupation and employer (but not industry) 6 Change in occupation and industry (but not in employer) 7 Change in occupation and industry and employer 8 Entered labor force, retired, died 9 Not ascertained, unknown (17) Return filed previous year Source: the taxpayer's answer to the questions: "Did you file a 19.. (previous year) return?" and "If you did not file a 19.. (previous year) return, why did you not file?" Both questions are on page 1 for all long forms front for short forms. Codes to be used: 0 - Yes 1 - No; non-resident 2 - No; less income than the filing requirements 3 - No; reasons other than those above (make out "Form S") 8 - No; reason not ascertained 9 - Not ascertained N.B. Code "8" is used if the first questions was answered (indicating that no return was filed for the previous year) but the second one was not; code "9" is used if both questions were not answered. (17A) Labor force indication for revious~year Source: answers to the same two questions which you used to code field (17), supplemented (if necessary) by information on the previous year's return. Codes to used are: 0 - employed 1 - unemployed (but seeking employment) 2 - student 3 - military service 4 - housewife (not employed!) 5 - retired 6 - underage 7 - other status (make out "Form S") 9 - no indication of labor force status (18) Marital status code Source: 1 the answers at the top of page 1 (front on short forms) will give some indication about the filer's marital status (we will assume that if a husband and wife file a combined return, they are married, unless there is clear evidence to the contrary). If no indication exists from these answers, check the following: 2 the answers to the questions "for persons claiming head of family exemption", which are absent on the 1962-1964 short forms, can be found on page 4 of the 1959-1961 forms and on page 3 of the 1962-1964 long forms; 3 other returns and/or documents in the folder which may indicate the existence of a spouse. 4 check the personal exemptions claimed by the taxpayer. Codes to be used: 0 - Single 1 - Married 2 - Widowed 3 - Divorced 4 - Separated 9 - Not ascertained (18A) Consistency indicator Source: the code you assigned in field (18), combined with the answer given by the taxpayer's spouse (if any) or the absence of a spouses's return. From these sources you can determine whether the answer you coded in field (18) is consistent with other information or whether it is not. Codes to be used: 0 - consistency (e.g. spouses agree) 1 - inconsistency (spouses do not agree) (19) Spouse separate income Source: answer to the question "Do husband and wife each have income?", at the top of page 1 (1959-1964). Note that on the 1963 and 1964 short forms the question is not asked. In these cases you will have to examine the returns and determine whether only one spouse declared income or both of them did. Codes to be used: 0 - Not applicable (taxpayer not married; therefore no spouse) 1 - Yes 2 - No 3 - Spouse died during the year 9 - Not ascertained (19A) Spouse's income reliability indicator Source: the code you assigned to field (19), combined with the actual presence or absence of a return. Codes to be used: 0 - No inconsistency; not applicable (field 19 coded "0") 1 - Inconsistency (20) Information re recent marriage: Source: answer to question "If marriage took place in 19.. (tax year) give full name and address of wife before marriage." This question is at the top of page 1 for all "long" forms; for 1962 short forms on the front of the return; for 1963 short forms totally absent; for 1964 short forms only the wife's maiden name is asked (top, front of return). Use the following codes: 0 - No marriage took place during the tax year l - Marriage took place during the tax year; complete information 2 - Marriage took place during the tax year; incomplete information 3 - Marriage took place; no information 9 - Not ascertained; no answer, Code "3" can be derived from the codes you assigned to field (19) in the previous year and the current year: if this year was coded as "1" and if the previous year was not coded as "1", you may assure that a marriage did take place, although no information was supplied; hence, you can code field (20) as "3". Information 13 "complete" if the wife's maiden name and former address are given, "incomplete" if only the maiden name or only her former address is given. 1963 short forms should be coded "9", 1964 short forms "0" if no marriage took place, "2" if marriage did take place and wife's maiden name given. (20A) Dissolution of marriage Source: compare the code; you assigned to field (1.8) for this year and for next year, as well as any indication on the forms "alimony paid") of a dissolved marriage. Codes to be used: 0 - Not applicable; no indication of dissolution of marriage 1 - Divorce took place during the tax year 2 - Separation took place during the tax year 9 - Not certain whether dissolution of marriage took place during tax year (e.g. although no indications of death or divorce, a formerly married male begins to file separate returns). (21) Number of dependents: Source: Count the number of names written under the heading "Dependents" (1959-1961 - page l) or under the heading "Exemptions" (1962-1964, page 3, 1962-1964 short forms on the back of the return). In several cases a husband will list his wife as a dependent (on the 1962-1964 forms, husband and wife will usually be found on the designated line); neither the wife nor the taxpayer himself should be counted as dependents. If all the lines are filled, make sure that you check for enclosures where additional dependents may be listed (the returns allow for only five dependents). Allot all dependents to the husband if they "belong" to a married couple. In the case of divorced or separated couples: if both parents claim all of the children (or if, in general, any dependent is claimed more than once), all dependents should be allotted to the ex-husband; if the dependents are split up between the ex-spouses (but no duplication of claims took place), each taxpayer will be allotted the number of dependents he(or she) claimed.hahttp://www.ssc.wisc.edu/wais/WAIS667002.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667002.txt||lSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Moyer, Geffert, Bauman, deVries and Hinckley WAIS Paper 667-002 First Revision August 25, 1966 PROCESSING 1959-1964 WISCONSIN INCOME TAX RETURNS FILING AND CODING MANUAL TABLE OF CONTENTS Introduction to Processing: An Overview Filing Manual 2.1 Introduction to Filing 2.2 Microfilming Resume' 2.2.1 Returns to be Microfilmed 2.2.2 Pulling the Folders 2.2.3 Microfilming a Folder 2.2.3.1 Microfilming Requirements 2.2.3.2 Folders Out for Field Audit 2.2.3.3 The Basic Microfilming Process 2.2.3.4 Identification of Taxpayers 2.2.4 Microfilm Logs Kept 2.2.4.1 File Log 2.2.4.2 Daily Microfilm Log 2.2.4.3 Reel Log 2.2.5 Microfilm Error and Problem File 2.2.6 General Problems in Microfilming 2.3 Filing Instructions 2.3.1 Materials Needed 2.3.2 Filing Procedure 2.3.2.1 Assembly 2.3.2.2 Order of Returns 2.3.2.3 Problems in Filing 2.4 Assigning of Identification Numbers 2.4.1 The Construction of WAIS Identification Numbers Page 2.4.2 Assigning ID Numbers to Taxpayers Already in WAIS Files 22 2.4.2.1 Finding the Previously Assigned ID Number 22 2.4.2.2 Positioning of Gummed Labels on Folders 24 2.4.3 Assigning ID Numbers to the Residual Stack 25 3. Coding Manual 28 3.1 Material Needed 28 3.2 Coding Procedure 31 3.3 Miscellaneous Coding Instructions 35 3.4 Coding Conventions 36 3.5 Codebook 37 Appendix A 75 Appendix B 76 Appendix B2 80 Appendix C 84 Appendix D 88 1. An Overview of the Processing of the 1959-1964 Income Tax Returns 1.1 WAIS is presently engaged in gathering data from the 1959-1964 Wisconsin Income Tax Returns of a 1% sample of Wisconsin taxpayers. Processing of the data involves the following operations: 1.1.1 Microfilming of returns and preparation of prints From April 27 until July 14, Tax Department personnel microfilmed 1959-1964 tax returns. The film was then developed and printed by the Recordak Company, and the prints were sent to WAIS. 1.1.2 Filing of prints and assignment of identification numbers The facsimile returns are to be filed into husband-and-wife or single-person folders; each taxpayer is to be assigned an identification number on the basis of name, sex, and position in family. 1.1.3 Coding of demographic information Certain information which cannot be directly keypunched will be coded according to detailed instructions and coding conventions. 1.1.4 Keypunching the summary information Keypunchers will punch directly the income and deduction items on the returns as well as the coded demographic data. The punched data will then be verified for accuracy. 1.1.5 Editing the cards produced; correction of errors The punched cards will be machine and hand-edited, and all errors detected will be corrected. 1.1.6 Preparing a 1959-1964 Master File A Master File, containing 1959-1964 information about the income and expenses of the taxpayers, will be created from the data. 1.1.7 Preparing an Identification File An Identification File will be created from the 1959-1964 data and will be checked against information provided by the Social Security Administration. 1.1.8 Consistency checking The data will be checked for self-contained inconsistencies via a program written for this purpose. This program generally checks for errors in computation. 1.1.9 Merging the 1959-1964 Master File with the 1946-1960 Master File The 1959-1964 data will be merged with the 1946-1960 data. The complete merged file will contain income and expense data from 1946-1964. 1.2 This manual contains 1.2.1 An introduction to filing 1.2.2 A resume' of the microfilming process 1.2.3. Detailed instructions for filing the prints and assigning the identification numbers 1.2.4 Detailed coding instructions, including coding conventions and a codebook 2. Filing Manual 2.1 Instructions for Filing, Facsimile Tax Returns During the months of May, June and July 1966, Tax Department personnel microfilmed the 1959-1964 Wisconsin State Income Tax returns of a sample of Wisconsin taxpayers. The microfilm was developed by the Recordak Corporation, half-sized electrostatic prints were made, and the prints were delivered to WAIS. Your job is to sort these prints into husband-wife manila folders and to attach an identification number for each husband, wife or single person in each folder. It is important that this work be done as accurately as possible so that the next step (coding the returns) can be done easily and quickly. Therefore we want you to keep certain records on what you have done. These records will guide the coders in making the decisions necessary for a good job. Similarly, the microfilmers kept certain records which will help you to be sure you are filing the returns properly. These instructions cover the records kept for you as well as the records you are to keep. 2.2 Resume' of Microfilm Procedures and Logs 2.2.1 Returns microfilmed The following is a list of the name clusters microfilmed: Name Cluster ****Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information.**** 2.2.2 Pulling folders The folders containing returns for the people in the original WAIS sample were pulled, by name group and in alphabetical order, from the files in the state tax archives. These folders were then brought to the microfilm section which was set up within the archives. The folders and their contents were then filmed and returned to the files. During most of the filming process, two machines were used but care was taken not to split up a name group between the two machines. A residual group of taxpayers, not in the original sample, but who had been interviewed in a special survey, was handled differently from the rest of the WAIS sample. The files were searched for each separate folder in the residual group and the folders were then collected into a single group which was microfilmed last. 2.2.3 Microfilming a folder 2.2.3.l Microfilming requirements WAIS wished to (l) microfilm each sheet of paper in each folder, (2) know about each missing return, and (3) be able to determine the returns which belong to the folder. Therefore, before microfilming a folder the microfilmers checked the returns inside and noted any missing returns for the husband or single person. Signs bearing the caption "H-19--return missing" were placed on the front of the folder to denote any missing years for the husband or single person ("H"). The folder (with the signs if any returns were missing) was then photographed. The label, which the Tax Department affixes to each folder, provides WAIS with the name of the husband or single person, spouse's name, social security numbers and address. 2.2.3.2 Folders out for field audit When the returns were missing from a folder for field audit, the folder was ordinarily empty except for an "out" card telling, hopefully, where the returns were. In this case, the "out" card and "all returns missing" were filmed on the folder. In the future, WAIS hopes to be able to return to the tax files and film the returns back from field audit. 2.2.3.3 The basic microfilming process l. Folder shots The folder was shot in the center of the microfilm stage, with the signs indicating what years, if any, were missing. (Note the counter number at the bottom of each image. This number changes with each shot.) FOLDER <- image 00370 <- counter # 2. Long form returns The returns were laid out flat so that page one and page four could be photographed in one exposure image. If page four of the return was blank, any extra sheet was filmed by placing it on page four for the first shot. Pages two and three were similarly microfilmed in one image. P4 P1 P2 P3 3. Additional enclosures Any additional items were microfilmed with page one of the return or with some other information bearing the year, the name, social security number and address of the taxpayer. item P1 Wt9 P1 item Wt9 Wt9 item item Wt9 item Wt9 Wt9 4. Short forms Short forms (card forms) were filmed in order to furnish certain taxpayer identifications in each image. l. First shot - front of card, schedules, letters, Wt9s, etc. 2. Last shot - back of card, Wt9s. item card Wt9 card back (front) Wt9 item item Wt9 Wt9 2.2.3.4 Identification of taxpayers In all this work identification was the crucial factor. The decision was made to waste film rather than destroy identification. If the microfilmer was in doubt about whether her hand had been in front of the camera, or some other such accident had occurred, she took the shot again. 2.2.4 Microfilm logs kept Three logs kept during the microfilming process contain most of the basic information about the actual microfilming. 2.2.4.1 The file log notes the dates for pulling and returning the folders, as well as the names of the first and last taxpayers in each name group. 2.2.4.2 The daily microfilm log notes the daily progress of the filming, by giving the names of the first and last taxpayers filmed each day. 2.2.4.3 The reel log is the most important of the three logs and is actually probably the only log which will prove really useful in the processing of the data. It notes the name group or groups on each reel, the date microfilmed, the machine, and the counter numbers for the first and last images. (Most of this information was also recorded on the box containing each reel. The first and last taxpayers, the year of the first and last returns and the first and last counter numbers on each reel are recorded on the box for that reel.) If a reel contains more than one name group, the last taxpayer in the first name group, the year of his last return, and the last counter number in the filming of that name group, are all recorded as well as similar information for the first taxpayer in the second name group, i.e., year of his first return, and the first counter number in filming the second name group. 2.2.5 Microfilm error and problem file In addition to the three basic logs, a log of microfilming problems and errors was kept. This was actually a series of hastily written notes about specific problems encountered in microfilming. These notes were later transcribed onto note cards and filed on alphabetical or numerical order. Most of the problems were due to misfiling or to inaccurate labeling on the folder shots as to years missing. In many instances, folders had to be combined (if M*** B***** and M*** *. B***** were actually the same person, yet were filed separately). Often Wt9s were either in the wrong folder or attached to the wrong year. Nominal wife-swapping was fairly common, too. If these problems were discovered before the folders in question had been filmed, no note was made of the problem. If the folders had been filmed, note was made of the problem as well as any solutions attempted. This information is recorded alphabetically under the names of the taxpayers concerned in each case. A typed list, in alphabetical order by taxpayer name, compiled from this file, will be given to each filer. In few cases, no names were recorded but the problems are noted in numerical order according to counter number. 2.2.6 General problems in microfilming As the microfilming progressed, many unexpected problems cropped up. The microfilmers straightened out as many of these problems as they could, but there were probably some snags which went undetected. Many of these snags are due to misfiling, while the rest can be blamed on the actual microfilming process. Misfiling resulted in whole returns or perhaps just enclosures (schedules, informationals, Wt9s) being in the wrong folder. Very often, too, the enclosures were stapled to the wrong year within the proper folder. If such errors were discovered "too late", but before the folder had been completely filmed, the informationals, etc. were refilmed, this time with the proper year, or a note was made of the error if the informationals were in the wrong folder, and they were later refilmed in the proper folder. If an error was discovered immediately after the incorrect shot, the next shot might read "disregard previous shot." If a big enough batch of errors was discovered in one folder, the caption "disregard all previous shots for this folder" might occur immediately before the first shot in the refilming of the folder. This process was used, too, if much information belonging in a folder already filmed, was discovered in another folder later on in the filming. If, however, the folder was a large one, only the years to which the additional data related, were refilmed, with an explanatory caption. If a folder was refilmed out of order, a sign to that effect was placed on the folder for the first shot. If a return was refilmed out of chronological order, but within the proper folder, an "out of order" or "out of chronological order" sign was filmed with the return. Occasionally, misfiling resulted in a taxpayer who was not in the sample, being filed among the taxpayers within one of the WAIS name groups. If this was discovered before filming, no note was made of it, as it was not filmed. If it was filmed and then detected, the folder was refilmed with the caption "disregard all previous shots for this folder - not in sample." The captions filmed on the folder were designed to aid in filing the facsimile returns. The undetected errors in filing and in microfilming should be few. However, any such errors which are detected during the printfiling and coding processes, should be discussed with the supervisor so they can be straightened out or so that they can be noted. 2.3. Filing Instructions 2.3.1 Filing materials needed for this job 1. A bundle of electrostatic prints These bundles are kept, in order, in the filedrawers into which you will file them. Only three bundles are placed in each file drawer, but, after the returns have been placed in folders, you will probably be able to place four or five bundles of returns into each drawer. 2. Print log (see following page) Each bundle you take to your desk is to be marked in this log, which will be in the possession of your supervisor. Write the counter number of the first print (the one on the top of the bundle) in the space provided. When you finish a bundle, write the counter number of the last print in the space provided for it. If a number from that sequence of numbers is missing (i.e., if you have prints #1009 through 1127, except for #1100) see the supervisor. If any other problems occur, see the supervisor. 3. Manila folders 4. Three listings to help in assignment of identification numbers (l) There is a listing of the taxpayers in the sample, by social security numbers. This list is in numerical order, and contains the social security number, name, identification number, etc. of the taxpayers. (2) There is a listing of the taxpayers by last name, in alphabetical order. This lists the identification number, social security number, etc. (3) A third listing, by identification number, which matches the gummed label listing, is available. 5. Gummed labels This set of labels contains the identification materials for the taxpayers. There should be a label for each taxpayer who was filing between 1946 and 1960. YEARS 000110111111110 ID# 0201**** SS# ********* LASTNAME, FIRSTNAME ADDRESS CITY 6. Typed error and problem list This list, compiled from the Microfilm Error and Problem File, is designed to help you straighten out most of the filing and coding problems as they occur. 7. Coding-filing log (see following page) There should be one such log sheet placed in each folder, and the work done on the folder should be recorded, along with any pertinent remarks, on this log sheet. PRINT LOG 1. The first numbers written in the first and last counter number spaces were transcribed from the Reel Log. 2. Under these numbers, record the counter numbers from the first and last images in the corresponding bundle. 3. If the print counter numbers do not match the prerecorded numbers, see your supervisor. 4. Record your initials and the date for each bundle you file. Reel # First Counter # Last Counter # Remarks Filer - Date 1959-1964 Income Tax Returns Coding-Filing log Name of Head Social Security Number of Head Identification Number of head Function Date Initials Remarks Returns Filed ID Number Assigned Demographic Coding Summary Information Keypunching Summary Information Edit Correction (1) Summary Information Edit Correction (2) Summary Information Edit Correction (3) Details and Assets Coding Details and Assets Keypunching Details and Assets Edit Correction (l) Details and Assets Edit Correction (2) Details and Assets Edit Correction (3) 2.3.2 Filming Procedures Our objective is to produce a set of folders, each of which contains all the returns of a married couple or single person for all the years 1959-1964, or all the years available. Each person in each folder is to be given an identification number, unique to that person, which will allow us (at a later time) to integrate the 1946-1960 returns of the person with his 1959-1964 returns. If the identification number you assign is in error, the integration process will be slowed up considerably, and integration may never occur. 2.3.2.1 Assembly The first step in the construction of a folder is to gather all the prints (returns, Wt9s, or other materials) for a given couple or single person into a single stack. In general this will be a simple job because of the order in which the microfilming was done. An additional microfilm shot was taken to indicate the way the state archives had filed these returns. This "folder shot" contains the state's identification materials for the people in the folder, as well as an indication of any years missing from the series. If the folder shot indicates that the returns were out to field audit, the manila folder should be stickered and filed empty except for a colored slip of paper to be taken from a box on the supervisor's desk. The images showing that the returns were out to field audit should be placed in the folder labeled FIELD AUDIT, on the supervisor's desk. 2.3.2.2 Order of returns In most cases the prints will be ordered in the folder in the reverse of the order in which they were microfilmed. Filming order was 1964-1959; folder order will be 1959-1964. The following outline shows the proper order for items in a folder: Number 1 Coding-Filing log with the date, your initials, and any problems you encountered with the folder 2 Folder Shot 3 1959 front and back pages (a single print) 4 1959 second and third pages (a single print) 5 1959 information returns (and Wt9 after 1962) 6 1959 schedules, assessment notices, correspondence 7 Similarly 1960, 1961, 1962, 1963, 1964 When the prints are in the proper order, they should be stapled together (once) in the upper right-hand corner. Any returns from years before 1959 should be stapled into a separate bundle and placed immediately behind the 1959-1964 group of returns in the folder. The old (pre-'59) returns should be compared with the returns in the old files. If they merely duplicate the returns in the old files, they should remain in the new folder. If some returns which are missing in the old folder, occur in the new folder - see the supervisor. 2.3.2.3 Problems in filing 1. Disagreement between folder shot information and returns actually present. If the folder shot indicates that a year is missing, but the return for that year is actually present in the folder, record this information in the space next to the missing year indicator on the folder shot, along with the date, your initials and any explanation about why the discrepancy occurred. 1963 return missing (63 present out of order 8-11-66 AB) If the folder shot shows that a return is present, despite the fact that the return is missing from the folder, and the counter numbers indicate that no such return was photographed, see the supervisor for a list of known microfilm errors. 2. Broken marriages Divorcees and widows constitute another major filing problem, since the state puts the returns of a divorcee or widow into a separate folder after a divorce or the death of the husband, while WAIS wants to keep all the ex-wife's or widow's returns with those of ex-husband or deceased husband, just as WAIS wants all of the returns she filed before her marriage, in the husband's folder. Since the state files a wife's returns from years previous to her marriage, in her husband's folder there is no problem for WAIS in this area. However, the separate filing for divorcees and widows does constitute a definite problem. 1. Check all returns in the folder for a wife's name and social security number. If the name is present for some years but not all years, check for returns filed under the wife's name. If you find any, file them in his folder. 2. When you look for an ID sticker for a male, on the sticker listing, check the next few stickers for a wife (or wives). A wife's identification number would be identical to her husband's except for the 7th (next to the last) digit. (His 7th digit is 0; hers is 1,2, etc. depending on how many times he married.) If you find a wife for a man who didn't mention a wife on his 1959-64 returns, check for returns under her name and file any you find, in her husband's folder. 3. If a couple filed together for several years (for instance 1959-1962) and suddenly there are no more returns for either in the folder: a. Check for correspondence about the death of the husband; look under the wife's name for returns and file her returns in his folder. b. See if there is any indication that the couple left the state and is therefore no longer subject to Wisconsin taxes. 2.4 Assigning Identification Numbers 2.4.1 The construction of WAIS identification numbers The WAIS identification number is an eight-digit number with three major divisions: The name group identifier (digits 1-2) 01-51 for the 50 name groups The folder identifier (digits 3-6) Consecutive beginning with 0001, but with major gaps existing The person identifier (digits 7-8) Code Male head 00 First wife 10 Second wife 20 Third wife 30 Thus 01 000100 identifies the male head in the first folder in the 01 (b******) name group. 2.4.2 Procedure for assigning identification numbers to taxpayers already in WAIS files Most taxpayers who filed 1959-1964 returns also filed 1946-1960 returns, were interviewed by WAIS, or are beneficiaries of taxpayers who filed 1946-1960 returns. Therefore an identification number already exists for these people. IT IS IMPERATIVE THAT THE 1959-1964 RETURNS BE IDENTIFIED BY THE SAME NUMBER AS OTHER WAIS RECORDS FOR THE PERSON. 2.4.2.1 Finding the previously assigned identification number Each person in the folder (the male and each of his wives) must have a separate identification number. For each person then, 1. Find his social security number from his returns or from the Folder Shot. 2. On the listing of persons already in WAIS files (ordered by social security number) find the social security name and the identification number associated with the person. If you find the taxpayer's number, go on to step 3. If the social security number of the taxpayer is not on the list ordered by social security number, look for the taxpayer on the list ordered alphabetically by name. If you find the taxpayer's identification number, go on to step 3. Be Careful! Many taxpayers have identical names. When you are in doubt, check the folder in which 1946-1960 returns are kept and compare the two prints of the taxpayer's 1959 or 1960 returns. If you cannot find the taxpayer on the list ordered by name, check over the list ordered by identification number (or the gummed labels themselves) to try to find the taxpayer. If you are unsuccessful, put the folder in a residual stack with others containing taxpayers whose identification number has not yet been assigned. 3. Find the appropriate gummed label, remove it from its backing and affix it to the folder in the positions described below. Some taxpayers have two identification numbers. Find both labels for those who have them. IF A MARRIED COUPLE HAVE DIFFERENT NAME GROUP OR FOLDER IDENTIFIERS, SEE THE SUPERVISOR. 2.6.2.2. Positioning of gummed labels on folders l. Males with identification Numbers Ending in 00 -- Labels are to be placed on the lip of the folder on the left hand side so that the identification number, social security number, and the name are clearly visible. 2. Females with Identification Numbers Ending in 10, 20, 30 -- Labels are to be placed to the right of the male's label in ascending order (10, 20, 30 left to right). 3. Males or females with Identification Numbers Ending in 01, 11, 02, 12, 03, 13, etc. -- Labels are to be placed on the face of the folder in the same left to right order as that of labels on the lip. Place these folders in the residual stack with persons who have no identification number unless the taxpayer has one number ending in 0 and a second number ending in some digit greater than 0. Taxpayers who have both identification numbers should have a label in each appropriate position and the folders should be filed in the drawer. If you find a taxpayer who has three identification numbers, see your supervisor. 2.4.3 Assigning identification numbers to the residual stack Since there are no labels for taxpayers in this group, you will print on a blank label the identification number and social security number (first line) and the full name of the taxpayer. This hand printed label will be affixed to the lip of the folder. Each taxpayer in each folder must have a label affixed to the lip of the folder. The folders in the residual stack are of two major types: 1. Those with labels on the lip for one spouse but not for the other(s) (new spouses of existing taxpayers). 2. Those with no labels on the lip at all (new taxpayer families). We will discuss each type separately. 1. New spouses of existing taxpayers In most cases these will be the first wives of previously single men but new spouses can be females whom the male head married after he had been widowed or divorced. Others may be women who did not work before 1959 or 1960, but who worked (and so filed) during the years 1961-1964. A few "new spouses" are males who did not work front 1946-1960 but who began to file after 1960, or whose returns were missed during the first microfilming job. In the first half of the project, only the wives who filed were given identification numbers. Therefore, many men with previously assigned ID#s have wives who will have to be assigned a new IN. The following steps are to be followed in each such case: (l) Check the remaining printed gummed labels to make sure that the spouse does not in fact have an existing identification number. The ordering of the gummed labels should make this relatively easy. (2) When you are satisfied that no label exists for this spouse, print a label with the required information (identification number, social security number and name) and affix it to the lip of the folder. The first six digits of the identification number must be identical to those of the spouse's number: the last two digits should be 00 for a male; 10, 20, 30 for his wives. However, it is usually impossible to tell by the 1959-64 returns whether a man was married previously and then divorced or widowed. Since we don't want to assign #1 wife and #2 wife the same ID#, and since it would take too long to check the old file for every male taxpayer with an ID and a "new" (un-"identified") wife, to see which number wife this is for him, we must assign the new wife a completely unique wife number, which tells us only that she is married to a man who had an ID#, though she did not have one. This number will reveal nothing about the number of times her husband married. The number is to be identical to the husband's ID# except for the last two digits which are always to be "NN." Men each taxpayer in the folder has been given an identification number ending with 0, file the folder under its identification number. 2. New taxpayer families For those folders with no lip labels: (l) Print a label for each person in the folder and affix it to the lip of the folder. Omit the identification number on these labels. (2) Alphabetize the folders according to the head's name. (3) Check each folder against the remaining printed labels to make sure there is no existing identification number for the taxpayer(s). Remove any for whom labels exist and handle them as you would have if you had matched them before. Check to see whether prints of 1959 or 1960 returns are identical if there is some doubt. (4) Assign identification numbers with folder identifiers beginning (4001) to each label on each remaining folder and file the folders in identification number order. As you assign identification numbers to these folders, write the identification number and the social security number and name of the taxpayer to whom you assign the identification number on a yellow sheet in sequence and submit the completed sheet (dated and initialed) to the supervisor when you have finished the name group. 3. Coding Manual 3.1. Materials needed 3.1.1 Folder(s) You will work with a group of folders, each of which contains all the returns which were collected for the single person or married couple whose name(s) is (are) written on the top of the folder or on gummed labels on the top of the folder. The folders also contain a filing/coding log sheet. All folders are filed in file-cabinets in the corridor. They are separated by "name-group" and, within the name-groups, ordered by identification number. A name-group may fill more than one drawer (check the label or the outside of the drawer) and in some cases a drawer may contain more than one name-group. When you begin the coding, the coding supervisor will assign a name-group (or drawer) to you. Begin with the first drawer for this name-group and work until you have completed the coding of all folders for this particular name-group. Make sure that you don't skip any of the folders in the group. 3.1.2 Coding sheets You will receive a stack of coding sheets, on which the information has to be coded, which will later be required to integrate the new tax returns with the existing WAIS files. The top part of the coding sheet refers to "general" information, which should be obtained both from the top of the folder (see above, section 3.1.1) and from the most recent return in the folder (see detailed instructions in section 3.5, fields (l) through (8A) ). The numbered fields in this part are allotted a specific number of boxes, which in some cases is exactly the number required to contain a certain field (e.g. the Social Security number), in other cases is designed to contain up to a maximum number of digits (e.g. the fields pertaining to the taxpayer's name). In the latter group, we assumed that the number of boxes allotted would be sufficient for all returns. If you find any name which contains more digits than the maximum number allotted, see the coding supervisor. The bottom part of the coding sheet contains specific information which has to be coded for each year in which the taxpayer filed a return. For these fields, the boxes have not been divided into specific numbers of digits; the codebook specifies for each field how many digits it should contain; do not code less or more digits than the number specified in the codebook.gMike VonSchneidemesser 1967.(UPDATEAL- A Program to Update Tape FilesFebruary 9, 1967 WAIS paper667-025p6/Maintenance System - Files, Data, Etc. ProgramsbffMichael von Schneidemesser WAIS 667-025 January 2, 1968 2nd Revision UPDATEAL - A Program to Correct and Maintain Tape Files Contents: I. General Description II. Input Card Description III. Example File, Input Card Examples, and Outputs IV. Phase 1 Error Messages V. Preparation and Checking of Input Cards VI. Some Features of UPDATEAL 1. Making Changes to a Group of Records 2. Specifying a Blank Character or Space 3. Safeguards Against Inadvertent Changes 4. C-Card versus M-Card 5. A-Card or Numbered Cards 6. Making a New Tape File from Cards VII. Adapting the Source Decks to a Specific File VIII. Running the Program III. EXAMPLE FILE, INPUT CARD EXAMPLES, AND OUTPUTS To illustrate the features of UPDATEAL, to give some examples for coding of input cards, and to provide a set of source decks as convenient as possible for adaptation to other files a fictitious file has been created. On the basis of this file a set of cards was made up to demonstrate the working of the program and to generate the various outputs. It now follows a description of file formats and listings of the fictitious example file, of the sample set of cards as well as the outputs generated. Example File Format Cols. Number of Description Possible Values Order of Importance Characters of Field in Field if Sortfield 1-2 2 constant '66' - 3 1 item type alphabetic 3 4-11 8 ID number numeric 1 12-13 2 coded information alphanumeric - 14-15 2 year numeric 2 16-17 2 card number numeric 4 18-78 61 coded information alphanumeric - 79-80 2 sequence number numeric or blank 5 80 1 constant code for record mark The sortfield length is thus 15 (the number of characters in all sortfields). For the remainder let us assume that the minimum identification length (See Chapter VI-3) has been set to 6 characters, and that the"Stand-for-Blank" character is the '$' sign (See Chapter VI-2). Layout of File Listings Files in this write-up are listed with one record per line and the characters are printed in blocks of 10 followed by two blank spaces each. Layout of UPDATEAL Printed Output All Phases. The first line will always contain an identification of the program and the phase and the date, giving the year and the day of the year. The second line contains a listing of some program parameters to identify the program configuration. Then follow messages and record printouts. Whenever a record is printed out either due to an error or because it was requested by an input card, the record will be displayed in segments of 100 unedited characters per line. Any record and any message due to an error will always be printed starting in column 1, while record printouts requested by the user will start in column 20 with the accompanying message starting in column 10. At the end of each phase counts detailing kinds of cards, records, and errors will be listed. Phase 1. An error in an input card will be indicated by a 40 position message followed by a relevant input card. A 'W' in position 39 indicates a warning type message, which indicates that the program may be continued through the next phase with such an error. Some messages are followed by a printing of the record created to facilitate recognition of the error. Phase 2. Messages are usually self-explanatory. Records printed on the request of the user are preceded by a 'BEFORE' and 'AFTER', designating the record as it was found on the file to be changed, and the record in the form it will appear on the updated file, respectively, Naturally a record to be inserted into the file can be displayed only in its 'AFTER' status and a record dropped only in its 'BEFORE' status. Phase 3. No messages except those provided by the sort module. Phase 4. Self-explanatory. EXAMPLE FILE BEFORE UPDATEAL RUN 66S1100012 3NA5401 TH IS IS THE FIRST FICT ITIOUS REC ORD FOR A FICTITIOUS FILE 00 66S1100012 3NA5401 TH I S I S THE SECOND FIC TITIOUS RE CORD FOR A FICTITIOU S FILE 02 66S1100012 3NA5401 TH IS IS THE THIRD FICT ITIOUS REC ORD FOR A FICTITIOUS FILE 03 66U1100012 3NN5401 TH IS IS THE FOURTH FIC TITIOUS RE CORD FOR A FICTITIOU S FILE 6681100200 3NA5401 TH IS IS AN E RRONEOUS RRRECOR 0 FOR A FI CTITIOUS F ILE 10 6682100000 4NN5024 AN OTHER RECO RD 01 6682100000 4NA5124 AN OTHER RECO RD 01 6602200000 2NN5023 TH IS IS SOME OTHER FIC TITIOUS RE CORD FOR A FICTITIOU S FILE 01 66D2200000 3NN5023 AN D ANOTHER FICTITIOUS RECORD FO R A FICTIT ITIOUS FIL E 01 66S2200000 3NN5123 TH IS IS THE SECOND LAS T RECORD 0 F THE FICT ITIOUS FIL E 01 66D9999900 ONA5929 TH IS IS THE LAST RECOR 0 OF THE F ICTITIOUS FILE 09 INPUT CARDS UPDATEAL PHASE I RESULTS. 67352 PROGRAM PARAMETERS - RECORD LENGTH 081, IDENTIFICATION LENGTH 15, MINIMUM ID LENGTH 06. 000018 CARDS READ OF WHICH 000004 WERE A-CARDS, 000006 C-CARDS, 000002 D-CARDS, 000002 M-CARDS, AND 000004 NUMBERED CARDS. 000011 RECORDS PUT OUT ON ALTER TAPE - THAT EXCLUDES DUMMY IDENTIFICATION RECORDS. OF THIS TOTAL 000002 WERE A-TYPE RECORDS, 000002 D-TYPE RECORDS, 000004 C-TYPE RECORDS, 000001 M-TYPE RECORDS, ANU 000002 NUMBERED-TYPE RECORDS. 000000 ERRORS IN THE INPUT DATA, 000000 OF WHICH WERE SERIOUS. UPDATEAL PHASE 2 RESULTS. 67352 PROGRAM PARAMETERS - RECORD LENGTH 081, IDENTIFICATION LENGTH 15. THIS UPDATE RECORD COULD NOT BE PUT OUT ON THE UPDATED FILE 66S11000123NA5401 THIS DUPLICATES A RECORD ALREADY ON THE FICTITIOUS FILE 02 BECAUSE THE FOLLOWING RECORD ON THE OLD FILE HAS THE SAME IDENTIFICATION. 66S11000123NA5401 THIS IS THE SECOND FICTITIOUS RECORD FOR A FICTITIOUS FILE 02 IDENTIFICATION AND ACTION SECTION OF UPDATE CARD 110001235450103 CJ . AFFECTED RECORDS FOLLOW BEFORE 66S11000123NA5401 THIS IS THE THIRD FICTITIOUS RECORD FOR A FICTITIOUS FILE 03 AFTER 66S11000123NA5401A C-CARD CREATED THIS DUPLICATE RECORD A FICTITIOUS FILE 02 IDENTIFICATION AND ACTION SECTION OF UPDATE CARD 1100012354T9901 AJ . AFFECTED RECORDS FOLLOW AFTER 66T11000123NN5499 AND THIS JUST BEHIND THE THREE RECORDS 01 NO RECORD IN OLD FILE FOR THIS UPDATE RECORD 21000000 OJ IDENTIFICATION AND ACTION SECTION OF UPDATE CARD 21000004 MJ . AFFECTED RECORDS FOLLOW BEFORE 66821000004NN5024 ANOTHER RECORD 01 AFTER 66821000004CC5024 ANOTHER RECORD MODIFIED 01 BEFORE 66821000004NA5124 ANOTHER RECORD 01 AFTER 6682L000004CC5124 ANOTHER RECORD MODIFIED 01 IDENTIFICATION AND ACTION SECTION OF UPDATE CARD 2200000351S23 CJ . AFFECTED RECORDS FOLLOW BEFORE 66S22000003NN5123 THIS IS THE SECOND LAST RECORD OF THE FICTITIOUS FILE 01 AFTER 66U11000123NN5401 THIS IS THE SECOND LAST RECORD OF THE FICTITIOUS FILE 01 IDENTIFICATICN AND ACTION SECTION OF UPDATE CARD 9999900059D DJ . AFFECTED RECORDS FOLLOW BEFORE 66099999000NA5929 THIS IS THE LAST RECORD OF THE FICTITIOUS FILE 09 000011 OLD RECORDS REAC IN, 000011 UPDATE RECORDS READ. 000009 NEW RECORDS PUT OUT, PLUS 000004 RECORDS PUT OUT ON RECYCLE-FILE. 000001 RECORDS FROM OLD FILE WERE DROPPED,DUE TO 000002 0-TYPE RECORDS FROM THE UPDATE-ALT-FILE. 000004 RECORDS WITH MODIFIED FIELDS,INCLUOING SORTFIELOS, DUE TO 000004 C-TYPE RECORDS. 000002 RECORDS WITH MODIFIED FIELDS-EXCEPT THEIR SORTFIELOS-, DUE TO 000001 M-TYPE RECORDS. 000003 NEW RECORDS INSERTED INTO THE NEW FILE, DUE TO 000004 A-TYPE OR NUMBERED RECORDS. 000002 POTENTIAL ERROR CONDITIONS ENCOUNTERED. PHASE 2 OUTPUT FILES NEW - EXAMPLE FILE WITH A, D, M, AND NUMBERED CARD CHANGES MADE 66S1100012 3NA5401 TH IS IS THE FIRST FICT ITIOUS REC ORD FOR A FICTITIOUS FILE 00 66S1100012 3NA5401 TH IS RECORD WILL BE IN SERTED BEH INO THE FI RST RECORD 01 66S1100012 3NA5401 TH IS IS THE SECOND FIG TITIOUS RE CORD FOR A FICTITIOU S FILE 02 6671100012 3NN5499 AN 0 THIS JUS T BEHIND T HE THREE R ECORDS 01 6662100000 4CC5024 AN OTHER RECO RD MODIFIE 0 01 6662100000 4CC5124 AN OTHER RECO RD MODIFIE D 01 66D2200000 2NN5023 TH IS IS SOME OTHER FIC TITIOUS RE CORD FOR A FICTITIOU S FILE 01 66D2200000 3NN5023 AN D ANOTHER FICTITIOUS RECORD FO R A FICTIT ITIOUS FIL E 01 6609999900 ONA6029 I NSERTED 08 66S1100012 RECORD A FICTITIOUS FILE 02 RECYCLE - EXAMPLE FILE RECORDS WITH C CHANGES 3NA5401A C -CARD CRE ATED THIS DUPLICATE 66U1100012 3NN5401 TH IS IS THE FOURTH FIC TITIOUS RE CORD FOR A FICTITIOU S FILE 01 6681100200 3NA4901THI S WAS A CO RRECTED RECOR 0 FOR A FI CTITIOUS F ILE 10 66U1100012 3NN5401 TH IS IS THE SECOND LAS T RECORD 0 F THE PICT ITIOUS FIL E 01 UPDATEAL PHASE 4 RESULTS. 67363 PROGRAM PARAMETERS - RECORD LENGTH 081, IDENTIFICATION LENGTH 15. THIS.-ALREADY EXISTING RECORD OF THE ORIGINAL FILE 66S11000123NA5401 THIS IS THE SECOND FICTITIOUS RECORD FOR A FICTITIOUS FILE 02 IS DUPLICATED BY THIS RECORD CREATED THRU A C-CARD, WHICH WAS NOT PUT OUT ON THE FINAL UPDATED FILE. 66S11000123NA5401A C-CARD CREATED THIS DUPLICATE RECORD A FICTITIOUS FILE 02 THIS RECORD CREATED THRU A C-CARD 66U11000123NN5401 THIS IS THE FOURTH FICTITIOUS RECORD FOR A FICTITIOUS FILE 01 IS DUPLICATED BY THIS RECORD ALSO CREATED THRU A C-CARD, WHICH WAS NOT PUT OUT ON THE FINAL UPDATED FILE. 66U11000123NN5401 THIS IS THE SECOND LAST RECORD OF THE FICTITIOUS FILE 01 000011 RECORDS OF PHASE 4 UPDATED FILE PUT OUT, 000009 RECORDS OF PHASE 2 UPDATED FILE WERE READ IN AND PUT OUT, AND 000004 C-TYPE RECORDS WERE READ. IN 000001 CASES AN ORIGINAL RECORD WAS FOUND TO HAVE THE SAME IDENTIFICATION AS A C-TYPE RECORD, IN 000001 CASES TWO RECORDS WITH THE SAME IDENTIFICATION WERE CREATED, AND BOTH THRU C-TYPE RECORDS. PHASE 4 OUTPUT FILE FINAL - NEW AND RECYCLE MERGED 66S1100012 3NA5401 TH IS IS THE FIRST FICT ITIOUS REC ORD FOR A FICTITIOUS FILE 00 6651100012 3NA5401 TH IS RECORD WILL BE IN SERTED BEN IND THE FI RST RECORD O1 6651100012 3NA5401 TH IS IS THE SECOND FIC TITIOUS RE CORD FOR A FICTITIDU S FILE 02 66T1100012 3NN5499 AN 0 THIS JUS T BEHIND T HE THREE R ECORDS 01 66U1100012 3NN5401 TH IS IS THE FOURTH FIC TITIOUS RE CORD FOR A FICTITIOU S FILE 01 6681100200 3NA4901THI S WAS A CO RRECTED RECOR D FOR A FI CTITIOUS F ILE 10 66B2100000 4CC5024 AN OTHER RECO RD -MODIFIE 0 01' 6682100000 4CC5124 AN OTHER RECO RD MODIFIE 0 01 66D2200000 2NN5023 TH I S I S SOME OTHER FIC TITIOUS RE CORD FOR A FICTITIDU S FILE 01 66D2200000 3NN5023 AN D ANOTHER FICTITIOUS RECORD FO R A FICTIT ITIOUS FIL E 01 66D9999900 ONA6029 I NSERTED 08 IV-1 IV. PHASE I ERROR MESSAGES 1. SEQUENCE The card printed to the right of the message or the card printed below are not sorted properly. A set of numbered cards may not be sequenced correctly. 2. FORMAT The action section uses an illegal symbol; the card could not be recognized and processed by the program. 3. IDENTIFICATION TOO SHORT OR TOO LONG The characters coded in the identification may exceed the total sortfield length; it may contain fewer characters than the minimum identification length specified in the program; for numbered or A-Cards the identification section specified does not contain the same number of characters as are in all sortfields, 4. BAD FIELD-DESIGNATOR AND/OR DATA W Errors in the data section. Length of field specified does not correspond to the data actually coded; use of non-numeric characters in the field-designator; a field-designator-with-datafield starts in a wrong column. Check all cards with the same identification. 5. A RECORD ADDRESSED BY 2 TYPES OF CARDS W It is not possible to specify two different actions on any one record or group of records. 6. FIELD CODED EXCEEDS RECORD LENGTH An attempt was made to specify a data field for which there is no place on the record, because it would exceed the record length. Check all cards with the same identification. 7. FIELDS SPECIFIED FOR RECORD COLLIDE W The same column(s) on a record have been referenced twice or more by field-designator-with-data-fields, Check all cards with the same identification. 8. CHANGES IN SORTFIELD OF RECORD BELOW For numbered cards or A-Cards the identification section was found to be different than the sortfield(s) in the record set up. For M-Cards some changes to the sortfield(s) of the record(s) have been specified. 9. NO CHANGE IN SORTFIELD OF RECORD BELOW C-Cards are specified which do not make any changes to the sort field (s), 10. NOT ENOUGH CARDS FOR RECORD BELOW W Not enough numbered cards are used to specify a complete new record. This message may also occur if more than enough cards have been specified. Note: If errors other than those marked with 'W' are indicated, then the counters for types of cards read in may not be correct. Also, it is possible that the number of errors indicated is greater than the number of messages printed, because of duplicate errors per card or record. V. PREPARATION AND CHECKING OF INPUT CARDS In many applications it may be useful to check the input cards for legal values as can appear in the specified fields of the file to be updated. It is not necessary to check for those conditions which would preclude a proper functioning of the program itself, since such conditions will be detected by the program. For the kinds of errors the program will detect, see Chapter IV. Since UPDATEAL is a general purpose program it assumes that any character can occur in any position in the file (for a possible exception see Chapter VI-2), For many files, however, the user knows that certain fields can only contain certain values. For example, the fictitious file used as an example in this write-up does allow only numeric characters in positions 3-11, 14-17, and 79-80; and one also knows that position 16-17 can contain only values between let us assume 55 and 68, since this is the year. Depending on how important it is to prevent possible coding errors in these fields, and on how likely it is that errors could have been introduced in the preparation of the input cards, the user may want to check the input cards for these conditions using an edit program. Such an edit often will be made easier by adopting some restrictive, coding, conventions like: use only one field designator with data per card; use one field designator with data per logical field; do change only the complete logical field, even though some positions will be the same. To facilitate the job of keypunching, use separate code-sheets for numbered cards and instruct the keypunch personnel that for the batch with the C, M, A, and D cards columns 35, 50, 65, and 80 always must be blank and that column 21-24, 36-39, 51-54, 66-69 can contain only numeric data or blanks. For a technique on how to avoid changing records mistakenly, see Chapter VI-3. When all cards are punched sort numerically on columns 20-1. Of course, one can skip those columns from column 18 on which are never used, if the sortfield is less than 18 columns long. VI. SOME FEATURES OF UPDATEAL 1. Making Changes to a Group of Records In principle UPDATEAL will act on all records, which have values in their sortfield(s) equal to those given in the identification section of an input card. Fewer characters in that section than there are in the sortfield(s) will result in changes to all records having in common those parts of the sortfield(s) which are specified. In other words only as many characters of the sortfield(s) in the records of the file will be compared to the characters in the identification section as there are positions up to and including the last non-blank character. This allows one to change a whole sequence of records with one card or one set of cards by specifying only those major sortfield(s), which are common to the records to be changed. It is not necessary to make out a card each for each logical record to make the same change(s) to a group of records stored consecutively on the file. Yet to prevent unintended changes to a great number of records through an incompletely prepared input card the program can be made to reject all cards with not enough characters specified in the identification section (See Chapter VI-3). For an example of this feature see the phase 2 output under section III of this write-up. 2. Specifying a Blank Character or Space, The program makes special use of blanks or spaces. This character is sometimes denoted as 'b'. if, however, a blank is to be entered on a record or it is a part of the identification, i.e, some part of the sortfield is blank, then it is necessary to communicate this to the program thru the use of the "Stand-for-Blank" character. in the example file version of UPDATEAL the "Stand-for-Blank" character is the dollar sign '$'. This program parameter may be changed to any other character (See card number 0591 in the source program). But one should make sure that this character is not a valid character in the file. Specifying Blanks in the Identification Section The"Stand-for-Blank" character may be used each time a character in the sortfield is blank. This applies for all card types. The "Stand-for-Blank" character, however, must be used for the last character in the identification section if it is blank in the sortfield, because the program treats everything up to and including to the last non-blank character as the identification. Specifying Blanks in the Data Section The "Stand-for-Blank" character must be used whenever a blank should be moved into any position of an existing record. But do not use the "Stand-for-Blank" character in the data section of either a numbered card or an A-Card. A blank field in the data section of a numbered card will be set up as a blank field in the record to be created. Any position of a record to be established which is not specified in the data section of an A-Card will automatically be set to blank. 3. Safeguards Against Inadvertent Changes, The program has a special parameter (minimum-ID-length, card number 0590) which may assume values between 0 and 18. Any input cards, which have fewer characters in the identification section than specified by this parameter will be rejected as an error. Incomplete Identification Section By specifying a minimum-ID-length of close to or equal the sortfield length it is possible to reject input cards which are incompletely prepared and thus could make changes to a whole string of records (See Chapter VI-1). Incorrect Identification Section An incorrectly prepared identification section may result in the wrong record(s) being changed, To avoid this, one can declare any other field in the file to be an additional sortfield, by appropriately specifying this in the program decks (See Chapter III). The file would not actually also be sequenced on this additional sortfield, or dummy sortfield, since that would not affect the arrangement of records in the file if (a) the original or true sortfield(s) provide a unique identifier of each record and (b) this additional or dummy sortfield is used as the most minor one. Also this dummy sortfield should be chosen in such a manner that it will have the same value for as few records as possible. Then if by accident the identification section points to a record which was not originally intended to be changed the program will compare everything in the identification section to the sortfield(s) -- including the dummy one -- in the file, but will not be able to locate the misspecified record since the dummy part of the sortfield(s) will not agree. A "No record in old file for this update record" message will appear in the diagnostic output of phase 2. This "double identifier" method will be 100% effective if properly planned and is a necessity in many applications. 4. C-Card versus M-Card Both cards serve to make changes to selected fields of existing records and use the same method to do this. However, whenever an "M" appears in the action section not only any attempt to change the values of a field designated as a sortfield will be rejected, but also an attempt to overlay a sortfield character with the same character. A "C" in the action section legitimizes changes to any field on a record. In fact a C-Card will be rejected if no change to the sortfield(s) have been specified. Besides changing a sortfield it is also possible to change any other field. C-Cards should be used with great care and in some applications one may not want to use C-Cards at all. If M-type changes and C-type changes are to be made at the same time on one record, it is necessary to make all cards C-Cards. It is not possible to use different types of cards for one record during the same run. 5. A-Cards or Numbered Cards Both card types may be used to specify new records for insertion into a file. In general, the "numbered card" method is easier to use, requires less coding, fewer cards, and is probably safer in respect to error possibilities. Under some circumstances, however, the A-Card method will be advantageous: If the record to be specified consists mainly out of blank spaces and is rather long. Also, if only very few new records are to be specified together with changes on existing records, then a coder may find it easier to use one method only and not try to learn an additional one. 6. Making a New Tape File from Cards, Phase 1 of UPDATEAL may be used for creating new tape files using the A-Card and/or numbered card methods. The program will do the formatting of the file and check for sequence errors and duplicate records. To use UPDATEAL for this purpose remove card number 1104 and 1212 from the source deck. VII. ADAPTING THE SOURCE DECKS TO A SPECIFIC FILE To prepare the Cobol source decks for a specific file it is necessary to make a few parameter changes to certain well identified cards. This should be done by a person, who has at least some elementary knowledge of the Cobol programming language. Also, if used, some parameters of the sort specification program (phase 3) have to be adapted to suit the specific file. The cards which need to be changed are the same in phase 1, 2, and 3. Their number is small. For example, an unblocked file with one sortfield necessitates the (re)punching of at the most 11 cards in a phase. For a blocked file with three separate sortfields the number of cards is 18. General Card Format Col. 1 not used. Cols. 2-6 contain the sequence number, which is not always continuous from card to card. Cols. 7-72 contain the Cobol statements. Cols. 73-77 'UPDAT', the program identifier. Col. 78 the phase number, 1-4; this field is blank for those cards which may be changed and are the same in each phase. Col. 79-80 indicates the changes to be made, if any, otherwise blank. Indicators for Type of Change in Columns 79-80 Columns Meaning of Indicator and What to do with Cards so Indicated 79 80 1 b Change or remove for a change in the sortfield(s). 2 b Value clause has to be the total number of characters in the sortfield(s). 3 b Value clause has to be the minimum number of characters to be specified in the identification section of the input cards. b 1 Change or remove for a change in the record length. b 2 Value clause has to be the number of characters in the logical record, b 3 Value clause has to be the "Stand-for-Blank" character. b 4 Remove whenever the logical record does not terminate with a record mark. (A record mark is a necessary character for blocked file handling in the IBM 1400 series computers.) The following two kinds may be disregarded for normal applications: 6 7 The area size specified by these cards need only be as big as the logical record; to save on care these areas may be reduced to the logical record size. 8 8 Remove, if phase 1 is to be used to set up a new tape file. Explanation of Some Specific Statements (by sequence number) 0431-0439: Subentries under FILE-ID-IMAGE define the location and length of the sortfield(s) in the file. "ID1" refers to the major, ID5 to the most minor sortfield. There is no limit to the number of sortfields declared this way, except by the size of the identification section (=18). Fields which are not a sortfield should be declared FILLER. It is necessary to account for each character in a record, however, note that only alphanumeric characters can be declared irrespective to actual usage. It may be advisable to include a size entry in statement 0430; then the compiler will be able to detect size errors in the format specification of FILE-ID-IMAGE. 0441-0449: Subentries for SORTFIELD are all the individual sortfield(s) in order of relative importance. SORTFIELD must correspond in its layout to the use of the sortfield(s) in the identification section of the input cards. Together with FILE-ID-IMAGE it serves to establish the correspondence of the sortfields in the input cards with those in the file. Any unused positions of the 18 character SORTFIELD must be declared FILLER. 0590-0593: Only the value clause is subject to change, all four cards must be present. 1302-1305: See the note on cards 1310-1311. Suggested Procedure for Making Changes Get a listing of UPDATEAL in its example file version and use it for reference. Phase 1 will usually be enough. Make a duplicate deck of the original example file version decks and alter the duplicate decks only. Save the original deck for future uses, since it will usually be the most convenient version to start with. Starting with phase 1 remove all cards which need to be changed. Discard those not needed, Repunch cards to be changed and punch any additional cards. Then make two additional copies of this set of cards and four more copies of those cards appearing in the file section (col. 2-3 = 03). This set of cards will suffice to make all necessary changes to phase 1, 2, and 4. Insert now one set of cards into the phase 1 deck. Remove and replace cards for phase 2 and 4 in a similar manner. The sequence numbers are the same in all phases, except for the cards in the file section. Since all files require the same format, the same cards can be used for all file descriptions. To get the correct phase 3, refer to the manufacturers sort program specifications. VIII. RUNNING THE PROGRAM Put the sorted input cards behind the EXEQ UPDATONE, MJB card of phase 1, followed by the program decks for phases 2 and 3 and 4, if C-Cards were made up. The program will instruct the operator to cancel the run, if the input cards contained errors, otherwise it will go thru the various phases, entering a WAIT-Loop, whenever tapes have to be mounted or dismounted.hahttp://www.ssc.wisc.edu/wais/WAIS667025.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667025.txt8 John deVries 19676/A Proposal for the Standardization of WAIS JobsFebruary 8, 1967 WAIS paper667-027Administration4-John DeVries WAIS 667-027 February 8, 1967 A Proposal for the Standardization of WAIS Jobs. (Subtitled: "Bureaucracy Sticks up its Ugly Head Again"). 1. Introduction In view of the recent - if not current - confusion about the status of various WAIS jobs, the activities of the staff, expected completion dates, etc., it is my impression that there is an urgent need for a standardized procedure for the handling of the various jobs dreamt up by the fertile minds of the project staff. The following proposal constitutes an attempt to design a procedure which will enable us: (a) to allocate the available personnel efficiently and have some idea about "who is doing what" with his time; (b) to have, hopefully, more realistic and reliable time-estimates; (c) to arrive, on the basis of time-estimates and personnel allocations, at more realistic deadlines; (d) to prevent, hopefully, the Parkinsonian expansion of jobs to points where available manpower capacity is exceeded; (e) to inspect, at regular time-intervals, the progress of the respective jobs and, if required, to change time-estimates and personnel allocations. 2. Analysis of jobs In very general terms, most jobs in this project seem to fall somewhere in the following pattern: General Stage Subdivision Possible "Output" Type of Staff Remarks Stage Involved Planning Definition of WAIS paper (to Planner In some cases the problem define problem, analyst should be propose general consulted at this procedures) stage Analysis Analysis of Set of instruct- Planner, Analyst Output not always problem ions (WAIS paper) required Flow-charting Program flow. Analyst If program invol of procedure chart or systems ved programmer and/or program flow chart should be consul ted in many cases Programming Coding Source program Programmer Analyst should be available for con- sultation Testing of Program listing, Programmer Clerical staff program, object program, clerical staff can be used to production run program output help debugging Clerical Checking computer Clerical staff Programmer should Operations output be available for consultation Writing corrections Clerical staff, programmer Summarising WAIS paper, report Clerical staff, Staff involved findings to respondents, programmer, depends an nature Monograph, etc. analyst, planner of jobs typing clerical staff (typist, secretary) Keypunching Punched cards Keypunchers (listings) Coding Coded sheets Clerical staff Programmer or an- alyst should be a vailable for con- sultation Miscellaneous Everything I forgot to specify above operations Types of Staff Available for our Jobs: I have divided the project staff into four categories, dependent on the nature of the work: 1. Planner: Defines the problem on hand and proposes general procedures for the solution. Frequently also writes up final outcome in publishable form; 2. Analyst: Analyses the specifications supplied by the planner and "translates" them into a detailed set of instructions and/or a flow chart; 3. Programmer: "Translates" flow charts and/or instructions into sourceprograms, tests these programs and runs production jobs; may also be required to help clerical staff set up control cards for existing programs (e.g. Card edit or pre-edit programs); 4. Clerical Staff: Checks computer output, codes data, writes corrections, etc. In many cases, the planner and the analyst, or the analyst and the programmer, will be the same person, although this is not necessarily so. In several other cases, where no programming is involved, the "programmer" step will be skipped completely. 4. Elements of a Typical WAIS Job-Design: In order to achieve the desired standardization, I propose that all WAIS jobs be processed according to the following procedure: (I) Planning stage: Definition of the problem, outline of general procedures. I suggest that for every job a WAIS paper be written with sufficient detail for others to understand the general purpose and to make intelligent comments. Each job write-up should be discussed in a project meeting before it enters the next stage. There should be a standing rule that no job can enter the analysis stage before the problem is clearly defined and the general procedure has been specified. (2) Analysis Stage: Formulation of a detailed set of instructions (which for programming jobs should include a flow chart of the program required). For large jobs (e.g. filing of tax returns, keypunching of coded data, etc.) the instructions should, again, be in the form of a WAIS paper to be discussed in a staff meeting; for smaller jobs (e.g. checking on multiple ID numbers) a WAIS paper and discussion would generally not be required - but even in these casees clear instructions should be written up and available to whoever is interested. Besides the instructions, the analysis stage should also produce a time-estimate and an allocation of personnel. The time-estimate (broken down into planning, analysis, programming and clerical) should be done in terms of man-hours required; the allocation of personnel should be done in terms of individuals allocated to the job, as well as number of hours per week each individual will spend on this particular job. The combined total will yield an initial estimated completion date. It should be a rule that no job can pass to the programming stage (or the clerical stage in the absence or programming requirements) prior to the completion of the planning and analysis stages (including time-estimates!) (3) Programming Stage: (Only for jobs which require new programs to be written): includes the writing of the program, the testing and debugging and the production running. (4) Clerical operations: This includes a large group of separate operations; every job will usually include at least one of them. The last two stages can be defined as "processing" stages (either by machine or by manual operations); they will usually be the ones absorbing the largest segment of the allocated time. 5. Additional Rules and Suggestions: (a) On time-estimate: I suggest that all time-estimates as a rule be revised after 10% of a job (or phase of a job) has been completed, and again after approximately half of the job has been finished. It also seems sensible to review time-estimates if a change in personnel occurs (this may be either in the form of gain or loss of staff or in the form of changes in time available from a staff-member). (b) On time-estimate revisions: If the "10%" revision rule, as suggested above, is accepted, I suggest that the people working on the job (or the people supervising) keep track of the hours spent per week on this particular job. (c) On allocation of personnel: There should be a file containing for each individual on the project staff: (i) the total number of hours per week the person works; (ii) for each job a person works on, the number of hours per week spent on it. (d) On overall efficiency: I feel that most individuals will decrease their productivity if they work on more than three jobs simultaneously. 6. A Summated Format for a Job-Description Sheet Job: WAIS paper: #, Date, Instructions: (WAIS paper #) Date: Planner: Analyst: Programmer: hours per week Clerical staff: hours per week hours per week hours per week hours per week Time estimates Initial 10% 50% Man-hours Expected completion date Final completion date: Start of processing depends on jobs numbered: Completion of this job essential for the starting of jobs numberedhahttp://www.ssc.wisc.edu/wais/WAIS667027.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667027.txt : Ben Bridges  1965WAIS and the SSAFebruary 23, 1965 WAIS paper645-036r"Benefit File Social Security & Ben Bridges, WAIS Paper 645-036 February 23, 1965 WAIS and the SSA 1. DATA A. Data WAIS now has: l. Wisconsin Tax Record File for 1947-1959: a one-percent sample of Wisconsin individual income tax returns for the years 1947-1959, selected on a name cluster basis which insured the ability to match returns from one taxpayer for all the years in which he filed. Some 18,000 taxpayers appear in this 13-year shifting panel. 2. Personal Interview File: Mainly a stratified sub-sample of the Wisconsin Tax Record File which includes 1,300 heads of taxpayer units. Collected information needed to explain trends and fluctuations in income (occupational history, labor force participation history, education, age, race, etc.) historical information on the ownership and disposition of assets and debts, and other data. 3. Social Security Account File: Information available on Form 805 (summarizing the benefit status earnings in jobs covered by Social Security, age, race, and quarters of covered employment since 1951) for 14,000 of the taxpayers who appear in the Wisconsin Tax Record File. 4. Wisconsin Taxpayer Population File for 1962: Limited information on each 1962 Wisconsin taxpayer. 5. Stock Price and Dividend File: Data from Standard and Poor's for large corporations only. B. Useful additions to our data files 1. Additions to Wisconsin Tax Record File: Add years since 1959. 2. Social Security Benefit data: Social Security Benefit data for recipients of OASDI benefits have not yet been collected, although collection forms have been designed and the Social Security Administration has contracted to deliver the data to us after we are in a financial position to give our final order for them to proceed. 3. Additions to Wisconsin Taxpayer Population File: Add 1961 and years since 1962. 4. Additions to Stock Price and Dividend File. 5. Federal Tax Record Files. II. ANALYSES A. National Bureau of Economic Research project: a study of capital gains and capital gains taxation. B. Brookings Institution protect: a study of income fluctuations and trends. A major part of this study will be an attempt to describe and explain trends and fluctuations in the pre- and post-tax incomes of different population groups. Much attention will also be paid to various proposals for income averaging for tax purposes. Parts of the study which may be of particular interest to the Social Security Administration: 1. Analysis of income patterns (fluctuations and trends) over time for different population groups; particularly for the aged before and after retirement. Breakdowns by earned and property incomes will indicate to some degree the acquisition and dissolution of assets over time by the aged. 2. Tax treatment of the aged (may be deferred until later) (a) double exemption (b) retirement credit (c) special medical care deduction C. Additional studies which may be of particular interest to the Social Security Administration. 1. Studies of retirement: see the David memo of February 23, 1965. 2. Poverty: a. To what extent is it temporary? b. Poverty and the aged. 3. Incidence of Social Security taxes: with respect to average income vs. with respect to annual income. 4. Tax treatment of the aged: a. Tax exemption of Social Security benefits. b. Property tax exemptions for the aged. 5. See memo of September 18, 1963.hahttp://www.ssc.wisc.edu/wais/WAIS645036.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645036.txt(!Wynn Bussmann Richard Mastronardel 1967>8Remaining Clean-up Operations on File 12 and Files 21-28February 9, 1967 WAIS paper667-028a Property File7Wynn V. Bussmann Richard Mastronarde WAIS 667-028 February 9, 1967 Remaining Clean-up Operations on File 12 and Files 21-28* Each of the checks described below refers to the following procedure: (1) check for errors, (2) correction of those errors, and (3) re-check after correction to insure that no errors were missed and that the corrections themselves were made without error. For each check we are including an estimate of the number of man-hours of planning time (Pl), analyst's time (An), programmer's time (Pr), and clerical time (Cl) that will be required to complete the job. In addition, we are including an estimated completion date (E.C.D.) based on the estimates of required man-hours and the availability of the staff's time for that job. The following WAIS papers are appropriate references for perspective on the present paper: (1) "General Card Edit Program XS- " by George Loniello shows the kinds of statements possible for intracard checks. ------------- *File codes refer to those proposed in WAIS 667-020, p.5. The state of completeness of Files 21-28 is also described in WAIS 667-020. A future WAIS paper will examine the possibilities of expanding these files for useful analysis. ------------- (2) "Computerized Error Correction. . ." by James Geffert (WAIS 656-044) suggests a method of constructing interfile checks. (3) "Portfolio Evaluation..." by Roger Miller (WAIS 656--050, 656-051, and 656-053) outlines general processing procedures for each file, (4) "Outline and Timetable..." by Richard Bauman (WAIS 667-020) contains an analysis and projection of work on File 12 and Files 21-28. (5) "Correcting and Updating. . by Mike von Schneidemesser gives an approximate sequence for making corrections on File 12 and other WAIS files. (wAis 667-021) I. Intracard Checks These checks look for the presence of (a) invalid blanks, codes, and characters; (b) "highly unlikely codes" (e.g., interest and dividends of $50,000 or more in any one year; if found, these "errors" are hand-checked with the returns); and (c) improper contingent codes (where the presence of one code implies the presence of another code elsewhere on the card, but the latter is either missing or incorrect). A. File 12 (Property File) Card 01 (interest and dividends), which contains by far the greatest number of records, has been completed. The details of the editing process along with a description of the format of all the cards will be explained in a forthcoming WAIS paper by R. Mastronarde. Pl An Pr Cl E.C.D. 0 20 0 20 3-1-67 B. Files 21-28 (Firm Data Files) These checks examine asset type codes and asset identification numbers to make sure (1) that there are no invalid characters or blanks end (2) that the asset identification number is the proper one given the asset type code and vice-versa. Pl An Pr C1 E.C.D. 4 6 0 20 2-20-67 II. Intrafile Checks A. File 12 One check that the authors feel should be made would simply insure that the first six digits of the taxpayer's identification number are consistent within families. However, the tape-image record is not blocked by families, and so the extra programming required for this check may deem it impractical. The man-hour time estimates below do not include an estimate of the time required for this check. Nevertheless, other intrafile checks are possible; for example, the elimination of duplicate and of multiple eight-digit identification numbers. Such a check has been completed on File 12. Another check will verify that sequence numbers for Card 1 are complete. A program already exists for this operation, and the check can be quickly run. Pl An Pr Cl E.C.D. 0 5 0 10 2-25-67 B. Files 21-28 These checks look for duplicate asset identification numbers (which are allowed, for instance, if two firms merge) and for multiple asset identification numbers and asset type codes for one firm. Some of the latter may, of course, be justified if, for example, a firm has issued both stocks and bonds, thus giving rise to two valid asset code types. Pl An Pr Cl E.C.D. 0 10 0 25 2-25-67 III. Interfile Checks A. File 12 This check involves preparation of an extract from File 12 containing summary information and then checking that summary information against corresponding items in File 11 (Master File). For example, the extract should contain the sums of interest and dividends; net rent; capital gains; and farm, business, and professional income yearly for each taxpayer in File 12 (see Geffert, op.cit.). These sums are then compared with the corresponding amounts recorded in File 11. In addition to simple sum checks, various consistency checks can be made which will tell us more accurately than the sum checks where an error has been made. Since the program described in WAIS 656-044 was designed to pick out specific items in File 11 and run consistency checks on them, it cannot be used directly for File 12. However, given the general strategy outlined in WAIS 656-044, a program can, we feel, be written to perform consistency checks on File 12. Further, a check should be made to see that anyone with a record of property income in File 11 also has a record in File 12 (a similar check has been completed with a pre-edit program to find all records which are in File 12 but not in File 11 and to drop or change them). Pl An Pr C1 E.C.D. 20 25 20 60 3-25-67 B. Files 21-28 After it has been determined that File 12 is consistent with itself and with other WAIS tiles, File 12 should be checked against Files 21-28 to insure that every asset type code and asset identification number in File 12 is also in Files 21-28. The converse operation is not practical because Files 21-28 may (and do) validly include assets not held by taxpayers in the WAIS sample. Because there is no program presently available for this operation, the authors will have to confer with the programming staff to get an estimate of the programming time required to produce a program that will carry out this check. Since this conference has not yet taken place, the estimate of programming time and hence the E.C.D. have been left blank. Pl An Pr C1 E.C.D. 1 3 5hahttp://www.ssc.wisc.edu/wais/WAIS667028.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667028.txt*r John deVries 1967F@1959-1964 Wisconsin Income Tax Returns: Tentative Record FormatsMarch 15, 1967 WAIS paper667-030& Formats Master File- Tax Records)N)GJohn deVries WAIS 667-030 March 15, 1967 1959 - 1964 Wisconsin Income Tax Returns Tentative Record Formats If no valid objections are made in the near future, the following set of record formats will be used in the processing of the 1959-1964 Wisconsin Income Tax summary and identification data. For the identification file, the multiple-ID file and the multiple-SSA file, these formats will then also be final; for the master file, the format is only an intermediate one, prior to a consistency program (which will insert keys similar to keys B389-B398 on the old master) and prior to the addition of Social Security earnings information (comparable to fields B410-B441 on the old master). I have tried to stay, as much as possible, in accordance with the formats for the "old" files, with the result that "new" fields are not always logically placed where they belong, but are grouped together at the end of the amount - fields. In a more advanced format (e.g. after the consistency program) we could regroup the fields in a logically more elegant arrangement if that is felt to be useful. 2. Format: Identification Records Field Name Source On Punched Cards Number of Positions Characters On Record 1. Source Code Not on punched cards; will be 1 -1 generated by program 2. Identification Number Card 1, cols. 2-9 8 2-9 3. Social Security Number Card 1, cols. 10-18 9 10-18 4. Last Name Card 1, cols. 19-35 17 19-35 5. Title Card 1, cols. 36-38 3 36-38 6. First name Card 1, cols. 39-51 13 39-51 7. Middle name Card 1, cols. 52-63 12 52-63 8. Street or Box number Card 1, cols. 64-73 10 64-73 9. RR or RFD number Card 1, cols. 74-76 3 74-76 10. Street name Card 2, cols. 19-35 17 77-93 11. Street class Card 2, cols. 36-39 4 94-97 12. Post office (city) Card 2, cols. 40-56 17 98-114 13. ZIP code or postal zone Card 2, cols. 57-61 5 115-119 14. County code Card 2, cols. 62-63 2 120-121 15. Age in 1964 Card 2, cols. 64-65 2 122-123 16. Date of death (if any) Card 2, cols. 66-71 6 124-129 17. Blanks Will be inserted by 6 130-135 program 18. Record mark Will be inserted by program 1 -136 3. Format: Multiple ID-Numbers I have assumed that virtually nobody in the WAIS sample has more than two ID-Numbers (in a run on the 1946-1960 data, only three persons were found with more than two ID-Numbers); persons with more than two ID-Numbers will have more than one record on this file. Field Source on Punched Cards Number of Positions Characters On Record 1. Source Code Col. 1 (always "M") 1 -1 2. ID-Number Cols. 3-10 8 2-9 3. Social Security Number Cols. 14-22 9 10-18 4. Additional ID-Number Cols. 23-30 8 19-26 5. Blanks Inserted by program 5 27-31 6. Record Mark Inserted by program 1 - 32 4. Format: Multiple Social Security Numbers I have assumed that virtually nobody has more than two Social Security numbers (although there is no evidence either to prove or to disprove that assumption); persons with more than two Social Security numbers will have more than one record on this file. Field Source On Punched Cards Number of Positions Characters On Record 1. Source Code Col. 1 (always "S") 1 -1 2. I-D Number Cols. 3-10 8 2-9 3. Primary Social Cols. 14-22 9 10-18 Security Number 4. Multiple Social Col. 23 1 -19 Security Indicator 5. Secondary (Reap. tertiary, etc) Cols. 24-32 9 20-28 Social Security Number 6. Blanks Inserted by program 3 29-31 7. Record Mark Inserted by program 1 -32 5. Format: Master File (preliminary) Field Source on Punched Cards Number of Positions Characters on Record 1. Source Code Inserted by program; always "2". 1 -1 2. I-D-number All forms: Card 1, cols. 3-10 8 2-9 3. Year of return All forms: Card 1, cols. 11-12 2 10-11 4. Multiple ID-indicator All forms: Card 1, col. 13 1 -12 5. Multiple Social A11 forms: Card 1, col. 23 1 -13 Security indicator 6. Demographic data All forms: Card 1, cols. 24-66 43 14-56 (coded) 7. Largest wage Form 1: Card 1, cols. 67-74 8 57-64 Form 2: Card 3, cols. 37-44 Form 3: Card 3, cols. 21-28 Form 4: Card 1, cols. 67-74 8. Second largest wage Forms 1 and 4: Card 2, cols. 13-20 8 65-72 Form 2: Card 3, cols. 45-52 Form 3: Card 3, cols. 29-36 9. Total other wages Forms 1 and 4: Card 2, cols. 21-28 8 73-80 Form 2: Card 3, cols. 53-60 Form 3: Card 3, cols. 37-44 10. Total interest received Form 1: Card 4, cols. 69-76 8 81-88 Form 2: Card 3, cols. 69-76 Form 3: Card 3, cols. 61-68 Form 4: not supplied 11. Total dividends Form 1: Card 5, cols. 13-20 8 89-96 received Form 2: Card 3, cols. 61-68 Form 3: Card 3, cols. 53-60 Form 4: not supplied 12. Rent Income Form 1: Card 5, cols. 21-28 8 97-104 Form 2: Card 4, cols. 13-20 Form 3: Card 3, cols. 69-76 Form 4: not supplied 13. Gain or loss, property Form 1: Card 5, cols. 29-36 8 105-112 Form 2: Card 4, cols. 21-28 Form 3: Card 4, cols. 13-20 Form 4: not supplied Field Source on Punched Cards Number of Positions Characters on Record 14. Profits or loss, Form 1: Card 5, cols. 37-44 8 113-120 business Form 2: Card 4, cols. 29-36 Form 3: Card 4, cols. 21-28 Form 4: not supplied 15. Estate or Trust income Form 1: Card 5, cols. 53-60 8 121-128 Form 2: Card 4, cols. 45-52 Form 3: Card 4, cols. 37-44 Form 4: not supplied 16. Partnership income Form 1: Card 5, cols. 45-52 8 129-136 Form 2: Card 4, cols. 37-44 Form 3: Card 4, cols. 29-36 Form 4: not supplied 17. Other income Form 1: Card 5, cols. 61-68 8 137-144 Form 2 and 3: Card 4, cols. 53-60 Form 4: Card 2, cols. 29-36 (includes interest, dividends, etc.) 18. Total income Forms 1 and 4: Card 2, cols. 37-44 8 145-152 Form 2: Card 1, cols. 67-74 Form 3: Card 4, cols. 61-68 l9. Business expenses Form 1: Card 2, cols. 45-52 8 153-160 Form 2: Card 2, cols. 13-20 Form 3: Card 4, cols. 69-76 Form 4: not supplied 20. Adjusted gross income Form 1: Card 2, cols. 53-60 8 161-168 Form 2: Card 2, cols. 21-28 Form 3: Card 1, cols. 67-74 Form 4: not supplied 21. Standard deduction Form 1: Card 2, cols. 61-68 8 169-176 Form 2: Card 2, cols. 29-36 Form 3 : Card 2, cols. 13-20 Form 4: Card 2, cols. 45-52 22. Net taxable income Form 1: Card 2, cols. 69-76 8 177-184 Form 2: Card 2, cols. 61-68 Form 3: Card 2, cols. 45-52 Form 4: Card 2, cols. 53-60 23. Wisconsin Income Form 1: Card 3, cols. 61-68 8 185-192 Tax paid Form 2: Card 5, cols. 21-28 Form 3: Card 5, cols. 37-44 Form 4: Card 2, cols. 61-68 Field Source on Punched Cards Number of Positions Characters On Record 24. Union dues Form 1: Card 3, cols. 69-76 8 193-200 Form 2: Card 5, cols. 29-36 Form 3: Card 5, cols. 45-52 Form 4: not supplied 25. Medical-dental Form 1: Card 3, cols. 53-60 8 201-208 expenses Form 2: Card 4, cols. 69-76 Form 3: Card 5, cols. 21-28 Form 4: not supplied 26. Interest paid Form 1: Card 5, cols. 69-76 8 209-216 Other forms: not supplied 27. Non-business Form 1: Card 3, cols. 45-52 8 217-224 interest paid Form 2: Card 4, cols. 61-68 Form 3: Card 5, cols. 13-20 Form 4: not supplied 28. Other deductions All forms: A-card, cols. 21-28 8 225-232 29. Alimony paid Form 1: Card 4, cols. 13-20 8 233-240 Form 2: Card 5, cols. 37-44 Form 3: Card 5, cols. 53-60 Form 4: not supplied 30. Forest crop land Form 1: Card 4, cols. 21-28 8 241-248 expenditures Form 2: Card 5, cols. 45-52 Form 3: Card 5, cols. 61-68 Form 4: not supplied 31. Total deductions Form 1: Card 4, cols. 29-36 8 249-256 Form 2: Card 2, cols. 37-44 Form 3: Card 2, cols. 21-28 Form 4: not supplied 32. Net income before Form 1: Card 4, cols. 37-44 8 257-264 federal tax and Forms 2, 3 and 4: not supplied donations 33. Federal income and Form 1: Card 4, cols. 45-52 8 265-272 Social Security Tax Other forms: not supplied deductions 34. Net income before Form 1: Card 4, cols. 53-60 8 273-280 donations Form 2: Card 2, cols. 45-52 Form 3: Card 2, cols. 29-36 Form 4: not supplied Field Number of Positions Source on Punched Cards Characters On Record 35. Donations Form 1: Card 4, cols. 61-68 8 281-288 36. Personal exemptions Form 2: Card 2, cols. 53-60 8 289-296 Form 3: Card 2, cols. 37-44 Form 4: not supplied Form 1: Card 3, cols. 13-20 37. Total tax Forms 2 and 4: Card 2, cols. 69-76 8 297-304 Form 3: Card 2, cols. 53-60 Form 1: Card 3, cols. 21-28 38. First installment Form 2 and 4: Card 3, cols. 13-20 8 305-312 Form 3: Card 2, cols. 61-68 Form 1: Card 3, cols. 37-44 39. Non-taxable income Other forms: not supplied 8 313-320 All forms: A-cards, sum of cols. 29-35, 37-43, 45-51, 53-59, 61-67, 69-75 40. Social Security All forms: A-cards, cols. 13-20 8 321-328 received All forms: L-cards, cols. 21-28 8 329-336 41. Adjusted taxable income All forms: L-cards, cols. 61-68 8 337-344 42. Total additional taxes Form 1: Card 3, cols. 29-36 8 345-352 43. Tax to other States Forms 2 and 4: card 3, cols. 21-28 Form 3: Card 2, cols. 69-76 8 353-360 44. Total payments and Forms 1 and 4: not supplied credits Form 2: card 3, cols. 29-36 8 361-368 45. Casualty losses Form 3: Card 3, cols. 13-20 Forms 1 and 4: not supplied 46. Unemployment Form 2: Card 5, cols. 13-20 8 369-376 Form 3: Card 5, cols. 29-36 Form 3: Card 3, cols. 45-52 compensation Other forms: may be present on A-card 8 377-384 47. Refund of Wisconsin Form 3: Card 4, cols. 45-52 income taxes Other forms: not supplied 8 385-392 48. Wisconsin Income Form 3: card 5, cols. 37-44 tax withheld Other forms: not supplied Field Source on punched Cards Number of Positions Characters On Record 49. Type of form All forms: Card 1, col. 1 1 -393 50. "Completeness" Inserted by program; based on 1 -394 indicator number of cards present for a 9 395-403 return 51. Social Security number All forms: Card 1, cols. 14-22 52. Blanks Inserted by program 4 404-407 53. Record mark Inserted by program 1 -408hahttp://www.ssc.wisc.edu/wais/WAIS667030.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667030.txt Bob Esterly 1967NHElimination of Invalid Multiple FFID Numbers: Scope, Method, and ResultsJanuary 31, 1967 WAIS paper667-023f.'Fixed Format Identification File (FFID)FBob Esterly WAIS 667-023 January 31, 1967 Elimination of Invalid Multiple FFID Numbers Scope, Method, and Results Scope The elimination of all invalid Multiple Fixed Format Identification numbers (MID's) was quickly (for reasons of long-run time efficiency) broadened to include the development of a listing of valid MID's and the purging from the FFID file of other discrepancies noted. Valid MID's were defined as arising from cases of sons and daughters first filing while residing at the family home and then filing as a private household head. These are primarily sons though some daughters either (1) did not marry and remained in a WAIS namegroup, (2) married within a WAIS namegroup, or (3) were assigned a 70 series FFID number. Invalid MID's primarily arose from three situations: (1) filing errors where the returns were erroneously divided, (2) divorced wives who did not remarry and were originally given new FFID numbers and who should be reintegrated with their prior households, and (3) assignment of a second FFID number to an individual in the survey file (1 in 3rd digit) who already possessed an FFID number. Method of Approach A listing of all occurrences of the same SSA number with two FFID's was available. In those cases where both FFID's ended in the same two digits (or where one FFID was a survey number) the old files were pulled, integrated, and C, J, I, and N cards produced.* This procedure was both accurate and reasonably efficient. Unfortunately, it was not comprehensive. The invalid MID's eliminated did not include those for which one FFID entry or both did not include an SSA number or for which one SSA number was erroneous. The only way to approach this unknown remainder was to review the alphabetical listings for similarities of names, SSA numbers, and addresses. ---------------------- *WAIS Paper 645-055; 656-052 This introduced a subjective element of balancing the degree of similarity against the time involved to check the files. A high proportion of potential duplicates was investigated, yet some obvious cases were undoubtedly missed since in large name groups, the absence of a middle initial in one entry could separate two entries by a page or more. In addition, recourse to the files did not always provide sufficient information for making a determination me way or another. During this phase, it was also appropriate to rectify other discrepancies noted in the coding since they would likely not be recognized except through an intensive examination of the FFID listing. By far the most numerous of these were (1) improper assignment of the sex designating digit of the FFID number, and (2) assignment of either the husband's or wife's first name to both FFID's. Elimination of Invalid MID's was not the only factor requiring a perusal of the alpha-FFID listing. It was also desired to produce an accurate listing of valid MID's. Obviously, the "adjusted" SSA/duplicate FFID listing understated the number of valid MID's by those cases where no SSA number was reflected under one or both FFID numbers or where an SSA number was in error. Considering the de-emphasis of SSA numbers on some tax forms, filers proclivity for incompleteness, and the fact that persons in some occupations have no SSA number, it was also necessary to investigate many two FFiD cases which resulted in no fewer FFID's, but for which SSA numbers had to be assigned, in one of the two cases to assure the possibility of later extraction of a valid MID listing by matching on SSA number. Results Summary Count from Alpha-listings** Valid MID's*** 430 MID's eliminated 185 Name corrections 45 FFID number corrections 41 Other**** 148 ------------------------- **Figures represent persons, not FFID's, i.e., 185 ID's were eliminated out of 380 or so involved. ***Includes both obviously valid MID's from SSA match and those determined later to be valid and for which appropriate SSA number adjustments were made; may be understated by the number of MID's reflected on the SSA match but not recognized an the alpha-listing (correctable by tape-extract) and is definitely understated by those valid MID's not reflected on the SSA match and not recognized on the alpha-listing. ****Includes primarily "suspect" cases which were determined Intact to be separate individuals plus some adjustments in SSA numbers, addresses, replacement of outdated information by more recent, and other discrepancies. Revision of other files, particularly the new files, will be initiated when a final revised FFID listing is available.hahttp://www.ssc.wisc.edu/wais/WAIS667023.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667023.txt/nAshok Bhargava 1967Analysis of Folder Shots - Required for Updating Old and New Master Files with Respect to Residual Tax Records for WAIS's Sample April 4, 1967c WAIs paper667-032(("Missing Data (Master File Records)hahttp://www.ssc.wisc.edu/wais/WAIS667032.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667032.txt.-Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Ashok Bhargava WAIS 667-032 April 4, 1967 Analysis of Folder Shots - Required for Updating Old and New Master Files With Respect to Residual Tax Records for WAIS's Sample WAIS 667-029 outlined a procedure for updating old and new Master Files with respect to Residual Tax Records for WAIS's sample. The procedure outlined was a multi-phased one. However, after half of phase I - part I it was considered advisable to analyze folder shots from 5 name groups. The reason for this step was to get an idea of the distribution among different types of cases of missing returns and the reasons thereof. The following constraints were kept in mind while choosing the name groups to be analyzed: 1. Family ID #'s in name groups should total roughly 2000. 2. Each of the three different types of name groups should be represented in proportion to their weightage in the population. 3. The name groups chosen should be distributed over the entire population. The following combination of name groups fulfilled all the constraints. 01 - B**** 08 - C**** - C**** 30 - R****,B. - E. 36 - S**** - S**** 50 - Z**** - Z**** Hopefully, we have got a representative sample. The missing year returns of different types were divided into 15 separate cases [see appendix A]. Cases 1-9: ID#'s in both new and old Master Files Cases 10 - 12: ID#'s in old Master File only Cases 13 - 15: ID#'s in new Master File only These 15 cases were further sub-divided into males and females because the expected pay-off (in terms of returns found in the Tax Archives) is expected to be different. Note: Cases not covered in the above 15, were included in the nearest approximating case with the higher pay-off. Expected Pay-Off Table Case Males Females 1 None None 2 None None 3 Low Low 4 High Low 5 High Low 6 Low Low 7 High High 8 High High 9 High High 10 High High 11 High Low 12 Low Low 13 High Low 14 High Low 15 Low Low Percentages: Expected Pay-Offs Males Females Total None 13.2 3.4 8.31 Low 60.8 89.5 73.00 High 26.0 7.1 18.69 100.0 100.0 100.0 Analysis of Cases Case 1: All returns present Case 2: Only 1946 return missing. The 1946 returns were incomplete when the old Master File records were micro-filmed. They are not available in the Tax Archives now. Hence the expected pay-off is nil. The number of persons in the first two cases is 219 (g.31%). See Appendix B. Case 3: All available after a block (n = 2) missing in initial years. The possible reasons for returns missing are - a. Person may have moved into state b. Wives may not have worked c. Women married into sample d. Person may have started working. For all these reasons the initial year returns may be missing from the old Master File. The reasons given above indicate that the pay-off will be low for both males and females. [It is likely that the payoff for females will be lower comparatively]. There are a substantial number of cases here 437 (16.51%). Case 4: Block missing between 1946-58 with returns available before and after. Reasons for block of returns missing may be: a. May be out on field audit b. Returns out of file at time of micro-filming for collection of delinquent returns. c. May have moved out of state temporarily (e.g. young people going to college in another state). d. Insufficient income e. May not have worked The pay-off in this case is likely to be high in the case of males. But due to reason e. it should be comparatively lower in the case of females. The number of females (164) is greater than the number of males (126). Total 296 (10.95%). Case 5: Years missing in two or more blocks in 1946-58 The reasons for returns missing are the same as in case 4. The conclusions are also the same. There are few cases in this category 34 (1.30%). If there is any pay-off it will be in the case of males. Females are more likely to stop working for a year or two at a time. Case 6: Block missing 1961-64. The reasons for returns missing can be: a. Out of file for "office or field audit" at time of microfilming b. Out for collection of delinquent taxes c. Person may have died d. Person may have moved out of state e. Wives may not have worked (However, generally the wives and husbands returns will be missing together since they have to file together.) The pay-off is not likely to be high here, in the case of males or females unless many returns were out for "office or field audit" or to collect delinquent taxes. The number of cases in this category is 117 (4.46%). Case 7: Years missing in two or more blocks in 1961-64. Reasons for returns missing may be: a. Out of file for office audit b. Out for collection of delinquent taxes c. Wives may not have worked d. Insufficient income The pay-off here should be high in the case of both males and females because returns are filed together. The number in this case is small, 39 (1.38%). Case 8: 1959-60 missing Case 9: 1959 or 1960 missing Reasons. for returns missing in cases 8 and 9: a. Out for collection of delinquent taxes b. Out for office audit c. Wives may not have worked The pay-off should be high for both males and females - though it will probably be a bit lower for females for the third reason. Number in case 8-58 (2.20%) Number in case 9-89 (3.29%) These together form a substantial sum, and are important years because of the overlap feature of the two files. Their absence with returns available before and after is hard to explain - thus the pay-off should be high. Cases 10 and 12 refer to unmatched FFID's from Old Master File Case 10: Block missing 1946-60. (and 1961-64 missing). The reasons for returns missing in this case are the same as for Case 3. But with returns available for 1959-60, it is hard to explain why they are not available in the new Master File, which also has the same years (1959-60) because of the overlap features. Here the chances are high that the returns were out for office or field audit, or collection of delinquent returns. The pay-off should therefore be high in the few numbers in this case, 32 (1.25%). Case 11: Years missing in one or more blocks 1946-60. [All missing 1961-64]. The reasons for returns missing are the same as case 4. Similar to case 4 the pay-off will be high in the case of males and low for females. Also the reasoning about 1959-60 from case 10 above can be applied. This will tend to raise the pay-off somewhat. The number of cases here are 60 (2.45%). Case 12 : Block available in 1946-60 with returns missing before and after the block. Reasons for returns missing may be: a. Person may have died after moving into state b. Person may have moved into and out of state (e.g. only here while going to school). c. Wives did not work before and after the years of returns available. The number in this case is substantial, 452 (17.07%). Out of these 227 have returns available only for one or two years. The pay-off here will be negligible. Even in the other cases the pay-off will be low, since these will generally be people who were in the state for a very short period. However, large number of males may be out for office or field audit or for collection of delinquent taxes. Cases 13-15 are from the new Master File only. Case 13: All available 1959-64. [All missing 1946-58]. In this case the females, 197, are much more than the males. From the 197 females - 154 are those with ID#'s ending with NN. These are females who are first or second wives. There is incomplete information about them. It is most likely that their previous returns are available in the old file and the missing years can be cleaned up by a check of the old Master File. For the rest of the cases it is difficult to explain why no returns are available in the old Master File because of the overlap feature of the two files. The pay-off here will be high though the number left after removing the NN's is low, 73 (2.70%) Case 14: Years missing in one or more blocks with returns available in between. The number of cases here is very small, 30 (1.18) and most of these are females. The reason for the return missing here will be that females did not work. There will be a high pay-off for males and low for females. Case 15: Years available with blocks missing on both sides, 1959-64. Reasons for returns missing. a. Out for field audit b. Out for collection of delinquent taxes. c. Moved out of state d. Come to state to go to school only. The pay-off will be low unless reasons a and b are pre-dominant. The number in this case is substantial 564 (21.13%). Most of the cases in the low pay-off category can be explained by internal checks from a. Death Extract b. Benefit File c. New Master File Codes The next stage would be to test these theoretical hypotheses with empirical data. For this we would have to go to the Income Tax office and micro-film the returns which we can find. On the basis of this empirical data we will have a better idea of what cases to stress (in case only partial completion of missing year returns is the aim). Appendix A Cases of Returns Not Microfilmed in New and Old Master Files Cases 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 59 60 61 62 63 64 Both M.F.'s 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 4 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 5 0 0 0 0 1 0 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 7 1 1 1 1 l 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 8 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 9 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1 Old M.F. 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 Unmatched FFID's) 10 11 0 0 0 1 0 0 1 0 1 1 1 0 1 1 1 12 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 New M.F. 1 1 1 1 1 1 13 14 0 1 0 1 0 0 15 0 1 1 1 0 0 M.F. - Master File 0 - Year returns missing 1 - Year returns available Note: Cases not covered in the above categories are included in the case which is the nearest approximation and has the higher pay-off. Appendix B Distribution of ID#'s* Among Different Cases of Years Missing in the New and Old ** Master Files** Cases Males % Females % Total % of Total 1 53 } 13.2 217 8 } 3'4 1470 2.68 } 8.31 2 121 9 5.63 3 253 19.3. 184 14.1 437 16.51 4 126 9.5 164 12.3 290 10.95 5 18 1.4 16 1.3 34 1.30 6 62 4.7 55 4.2 117 4.46 7 22 1.7 17 1.2 39 1.38 8 36 2.7 22 1.7 58 2.20 9 52 3.9 37 2.7 89 3.29 10 16 1.2 16 1.2 32 1.25 11 35 2.6 25 1.8 60 2.45 12 2.31 17.4 221 16.7 452 17.07 13 30 2.3 197 14.7 227 8.52 14 10 0.8 20 1.6 30 1.18 15 257 19.4 307 23.1 564 21.13 1322 100 1326 100 2648 100.00 *2004 Family ID#'s taken from name groups 01, 08, 30, 36, 50 of WAIS **For meaning of different cases see Appendix A Ashok Bhargava April 27, 1967 Correction to WAIS 667-032 WAIS 667-032 had divided the missing years of returns into 15 different cases. By an oversight one important case was overlooked. This correction introduces it as ... Case 16 -- All returns missing from 1959-64. Folder exists in new master files. 47 f ' 50 51 2 53 54 55 56 7 58 59 60 s 61 62 Y 63 64 16a 0 0 1 1 1 1 0 0 0 0 0 16b 0 0 0 Reasons for all returns missing. a. Returns moved to led to fiduciary file (after person died). b. Returns moved to delinquent file. c. Doomage returns. d. Returns out for field audit. e. Income too low. The pay off in this case will be high where returns are out for audit or have been moved to the fiduciary file.* Bob Esterlyo 1967$A Note on Integration of Files April 5, 1967 WAIS paper667-033F?Maintenance System - Files, Data, Etc. Master File- Tax Records((Bob Esterly WAIS 667 - 033 April 5, 1967 A NOTE ON INTEGRATION OF FILES At the present time there exist several source files which might be integrated for more convenient access. Obviously, the microfilm prints for years 1959-64 will eventually be combined with microfilm prints for years 1946-59. In addition, there are at least three other information sources which could be included: Namely, the interview cover sheets, benefit data file, and age data file. I. Folder Organization A proposed organization for the combined file folder is as follows: Household Number (010XXX) Head of Household File (0l0XXX)00 Survey Cover Sheet* Benefit Data* Age Data* Demographic Code Sheet(s) - old file** FFID Code Sheet, I, N Cards*** Property File Code Sheets* Returns, 1946-58 or 1959**** Code Filing Logs***** Demographic Code Sheet - new file Returns, 1959 or 1960 - 1964 or 1965****** Spouse File (010XXX)10******* (Same data as above as applicable except for no code-filing log.) ------------- *These data will exist only for selected individuals. **On occasion, two demographic code sheets will appear for the years 1946-59. This could occur for an individual who was both dependent and household head during those years or for whom one ID number was found invalid and eliminated. The code sheets may be filed together in order of the years they cover. ***If several N Changes to either the original FFID code sheet or I Card exist, all N cards except the most recent may be discarded if the most recent can be determined. ****If a 1959 record exists in the old files, it is the source document for the tape record regardless of the existence of a 1959 return in the new files. *****The code-filing log is used only with the new returns. ******A 1965 return may or may not exist for a given individual. If it does exist it is not coded but is retained for reference. *******Joint husband-wife returns would be filed under the husband's ID number. In cases where a wife is identified but no returns are in the file, a "no wife return" form including ID and demographic data is prepared. These will eventually be punched and included in the file. -------------- Depending on the method of handling certain dependent situations (discussed later) there may also exist some dependent returns in the integrated parent household file. If so, they should be organized comparably with sons following in order after the spouse and daughters after sons. II. Assignment of Identification Numbers & File Integration Aside from having a convenient organization for reference purposes, the most important consideration in the integration process is the handling of identification numbers. The vast majority of WAIS taxpayers should have a unique ID number (e.g., an exception would be a wife divorced from one WAIS taxpayer but who remarried another.) Several categories of individuals should be noted: A. Some individuals not in a WAIS name group have been assigned a 70 name group code; these are largely persons who may be beneficiaries of WAIS taxpayers. A 70 code folder should be set up for these persons regardless of whether benefit source forms exist. Individuals who are beneficiaries may eventually draw benefits based on contributions from several individuals, either in or out of the WAIS sample, but this poses no problem for filing since the 70 code ID is unique. Another potential source of 70 code individuals is taxpayers (not in a WAIS name group) whose returns were erroneously microfilmed during the 1959-64 update. Whether these returns should be punched or not is a question to be resolved. B. A similar case exists for individuals included in the survey file but for whom WAIS has no tax returns. These persons have a 1000 series ID code. Again, folders should be set up for all 1000 series numbers assigned, the folders containing at least the survey cover sheets. C. Cases may occur where benefit, survey, or age data (and conceivably new tax returns) are retained under a number eliminated in the purge of invalid multiple ID's from the old file. These inconsistencies must be rectified at the time of integration on both the basic records and tape files. 1. If (as should generally be the case) the same ID deletion was made in both the old and new files, then the benefit, survey and age data can be adjusted as necessary. 2. If one of a pair of ID's was deleted in the old (or conversely, the new) file but no adjustment was made in the new (or, conversely, the old) file, then the unadjusted file (and, as necessary, the benefit, survey or age data files) should be brought into accord with the adjusted file. 3. If both old and new files show a deleted ID number but the adjustments contradict each other, then the returns should be re-examined and the appropriate adjustments made to the integrated files. It would seem that the integration process represents the surest check on the consistency of ID numbers assigned to various files. Old folders from which returns were transferred remain in the files with a notation as to the present location of the returns they once contained. New folders, of course, retain labels relating to all relevant ID numbers. In addition, new demographic code sheets indicate the presence of multiple ID numbers and, on the reverse, the duplicate numbers. The existence of these data in conjunction with coding sheets and punched cards relating to ID changes that have been made suggest the possibility of compiling a final list of ID number changes which could be used (for example) to revise the property file records D. The 1959-64 Coding-Filing Manual defines the procedures for assigning ID's to the new return folders: 1. Heads of households and spouses in old, or old and new files Males and females with old head of household and spouse numbers retain the numbers in the new file (if it exists) and hence the integrated file may be placed in one folder. 2. Heads of household, spouses, dependents in new files only Males and females with no old file numbers (new entrants to the sample) are assigned 4000 series codes (head of household and spouse) and may be retained in the integrated files in that format. (Note that no dependency digit - 01, 02, 11, 12, etc. - is utilized in the 4000 series code.) If a male head of household present in both old and new files is married in a "new file" year, his wife is given his 1D number with "NN" in the last two digits since a check to see whether 10, 20, etc., is appropriate was, at the time of coding new files, inefficient. These "NN" wives should be assigned the correct final two digits at the time of integration. 3. Persons who are both dependents and household heads in old files and household heads in the new files. The new returns and some old returns will be filed under a household head number which should be the ID of the integrated folder; the dependent returns from the old file should be relocated (both physically and on tape) to this integrated file and the dependent ID eliminated from the tape. Cases may arise where no new files exist (e.g., if the individual left the state); the dependent number should still be eliminated and the returns moved to the old head of household folder. 4. Persons who are dependents in the old files, and dependents or household heads in the new file. The old returns will carry a dependency number and the new returns a 4000 series number. The files should be integrated under the 4000 number and the old file dependency number eliminated. 5. Persons who are dependents in the old files and do not appear in the new files. These persons will have a unique, though "dependent" ID. Considering the sequential assignment of the 4000 series codes, it would be inadvisable to assign them such a code. The following alternatives exist: (a) Make no adjustments. The disadvantage is obviously that these would be the only "dependent" ID's, and the only "dependant" returns to remain in the parent's folder. (b) Assign a new code series (e.g., 6000) on a sequential basis utilizing as a guide unmatched old file dependent labels. This has the advantage of comparability with other integrated folders but the disadvantage of additional work in setting up the new folders . (c) Assign unused old file codes to the dependents and set up new folders. III. Other miscellaneous considerations A. Whenever returns are moved from one file folder to another (either dependents, divorced wives who did not remarry and were once assigned new ID's, or other cases of invalid multiple ID's), two rules should be observed: (1) remove no empty folders from the files, and (2) make notations on the old and new folders where the returns went and conversely where they came from. B. As noted earlier, both 1959 and 1960 returns may appear in both the old and new files. The old 1959 return is the source document for the tape file while the new 1960 return is the source document for that year. Therefore, in cases of duplication, the new 1959 return and the old 1960 return may be deleted after a check for comparability has been made. This would serve two purposes: (1) some additional relevant information (e.g., supplementary schedules) might have been microfilmed but not coded, (2) a final check to assure that old and new files belong to the same persons would be made. C. In the actual filing process, the 51 name group should precede the 24 name group. D. As with the assignment of ID numbers, the integration process affords a chance to make another check on the condition and accuracy of the files. The gummed labels for the new files indicate the years for which returns (1946-1960) for any given individual exist in the tape record. A visual check of the old returns at the time of integration would provide two important pieces of information: (1) returns in the folder but not in the tape record, and (2) information in the tape record for years for which no returns are present. Any discrepancies should be rectified by recoding, printing-out records from the Master File, and so forth. The observations included here are preliminary and intended to raise further comments and suggestions, of other problems or sources of information that should be included in the integrated files. It would seem that the combination of survey cover sheets, benefit and age data with the old files could be accomplished first with the expanded old file to then be combined with the new.hahttp://www.ssc.wisc.edu/wais/WAIS667033.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667033.txt- John deVries 1967TM1959-1964 Wisconsin Income Tax Data Outline and Flow of Processing OperationsApril 11, 1967 WAIS paper667-035Data Processingc,,John deVries WAIS 667-035 April 11, 1967 1959 - 1964 WISCONSIN INCOME TAX DATA OUTLINE AND FLOW OF PROCESSING OPERATIONS This paper contains essentially the table of contents for the detailed manual of operations, which is currently in preparation and which will, hopefully, be ready in May, 1967. It is followed by a flow-chart indicating the suggested flow of work. Some remarks regarding the paper: 1. The sets of numbers preceding the job title are: a. cross references to the sections in the later instruction manual describing the jobs in detail; b, cross references to a system used to mark output produced by the various jobs, which will help us to identify the different outputs. 2. Most operations will be done per name group (i.e. almost all jobs will be done 52 times); only the jobs in section 14 will be done on all name groups together. 3. The system is set up in such a way that many jobs are "paired": the first job of a pair will be diagnostic in nature (e.g. running a program on a data file to check for certain types of errors); the second job in the pair will be corrective (to correct the errors found in the first job). This implies that the second job in a pair will not be necessary if the first job in the pair did not reveal any errors. In this paper, such a situation will be indicated by parentheses around the number of the second job in a pair. Jobs with numbers in parentheses are: a. corrections of errors found in the job immediately before it; b. may not be necessary Section 4: Processing of Identification Cards 4.1 Listing of original identification cards. 4.2 Sorting of identification cards [by identification code, col. 1). (4.3) Elimination or correction of incorrect identification cards (with incorrect identification code in column 1]. 4.4 Sort of identification cards [by ID-number, cols. 2-9]. (4.5) Correction of incorrect identification cards [with incorrect ID numbers]. 4.6 Listing of sorted identification cards. 4.7 Separation of identification cards [by card number, col. 80]. (4.8) Correction of incorrect I-cards [with incorrect card number]. 4.9 Card edits for type 1 I-cards [i.e. those with "1" in col. 80; job includes correction of the errors found, as well as verification of the corrections]. 4.10 Card edits for type 2 I-cards [with "2" in col. 80; job includes correction and verification]. 4.11 7-cards to tape conversion [produces record with format as described in WAIS 667-030; checks for missing and superfluous cards). (4.12) Correction of incorrect I-tape records [caused by mismatches,missing cards, superfluous cards, etc.]. 4.13 Check on multiple social security numbers (4.14) Correction of incorrect records on I-tape [more than one person using the same social security number). 4.15 Husband-wife consistency checks on I-record. (4.16) Correction of husband-wife inconsistencies 4.17 Sort and merge of I-records [accumulation of completed name group files]. Section 5; Initial Processing of Summary Cards 5.1 Listing of summary cards [including M-, S-, L- and A- cards]. 5.2 Separation of summary cards [by identification code, col. 1]. (5.3) Correction of incorrect summary cards [those with incorrect identification code in column 1]. Section 5a: Processing of M-Cards 5.4 Sort of M-cards [by card number, cot. 2, and ID-number, cols. 3-10]. (5.5) Correction of incorrect M-cards [with incorrect ID-numbers]. 5.6 Listing of sorted M-cards. 5.7 Check for missing and superfluous M-cards (5.8) Correction, addition or elimination of incorrect M-cards [mispunches,miscodes, missing or superfluous cards]. 5.9 M-card edits (includes correction and verification]. 5.10 M-cards to ti-,)e conversion. 5.11 Check for multiple social security numbers. (5.12) Correction of incorrect M-records [more than one person using the save social security number]. 5.13 Check for duplicate "secondary" ID-numbers. (5.14) Correction of incorrect M-records [more than one person with the same "secondary" ID-number, or the "secondary" number of one person identical to the "primary" number of someone else]. 5.15 Sort and merge of M-records [accumulation of completed name group files]. Section 6: Processing of S-Cards 6.1 Sort of S-cards [by card number, col. 2, and ID-number, cols. 3-10]. (6.2) Correction of incorrect S-cards [incorrect card numbers or incorrect ID-numbers]. 6.3 Listing of sorted S-cards. 6.4 Check for missing or superfluous cards. (6.5) Correction, deletion or addition of S-cards [found to be miscoded, mispunched, superfluous or missing]. 6.6 S-card edits [includes correction and verification]. 6.7 S-card to tape conversion. 6.8 Check for duplicate social security numbers ["primary" as well as "secondary"]. (6.9) Correction of incorrect S-records [duplication of social security numbers, "primary" and "secondary"]. 6.10 Sort and merge of S-records [accumulation of completed name group files]. Section 7: Processing of A-Cards 7.1 Sort of A-cards [by card number, col. 2, year, Cols. 11-12, and ID-number, cols. 3-10]. (7.2) Correction of incorrect A-cards [incorrect card number, year, or ID-number]. 7.3 Listing of sorted A-cards. 7.4 Check for missing or superfluous cards. (7.5) Correction, deletion or addition of A-cards [for miscoded, mispunched, superfluous or missing cards]. 7.6 A-card edits [includes correction and verification]. 7.7 A-card to tape conversion [accumulation of completed name group files]. Section 8: Processing of L-Cards 8.1 Sort of L-cards [by card number, col. 2, year, cols. 11-12 and ID-number cols. 3-10]. (8.2) Correction of incorrect L-cards [with incorrect card number, year, or ID-number]. 8.3 Listing of sorted L-cards. 8.4 Check for missing or superfluous cards (8.5) Correction, deletion or addition of L-cards [miscodes, mispunches, superfluous or missing cards]. 8.6 L-card edit [includes correction and verification]. 8.7 L-card to tape conversion [accumulation of completed name group files]. Section 9: Processing of Form 1 Summary Cards 9.1 Sort of form 1 cards [by card number, col. 2, year, cols. 11-12, and ID-number, cols. 3-10]. (9.2) Correction of incorrect cards [with incorrect number, year or ID-number]. 9.3 Listing of sorted cards 9.4 Check on superfluous or missing cards (9.5) Correction, deletion or addition of cards [miscoded, mispunched, superfluous or missing cards]. 9.6 Separation of form 1 summary cards [by card number, col. 2]. 9.7 Card edits card 1 [including correction and verification]. 9.8 Card edits card 2 9.9 Card edits card 3 9.10 Card edits card 4. 9.11 Card edits card 5. 9.12 Creation of "form 1 tape-record" [according to format in WAIS 667-030]. Section 10: Processing of Form 2 Summary Cards 10.1 Sort of form 2 cards [by card number, col. 2, year, cols. 11-12, and ID-number, cols. 3-10). (10.2) Correction of incorrect cards [with incorrect card number, year or ID-number]. 10.3 Listing of sorted cards 10.4 Check on missing or superfluous cards. (10.5) Correction, deletion and addition of cards [miscoded, mispunched, superfluous or missing cards]. 10.6 Separation of form 2 summary cards [by card number, col. 2). 10.7 Card edits card 1. 10.8 Card edits card 2. 10.9 Card edits card 3. 10.10 Card edits card 4. 10.11 Card edits card 5. 10.12 Creation of form 2 tape-record [according to format in WAIS 667-030]. Section 11: Processing of Form 3 Summary Cards 11.1 Sort of form 3 cards [by card number, col. 2, year, cols. 11.12, and ID-number, cols. 3-10]. (11.2) Correction of incorrect cards [with incorrect card number, year or ID-number]. 11.3 Listing of sorted cards. 11.4 Check on missing or superfluous cards. (11.5) Correction, addition or deletion of cards [miscoded,mispunched, missing or superfluous cards]. 11.6 Separation of form 3 summary cards [by card number, col. 2]. 11.7 Card edits card 1. 11.8 Card edits card 2. 11.9 Card edits card 3. 11.10 Card edits card 4. 11.11. Card edits card 5. 11.12 Creation of form 3 tape-record [according to format in WAIS 667-030]. Section 12: Processing of Form 4 Summary Cards 12.1 Sort of form 4 cards [by card [by card number, col. 2, year, cols. 11-12, and ID-number, cols. 3-10]. (12.2) Correction of incorrect cards [with incorr':_ct card number, year or ID-number]. 12.3 Listing of sorted cards. 12.4 Check on missing or superfluous cards. (12.5) Correction, addition or deletion of cards [miscoded, mispunched, missing or superfluous cards]. 12.6 Separation of cards [by card number, col. 2]. 12.7 Card edits card 1. 12.8 Card edits card 2. 12.9 Carl edits card 3. 12.10 Creation of form 4 tape-record [according to format in WAIS 667-030]. Section 13: Processing of Preliminary Master File (by name-groups) 13.1 Here of summary tape-records form 1, 2, 3, and 4 [by ID-number and year; contains check on duplicate records and "mismatches"]. (13.2) Correction, addition or deletion of incorrect records [duplicates, inconsistencies]. 13.3 Check for duplicate Social Security number. (13.4) Correction of incorrect records [more than one person using the same Social Security number]. 13.5 Addition o data from L- and A- records. (13.6) Correction of incorrect records [mismatches between summary records and L-, rasp. A- records, missing records, etc.]. 13.7 Internal consistency - checks [checks for errors in calculation of intermediate amounts, of Geffert's CONSIST-01]. (13.8) Correction of incorrect records [cases where "incorrect computation" was actually caused by mispunches, etc.]. 13.9 Inter-year consistency checks [checks for "legitimate" presence or absence of records, consistency of demographic codes, etc.]. (13.10) Correction of incorrect records [inconsistencies due to miscoded or mispunched data; corrections for cases where inconsistencies can be resolved by code expansion, etc.]. 13.11 Husband-wife consistency checks. (13.12) Correction of incorrect records [inconsistencies between coded fields for husband and wife caused by mispunches, miscodes, etc.]. 13.13 Sort and merge of summary-records [accumulation of completed name-group Alias). Section 14: Processing of Completed Files 14.1 Consistency-check "new" vs. "old" master [checks for presence and absence of records; compares coded data, etc.]. (14.2) Correction of inconsistencies [errors found in "old" or "now" master]. 14.3 Consistency-check "new" master vs. "new" ID-file [checks for presence and absence of records; compares coded data, Social Security numbers, etc.]. (14.4) Correction of inconsistencies [errors found in master file or in ID-file]. 14.5 Check of M-file vs. "new" master [mainly checks for presence or absence of records]. (14.6) Correction of errors [mainly errors found in M-file]. 14.7 Check of S-file vs. "new" master [mainly checks for presence or absence of records]. (14.8) Correction of errors [mainly correction of errors in S-file]. 14.9 Merge of "old" and "new" ID-files [partly updating of existing records on old file, partly addition of new records; involves transformation of "old" file to format compatible with the new format, as well as some consistency checks]. (14.10) Correction of errors in combined ID-file [inconsistencies found during file merge]. 14.11 Merge of "old" and "new" master files [only possible if transformation on the "old" master have produced records in format compatible with the "new" records; involves some consistency checking]. (14.12) Correction of errors in combined master file (found during the file merge].hahttp://www.ssc.wisc.edu/wais/WAIS667035.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667035.txtn John deVries 1967XQ1959-1964 Wisconsin Income Tax Data Compatibility of "Old" and "New" Master FileslApril 21, 1967 WAIS paper667-0370Master File- Tax Records@:John deVries WAIS 667-037 April 21, 1967 1959-1964 Wisconsin Income Tax Data Compatibility of "Old" and "New" Master Files If a smooth merge between our existing master file and the 1959-1964 master file is desired, and if that merge is supposed to take place as soon as possible after completion of the 1959-1964 master file, we should now begin to prepare the existing master file for that merge. One of the recent discoveries in social science data processing is that merges between large files require a high degree of "cleanliness" in each of the component files (this is popularly known as "Day's law"). While for the 1959-1964 addition special care has been taken to maintain rigid control over data quality, the existing master file is known to contain errors, unresolved inconsistencies, to miss certain records, etc. (See WAIS 667-034 for an extensive listing of items which are known to be wrong). There is also the consideration that the two files, in order to be merged, must have identical formats (this may seem to be a large job, but the reduction in programming effort will make it worthwhile; also, several standard programs require files to have standard formats for their records). In an earlier paper (WAIS 667-015, Section 3.3) I summned up which changes would have to be made on the existing master file; this paper goes one step further in that the job to be done is broken down into phases, where these phases must be done sequentially. I propose that the preparation of the existing master file be done in the following steps; Phase 1: General efforts, to increase completeness and consistency; a) elimination of multiple ID cases (see WAIS 667-034, Section 1). b) various checks to find and resolve inconsistencies, and to find indications of possible missing records: (i) tax returns not on Master (667-034, Section 2); (ii) husband-wife checks (667-034, Section 10); (iii) inter-year intra-person consistency checks; c) microfilming of missing returns (667-029, 667-032, 667-034, Section 4) and addition of these records to the Master file. Phase 2: Machine recoding of first set of demographic codes. (This may, and in my opinion should, include intra-record and inter-record consistency checks on coded data in addition to the ones run in phase lb). Fields to be recoded in this set are: (i) residence location (field 10 in new data) - residual by hand; (ii) non-resident indicator (field 13A) - residual by hand; (iii) occupation code (field 14) - residual by hand; (iv) return filed previous year (field 17); (v) marital status code (field 18) - residual by hand; (vi) spouse separate income (field 19) - residual by hand; (vii) information re recent marriage (field 20) - residual by hand; (viii) dissolution of marriage- (field 20A) residual by hand; (ix) next year field indicator (field 31) -residual by hand. Phase 3 Manual recoding of residuals from Phase 2, as well as correction of inconsistencies found. Phase 4: Machine recoding of second set of demographic codes: (i) address change (field 13) - residual by hand; (ii) industry code (field 15) - residual by hand; (iii) labor force participation for previous year (field 17A) - residual by hand; (iv) marital status consistency indicator (field 18A) - residual by hand; (v) spouse's income reliability indicator (field 19A) - residual by hand. Phase 5: Manual recoding of residuals from phase 4, as well as correction of additional inconsistencies found. Phase 6: Machine recoding of remaining fields: (i) occupation change (field 16) - residual by hand; (ii) any additional machine-transformation which can be done (phases 2-6 will undoubtedly. reveal jobs not included in this expose). Phase 7: Manual correction of remaining errors. and inconsistencies (includes residuals from phase 6). Phase 8: Machine-transformation of master file to format compatible with new data-format.hahttp://www.ssc.wisc.edu/wais/WAIS667037.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667037.txtG John deVries Wynn Bussmann 1967\UProcessing 1959-1964 Wisconsin Income Tax Returns: Property Income File Coding ManualpApril 24, 1967 WAIS paper667-038t Property FilecFFJohn DeVries Wynn V. Bussman WAIS 667-038 April 24, 1967 Draft PROCESSING 1959-1964 WISCONSIN INCOME TAX RETURNS: PROPERTY INCOME FILE CODING MANUAL 1. Introduction The Wisconsin Assets and Incomes Study (WAIS) was begun in 1960. Its main purpose is to gather data from Wisconsin income tax returns, process these data into a form which can readily be analysed, and then use the data to find out more about human behavior. With grants of money from the National Science Foundation, The National Bureau of Economic Research, and The Brookings Institution, WAIS microfilmed the tax returns of a sample of about 20,000 Wisconsin taxpayers for the years 1946-1960, coded up the desired information from the tax returns, and finally had this information transferred to tape files, in which form it now exists. The following is a list of the basic tape files of WAIS: (1) The Master File containing much of the summary information on the tax returns (e.g. income totals from various sources, number of exemptions, total taxes, etc.),* (2) The Property Income File containing specific data on income from assets (interest, dividends, rent, capital gains, farm, business, and professional income) reported by persons in the Master File, (3) The Social Security Account File containing age, race, and earnings data on most people in the Master File, (4) The Personal Interview File containing data from personal interviews of some 1300 people in the Master File, and, (5) Firm Data Files containing data on stock prices and dividends of firms whose stocks (bonds, etc.) were (or may have been) owned by taxpayers in our sample. ------------- *See WAIS Monograph 2 by M. Eugene Moyer, pp. 84-85 for a complete list of Master File information. This monograph also contains general information about WAIS which may be of interest to the reader. ------------- Some work remains to be done on the files before they are completely ready to be used for analysis. Eventually all of the files either will be merged or will be interconnected so that studies can be made using one or all of the files at once. In the meantime, however, WAIS has acquired more funds for updating the files to 1964. The 1959-1964 tax returns have been microfilmed, and prints of the tax returns are in manila folders. The coding of the Master File information on these new tax returns is nearing completion, and the coding of the Property Income File information will be begun in the near future (within a year of this writing). Once these new property income data are processed (coded, keypunched, transferred to tape, and edited for errors), they will be integrated with the old Property Income File to provide a longer time span of data. When the data files of WAIS are finally completed, they will provide an extremely valuable source of information -- especially so because it is a rare event when tax return information is available for analysis. Already, the WAIS data have been used in two Ph. D. dissertations; there are several more in the works, and there are plans for still more theses, papers, and possibly several books. This paper is intended to provide instructions for the coding of the property income data for the new (1959-1964) tax returns. All coders working on updating the Property Income File should be thoroughly familiar with the instructions below before starting coding. For further information, the reader is referred to WAIS 645-038 (Coding of Old Tax Returns) and WAIS 667-002 (Updating Master File). 2. General instructions 2.1. Materials needed 2.1.1. Folder(s) You will work with a group of manila folders, each of which contains (microfilm prints of) all the tax returns which were available for the individual or married couple whose name(s) is (are) written on the top of the folder. In some cases, the folder contains the returns of only one person, in which case his (her) name is the only one to appear at the top of the folder. In other cases, the folder contains the returns of two or more people (a married man plus all the women to whom he has been/is married), in which case the names of all these people who filed tax returns should appear at the top of the folder. In any case, each folder should contain all the returns for every person whose name is mentioned on the folder and who filed returns. In addition to containing tax returns, the folders will also contain coding sheets (one for each person in the folder) and a filing/coding log sheet. Both these items should be located in the front of the folder. Report to the supervisor if any of the above items are missing from the folder. Before coding, make sure that any returns that are in the folder are listed on the codesheet as having been coded. Also make the reverse check; that is, if the coding sheet indicates that a return should be present for a particular year, make sure that the return for that year is in the folder. Report any discrepancies to the supervisor! After completing coding all the returns in the folder, fill in the required information on the filing/coding log on the line, "Details and Assets Coding," and leave the filing/coding log in the front of the folder. See section 2.3.4. below. All folders are filed in locked cabinets in room 7130. The cabinets should be locked before and after you extract folders from them; the coding supervisor has the keys to the cabinets. The folders are arranged by "name-group." Within each name-group, they are ordered by identification number. A name-group may fill more than one drawer, and in some cases a drawer may contain more than one name-group. When you begin the coding, the coding supervisor will give you a list of the identification numbers of the folders that you are to code. When you pull these folders out of the file drawers in room 7130, check off each one on your list. If you cannot find a folder, report the number to the coding supervisor. 2.1.2. Coding sheets You will receive a stack of coding sheets on which certain information on the tax returns is to be coded. All coding will be done on these coding sheets. There will be a coding sheet for a master information card for each taxpayer. There will also be a different coding sheet for each type of property income for each taxpayer. Copies of the coding sheets are reproduced in section 2.3.3. of this paper. 2.1.3. Codebook You will be working with a codebook containing detailed coding instructions for every field which has to be coded, plus information which will help you locate each item on the tax returns. The codebook is reproduced as section 3 of this paper. As the coding progresses, changes may have to be made in the coding instructions. If changes do occur, you will be told what to change; often it will mean that you have to replace a sheet in the codebook by a revised version. It is essential that you make all changes immediately and that you always code the returns according to the most recent coding instructions. A master codebook will be kept by the supervisor; it will contain every change in the coding. Your personal codebook must be an exact copy of the master codebook. Every day, before you start coding, check with the coding supervisor for any changes in the master codebook that have been made from the previous day. If there are changes to be made in your codebook, be sure that after they have been made, your codebook and the master codebook agree in every detail. You can be sure of this by comparing the notes at the top of each page in your codebook with the notes at the top of each page in the master codebook. The notes will be at the top right of each page and will say, "Revision (date)." If these notes are the same for your codebook and the master codebook, the two are in agreement. Do not make any changes on your own initiative; whenever you feel that instructions for a particular field are incomplete or inaccurate, mention this to the coding supervisor so that, if any changes are to be made, all codebooks can be modified at the same time and in the same way. 2.1.4. Lists of firm issue identification numbers You will, in general, be coding information on income from assets (e.g. dividends, rent, etc.) received by the taxpayers. Part of this task involves assigning codes (identification numbers) for firms' names. You will be given lists of firms' names along with the identification numbers already assigned to them deriving the coding of the old tax returns. When you come upon a tax return which lists, for instance, dividends from a particular company, you should search the existing lists for that company's name. If you find the company in one of the lists, then you will code the identification number already issued to the company. If you do not find the company on any of the lists which you have, you must see the coding supervisor who will then assign that company an identification number. The supervisor will keep a record of all such numbers assigned, and you will periodically receive copies of the new numbers and firms' names. These lists you should keep with the lists already in your possession. Do not forget to look for firms on these lists in addition to the original lists before seeing the coding supervisor for an identification number; do not assign identification numbers on your own. 2.2. Coding procedure 2.2.1. Read the codebook and other information and ask questions Before you start any coding, read the codebook and all other coding instructions carefully. Make sure that you understand everything. If you have any questions, be sure to ask someone on the supervisory staff; make sure that someone answers your questions to your own satisfaction. In short, be a pest about asking questions and obtaining answers -- anytime and of anyone. Also make sure that you have all the items you are supposed to have -- especially' make sure that your codebook is complete and up to date -and that you know where to find additional material when you need it. 2.2.2. "Test-coding" When you are certain that you are sufficiently prepared to begin the coding work, notify the supervisor; you will be given a number of returns to be coded. These returns will be typical of the returns that you will be coding. Several returns may contain special coding difficulties. If you come to a place where there is a little doubt in your mind as to what you should do, STOP! Ask the supervisor for help. Do not make any arbitrary decisions, on your own -- let the supervisory staff make all the mistakes! When you have completed the coding of these test cases, take them to the supervisor who will check your results, discuss coding problems, and explain points which turned out not to be clear. Please note that this is not an "entrance exam;" its main functions are to enable you to discover the specific problems before you start the actual coding and to enable the supervisory staff to ensure that you fully understand your job. Work carefully and write clearly. Do not be afraid to take the time to do it right the first time! After you complete the test-coding, the supervisor will assign a list of folders to you and let you begin to code the regular files. 2.2.3. The regular coding (Since the codesheets have not yet been designed, this section, which will provide some general instructions about the codesheets, cannot yet be written.) 2.2.4. Logging When you have completed all the coding for a folder, write the date and your initials on the line marked "Details and Assets Coding" on the filing/coding log sheet that you will find in the folder. If any problems arose in the coding, make a note under the column "Remarks;" use the back of the sheet if necessary. This will enable us later, if any discrepancies are found, to look at the "Coding History" and possibly explain such discrepancies. 2.3. Miscellaneous instructions 2.3.1. "Form S" file If during the coding of a return, you find that an arbitrary decision must be made about the code to be used, consult with the supervisor; after an agreement has been reached regarding the proper code to be used, fill out a "Form S" indicating the following information: (a) the taxpayer's ID number (b) the year of the returns (c) the number of the field giving you trouble (d) the code you used (e) the exact answer as you found it on the tax form. Do not file these "Form S" sheets in the folders, but give them to the supervisor. Make a note in the "Remarks" column on the filing/coding log sheet indicating that you coded a "Form S" for this taxpayer. 2.3.2. Maintenance of files The folders are filed in the drawers in a specific order (by identification number). If you take folders out of the drawer to be coded, be careful to re-file them in the correct order and then lock the file drawer. Work on the folders in one drawer at a time. When you pull out those folders in a drawer which you will be working on, take one of the cards in the front of the drawer and enter the following information on it. (a) the ID number of the first folder you pulled out (b) the ID number of the last folder you pulled out (c) the date (d) your name (e) "P . F . update coding". Place this card in the front of the drawer. When you have completed coding the folders in the drawer, replace them in the proper order and cross out the information on the information on the card in the front of the drawer, leaving the card in the drawer. 2.3.3. Coding conventions If any field exceeds the maximum size permitted (8 digits for amount fields), do the following: Enter "999,...,97" (i.e., all 9's except for a 7 in the last digit on the right) and fill out a Form S with the pertinent information. 3. Codebook The following pages contain lists of the items which we propose to code from the 1959-1964 tax returns. The writing of a codebook containing detailed and explicit instructions for coding each item is still in the preliminary planning stages. All monetary amounts will be recorded in dollars and cents. For 1959-1961 tax returns, the interest and dividend amounts are found in Schedules D and E on page 3 of Form 1, the rental income amounts are found in Schedule F on page 3 of Form 1, and the capital gain or loss amounts are found in Schedule G on page 3 of Form 1. The farm and business income forms are the same for all years, 1959-1964. The farm income amounts (cash basis) are found on page 1 of Form 1-Fc with the exception of the original cost of fixed assets and the depreciation allowed in prior years, which are found in the Depreciation Schedule on page 2 of Form 1-Fc. The farm income amounts (accrual basis) are found on page 1 of Form 1-Fi with the exception of the original cost of fixed assets and the depreciation allowed in prior years, which are found in the Depreciation Schedule on page 2 of Form 1-Fi. For 1962-1964 tax returns, the interest and dividends amounts are found on page 2 of Form 1, the rental income amounts are found on page 2 and page 4 (Schedule B) of Form 1, the capital gain or loss amounts are found in Schedule C on page 4 of Form 1, and the business or professional income amounts are found on page 1 of Form 1 B. WAIS will be glad to supply tax forms if they are desired. 3.1 Card 1 - Master control card 1) Taxpayer identification number 2) Social security number 3) For each year between 1959 and 1964: a) Number of entries for interest and dividends b) Number of entries for rental income c) Number of entries for gains/losses from the sale of assets. d) Number of entries for farm income and expenses e) Number of entries for profit/loss from a business or profession 3.2 Card 2 - Interest and dividends 1) Taxpayer identification number 2) Year 3) Asset type 4) Asset identification number 5) Interest or dividends received for the year 3.3 Card 3 - Rental income 1) Taxpayer identification number 2) Year 3) Cost of buildings & other property 4) Dates of acquisition of buildings and other property 5) Depreciation for the current year 6) Depreciation for prior years 7) Interest and repairs 8) Property taxes paid 9) Gross rent received 10) Proportion occupied by owner 3.4 Card 4 - Gains and losses on sales of assets 1) Taxpayer identification number 2) Year 3) Asset type 4) Asset identification number 5) Date acquired (year - month) 6) Method of acquiring 7) Date sold (year - month) 8) Gross sales price 9) Depreciation allowed in prior years 10) Original cost for income tax purposes 11) Subsequent improvements 12) Expense of sale 13) Amount of gain or loss 14) Was residence replaced by another? 3.5 Card 5 - Farm income and expenses (cash basis) 1) Taxpayer identification number 2) Year 3) Live on farm? (Yes/no/not available) 4) Total number of acres on farms 5) Amount of sale of livestock raised 6) Amount of sale of produce raised 7) Other farm income 8) Profit on sale of livestock and other items purchased 9) Gross profit 10) Expenses 11) Depreciation 12) Total deductions 13) Net farm profit 14) Interest paid 15) Original cost of fixed assets 16) Depreciation deducted in prior years 3.6 Card 6 - Farm income and expenses (accrual basis) 1) Taxpayer identification number 2) Year 3) Live on farm? (Yes/no/not available) 4) Total number of acres on farms 5) Inventory at end of year 6) Sales during year 7) Other miscellaneous receipts 8) Inventory at beginning of year 9) Gross profit 10) Expenses 11) Depreciation 12) Total deductions 13) Net farm profit 14) Interest paid 15) Original cost of fixed assets 16) Depreciation deducted in prior years 3.7 Card 7 - Profit/loss from business or profession 1) Taxpayer identification number 2) Year 3) Kind of business 4) Proprietor previous year? 5) Status previous year 6) Total receipts (less allowances, rebates, returns) 7) Inventory at the beginning of the year 8) Inventory at the end of the year 9) Gross profit 10) Total business income 11) Rent on business property 12) Interest on business indebtedness 13) Depreciation 14) Depletion of mines, timber, etc. 15) Net profit (or loss) from business 16) Initial cost of fixed assets 17) Depreciation allowed in previous yearshahttp://www.ssc.wisc.edu/wais/WAIS667038.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667038.txtMd Bob Esterly 1967.'WICS: WAIS Inventory and Control Systemf May 2, 1967} WAIS paper667-039AdministrationLpLjBob Esterly WAIS 667-039 Preliminary May 2, 1967 WICS: WAIS INVENTORY & CONTROL SYSTEM 1. Introduction: During the progress of the WAIS project the desirability of creating an operational inventory and control system for WAIS documents has been observed repeatedly. WAIS papers on the subject include 656-054, M. Von Schneidemesser, A Proposal for Documenting at the Macro Level the Programs and Files of WAIS; 667-014, R. Bauman, On a WAIS Inventory and a Proposal for a Finder File; and 667-016, M. Von Schneidemesser, Documentation and Housekeeping Procedures for WAIS Staff. Concrete progress can be pointed to in the form of the Program/File Catalog created under WAIS 667-016 and the Erasable Tape Use Chart. Still, an adequate inventory does not exist and control is not rigorous. WICS is offered for comment, criticism and revision: at a minimum, it may serve as a foothold for further progress. The General Plan of Control and Inventory 2.1 WAIS Documents and Files to be Inventoried 2.1.1 WAIS 667-014 may be referred to for a comprehensive outline of documents, programs and files. The following list is general and abbreviated but convenient: a. Tapes: All tapes are produced by a processing operation though the complexity of these operations may vary greatly. b. Data Card Decks: (1) Card decks may be punched directly from source documents or coding forms or may be created through processing operation. (2) Program and control card decks should be differentiated from data card decks for the purpose of inventory and control. c. Data Listings: All listings are produced by processing operation. d. Programs: (1) Programs may include narrative materials, flow charts, code sheets, and program card decks or listings. (2) Programs may be multi-purpose or single-purpose. e. Control Card Decks f. Source Documents g. (Permanent) Memory Unit Files 2.1.2 The intention of WICS is to inventory documents, programs, and files - so far as is possible - according to identification based on the tape processing operation. An operation is defined as any computer application and is terminated by production of the desired final output. Hence, it may involve one program, one processing run and very few inputs and outputs; or it may involve several processing runs with inputs and programs introduced at various stages and output created at various stages. All but the desired terminal output are defined as operation outputs. 2.2 Comments on the Documents, Programs and Files 2.2.1 The documents and files with which WICS is primarily concerned (tapes, card decks, listings, programs and source files) have a number of characteristics which complicate organization of an inventory system: a. The designation of tapes, cards, listings, and source documents used in, or generated by, a given operation are therefore related and the designation should, if possible, reflect that relationship. This system of identification is, however, clearly impossible in the case of multi-purpose programs, source document files and some card decks which may be generated prior to the processing operation. b. Tapes, cards and listings have varying degrees of permanence. c. A given tape or card deck may be output in one operation and input to another. d. Programs may be specific to given operations on certain files or they may be multi-purpose. If multi-purpose, control cards should be considered. e. The inventory of any given program must deal with explanatory narratives, flow charts, code sheets, as well as various card decks and listings. f. WICS, as treated in this paper, does not include a control system for operations not involving computers; for example, coding and punching data directly from source documents. WICS identification could be assigned the final product and a provision made for reference to other documents covering these operations. g. Programs intended to be single purpose may turn out - with slight modifications - to be multipurpose. Likewise, programs, listings, tapes and decks intended to be permanent may turn out to be temporary and conceivably the reverse might hold. 2.2.2 What capabilities must be built into an inventory system? Figure 1 attempts to illustrate the document/file creation flow with respect to a given operation. Items to be inventoried are related to the processing operation. The common bond among these items is the processing operation as documented on a Tape Operation Log. Figure 2 illustrates the input/output relationships between programs (and card decks, tapes and listings) and the processing operation through the association of entries on the Tape Operation Log with entries in various inventory files. 2.2.3 Inputs to the processing operation may be considered as the following: a. Source Data Decks (1) Reflecting original data taken directly from documents. (2) Reflecting original data taken from coding sheets. (3) Reflecting updating or error data from coding sheets. b. Source Data Tapes (1) Reflecting original data and coding process. (2) Reflecting updating or revision operations. c. Tapes and Card Decks Produced as Output in a Prior Operation d. Program and Control Card Decks Output of a processing operation includes intermediate operation tapes, card decks and listings, and final output tapes, card decks and listings. FIGURE 1. SOURCE DATA FILES or CODING FORMS UPDATE Revise OPERATION OPERATION TAPES PROGRAM CARD DECKS SOURCE DATA TAPES PROCESS RUN 1 OPERATION OPERATION LISTINGS CONTROL CARD DOCKS PROCESS RUN 2 OUTPUT TAPES FIGURE 2 SOURCE DOCUMENT FILE CARD DECK FILE TAPE FILE LISTING FILE PROGRAM FILE CONTROL CARD FILE TAPE OPERATION LOG INTERMEDIATE PROCESSING OPERATION(S) FINAL OPERATION AND OUTPUT MEMORY UNIT FILE 3. WICS Identification System 3.1 It would be desirable in terms of simplicity to identify all types of records by a single identification rule. In the WICS case, however, the variety of records makes this impossible. At least ten items deserve consideration for inclusion in the identification system: Physical Format Identification, a one letter identification of the format: D = document, P = program, L = data listing, C = data card deck, M = memory unit, W = WAIS tape, S = SSRI tape and K = control card deck. Alphabetical File Reference, a two letter reference to the primary file involved in an operation: MA = master, BN = benefit, PR = property, SR = survey, FF = identification and so forth. If no file predominates, XX could be used. Existing Tape Number, SSRI or WAIS tape number preassigned to every tape. Program Name, alphabetic name as designated by the programmer. Operation Number, three digit number assigned sequentially to every tape operation (or alternatively, one letter and two numbers to increase the feasible range and facilitate distinction from the Inventory Serial). Operation Serial, two digit decimal following each operation number to allow for identification of more than one listing, or more than one card deck produced by a given operation. Since tapes are uniquely identified by the tape number, these digits could be used to identify tapes in a series; e.g., ten reels of Master File. Inventory Serial, a three digit number assigned sequentially to identify records not produced through processing operation; e.g., programs, and card decks punched directly from source documents or coding forms. Operation Description, alphabetic abbreviation of operation. Data Description, alphabetic statement of nature of data created either by non-processing or processing operation. Micro-Macro Data Indicator, one digit pair of numbers or letters denoting whether data pertains to individual taxpayers or is aggregative. 3.2 Of these possible designations, it is mandatory to the system to use the Format Identification, Tape Number, Operation Number, and Operation Serial. The Format Identification can be applied to all records but the Tape Number only to tapes and the Operation Number and Serial only to products of processing operations. The question is: How are records not tapes or products of processing operations to be uniquely identified? The use of an Alphabetic File Reference in conjunction with an Inventory Serial will accomplish this purpose. The Macro-Micro Data Indicator could be included if desired. The identification would be completed with, for programs, the Program Name, or for other records, Data or Operation Description. 3.3 A set of example identifications might look like the following: Figure 3. Digit 1 2 3 4 5 6 7 8 9. Source Documents* D M A 0 2 0 [description] Programs P X X 0 1 1 [program name] Data Listings L B N A 0 6 . 0 6 Data Card Decks** C P R 0 4 2 [description] Data Card Decks*** C S R A 0 1 . 0 2 [description] Control Card Decks K B N A 0 4 . 0 1 [prog. reference ?] Memory Unit Files M S R B 1 1 . 1 4 [description] Data Tapes S 5 5 1 A 0 6 . 1 1 [description] *Use of the Alphabetic File Reference would allow organizing inventory cards sequentially within major file groups. The Inventory Serial would be assigned sequentially to each recorded category. **Punched directly from Source Documents or Coding Sheets. ***Created by Processing Operation. 4. Use of WICS Documents 4.1 Programs 4.1.1 At the time a program is being prepared, the programmer should obtain two records Program Description Sheet (Figure 7) and Program Inventory Card (Figure 6). Either he or the analyst should also obtain a copy of Tape Operation Log, (Figure 4). The Tape Operation Log should already have a sequential centrally assigned operation designation; e.g., A10. The programmer/analyst would assign the remainder of the designation as MA, BN, XX, PR, etc. The Program (Program Description Sheet and Inventory Card) would be assigned a designation also as described in Section 3. The "program" is identified with the program card decks and these would carry the program number. (If several decks are involved, further identification could be noted on the decks). If control cards are involved, a separate inventory card identified per Section 3 should be prepared. 4.1.2 The Program Description Sheet should be filled out completely (this record serves the same purpose as the Program Catalog Sheet of WAIS 667-016). When the program has been completed, a number of records may exist: a. Descriptive Narrative Notes b. Flow Charts c. Coding Sheets d. Assorted Card Decks (Program and Control) e. Assorted Listings The descriptive notes, flow charts, coding sheets, and listings should be located in one folder bearing the designation of the program and reference to their existence and location made on the Program Description Sheet. The folders of program materials should be maintained in a central location unless in current use by programmers. An "outcard" could be used in these files to indicate when a file is removed. The program and control card decks would be represented by the inventory cards, again both the decks and inventory cards should be under central control unless in current use. 4.2 Source Documents, Source Document Card Decks, and Revision Decks 4.2.1 All source documents and the card decks produced directly from these documents (or via coding sheets, but not via any processing operation) and revision card decks created directly from coding forms as well as the revised source document card decks must be assigned a logical code. Again, a number of documents exist: a. Source Documents b. Source Card Decks (with data in original form) c. Updating or Revision Card Decks d. Source Card Decks (with data in updated or revised form) When an inventory system is implemented, all existing source documents must be assigned identification as noted in Section 3. Source card decks (original data form) could carry the same designation except the first digit would be C rather than D. When many card decks were involved, additional identification indicating their placement in a series could be indicated on the decks. Updating or revision decks could be assigned a serial, but since they would be utilized in a particular operation, it would be more meaningful to assign them an operation number. (WICS makes no particular attempt to control coding sheets in general, but it would be logical to designate them with the operation number also.) Updated or revised source card decks are operation output and should carry an operation number if created by computer. 4.2.2 A Card Deck Inventory Card (Figure 6) should be prepared for each card deck produced and retained and central control established. The assignment of identification codes will assure a logical order of inventory cards in the file. The "outcard" system of control over document files may be retained. 4.3 Tape Records 4.3.1 Since all tapes are products of a processing operation, those tapes which are created prior to implementation of an inventory system and still in existence must be assigned an operation number substantiated by a Tape Operation Log and Tape Control Card File. To what extent it is desirable to carry over the identification on this dummy Log to card files and listings already in existence is a problem for discussion. Certainly the most important records should be identified consistent with the system. 4.3.2 Inventory and control of tapes is complicated by the fact that some tapes may carry revised information, or completely different if it is scratched and reused in a subsequent operation. The Tape Control Card File forms (Figure 5) incorporate some format changes (when compared to those in Figure 6) to accommodate this. The card file carries only the WAIS or SSRI number in the space designated at the top (unless it is a permanent source tape in which case the complete identification could be included). The various operation numbers are indicated in the far-right column. The contents of the tape for any given operation should be noted at the bottom of the form. A space is also provided to indicate whether the tape (for any given operation) is permanent, temporary, or scratch. The extent to which central location of the tapes could be effected is questionable, but at least central control of their use and actual location could be arranged. It seems likely that for permanent tapes, a Tape Description Sheet, analogous to the Program Description Sheet, and including information as appears on the File Catalog Sheet of WAIS 667-016 should be prepared. 4.4 Output Documents and the Tape Operation Log 4.4.1 The Tape Operation Log is designed to accumulate data on each step of a processing operation with appropriate identification of inputs and outputs, purpose, steps involved and progress at any given time. The inputs would show a check in the input column and their (already assigned) identification. The outputs would be listed with an appropriate check and assigned a code' reflecting their form (tape, cards, list), file reference (or, for tape, tape number), operation number and operation serial. Any number of processes and any number of inputs and outputs can be reflected for a single tape operation by adding additional sheets, and when the operation is complete, the Tape Operation Log will present a condensed history of the operation with appropriate references to input, output, purpose, etc. The Log "Process" column could be used to identify separate processes within the given operation. The "Nonprocessing Operation" column could show a check followed by narrative reference to the nature of the nonprocessing step. This would add desirable completeness to the WICS system. The "Repeat Process" column could be checked and followed by a comment such as : "Repeat process 2 for namegroup 10". 4.4.2 All outputs from the operation must be reflected on Tape Control, Card Deck, or Listing Inventory Cards unless immediately destroyed or scratched and this fact noted on the Log. The surviving outputs will then so far as is possible be centrally located and checked out as necessary. If a record is subsequently destroyed or scratched the Tape Operation Log should be amended and the inventory record removed. Note that in the case of tapes and card decks, an input may display an identification assigned as an output in some previous operation. In most cases, this would seem not only acceptable but desirable since it allows quick identification of the previous operation. 5. Complementarity With Job Standardization Reference is made here to WAIS 667-027, J. De Vries, A Proposal for the Stadardization of WAIS Jobs. WICS, as presented, offers no particular assistance for planning project activities. In that context, the standard plan of preparing a WAIS paper covering definition and procedure of all new activities, detailed instructions, program flow charts, and so forth would be a very useful asset in project control. With respect to time-estimates and personnel allocation, the De Vries paper presents a possible format for a Job-Description Sheet. Such a form could be tied into the WICS system by carrying the Operation Number and its preparation would predate the Tape Operation Log. The significant complementarity of WICS occurs through the Tape Operation Log. Paragraph 1 (e) of WAIS 667-027 states as a purpose of job standardization: "to inspect, at regular time-intervals, the progress of the respective jobs and, if required, to change time-estimates and personnel allocations". The Tape Operation Log provides the desired information on actual performance (acting as a feedback to the estimates on the Job Description Sheet in detailed form with regard to processing operations and in summary form for nonprocessing operations. Figure 4. Tape Operation Log (8 1/2 x 11") Date Initiated Date Completed Programmer Analyst Operation Number Sheet No. Process Input Inter. Output Final Output Nonproc. Oper. Repeat Process Record Identification Date Process Completed Perm Temp Scratch Output Intput, Output or Process Description FIGURE 5. TAPE CONTROL CARD File (5X7") CARD NUMBER PERMANENT LOCATION TAPE NUMBER DATE OUT TO LOCATION EXP. RET. DT. RET. DATE P T S TAPE USE NUMBER TAPE USE NUMBER CONTENTS (CONT ON REVERSE) FIGURE 6. PROGRAM, CARD DECK, & LISTING INVENTORY FILE (5x7") CARD NUMBER PERMANENT LOCATION IDENTIFICATION DATE OUT TO LOCATION RETURN EXP ACT DATE OUT TO LOCATION RETURN EXP ACT CONTENTS: FIGURE 7 PROGRAM DESCRIPTION SHEET (8 1/2 x 11") PROGRAM NAME PROGRAMER PROGRAM NUMBER SUMMARY DESCRIPTION OF PROGRAM INPUTS OUTPUTS LOCATION, LABELING OF CARD DECKS, LISTINGS PHASING IMPLICATIONS LOCATION, DESCRIPTION OF ASSOCIATED WAIS PAPERS, FLOW CHARTS, NARRATIVE 6. Summary The variety of files, their interdependence, and the diversity of computer processing necessarily complicates WICS. This paper attempts to take the view of setting up a comprehensive system and allowing necessary compromise from there. Still, no simple system will offer what a more complex system will; namely: a. Inventory and Identification File, for all tapes, card decks,listings, source documents, and programs and related records. b. Control System to insure that the physical location of each document is known at any time. c. Operation Progress Record to allow quick checks on project operations. d. Meaningful Identification System to facilitate use of all documents. Whether a system such as this can be implemented is a function of several unknowns such as the cooperation of project members, the additional time costs imposed by its use and the likely errors and omissions of this writer who knows little about data processing.d]http://www.ssc.wisc.edu/wais/WAIS667039 http://www.ssc.wisc.edu/wais/textFiles/WAIS667039.txt̡ John deVries 1967Detailed Outline for Further Processing of 1959-1964 Wisconsin Income Tax Summary Data Vol. I -- The Processing of the ID Cards May 3, 1967 WAIS paper667-040-Master File- Tax RecordsgrglJohn deVries WAIS 667-40 May 3, 1967 Detailed Outline For Further Processing of 1959-1964 Wisconsin Income Tax Summary Data Vol. I -- The Processing of the ID-cards. 1. Introduction Prior to this stage, 1959-1964 Wisconsin Income Tax returns have been microfilmed and filed, ID numbers have been assigned, demographic data have been coded (See WAIS 667-002), identification cards and summary information cards have been punched (See WAIS 667-013, first revision) and the cards are being returned to WAIS. A brief outline of the further processing of the cards from here on has been sketched in WAIS 667-015; this paper constitutes an initial attempt to outline all procedures in detail and should serve as an instruction manual for the clerical staff. Several elements have been added; WAIS 667-035 gives an idea how the operations described in this paper fit in with the remaining jobs. 2. Instructions to Clerical Staff As was stated above, several steps have already been taken in the processing of the Wisconsin Income Tax Returns for 1959-1964. The three papers mentioned above (667-002, 667-013, 667-015) should give you a fairly good idea of what has happened so far and of what is going to happen. The operations which follow will constitute your job, as well as that of a number of other people. The instructions in this manual have been written out with considerable detail, not because the operations are complicated (as a matter of fact, most of them are not complicated at all), but because the system as a whole is fairly complicated; if we forget one operation or handle it incorrectly, the whole system may go wrong. All stages in the operations are designed to help us to spot, investigate and correct errors which may have been made in previous stages. Therefore, if any one element is, by mistake, left out, we may be stuck with a number of errors in our data. Follow the instructions carefully and make sure to record all the actions and results on the proper log sheet, in the proper position. If anything unusual occurs, mention it to the supervisor. Some remarks about the logging: a) You have probably found out that our data are separated into namegroups; each name-group has a two-digit number in the range from 01 to 51 (inclusive). There is also a residual group of people for whom we want to collect data but who, for some reason or another, do not fall into any of our name-groups; this group has 70 as its first two digits. Since all processing, up to a certain stage, proceeds by name-group, we want to log the various processing stages by name-groups as well. You will find all the log sheets in a folder marked "Name-Group Processing Log Sheets." b) Each step is described on the log sheet by a brief name, followed by the number of the section in this manual which describes the action. When you begin a particular action, mark the data and your initials in the appropriate columns. c) When you have completed the job, mark the date of completion in the correct position on the log sheet. Put the material you worked with (cards, listings, etc.) back where you found it, except if there are specific instructions to the contrary. d) In several steps you will receive, with the input, output of some kind (usually punched cards or listings). In such cases, the instructions will indicate that output is either certainly or possibly expected. If for a particular stage output is marked as certainly expected but you did not receive any output, see the supervisor. If no output is expected but you do receive something, see the supervisor as well. e) If output is expected, a number or a combination of letters and numbers will be indicated on the log sheet; copy this code onto the output you received so that there is no problem in finding or identifying the output when it is needed for some other job. [See Section 3.1 of this manual for an explanation of the numbering system]. f) If output is indicated as possibly expected but you did not receive anything, mark this on the log sheet by writing an "X" in the place besides the output indicator. g) Sometimes a particular phase is not required because a prior phase did not produce any output. Mark this on the log sheet by placing an "X" in the "date started" and "date completed" columns, as well as your initials. h) In several cases, a job will have to be run at least twice. In that case there will be one blank line following the line for the first running. If it is necessary to run the job more than two times, fill out a special sheet which you can get from the supervisor, and file it with the name-group log sheets. The supervisor will also specify the code-numbers for the output you receive in these extra runs. All the following jobs have been described as practically independent operations, mainly because it is impossible to decide beforehand how the work is going to be divided. This may mean that you will be assigned more than one job in a sequence; in those cases it of course does not make sense to put cards into a box as the completion of one job and then immediately take them out of the box as the beginning of the next job. 3. General Remarks 3.1 A Note On the Construction of Input-Output Codes The codes used to identify inputs, outputs, etc. consist of a sequence of letters and numbers. The first digit is always going to be a letter, identifying the type of output: "C" indicates that the output is a deck of IBM cards, "L" indicates that the output is a listing, "T" indicates that the material is on a magnetic tape, "P" indicates a computer program (usually a deck of cards), "K" indicates a set of control cards, to be used in conjunction with a computer program. The next two digits are always numeric and indicate the name-group to which the particular output belongs. In the write-up of the jobs, these two digits will be indicated as "XY"; always substitute the name-group digits when you have to mark output, pick up material, etc. The following series of digits will always be numeric, not always of equal length. These numbers will refer to the segment in this manual which describes the operation which produces this particular output. Programs usually do not refer to name-groups specifically, also are not produced as output of any given stage. The code for programs will therefore usually be a "P", followed by the numbers indicating the job in which the program is to be used. [This system will not be used if the program has not been specifically generated for this operation; in that case the program will have a distinct name, e.g. UPDATEALI. There are some inputs which have been produced prior to this stage; their code will carry a number lower than "4.1" as their last digits. 3.2 A Note On The Terms "Input" and "Output" For those of you who are not familiar with these terms, the following explanation should be sufficient in bringing you up to date. Input - that which you need to do a job. In our usage, the input for a job may consist of decks of punched cards, computer listings computer tapes, etc. (for each job, we will indicate exactly what input you will need and how to recognize it). Output - that which you receive after the completion of the job; this, again, may be punched card decks, listings, tapes, etc. Again, we will specify what output you can expect from a given job. There is one complication with outputs: for some jobs, certain types of output will not always be produced (e.g. if a particular input does not contain errors, you will naturally not receive an error listing). In such cases, we will indicate that there is a chance that you will not receive output of that type. For many jobs, the input you began with will also be part of the output. 4. Detailed Description of the Operations 4.1 Listing of Original Identification Cards Purpose: Before we begin to handle the cards (with the risk of losing some of them in the process), we need a permanent record of all the cards we received from the keypunchers, in the order in which we received them. Input: Card deck CXY 1, program deck P 4.1. Note: There should usually be less than a full box of cards in CXY 1; if there is more than one box, make sure to indicate clearly which one is the first box, which is the second one, etc. Action: Insert program deck P 4.1 in the front of the cards in CXY 1; submit this deck at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Output: Listing LXY 4.1, as well as your input card decks CXY 1 and P 4.1. After you have picked up input and output, file the input cards back where you found them, write the appropriate output code on the output listing, file it in a binder marked "Identification listings", then record the completion of the job on the log sheet. 4.2 Sorting of Identification Cards Purpose: This job is needed to ensure that the input deck really contains only cards which legitimately belong there. Input: Deck of cards marked CXY 1. Action: Take this deck of cards down to the Basic Machine Room (Room 4441) and sort on column 1 (alphabetically). All cards should have an "I" in column 1. All cards with something other than "I" in column 1 should be collected in one separate deck. Output 1. The sorted deck of identification cards (this should largely be identical to the deck before sorting). Mark the deck CXY 4.2.1, store it where you picked up the original deck and check it off on the log sheet. 2. Possibly, a deck of incorrect cards (this should be very small). If you have some, mark this deck CXY 4.2.2, store it on the shelf labelled "Error Cards" and check it off on the log sheet. If you don't get any error cards, put an "X" on the log sheet and mark off job 4.3 as not required. N.B. If the original deck is small you can do this job much faster by using a sorting needle. Be sure to check both punches if you use the needle! 4.3 Elimination of Incorrect Identification Cards Purpose: This job is needed to either eliminate or correct those cards which in job 4.2 were split off from the main group because they did not have an "I" in column 1. Note: It is possible that this job can be skipped. If there are no cards marked CXY 4.2.2 (check the log sheet!) mark the job as not required and proceed to job 4.4. Input: Cards CXY 4.2.2, listing LXY 4.1, Cards CXY 4.2.1 Action: Each card in deck CXY 4.2.2 will have to be either eliminated or corrected. Locate the card on listing LXY 4.1. The card probably belongs to the same person as the card immediately before it or the one immediately after it. Check this by comparing columns 2-9 on your incorrect card with columns 2-9 on the cards before and after it (columns 2-9 contain the ID-number). If you can identify to whom the incorrect card belongs, take the folder which contains that person's income tax returns from the files (make sure that you filled out a "folder-replacement card"!) and check the contents of the code sheet which you will find in the front of the folder. Then write the data which the incorrect card should have contained on an 80-column code sheet (for the specific contents of the columns see WAIS 667-013). Mark the correction on listing LXY 4.1. If you cannot determine to whom the incorrect card belongs, place it in a box marked CER (this is a box containing miscellaneous unidentified error cards). Cross off the card on listing LXY 4.1 in this case. When you have completed the corrections for all incorrect identification cards, take the codesheet(s) down to the keypunch room (Room 4470) to have cards punched. When the cards have been punched, add them to the deck of cards marked CXY 4.2.1 and change the label of that deck to CXY 4.3.1. Note: All incorrect cards you originally found in CXY 4.2.2 should either end up in box CER (if you cannot identify them) or be destroyed (after the corrections have been punched and added to CXY 4.2.1). Output: Cards CXY 4.3.1 (contain all cards from CXY 4.2.1 plus the corrected cards from CXY 4.2.2). 4.4 Sort of Identification Cards by ID Number Purpose: This job is intended to spot all cards with a possibly incorrect ID-number; also, to put all identification cards in order by increasing ID-number. Input: Deck of cards CXY 4.3.1 [if job 4.3 was not required, the input deck will be marked CXY 4.2.1]. Action: Sort this deck (in the Basic Machines Room, 4441) on columns 2-9. During the sort, eliminate the following categories of cards from the main deck: a) all cards with a digit other than "0" (Zero) or "N" in column 9; b) all cards with a digit other than "0" (Zero), "1", "2", or "N" in column 8; c) all cards with digits other than 0, 1 or 4 in column 4; d) all cards with columns 2 and 3 not equal to XY (where XY is the specific name group you are working on!). Output: 1. A large deck of cards. These are the identification cards, sorted by identification number and purged from cards with obviously illegitimate ID numbers. The deck of cards has to be labelled CXY 4.4.1, stored and checked off on the log sheet. 2. A small deck of cards: All the cards which were eliminated during the sort, on the basis of criteria 4.4 a-d. This output does not necessarily appear! If you get something, label the cards CXY 4.4.2, check it off on the log sheet and store the cards on the shelf marked "Error Cards". If you don't get anything, check the output column on the log sheet with an "X" and mark off job 4.5 as not required. 4.5 Elimination of Incorrect Identification Cards Purpose: This job is needed to correct all identification cards which were separated in the previous job because they had an incorrect ID-number. Note 1: This job is not required if job 4.4 did not yield any incorrect cards at all (i.e. no cards marked CXY 4.4.2 exist and the position on the log sheet has been marked with an "X"). Note 2: The processing of this job is almost the same as that for job 4.3. Input: Cards marked CXY 4.4.2, listing LXY 4.1, Cards CXY 4.4.1. Action: Each card in deck CXY 4.4.2 will have to be either eliminated or corrected. Locate the card on listing LXY 4.1. Several situations can arise: a) There may be another card with the same incorrect ID number in CXY 4.4.2; in that case, one card will probably have a "1" in column 80, the other one a "2". If you find two cards which fit the description (i.e. same contents in columns 2-9, one with "1" in column 80 and one with "2"), check the ID-number of the card before your incorrect ones, as well as the number of the card immediately following them, on listing LXY 4.1 (this listing is largely in ID-number order). The correct ID number for your error cards should almost certainly fit between the two IDnumbers you located on LXY 4.1. Now go to the file cabinets containing the folders with tax returns and check if there are any folders between the ones with the ID-numbers you located. If you find one or more folders, check all of them against the information on your error cards to find the folder containing the correct ID-number. If you are able to find the correct IDnumber, write the data, which the incorrect cards should have contained, on an 80 column code sheet (for the specific contents of the columns see WAIS 667-013). If you cannot determine to whom the incorrect cards belong, place them in the box marked CER (this is a box containing various unidentified error cards). Indicate on listing LXY 4.1 what you did with the cards. b) If you find only one card for a particular incorrect ID-number, there is a good chance that you will be able to find its "counterpart" on listing LXY 4.1, either immediately before it or immediately after it. If you are able to locate the counterpart, take the folder containing the tax returns for the person with the ID-number you found on the correct card (the counterpart) in columns 2-9, find the proper contents for your incorrect card and write them on the 80 column correction sheet. Make a note on listing LXY 4.1 of the correction you made. If you cannot identify the incorrect card at all, place it in box CER and note this on listing LXY 4.1. After all the corrections for your cards have been coded and the unidentified cards have been put into CER, have the correction sheets keypunched (Room 4470). The corrected cards have to be inserted into deck CXY 4.4.1. Make sure to insert these cards in their proper position! (Remember that this deck has been sorted on columns 2-9). After you have inserted the corrected cards in their proper position, destroy the cards in deck CXY 4.4.2. Change the label of deck CXY 4.4.1 to CXY 4.5 and check this off on the log sheet. Output: Deck CXY 4.5 (contains all cards from deck CXY 4.4.1 plus corrected cards substituted for cards from deck CXY 4.4.2, in increasing ID-number order (columns 2-9). 4.6 Listing of Sorted Identification Cards Purpose: This job is needed to produce an easily accessible record of the identification cards in ID-number order. Input: Cards marked CXY 4.5, program P 4.1. Note: It is possible that no card deck labelled CXY 4.5 exists; in that case, you will find a deck marked CXY 4.4.1 and you will find job 4.5 marked off as "not required" on the log sheet. If both conditions apply, use CXY 4.4.1 as your input. Action: 1. Insert program deck P 4.1 in front of the cards in CXY 4.5. 2. Submit this deck at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. 3. After you have picked up your input and output, file the cards where you found them (be sure to separate P 4.1 from the data cards). Write the appropriate output code (LXY 4.6) on the output listing, file the listing in a binder marked "sorted ID-cards", then record the completion of the job on the log sheet. Output: Listing LXY 4.6, as well as your input card decks P 4.1 and CXY 4.5 (or CXY 4.4.1). 4.7 Separation of Identification Cards Purpose: This job is to be done to separate the two types of identification cards (the ones with a "1" in column 80 and the ones with a "2"); if there are any cards with something other than "1" or "2" in column 80, they will come out as a separate deck of cards. Input: The deck of cards labelled CXY 4.5. Note: It is possible that no deck labelled CXY 4.5 exists; in such case you should be able to find a deck CXY 4.4.1, also the job 4.5 must have been marked off as not required. Action: Sort the input deck on Column 80. Output: Label all cards with a "1" in column 80, CXY 4.7.1, all cards with a "2" in column 80, CXY 4.7.2; if there are any other cards, combine them into one deck and label it CXY 4.7.3. If you did not get any other cards, mark this with an "X" on the log sheet; also mark off job 4.8 as not required. 4.8 Elimination of I-Error Cards Purpose: To correct or eliminate cards which were found to have an incorrect card-number (col. 80) in job. 4.7. Note: It is possible that this job is not required. If there is no input deck labelled CXY 4.7.3 - check this on the log sheet - and if you find an "X" in the "date started" and "date finished" columns for job 4.8 on the log sheet, the job can be skipped. Otherwise: Input: Deck of cards marked CXY 4.7.3, listing LXY 4.6, Cards CXY 4.7.1, CXY 4.7.2. Action: The processing of this job is similar to that for jobs 4.3 and 4.5 All the cards on input deck CXY 4.7.3 have an incorrect card number in column 80. Take the listing marked LXY 4.6 and locate the incorrect card. a) There is a good possibility that you will be able to locate one other card with the same ID-number (Cols. 2-9) as the incorrect card. If you find one, check the number in column 80 (on the correct card!) If the number is a "I" and the incorrect card follows the correct one on the listing, your incorrect card should have a "2" in column 80. If the number in column 80 on the correct card is a "2" and the incorrect card precedes the correct one, your incorrect card should have a "1" in column 80. Cards so identified can easily be corrected: mark the correct contents for column 80 on the card and ask the keypunchers to duplicate the card with the corrections. b) If you do not find a card with the same ID-number on the listing, check the folder for that ID-number, find the correct contents of the I cards (check WAIS 667-013 for the contents of the fields on the cards) and write these contents on an 80 column coding sheet. When you have completed the job, have cards punched. c) If you cannot determine to whom the incorrect card belongs, put it in box CER. d) When you have completed the job, you should have: 1. No more cards in CXY 4.7.3 2. Possibly some cards in box CER 3. A group of corrected cards. Place the corrected cards in their proper positions in decks CXY 4.7.1 and CXY 4.7.2. (Cards with "1" in column 80 go to CXY 4.7.1, the other ones to CXY 4.7.2). Both decks are sorted by ID-number (Cols. 2-9). Label the decks with a new number: CXY 4.7.1 becomes CXY 4.8.1, CXY 4.7.2 becomes CXY 4.3.2. Check off the appropriate output columns on the log sheet. Output: Card decks CXY 4.8.1, CXY 4.8.2. 4.9 Card Edits for I-Cards Type 1 Purpose: This job is required to check for invalid punches and codes on the data cards, by means of a computer program with control cards. Input: Deck of data cards CXY 4.8.1, program cards PCE, control cards KIXY (where XY stands for the name-group number) and control cards K 4.9. Action: This job is subdivided' into a number of separate phases. Phase 1: Submitting the Input to the Edit Program Assemble the input card decks in the following order: 1) PCE 2) KIXY 3) K 4.9 4) CXY 4.8.1 Submit this deck at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Output: The original card deck (PCE + KIXY + K 4.9 + CXY 4.8.1), and a listing of incorrect cards, indicating the type of error on the card and the location of the error. Label the listing LXY 4.9.1 and store the input cards where you found them. Phase 2: Correcting the Errors Found by the Program 1. There is a small possibility that listing LXY 4.9.1 does not contain any error messages. If this is the case, file listing LXY 4.9.1 in a binder marked "Card Edit Listings", mark card deck CXY 4.3.1 as CXY 4.9.2 and check off the "Repeat" column on the log sheet as not required (by putting an "X" in the "Date Started" and "Date Finished" columns). 2. If the listing LXY 4.9.1 contains error messages, each of them will have to be investigated and corrected. Use the card layout as given in WAIS 667-013 (first revision) to determine what each field on the card should contain. If you can determine the error, write the correct contents of the card on an 80 column code sheet. If you cannot determine the error, or if you think the card is correct (it is always possible that the control cards in K 4.9 or in KIXY are-in error), see the supervisor. Some errors may not have to be corrected (there may be circumstances where seemingly incorrect situations are acceptable) - let us call them "explained" errors. After you have corrected or "explained" all error messages on the listing, have the 80 column sheets punched. Take the corrected cards and substitute them for the incorrect ones in deck CXY 4.8.1 (remember that the deck has been sorted by ID-number, Cols. 2-9). Phase 3: Verifying the Corrections Take the corrected deck CXY 4.8.1 and repeat the first phase. The listing you receive this time should be marked LXY 4.9.2. If you handled the previous phase correctly and if you did not introduce new errors, listing LXY 4.9.2 should contain either no error messages or only indications of errors which you "explained" when they came out the first time. If this is so, you can file LXY 4.9.2 in the binder "Card Edit Listings", mark your data cards as CXY 4.9.2 and check off the job as done. If there are still some errors (either old "unexplained" ones or errors introduced with your corrections), repeat phase 2, followed again by phase 1. This sequence will have to be repeated until no more "unexplained" errors remain. Output: The final output will be: original program and control cards (PCE, KIXY and K 4.9), corrected data cards CXY 4.9.2, listings LXY 4.9.1 and LXY' 4.9.2. 4.10 Card Edits for I-Cards Type 2 Purpose: This job is required to check for invalid punches and codes on the second type of identification cards, by means of a computer program with control cards to specify the acceptable conditions. Note 1: The processing of this job follows the same procedure as job 4.9; this section will, therefore, not contain many details but refer to section 4.9 instead. Note 2: as job 4.9, this job runs in a number of separate phases. Phase 1: submitting the input to the edit program Input: deck of cards marked CXY 4.8.2, as well as a program deck labelled PCE and two decks of control cards, labelled KIXY (where XY stands for the name group number), and K 4.10. Action: assemble the decks in the following order: 1) PCE 2) KIXY 3) K 4.10 4) CXY 4.8.2 Submit this deck at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Output: the original card deck (PCE + KIXY + K 4.10 + CXY 4.8.2) and a listing of incorrect cards, indicating the type of error on the card and the location of the error. Label the list LXY 4.10.1, and store the input card decks where you found them. Phase 2: correcting the errors found by the edit program 1) If there are no error messages on LXY 4.10.1, file the listing, mark the data card deck as CXY 4.10.2 and check off the "Repeat" column on the log sheet (See section 4.9 for more detail). 2) If there are errors on the listing, investigate them and correct them, using the same procedure as was outlined in section 4.9. Phase 3: verifying the corrections Submit the corrected deck again by going through phase 1; repeat the procedure until there are no more errors that can be fixed (See section 4.9 for details). Your final output card deck for this section should be labelled CXY 4.10.2 [the last digit can be higher than 2, because it should indicate the number of times the job had to-be run; e.g. if you had to repeat twice, the output card deck would be labelled CXY 4.10.3, etc.] Output: the final output will be : original program and control cards (PCE, KIXY and K 4.10), corrected data cards CXY 4.10.2, listings LXY 4.10.1 and LXY 4.10.2.848.4.11 I-Cards to Tape Conversion Purpose: The purpose of this job is to transfer the data from the punched card deck to a magnetic tape. In the process, the program will check to ensure that every card type "1" is accompanied by one and only one card type "2" with the same ID-number. Input: 1) card deck CXY 4.9.2 (the last digit can be a number higher than 2). 2) card deck CXY 4.10.2 (the last digit can be a number higher than 2). 3) Program labelled P 4.11 4) A scratch tape. Check with the supervisor who will tell you how to get this tape and how to make sure that we can identify it later. Action: Assemble the input cards in the order: P 4.11, CXY 4.9.2, CXY 4.10.2 Take the assembled input cards, as well as the tape, over to the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Output: the original decks of cards (P 4.10, CXY.4.9.2 and CXY.4.10.2), your input tape as well as a listing. Label the listing LXY 4.11 and file it in a binder marked "Error Listings I-Cards"; store the cards where you found them; give the tape to the supervisor. The listing contains indications of errors on the cards found by the program which copied the information from the cards onto the tape. It is possible that no errors were found; in that case, mark job 4.12 as not required by putting an "X" in the "date started" and "date finished" columns for the job. Label the tape containing the data: TXY 4.11. 4.12 Elimination of Error Records on I-Tape Purpose: The purpose of this job is to correct errors found by the computer program which copied the information, contained on punch cards, to magnetic tape and which, in doing so, combined all information pertaining to a person in one tape record. Note: it is possible that this job does not have to be done. If this is the case, you will find an "X" in the "date started" and "date finished" columns. Input: listing marked LXY 4.11, program UPDATEAL, tape TXY 4.11, scratch tape. Action: this listing contains three types of errors: 1) missing information 2) superfluous information 3) missing ID-numbers To understand the meaning of these error diagnostics, you have to keep in mind that information for each person should be contained on one (and no more than one) card "1" (i.e. a card having a "1" punched in column 80) and one (and no more than one) card "2" (i.e. a card having a "2" punched in column 80). Missing information therefore indicates that a person had either only a card type "1" or only a card type "2" (the missing type will be given on the listing); superfluous information indicates that a person had either more than one card type "1" and/or more than one card type "2". "Missing ID-numbers" refers to situations where sets of cards are missing for people with household unit numbers (cols. 4-7) larger than 4000. Since these numbers were all assigned at the same time, they should be consecutive; the program simply indicates "gaps". If you find these diagnostics, see the supervisor. a) In the case of missing information the process to be followed is quite simple: 1) locate the folder containing the information for the person whose record is in error; 2) locate the code-sheet containing the information which is missing; 3) set up control cards to add the missing information to the tape record (See WAIS 667-025 for precise instructions for setting up control cards; use WAIS 667-030, section 2, for the format to be used.) b) In the case of superfluous information, the process to be followed is not always as simple. The following situations can arise: 1) the two original cards contained identical information (check this on the listing: it provides the contents of both cards). This is a very simple case: no action is required. 2) the two original cards were not identical. The procedure in this case is as follows: i) find the folder containing the information for the person whose record is in error; ii) determine which set of information fields is the correct one; iii) set up control cards to insert the correct information in the taperecord (follow the same instructions as for section (a.3) above). iv) try to determine to whom the "other" card belonged (i.e. the card containing incorrect information for that ID-number) [you can use listing LXY 4.1 to locate the card]. If you can locate the person, set up a set of control cards (as per section a.3) to add the information for that person to the tape-file (be sure to use the correct ID-number). c) When you have fixed up all the errors on the listing, have the control cards punched. d) Take the control cards and the tape TXY 4.11, as well as a scratch tape, and run program UPDATEAL (as described in WAIS 667-025). Output: your input cards (UPDATEAL; store where you found them), input tape TXY 4.11 (store where you found it), corrected data tape (the original scratch tape), label it TXY 4.12. 4.13 Check on Multiple Social Security Numbers Purpose: The purpose of this job is to assure that the Social Security Numbers on our file are unique, i.e. that no two (or more) persons have the same Social Security Number. Input: tape TXY 4.12, program deck P 4.13 Action: submit the input at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Output: listing of incorrect records, as well as your input cards and tape. Label the listing LXY 4.13 and file it in the binder marked "Error Listings I-Cards"; store the cards and the tape where you found them. Note: it is possible that the listing does not contain any errors. In that case, job 4.14 is not required: Put an "X" in the "date started" and "date finished" columns for job 4.14. 4.14 Elimination of Incorrect Records on I-Tape Purpose: the purpose of this job is to correct all cases, found in job 4.13, where two or more people are using the same Social Security Number. Input: listing LXY 4.13, program UPDATEAL, tape TXY 4.12, a scratch tape. Action: this job is to be done in a number of separate phases: Phase 1: locating and identifying the error Listing LXY 4.13 contains printouts for all pairs (or groups) of people with the same Social Security number. To find the right "owner" of the number, go through the following steps: 1) Get the folders containing the information for the people whose social security numbers you are checking and check the social security numbers on the tax returns in these folders; 2) If one of the two numbers is incorrect (due to coding or punching errors), set up control cards to correct the social security numbers (see job 4.12, section a.3 for details). 3. If both (or all) numbers seem to be correct (i.e. the income tax returns for both - or all - people carry the same social security number: i) the group may contain a husband and a wife, who are using the same social security number. In that case, "allot" the number to the husband and change the wife's number to blanks (unless the tax returns give you an indication that she does have an other number, or unless the income tax returns give you a clear indication that the husband has an other number and the "disputed" number really belongs to the wife), using control cards as described in job 4.12 section (a.3). ii) two (or more) "people" may in fact be one and the same person (compare the addresses, names, occupation codes, etc., on the code sheets). If "they" are the same person, show the case to the supervisor (we will be warned for the situation when it occurs on our other files). One of the records on the tape-file will have to be deleted; the supervisor will tell you which one to delete. Set up a control card, again according to detailed instructions in WAIS 667-025. iii) the records "sharing" the same social security number may belong to different people who are not in any way related to each other. Search their income tax returns for any indication of another social security number. If you find one, make the correction by setting up a control card according to WAIS 667-025. If you don't find any indications that anyone has another social security number, mention this to the supervisor. Mark off the error on the listing as "cannot be corrected". Phase 2: making the corrections When you have gone through all the errors on listing LXY 4.13 and your control cards for the corrections have all been punched, you can run program UPDATEAL to make the corrections (See WAIS 667-025 for instructions on submitting the program) - you need program deck UPDATEAL, your control cards, tape TXY 4.12 and a scratch tape. Mark the output tape (the original scratch tape) TXY 4.14. Phase 3: verification of the corrections To make sure that you have corrected all the errors and have not introduced new ones, run job 4.13 again using TXY 4.14 as input. The listing you receive this time should not contain any errors, except the ones you found and marked as "cannot be fixed." If there are other errors, repeat job 4.14. Output: Your input program cards (UPDATEAL; store where you found them), input tape TXY 4.12 (store where you found it), corrected data tape (the original scratch tape, now labelled TXY 4.14). 4.15 Husband - Wife Checks on I-Tape Purpose The purpose of this job is to make sure that the identification information for husband and wife agrees in last name and address. Input: Tape TXY 4.14, program P 4.15. Action: submit tape and program deck at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Output: A listing containing information about various inconsistencies on the identification file (as well as your input tape and program). Mark the listing LXY 4.15; store the tape and the program where you found them. It is possible that the program detected no inconsistencies at all; in that case, your output listing will not contain any error indications, job 4.16 will not have to be done. Mark this by putting an "X" in the "date started" and "date finished" columns for job 4.16. 4.16 Elimination of I-Error Records Purpose: The purpose of this job is to correct all the inconsistencies between husband's and wife's address information on the identification file. Note: It is possible that no inconsistencies were found; in that case, you will find "X" in "date started" and "date finished" columns for this job and you can skip this job. Input: Listing LXY 4.15, program UPDATEAL, tape TXY 4.14, a scratch tape. Action: This job is to be done in a number of separate phases. Phase 1: locating and identifying the error. Listing LXY 4.15 contains indications about inconsistencies between husband's and wife's address fields on the identification file. 1) Take the folder which contains the income tax returns for the couple and check the address(es) they indicate on the returns. 2) If you find that one of the addresses was incorrectly coded or punched, set up control cards to correct the error (see details in section 4.12 and WAIS 667-025). 3) If you find that the two spouses indicate two different addresses, two possibilities arise: a) they may be separated or divorced, or b) they may actually not be married to each other - our identification numbers may be in error. By looking carefully at all tax returns for earlier years you should be able to find clues to what really happened: if at some earlier time they gave the same address, and/or they claimed they were married to each other and/or they now state that they are separated or divorced, you may assume that the two different addresses correctly represent the actual situation; in this case, there is no real inconsistency and you cannot correct anything. Mark this on listing LXY 4.15 as "divorced" or "separated" (whichever is applicable). If there is no indication that the two people ever belonged together, see the supervisor. In this case, one of the ID-numbers is probably incorrect and you may have to do some work to find out what the correct ID-number is. 4) After you have processed all the errors and either explained them (as "divorced" or "separated") or corrected them (by setting up control cards) have the control cards punched. Phase 2: making the corrections to the file When your control cards have been punched, submit program UPDATEAL (See WAIS 667-025 for details) - you need program UPDATEAL, your control cards, tape TXY 4.14 and a scratch tape. Submit this at the input-output room in the Commerce Building. After the job has been run, label the output tape (the original scratch tape) TXY 4.16. Phase 3: verifying the corrections To make sure that you have made all corrections properly and have not introduced new errors, run job 4.15 again, using TXY 4.16 as input instead of TXY 4.14. If new errors appear or if old, "unexplained" errors reappear, repeat job 4.16. This cycle of repetitiions has to be followed utnil there are no more errors on listing LXY 4.15! Output: Your input program cards (UPDATEAL; store where you found them), input tape TXY 4.14 (store where you found it), corrected data tape (the original scratch tape, now labelled TXY 4.16). 4.17 Sort and Merge of I-Records Purpose: The purpose of this job is to add the segment of the identification file which has just successfully gone through the various checks and corrections to the segments which previously completed the process. Input: tape TXY 4,16, program P 4.17, the previous identification tape (labelled TI # ZZ, where "ZZ" is a sequence counter indicating the number of segments on that tape) and a scratch tape. Note: that for the very first segment to reach this job, you will not have a tape TI # 00. For that case, you can perform the job by simply marking tape TXY 4.16 as TI # 01. Action: take your input tapes and program P 4.17 to be run at the input-output room in the Commerce Building, where your output should be available the next day or the day thereafter. Output: An "updated" tape, containing the combined information from tapes TI # ZZ and TXY 4.16. Mark the updated tape TI # ZZ1, where ZZ1 is a number one higher than ZZ. [Example: Suppose you are adding the fifth segment to a tape containing four other segments. If we assume that your fifth segment contains name group 05, your input tapes would be TI # 04 and T05 4.16; your output tape would be labelled TI # 05].hahttp://www.ssc.wisc.edu/wais/WAIS667040.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667040.txt John deVries 1967f_1959-1964 Wisconsin Income Tax Data: Detailed Instructions for Processing Summary Cards Vol. IIb June 6, 1967 WAIS paper667-042rMaster File- Tax Records}}John deVries WAIS 667-042 June 6, 1967 1959-1964 Wisconsin Income Tax Data: Detailed Instructions for Processing Summary Cards Vol. II. This paper is a continuation of the manual started in WAIS 667-040; a summary of the total operation can be found in 667-035. 5.1 Listing of Summary Cards Purpose: The purpose of this job is to produce a permanent record of the summary cards we received from the keypunchers. Input: Summary cards CXY.2 (there may be several boxes of cards for many of the name-groups; they will be marked "Summary cards name-group XY, box 1, etc.; make sure that you include all boxes of summary cards for that name-group), program deck P 4.1. Action: Assemble the input deck by inserting the program deck in front of the cards in the first box. Submit this deck at the input-output room in the Commerce Building; this is also the place where you pick up your output the next day or the day after that. Your output will contain a listing; label the listing LXY.5.1 and file it in a binder marked "Summary listings". Note: If there are several boxes of summary cards for a namegroup, you can save yourself some energy by taking the cards to the loading dock in the Social Science Building and asking the pick-up service to move them. Ask the supervisor for advice in this. Output: Your original input cards ( P 4.1 and CXY.2) - store these cards where you found them; and listing LXY. 5.1. 5.2 Separation of Summary Cards Purpose:, To separate the different types of cards which were punched as one job (and therefore were mixed together), but which have to be processed separately for at least some time. Input: Deck of cards marked CXY.2. Note: This may actually consist of several boxes of cards. Action: Take these cards down to the Basic Machines Room (Room 4441) and sort them on column 1. The contents of this column can be alphabetic as well as numeric! Output: Eight decks of cards plus a remainder: Label your output decks as follows: --those with "1" in column 1: CXY.5.2.1, --those with "2" in column 1: CXY.5.2.2, --those with "3" in column 1: CXY.5.2.3, --those with "4" in column is CXY.5.2.4, --those with "A" in column 1: CXY.5.2.5, --those with "L" in column 1: CXY.5.2.6, --those with "M" in column 1: CXY.5.2.7, --those with "S" in column 1: CXY.5.2.8, -all others: combine into one deck, labelled CXY.5.2.9. For each of the decks labelled CXY.5.2.5 through CXY.5.2.9 there is a possibility that you will not get any cards. If this is the case, check it off on the appropriate output column on the log sheet. Also mark the following jobs as "not required": --if there are no cards in CXY.5.2.5: jobs 7.1-7.7; --if there are no cards in CXY.5.2.6: jobs 8.1-8.7; --if there are no cards in CXY.5.2.7: jobs 5.4-5.15, 14.5-14.6; --if there are no cards in CXY.5.2.8: jobs 6.1-6.10, 14.7-14.8; --if there are no cards in CXY.5.2.9: job 5.3; --if there are no cards in CXY.5.2.5 nor in CXY.5.2.6: jobs 13.5-13.6 5.3 Elimination of Summary Error Cards Purpose: This job is required to determine where the various cards in deck CXY.5.2.9 (all of which have an impermissible character in position 1) belong, to correct them if that is possible and to add the corrected cards to the appropriate card decks. Note: It is possible that this job is not required. If you can not find a deck labelled CXY.5.2.9 and if the output column on the log sheet for CXY.5.2.9 is marked with an "X", you can check off job 5.3 as not required (put an "X" in the "date started" and "date finished" columns). Input: Card deck CXY.5.2.9, listing LXY.5.1. Action: The action for this job consists of many phases; several of them may not be necessary (a simple rule is: if there are no cards to enter a particular phase, that phase can be skipped). The phases to executed are: Phase 1: Sort all with a "blank" (no be cards in CXY.5.2.9 on column 77; keep the cards punch) separate, put the other cards together again (they will be processed further in phase 2). The cards with a blank in column 77 are probably L-cards (i.e. they should have an "L" in column 1). To determine if this is so, locate the card on listing LXY.5.1, then check the card immediately before that one. If column 77 on that card contains either a "K" or an "L", your incorrect card should have an "L" in column 1. You can ask the keypunchers to duplicate the card with an "L" in column 1. Note: If you have several cards to be corrected in this or a similar way, save them and take them to the keypunchers as one group. The corrected cards should be added to card deck CXY.5.2.6. If you can not positively identify the incorrect card as an L-card, place it in CER. Mark the action you have taken (i.e. either corrected or placed in box CER) on list LXY. 5.1. Phase 2: The remaining cards (i.e. those which in phase 1 were put aside because they had a character other than "blank" in column 77) are to be sorted on column 31. Process all cards with a "blank" in phase 2; combine all the other cards into one deck and put them aside to be processed in phase 3. For the cards with a blank in column 31: locate the card on listing LXY.5.1. If you find blanks in columns 11-13, 31-76 and 78-80, you can assume that this card should have had an "M" in column 1. If you don't find this pattern of blanks, put the card in box CER. The cards which you identified can be duplicated by the keypunchers with an "M" in column 1. Corrected cards should be added to deck CXY.5.2.7. Mark the action you took on list LXY.5.1. Phase 3: The remaining cards (i.e. those you put aside in phase 2 because they did not have a blank in column 31) are to be sorted on column 33. All cards with a blank in column 33 are to be processed in phase 3; combine all the other cards (which entered phase 3 but did not have a blank in col. 33) into one deck; put this deck aside to be processed in phase 4. For the cards with blank in col. 33: locate the card on listing LXY.5.1. If you find blanks in columns 11-13, 33-76 and 78-80 (but nowhere else on the card!), you can assume that this card should have had an "S" in column 1. If you do not find this grouping of blanks, put the card in box CER. The cards which you identified can be duplicated by the keypunchers with an "S" in column 1. Corrected cards have to be added to deck CXY.5.2.8. Mark the action you took on list LXY.5.1. Phase 4:, The cards you put aside in phase 3 (those with other than blank in column 33) are to be sorted on column 76. All cards with numerics are to go to phase 5; keep all cards with either a blank or an alphabetic in column 76 to be processed in phase 4. Locate the cards with blanks or alphabetic in column 76 on list LXY.5.1; check the card immediately before that one or listing LXY.5.1. If the card immediately before it has an "A" or a "K" in column 77 and if columns 3-10 on that card and on your incorrect one have the same contents, or if the card two ahead of your incorrect one on listing LXY.5.1 has a "K" in column 77 and columns 3-10 on that card have the same contents as columns 3-10 on your incorrect one, column 1 on your incorrect card should contain an "A". If you find that the cards listed on LXY.5.1 don't satisfy any of the sets of conditions given above, put the incorrect card in box CER. Cards you were able to identify can be duplicated by the keypunchers with an "A" in column 1. Cards so corrected have to be added to deck CXY.5.2.5. Mark the action you took on list CXY.5.1. Phase 5: The remaining cards (i.e. all those not processed in phases 1-4) belong to decks CXY.5.2.1 - 4. Locate the incorrect card on listing LXY.5.1. To determine in which deck your card belongs, check the following: (i) If columns 11-12 contain "59", "60" or "61", column 1 must be a "1" (and your card belongs in deck CXY.5.2.1); (ii) If columns 11-12 contain "62", column 1 must be either "2" or "4"; (iii) If columns 11-12 contain "63" or "64", column 1 must be either "3" or "4"; (iv) If columns 11-12 contain anything else, put the card in box CER; (v) If column 2 contains "4" or "5", column 1 must be either or "2" or "3". If the above conditions do not help you out, take the folder containing the income tax returns for the ID-number contained in columns 3-10 on the card and, with the aid of the card layouts as given in WAIS 667-013 and the tax returns, identify the card and determine the correct contents of column 1. If you can identify the card, have the keypunchers duplicate it with the appropriate correction in column 1; corrected cards should be added to the appropriate deck: if column 1 contains a "1", to deck CXY.5.2.1; if column 1 contains a "2", to deck CXY.5.2.2; if column 1 contains a "3", to deck CXY.5.2.3; if column 1 contains a "4", to deck CXY.5.2.4. Put all unidentified cards in box CER. Mark all actions you took on listing LXY.5.1. Output: Updated data decks CXY.5.2.1-8; unidentified cards in CER. 5.4 Sort of M-Cards Purpose: This job has to be run to produce a sorted file of M-cards; it will also indicate some incorrect cards. Input: Card deck CXY.5.2.7. Action: Sort the input deck on card number (column 2) and on ID-number (columns 3-10). Put aside all cards with numbers other than "1" or "2" in column 2, also all cards with alphabetics anywhere in columns 3-10, except "NN" in columns 9-10, also all cards which do not contain "XY" (where XY stands for the number of the name group you are working on) in columns 3-4. Combine all the cards you put aside (if any) into one deck. Output: A sorted deck of M-cards; label it CXY.5.4.1. Possibly a small deck of incorrect cards; label it CXY.5.4.2. If you do not get any cards in CXY.5.4.2, mark off job 5.5 as not required (by putting an "X" in the "date started and "date completed" columns on the log sheet for job 5.5). 5.5 Correction of Incorrect M-Cards Purpose: This job is needed to correct the errors found in job 5.4. Note: If you find an "X" in the "date started" and "date completed" columns on the log sheet for job 5.5, and if you cannot find input cards CXY.5.4.2, job 5.5 does not have to be done. Input: Deck CXY.5.4.2, listing LXY.5.1, deck CXY.5.4.1. Action: Each one of the cards in CXY.5.4.2 is incorrect or seems to be incorrect. Locate the card on listing LXY.5.1. Three types of error can have occurred: 1. Columns 3-4 do not contain "XY" (where XY stands for the number of the name group you are working on). If columns 3-4 contain a legitimate name group number (i.e., <52 or = 70) and if the card is an M-card (check the format in WAIS 667-013), you can add the card to the appropriate deck of data cards if that has not been processed beyond job 5.4. If the processing for that name group has gone further than job 5.4, put the card in box CER (do this also if columns 3-4 do not contain a legitimate name-group number; i.e., >70 or between 51 and 70 or = 00). Mark the action you took on listing LXY.5.1. 2. Column 2 on your incorrect card contains a number higher than "2". Check the card immediately before the incorrect one on listing LXY.5.1. If that card also has an "M" in column 1, has in column 2 a number, one lower than your incorrect one has, contains the same code in columns 3-10 as your incorrect card and has a "C" in column 77, your card is correct and can be inserted in the appropriate place in deck CXY.5.4.1. (Remember that this deck has been sorted on column 2 and columns 3-10). If any of the above conditions is not satisfied, place your incorrect card in box CER and mark this on listing LXY.5.1. 3. Columns 3-10 contain an alphabetic punch (other than "NN" in columns 9-10) or a blank. Check the card immediately before your incorrect one on listing LXY.5.1 and compare the contents of columns 3-10 for that card with columns 3-10 on your incorrect card. If you find agreement on all columns except your incorrect one(s) (the alphabetic punches) which caused the card to be put in CXY.5.4.2 in the first place), you can assume that the cards belong to the same person; you can have the card corrected (ask the keypunchers to duplicate the card with the correct contents in columns 3-10). The corrected card can be inserted in its proper place in deck CXY. 5.4.1. If you do not find agreement between the two cards, check your incorrect one with the one following it on listing LXY.5.1. If you find the correct contents in columns 3-10 on this card, you can make your correction on the incorrect card; if you still don't find agreement, put your incorrect card in box CER. After you have processed all cards in deck CXY.5.4.2, you should have: a) no more cards in CXY.5.4.2; b) all cards which you could not correct in box CER. Change deck CXY.5.4.1 to CXY.5.5. Output: Updated deck CXY.5.5. 5.6 Listing of Sorted M-Cards Purpose: This job is needed to produce a list of the M-cards in the order in which they are sorted. Input: Deck of M-cards CXY.5.5, program deck P. 4.1. Action: Assemble the input by putting P. 4.1 in front of CXY.5.5. Submit this deck at the input-output room in the Commerce Building; this is also the place where you pick up your output the next day or the day after that. Your output will be a listing (besides your original input cards). Label the listing LXY.5.6 and file it in a binder marked "sorted M. listings". Output: Your input decks P. 4.1 and CXY.5.5 (store these where you found them) and listing LXY.5.6. 5.7 Check for Missing and Superfluous M-Cards Purpose: This job is required to ensure that we did not lose cards we should have had, and that we did not include cards which do not belong in this particular file. Input: Card deck CXY.5.5, program deck P. 5.7. Action: Assemble the input deck by placing the program deck in front of the data deck CXY.5.5, and submit this at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Your output should contain, besides your input cards, a listing with information about incorrect M-cards. Label the listing LXY.5.7 and file it in a binder marked "M-error listings". Note: It is possible that listing LXY.5.7 does not contain any messages regarding incorrect cards. If this is so, job 5.8 is not required; mark this by putting an "X" in the "date started" and "date completed" columns for job 5.8 on the log sheet. Output: Input decks P. 5.7 and CXY.5.5 (store these where you found them) and listing LXY.5.7. 5.8 Correction, Addition or Elimination of Incorrect M-Cards Purpose: To correct the errors found in job 5.7. Note: It is possible that this job is not required: If you find that your input listing (LXY.5.7) does not contain any error messages and if you find the "date started" and "date completed" columns on the log sheet marked with an "X", you can proceed to the next job. Input: Listing LXY. 5.7, card deck CXY.5.5, listing LXY.5.6. Action: Listing LXY.5.7 contains indications about missing or superfluous cards in deck CXY.5.5. The following situations may occur: 1. Missing cards The program which checked for missing and superfluous cards in job 5.7 used the following rule: If a particular card had a "C" in column 77, there should be at least one other M-card for that same person. This means that two different kinds of errors could have caused the program to produce this particular error message: a) the "C" should have been a "Z" (indicating that no more M-cards for this person were to be expected), or b) a card was lost or misplaced (due to a mispunch in columns 3-10). To decide what was the cause of the error message, get the folder which contains the income tax returns for the person whose IDnumber has been punched in columns 3-10. Take the, code sheet for that person (you will usually find this in the front of the folder) and look on the back of this sheet. You will find here all the ID-numbers under which we have records for that person. There should be one less M-card than there are ID-numbers. With this information and the information on listing LXY.5.6 (which lists all the M-cards pertaining to a person) you will be able to determine whether the "missing card" message on listing LXY.5.7 was caused by a mispunch or a missing card. If there is really a missing card, code the contents of that card on an 80-column sheet (see WAIS 667-013, first revision, page 8 for the contents of the specific columns); if there is a mispunch, ask the keypunchers to duplicate the incorrect card with a "Z" in column 77. Mark the action you took on listing LXY.5.7 (write "card added" in the margin if you had to add a missing card). 2. Superfluous cards Program P.5.7, using the rule specified under "missing cards", would decide that a card was superfluous if it was preceded by a card with the same ID-number (columns 3-10) as the superfluous card and a "Z" in column 77. Two kinds of errors could have caused this inconsistency: a) the preceding card should have had a "C" in column 77, or b) there is really a superfluous card. To decide which error caused the program to produce the error message, get the folder with the tax returns for the person whose ID-number is in columns 3-10 of your card, and check the back of the codesheet (see above for details). If the card is legitimately there, change column 77 of the preceding card from "Z" to "C"; if the card is really superfluous, put it in box CER (unless it is completely identical to another M-card; in that case, you can destroy it). Mark all the actions you take on listing LXY.5.6; if you remove cards (either to box CER or destroyed), mark this on list LXY.5.1. as well. When you have processed all the cards for which you found error messages on listing LXY.5.7, take your 80 column sheets (if you have any) to the keypunchers and have these cards punched; add these cards, as well as the ones you had corrected by duplication, in their appropriate positions to deck CXY.5.5 (remember that these cards were sorted by card number, col. 2, and ID-number columns 3-10); finally, change the number of card deck CXY.5.5 to CXY.5.8. Output: Card deck CXY.5.8. 5.9 M-Card Edits Purpose: This job is required to check for invalid punches and codes on the M-cards by means of a computer program with control cards. Note 1: The processing of this job follows the same procedure as job 4.9 (see WAIS 667-040); this section will, therefore, not contain many details, but will refer to section 4.9 instead. Note 2: As job 4.9, this job runs in a number of separate phases. Phase 1: Submitting the Input to the Edit Program Input: Deck of cards marked CXY.5.8, program deck PCE, control card decks KMXY (where XY stands for the name group number) and K.5.9. Note: It is possible that you cannot find a deck CXY.5.8; in that case, if job 5.8 was checked off as not required (an "X" in the "date started" and "date completed" columns), use deck CXY.5.5 instead; if that also does not exist, use deck CXY.5.4.1. Action: Assemble the decks in the following order: 1) PCE 2) K.5.9, except the last card 3) KMXY 4) The last card of K.5.9 5) CXY.5.8 (or CXY.5.5 or CXY.5.4.1) Submit this assembled deck at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Output: The original card deck (PCE + KMXY + K.5.9 + CXY.5.8) and a listing of incorrect cards, indicating the type of error on the card and the location of the error. Label the first LXY.5.9.1 and store the input card decks where you found them. Phase 2: Correcting the Errors Found by the Edit Program 1. If there are no error messages on LXY.5.9.1, file the listing in a binder labelled "M-error listings", mark the data card deck (i.e., your input card deck CXY.5.8, or CXY.5.5, or CXY.5.4.1) as CXY.5.9.1 and check off the "repeat" column on the log sheet. (See section 4.9 for more detail). 2. If there are errors on the listing, investigate them and correct them, using the same procedure as was outlined for job 4.9. Mark all the changes you made on listing LXY.5.6. Phase 3: Verifying the Corrections Run the corrected deck CXY.5.8 (or CXY.5.5 or CXY.5.4.1) again through phases 1 and 2; repeat the procedure until there are no more errors that can be corrected (see section 4.9 for details). Your final output data deck for this section should be labelled CXY.5.9.2 [the last digit indicates the number of times job 5.9 was run on this set of data cards; if the job was run more than twice, the number will be higher than 2, if the job was run only once, the number will be "1"]. Output: The final output will be original program and control cards PCE, KMXY and K.5.9 (store these where you found them), corrected data cards CXY.5.9.2, listings LXY.5.9.1 and LXY.5.9.2. 5.10 M-Cards to Tape Conversion Purpose: This job is needed to transfer the data from the punched cards to a magnetic tape. Input: Card deck CXY.5.9.2 [the last digit can be a digit other than 2], program labelled P. 5.10, and a scratch tape. Check with the supervisor who will tell you how to get this tape and how to make sure that we can identify it later. Action: Assemble the input cards in the order: P.5.10, CXY.5.9.2. Take the assembled input cards, as well as the tape, over to the inputoutput room in the Commerce Building. Your output should be available there the next day or the day after that. Output: The original input cards (P.5.10 and CXY.5.9.2) - store these where you found them, and your tape (which now contains the data from CXY.5.9.2). Label the tape TXY.5.10. 5.11 Check for Multiple Social Security Numbers Purpose: This job is to ensure that no two (or more) persons have the same social security number. Input: Tape TXY.5.10, program deck P.5.11. Action: Submit the input at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Output: Your input cards (P.5.11) and tape (TXY.5.10), as well as a listing of incorrect records. Label the listing LXY.5.11 and file it in a binder marked "M-error listings"; store the cards and the tape where you found them. Note: It is possible that listing LXY.5.11 does not contain any errors. In that case, job 5.12 is not required; put an "X" in the "date started" and "date completed" columns for job 5.12 on the log sheet. 5.12 Correction of Incorrect M-Records Purpose: This job is required to correct all cases, found in job 5.11, where two or more persons are using the same social security number. Note 1: The processing of this job is similar to that for job 4.14, described in WAIS 667-040; this section will therefore omit many details and, instead, refer to section 4.14. Note 2: It is advantageous if this job be done after completion of job 4.14 for the same name group. Decisions made during the execution of that job will in most cases be helpful in correcting errors in job 5.12. Input: Listing LXY 5.11, program UPDATEAL, tape TXY.5.10, a scratch tape. Action: This job is to be done in a number of separate phases Phase 1: Locating and identifying the errors: (For details see section 4.14 in WAIS 667-040). 1. From the tax returns, check the social security number of the people whose records you are investigating; 2. Set up control cards to correct mispunched or miscoded social security numbers. 3. For husband-wife groups, decide who is the legal owner of the social security number; change the other one to blanks, unless there is a different social security number written on one of the tax returns; 4. For remaining cases, try to find additional social security numbers; 5. Mark the remainder on listing LXY.5.11 as "cannot be corrected." 6. Mark all the changes you made on listing LXY.5.6. Phase 2: Making the corrections (For details see job 4.14) 1. Have your control cards punched (Room 4441). 2. Run program UPDATEAL. Mark the output tape (the original scratch tape) TXY.5.12. Phase 3: Verification of the Corrections Run job 5.11 again, using TXY.5.12 as input. The output listing you receive this time should not contain any errors except the ones you found and marked as "cannot be corrected" when you ran job 5.12 the first time. If there are other errors, repeat job 5.12. OutPut: Your input program cards (UPDATEAL; store where you found them) input tape TXY.5.10 (store where you found it), corrected data tape (the original scratch tape, now labelled TXY.5.12). 5.13 Check for Duplicate ID-numbers Purpose: This job is required to check for the presence of duplicate "secondary" ID-numbers or the M-file. Input: Program P. 5.13, tape TXY.5.12. Action: Submit the input cards and tape at the input - output room in the Commerce Building. This is where you pick up your output the next day or the day after that. Your output will contain, besides the original input cards and tape, a listing. Label the listing LXY.5.13 and file it in a binder marked "M-error listings." Note: It is possible that listing LXY.5.13 does not contain any error messages. In that case, put an "X" in the "date started" and "date completed" columns for job 5.14 on the log sheet. Output: Original program cards (store where you found them). Original data tape (store where you found it) and listing LXY.5.13. 5.14 Correction of Incorrect M-Records Purpose: This job is needed to correct the errors found in job 5.13 [two or more people sharing the same ID-number]. Note 1: If you cannot find listing LXY.5.13 and if job 5.14 is marked off with an "X" in the "date started" and "date completed" columns on the log sheet, this job is not required. Note 2: This job has the same "structure" as job 4.14 (described in WAIS 667-040); we will refer to that section for details. Input: Listing LXY.5.13, program UPDATEAL, tape TXY.5.12, a scratch tape. Action: This job is to be done in a number of separate phases. Phase 1: Locating and identifying the error, 1. Take the folders containing the tax returns and codesheets for the persons whose ID-numbers seem to be in error; 2. Check the "secondary" ID-numbers which have been written on the back of the codesheet; 3. If the error listed on LXY.5.13 was caused by a wrong code or punch, set up a control card to correct the wrong ID-number (see WAIS 667-025 for details, WAIS 667-030 section 3 for the format to be used); 4. If the codesheets do not reveal any mispunch or miscode, compare the income tax returns belonging to the two persons. It is possible that the two persons are infact one and the same individual. If you find a case like this, check it with the supervisor. 5. The two persons may be different ones, connected in some way (e.g., brothers, sisters, or both ex-wives of the same person). If you can determine any such connection, you may be able to determine that one of the numbers listed on LXY.5.13 is incorrect (e.g., second wife given an ID-number ending in "10"). In those cases , too, you can set up control cards to make the correction. 6. If all of the above approaches fail, there is really nothing you can do about correcting the error. Make a note of this on listing LXY. 5.13. 7. If you make changes, mark them on listing LXY.5.6, if you delete any records, make a note on listing LXY.5.1 as well. Phase 2: Making the Corrections (for details see job 4.14). 1. Have your control cards punched (Room 4441). 2. Run program UPDATEAL. Mark the output tape (the original scratch tape) TXY.5.14. Phase 3: Verification of the Corrections Run job 5.13 again, using TXY.5.14 as input. The output listing you receive this time should not contain any errors except the ones you marked as "cannot be corrected" on listing LXY.5.13 when you ran job 5.14 the first time. If there are other errors, repeat job 5.14. Output; Your input program cards (UPDATEAL; store where you found them); input tape TXY.5.12 (store where you found it); corrected data tape (the original scratch tape, now labelled TXY.5.14). 5.15 Sort and Merge of M-Records Purpose: This job is needed to add the segment of the M-file which has now gone through a number of consistency checks, to the segments which previously completed that process. Input: Tape TXY.5.14, program P.5.15, the previous M-tape (labelled TM#ZZ, where "ZZ" is a sequence counter indicating the number of segments on that tape). Note: That for the very first segment to reach this job, you will not have a tape TM#00. For that case, you can perform job 5.15 by simply marking tape TXY. 5.14 as TM#01. Action: For all segments after the first one, take your input tapes and program P 5.15 to the input-output room in the Commerce Building, where your output should be available the next day or the day thereafter. Output: An "updated" tape, containing the combined information from tapes TM#ZZ and TXY.5.14. Mark the output tape TM#ZZ1, where ZZ1 is a number one higher than ZZ. Example: Suppose you are adding the fifth segment to a tape containing four other segments. If we assume that your fifth segment contains name group 05, your input tapes would be T#04 and T05.5.14; your output tape would be labelled TM#05. 6.1. Sort of S-Cards Purpose: This job has to be run to produce a sorted file of S-Cards; it will also indicate some incorrect cards. Input: Card deck CXY.5.2.8 Action: Sort the input deck on card number (column 2) and on ID number (columns 3-10). Put aside all cards with numbers other than "1" or "2" in column 2; also all cards with alphabetics anywhere in columns 3-10, except "NN" in columns 9-10; also all cards which do not contain "XY" in columns 3-4 (where "XY" stands for the number of the name group you are working on). Combine all the cards you put aside (if any) into one deck. Output: Sorted deck of S-cards; label it CXY.6.1.1. Possibly a small deck of incorrect cards; label it CXY.6.1.2. If you did not get any cards in CXY.6.1.2, mark off job 6.2 as not required, by putting an "X" in the "date started" and "date completed" columns on the log sheet. 6.2. Correction of Incorrect S-Cards Purpose: This job is required to correct the errors found in job 6.1 (i.e., cards with incorrect ID-numbers or suspect cardnumbers). Note 1: It is possible that this job does not have to be done ; if you cannot find the input deck and if you find an "X" in the "date started" and "date finished" columns for job 6.2 on the log sheet, proceed to the next job. Note 2: This job is similar to job 5.5; we will refer to that section for details. Input: deck CXY.6.1.2, listing LXY.5.1, deck CXY.6.1.1. Action: Each one of the cards in deck CXY.6.1.2 is incorrect or seemsb to be incorrect. Locate the card on listing LXY.6.1. Three types of error can have occurred (see section 5.5 for details): 1. Columns 3-4 do not contain "XY" (where "XY" stands for the number of the name group you are working on). If columns 3-4 contain a legitimate name-group number (i.e., less than 52 or equal to 70) and if the card is an S-Card (check format in WAIS 667-013, Section 5.3), add the card to the appropriate deck if that has not been processed beyond job 6.1; otherwise, put the incorrect card in box CER. 2. Column 2 contains a number higher than "2". If you find another S-card on listing LXY.5.1 with a number one lower than your incorrect one in column 2, and with the same contents in columns 3-10 as your incorrect one, and with a "C" in column 77, your card is correct; add it to deck CXY.6.1.1 in the proper place (remember that this deck is sorted by ID-number, cols. 3-10, and card number, col. 2). If any of the above conditions is not satisfied, place the incorrect card in box CER. 3. Columns 3-10 contain an alphabetic punch. (other than "NN" in columns 9-10). Try to determine the correct ID-number from the other cards on listing LXY.5.1; if you can determine the correct number, have the card duplicated with corrections by the keypunchers, then insert the corrected cards in their proper position in deck CXY.6.1.1. If you cannot determine the correct ID-number, put the incorrect card in box CER. After all cards in CXY.6.1.2 have been processed, you should have: a) no more cards in CXY.5.1.2, and b) corrected cards inserted in CXY.6.1.2, and c) cards which cannot be corrected in box CER. Change the label of deck CXY.6.1.1 to CXY.6.2. Make a note of all the changes you made (including putting cards in CER) on listing LXY.5.1. Output: Deck CXY.6.2 (updated input deck CXY.6.1.1).~~6.3 Listing of Sorted S-Cards Purpose: This job is needed to produce a list of the S-cards in the order in which they have been sorted. Input: Deck of data cards CXY.6.2, program deck P.4.1. Note: It is possible that you cannot find a deck CXY.6.2; in that case job 6.2 should have been marked as not required ("X" in the "date started" and "date completed" columns for job 6.2 on the log sheet) and there should be a deck labelled CXY.6.1.1. Use this deck as your input deck instead of CXY.6.2. Action:, Assemble the input deck by placing P.4.1 in front of CXY.6.2 (or CXY.6.1.1). Submit this combined deck at the input-output room in the Commerce Building; this is also the place where you pick up your output the next day or the day after that. Your output will be a listing (besides your original input cards). Label the listing LXY.6.3 and file it in a binder marked "sorted S-listings". Output: Your input decks P.4.1 and CXY.6.2 (or CXY.6.1.1) - store these where you found them, and listing LXY.6.3. 6.4 Check For Missing or Superfluous Cards. Purpose: This job is required to ensure that we did not lose cards we should have had in our file, and that we did not include any cards which do not belong in this file. Input: Card deck CXY.6.2, program deck P.5.7. Note: It is possible that you cannot find deck CXY.6.2. In that case, you should find job 6.2 marked as not required and you should use cards CXY.6.1.1 instead of CXY.6.2. Action: Assemble the input deck by placing the program cards P.5.7 in front of the data cards CXY.6.2 (or CXY.6.1.1) and submit this at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Your output should contain, besides your input cards, a listing with information about incorrect or missing S-cards. Label the listing LXY.6.4 and file it in a binder marked "S-error listings". Note: It is possible that the listing does not contain any messages regarding inconsistencies. If this occurs, job 6.5 is not required; mark this off by writing an "X" in the "date started" and "date completed" columns for job 6.5 on the log sheet. Output: Your input decks P.5.7 and CXY.6.2 (or CXY.6.1.1) - store these where you found them; and listing LXY.6.4. 6.5 Correction, Deletion or Addition of S-Cards Purpose: This job is required to correct the errors found in job 6.4. Note 1: It is possible that this job is not required; if you find that your input listing does not contain any error messages and if you find an "X" in the "date started" and "date completed" columns on the log sheet for this job, proceed to the next one. Note 2: The processing of this job is similar to that for job 5.8; we will refer to that section for details. Input: Listing LXY.6.4, card deck CXY.6.2, listing LXY.6.3. Note: It is possible that you cannot find card deck CXY.6.2; use CXY.6.1.1 instead if you find job 6.2 marked on the log sheet as not required. Action: Listing LXY.6.4 contains indications about missing or superfluous cards in deck CXY.6.2 (or CXY.6.1.1). Several situations may occur (for details see job 5.8): 1. Missing cards: a) The S-card immediately before the missing card can have a "C" in columns 77 where it should have had a "Z ". or b) The card is really missing. On the back of the codesheet you will find all the social security numbers this person has used on his tax returns; there should be one S-card less than there are social security numbers. Determine, with the aid of the information on the codesheet and the contents of listing LXY.6.3 (which contains all S-cards belonging to a person), which type of error occurred. If there is a card missing, code the contents on an 80 column sheet; if there is no card missing, have the contents of column 77 on the preceding card changed from "C" to "Z" 2. Superfluous cards a) The preceding card should have had a "C" in column 77, or b) There is really a superfluous card. The information on the back of the codesheet and listing LXY.6.3 should enable you to determine what happened. If the card is not superfluous change column 77 on the preceding card from "Z" to "C"; if the card is really superfluous, drop it if there is another S-card completely identical to it, put it in box CER if the card is unique. When all the errors have been determined, have the cards punched from your codesheets; add the corrected cards in their appropriate positions to deck CXY.6.2 ( or CXY.6.1.1) and change the number of the card deck to CXY.6.5. (Remember that card deck CXY.6.2 - or CXY.6.1.1 has been sorted on ID-number, columns 3-10, and card number , column 2). Mark the changes you made to the cards on listing LXY.6.3; if you deleted a card (either destroyed or put in box CER), mark this also on listing LXY.5.1. Output: Card deck CXY.6.5 (updated, corrected deck CXY.6.2 or CXY.6.1.1). 6.6 S-Card Edits Purpose: This job is required to check for invalid punches and codes on the S-cards by means of a computer program with control cards. Note 1: The processing of this job follows the same procedure as job 4.9 (described in WAIS 667-040); this section will, therefore, not contain many details, but will refer to job 4.9 instead. Note 2: This job is to be done in a number of separate phases. Phase 1: Submitting the Input to the Edit Program Input: Deck of cards marked CXY.6.5, program deck PCE, control cards KMXY (where XY stands for the number of the name-group you are working on), and K.6.6. Note: It is possible that you cannot find a deck marked CXY.6.5. If job 6.5 was marked as not required use deck CXY.6.2 instead. If you don't find this either, and if job 6.2 was also marked on the log sheet as not required, use deck CXY.6.1.1 as your input data card deck. Action: Assemble the decks in the following order: 1) PCE 2) K.6.6, except the last card 3) KMXY 4) The last card of K.6.6 5) CXY.6.5 (or CXY.6.2 or CXY.6.1.1) Submit this assembled deck at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Output: The original card decks (PCE + KMXY + K.6.6 + CXY.6.5) and a listing of incorrect cards, indicating the type of error on the card and the location of the error. Label the listing LXY.6.6.1 and store the inpur card decks where you found them. Phase 2: Correcting the Errors Found by the Edit Program 1) If there are no error messages on LXY.6.6.1, file the listing in a binder labelled "S-error listings", mark the data card deck as CXY.6.6.1 and check off the "repeat" column on the log sheet as not required (see job 4.9 for more detail). 2) If there are error messages on the listing, investigate the errors and correct them, using the same procedure as was outlined for job 4.9. 3) Mark all corrections you are making to the cards, on listing LXY.6.3. Phase 3: Verifying the Corrections Run the corrected deck CXY.6.5 (or CXY.6.2 or CXY.6.1.1) again through phases 1 and 2; repeat this procedure until there are no more errors that can be corrected (see job 4.9 for details). Your final output data deck for this section should be marked CXY.6.6.2 [N.B. the last digit indicates the number of times job 6.6 was run and can therefore be higher than 2 if the job was run more than twice, or lower than 2 if the job was run only once because there were no errors]. Output: The final output will be: The original program and control cards PCE, KMXY and K.6.6 (store these where you found them), corrected data cards CXY.6.6.2, listings LXY.6.6.1 and LXY.6.6.2. 6.7 S-Card to Tape Conversion Purpose: This job transfers the data from the punched cards to a magnetic tape. Input: Card deck CXY.6.6.2 (the last digit can be a number other than 2), program labelled P. 6.7 and a scratch tape. Check with the supervisor who will tell you how to get this tape and how to make sure that we can identify it later. Action: Assemble the input cards in the order: P.6.7, CXY.6.6.2. Take the assembled input cards, as well as the tape, over to the inputoutput room in the Commerce Building. Your output should be available the next day or the day after that. Output: The original input cards (P.6.7. and CXY.6.6.2; store these where you found them) and your tape, which now contains the data. Label the tape TXY.6.7. 6.8 Check for Duplicate Social Security Numbers Purpose: This job is to ensure that no two, or more, persons have the same social security number. Input: Tape TXY.6.7, program deck P.6.8. Action: Submit the input (cards and tape) at the input output room in the Commerce Building. Your output should be available there the next day or the day after that. Output: Your input cards (P.6.8) and tape (TXY.6.7), as well as a listing of incorrect records. Label the listing LXY.6.8 and file it in a binder marked "S-error listings", - store the cards and the tape where you found them. Note: It is possible that listing LXY.6.8 does not contain any error messages. In that case, you can mark off job 6.9 as not required by putting an "X" in the "date started" and "date completed" columns for job 6.9 on the log sheet. 6.9 Correction of Incorrect S-Records Purpose: This job is required to correct all cases, found in job 6.8, where two or more persons are using the same social security number. Note 1: The processing of this job is similar to that for job 4.14; this section will refer to section 4.14 (in WAIS 667-040) for details. Note 2: It is advantageous if this job be done after completion of job 4.14 for the same name group. Decisions made during the execution of that job will frequently be helpful in correcting inconsistencies in job 6.9. Input: listing LXY.6.8, program UPDATEAL, tape TXY.6.7, a scratch tape. Action: This job is to be done in a number of separate phases. Phase 1: Locating and identifying the errors (for details see job 4.14) 1. From the tax returns, check the social security numbers of the people whose records you are investigating. If the error involves one or more "secondary" social security numbers, you have to check the back of the codesheets for these persons as well (all secondary numbers should be written here); 2. Set up control cards to correct mispunched or miscoded numbers. 3. For husband - wife groups, decide who is the legal "owner" of the social security number; change the other one to blanks, unless there is a different social security number written on one of the returns (this applies to the "primary" numbers; if at least one of the numbers involved is a "secondary" number, you may have to delete the whole record). 4. Mark the remainder of the errors on listing LXY.6.8 as "cannot be corrected". 5. Mark all the changes you made on the record, in listing LXY.6.3; if you deleted any records, indicate this also on listing LXY.5.1. Phase 2: Making the corrections (for details see job 4.14) 1. Have your control cards punched in room 4441. 2. Run program UPDATEAL. Mark the output tape (the original scratch tape) TXY.6.9. Phase 3: Verification of the Corrections Run job 6.8 again, using TXY.6.9 as input. The output listing you receive this time should not contain any errors except the ones you found and marked as " cannot be corrected" when you ran job 6.8 the first time. If there are other errors, repeat job 6.9. Output: Your input program cards (UPDATEAL, store where you found them), input tape TXY.6.7 (store where you found it). Corrected data tape (the original scratch tape which now should contain the corrected data from TXY.6.7; label it TXY.6.9). 6.10 Sort and Merge of S-Records Purpose: This job is needed to add the segment of the S-file which has now gone through a number of consistency-checks, to the segments which previously completed the process. Input: Tape TXY.6.9; program P.6.10; the previous S-tape (labelled TS#ZZ, where "ZZ" is a sequence counter indicating the number of segments on that tape) and a scratch tape. Note 1: It is possible that you cannot find a tape TXY.6.9; if job 6.9 was marked off as not required (an "X" in the "date started" and "date finished" columns for job 6.9 on the log sheet), use tape TXY.6.7 instead. Note 2: For the very first segment to reach this job, you will not have a tape TS#00. For that case, job 6.10 can be performed by simply marking tape TXY.6.9 (or TXY.6.7) as TS#01. Action: For all segments after the first one, take your input tapes and program P.6.10 to the input - output room in the Commerce Building, where your output should be available the next day or the day thereafter. Output: An "updated" tape, containing the combined information from tapes TS#ZZ and TXY.6.9. Mark the output tape TS#ZZ1, where ZZ1 is a number one higher than ZZ. (For an example, see Section 5.15). 7.1 Sort of A-Cards Purpose: This job has to be run to produce a sorted file of A-Cards; it will also indicate some incorrect cards. Input: Card deck CXY.5.2.5. Action: Sort the input deck on card number (column 2), on year (columns 11-12), and on ID-number (columns 3-10) numerically [Note: There may be some cards with "NN" in columns 9-10; this is acceptable]. Put aside the following groups of cards: --all those with punches other than "1" or "2" in col. 2. --all those with alphabetics anywhere in columns 3-10, (except "NN" in columns 9-10). --all those with punches other than [XY] in columns 3-4 (where [XY] stands for the number of the name-group you are working on). --all those with columns 11-12 less than "59" or more than "64". Combine all these cards into one deck. Output: Sorted deck of A-Cards; label it CXY.7.1.1. Possibly also a small deck of incorrect cards; label this deck CXY.7.1.2. If you do not get any cards in CXY.7.1.2, mark off job 7.2 as not required (by putting an "X" in the "date started" and "date completed" columns on the log sheet). 7.2 Correction of Incorrect A-Cards Purpose: This job is required to investigate and, where necessary, to correct the errors found in job 7.1. Note 1: If you find an "X" in the "date started" and "date completed" columns on the log sheet for this job, and if you don't find card deck CXY.7.1.2, you can proceed to the next job. Note 2: This job is similar to job 5.5; we will refer to that section for details. Input: Deck CXY.7.1.2, listing LXY.5.1, deck CXY.7.1.1. Action: Each of the cards in CXY.7.1.2 is incorrect or seems to be incorrect. Locate the card on listing LXY.5.1. The following types of error can have occurred (see section 5.5 for details). 1. Columns 3-4 do not contain "XY" (where "XY" stands for the number of the name-group you are working on). If columns 3-4 contain the number of a legitimate name-group (i.e. <52 or = 70), and if the card is an A-card (check the format in WAIS 667-013), add the card to the appropriate deck if and only if that has not been processed beyond job 6.1; otherwise put the incorrect card in box CER. 2. Column 2 contains a number higher than "2" If you find another A-card on LXY.5.1 with a number, one lower than your incorrect one in columns 2, and with the same contents in columns 3-12 (ID-number and year!) as your incorrect one, and with a "C" in column 77, your card is correct; add it to deck CXY.7.1.1 in the proper place (this deck has been sorted by ID-number, year and card number). If any of the above conditions is not satisfied, place the incorrect card in box CER. 3. Columns 3-12 contain an alphabetic punch (other than "NN" in columns 9-10). Try to determine the correct ID-number or year from the other cards on LXY.5.1; if you can determine the correct number, have the card duplicated with corrections by the keypunchers, then insert the corrected cards in their proper position in deck CXY.7.1.1. If you cannot determine the correct ID-number or year, put the incorrect card in box CER. 4. Columns 11-12 contain numerics outside the valid range. Valid punches for these columns are > 59 and < 65. If you find a card with numerics outside this range, locate the card on listing LXY.5.1. It is possible that you will be able to find another card for this person (check columns 3-10 which contain the ID-number) either immediately before your incorrect one or immediately after it. If you find that the card immediately before your incorrect one belongs to the same person, has a valid year punched in columns 11-12, and has a "C" in column 77, you can assume that your incorrect card should have the same contents in columns 11-12 as the card immediately before it; or if you find that the card immediately following it has the same ID-number, has a number higher than "1" in column 2 and has a valid year punched in columns 11-12, you can assume that your incorrect card should have the same contents in columns 11-12 as the card immediately following it. If any of the conditions above is not satisfied, locate the folder containing the tax returns for the person whose card you are trying to correct. Search through the returns to locate all miscellaneous amounts you can find; for each such amount you should be able to determine the year for which this amount was reported. Compare these years with the years indicated in columns 11-12 for the A-cards you can find in listing LXY.5.1; this may help you to determine the correct contents of columns 11-12. If you can identify the correct year, have the card duplicated with corrections, and insert the corrected card at the proper place in deck CXY.7.1.1; if you cannot identify the correct year, place the card in box CER. After all cards in CXY.7.1.2 have been processed, you should have: a) no more cards in CXY.7.1.2, and b) corrected cards inserted in CXY.7.1.1, and c) cards which could not be corrected in box CER. Make a note on listing LXY.5.1. for all cards you put in box CER. Change the label of deck CXY.7.1.1 to CXY.7.2. Output: deck CXY.7.2 (updated input deck CXY.7.1.1). 7.3 Listing of Sorted A-Cards Purpose: This job is needed to produce a listing of the A-cards sorted by ID-number, year and card number. Input: Deck of data cards CXY.7.2, program deck P.4.1. Note: If you find an "X" in the "date started" and "date completed" columns on the log sheet for job 7.2, use deck CXY.7.1.1 instead of CXY.7.2. Action: Assemble the input deck by placing P.4.1 in front of CXY.7.2 (or CXY.7.1.1). Submit this combined deck at the input-output room in the Commerce Building; this is also the place where you pick up your output the next day or the day after that. Your output will be a listing (besides your original input cards). Label the listing LXY.7.3 and file it in a binder marked "Sorted A-listings". Output: Your input decks P.4.1 and CXY.7.2 (or CXY.7.1.1); store these where you found them, and listing LXY.7.3. 7.4 Check for Missing or Superfluous Cards Purpose: This job is required to ensure that we did not lose cards we should have had, and that we did not include cards which do not belong in the A-file. Input: Card deck CXY.7.2, program deck P.7.4. Note: If job 7.2 was marked as "not required" on the log sheet, use card deck CXY.7.1.1 instead of CXY.7.2. Action: Assemble the input deck by placing P.7.4 in front of the data cards CXY.7.2 (or CXY.7.1.1) and submit this at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Your output should contain, besides your input cards, a listing with information about incorrect or missing A-cards. Label the listing LXY.7.4 and file it in a binder marked "A-error listings". Note: It is possible that the listing does not contain any messages regarding inconsistencies. If this occurs, job 7.5 is not required, mark this on the log sheet by putting an "X" in the "date started" and "date completed" columns for job 7.5. Output: Your input decks P.7.4 and CXY.7.2 (or CXY.7.1.1) -- store these where you found them, and listing LXY-7.4. 7.5 Correction, Deletion or Addition of A-Cards Purpose: This job is needed to correct the errors and inconsistencies found in job 7.4. Note I: If you find the "date started" and "date completed" columns for this job on the log sheet marked with an "X", you can proceed to the next job. Note 2: This job is similar to job 5.8; we will refer to that section for details. Input: Listing LXY.7.4, card deck CXY.7.2, listing LXY.7.3 Note: If job 7.2 was marked on the log sheet as "not required", use deck CXY.7.1.1 instead of CXY.7.2. Action: Listing LXY.7.4 contains indications about missing or superfluous cards in deck CXY.7.2 (or CXY.7.1.1). Several situations may occur: 1. Missing cards The program which checked for missing and superfluous cards in job 7.4 used the following rules: a) If a particular card had a "C" in column 77, there should be at least one other A-card for that year for that person; b) There should be as many A-cards for that person and year as the number of the A-card (for that person and year) with a "Z" in column 77. This means that several situations could have caused the error message to be printed on listing LXY.7.4: (i) the "c" should have been a "Z" (indicating that no more A-cards for this person and year were to be expected); (ii) column 2 on the last card for that person and year contains an incorrect (too high) number; (iii) a card was lost or, due to mispunches in cols. 3-12, misplaced. From the information on listing LXY.7.3, supported (if necessary) by the source information in the folder containing the tax returns of the person whose ID-number should be in columns 3-10 of the missing card, you should be able to determine what caused the error message on listing LXY.7.4. You can determine how many A-cards were supposed to be present by looking at the tax return for the year indicated in columns 11-12 and finding all the "miscellaneous amounts" (see WAIS 667-013, page 28 for the various possible types of "miscellaneous amounts"; each card can contain up to six different amounts). After you have determined what caused the error message, make the appropriate correction: if there is really a card missing, code the contents of the missing card on an 80-column sheet; at the end of the job, have all such cards punched and insert them at the proper place in deck CXY.7.2 (or CXY.7.1.1); remember that the cards in this deck are sorted by ID-number, year of return and card number. If the error was caused by a mispunch in an existing card, have it duplicated with the appropriate corrections and insert the corrected card at the proper place in deck CXY.7.2 (or CXY.7.1.1). Mark the change on listing LXY.7.3. 2. Superfluous cards: Using rules similar to those regarding missing cards, the program run in job 7.4 decided a card was superfluous if: (i) it was preceded by a card with the same ID-number and year (cols. 3-12) as the "superfluous" card, and a "Z" in column 77; (ii) there were more cards with the same ID-number and year than was indicated by the card number (col.2) on the last one of them (i.e., the one with a "Z" in col. 77). The following kinds of errors could have caused the inconsistency: (i) the "Z" in the "last" card should have been a "C" (while the card following it should have a "Z" in column 77); (ii) column 2 on the last card contains an incorrect (too low) number; (iii) there is actually a superfluous card. Determine what the cause of the error is by locating the cards on listing LXY.7.3; this will usually be sufficient to indicate the first two types of error. If listing LXY.7.3 does not clarify the situation enough, locate the tax returns. Check the "miscellaneous amounts" for the year which is punched in columns 11-12 of your superfluous card(s); this will usually confirm if the card is superfluous or not. If the superfluous card is completely identical to an other card for that person and year, you don't have to go through all that trouble: you can just destroy the card (mark this on listings LXY.7.3 and LXY.5.1). If the card is superfluous, but you can not determine where it belongs, put it in box CER (mark this, too, on listings LXY.7.3 and LXY.5.1). If there is an error on the card (such as the first two types indicated), have the card duplicated with corrections; insert the corrected card at the proper place in deck CXY.7.2 (or CXY.7.1.1). Mark the correction on listing LXY.7.3. After you have made all corrections to the cards listed as incorrect on listing LXY.7.4, change deck CXY.7.2 (or CXY.7.1.1) to CXY.7.5. Output: Card deck CXY.7.5 (corrected deck CXY.7.2 or CXY.7.1.1). 7.6 A-Card Edits Purpose: This job is required to check for invalid punches and codes on the A-Cards by means of a computer program with control cards. Note 1: The processing of this job follows the same procedure as job 4.9 (described in WAIS 667-040); this section will, therefore, for most details refer to section 4.9. Note 2: As job 4.9, this job runs in a number of separate phases. Phase 1: Submitting the Input to the Edit Program Input: Deck of data cards CXY.7.5, program deck PCE, control card decks KMXY (where XY stands for the name group number) and K.7.6. Note: It is possible that you cannot find a deck CXY.7.5; in that case, use deck CXY.7.2 if job 7.5 was marked off as "not required", or deck CXY.7.1.1, if job 7.2 was marked off as "not required" too. Action: Assemble the deck in the following order: 1) PCE 2) K.7.6, except the last card 3) KMXY 4) the last card of K.7.6 5) CXY.7.5 (or CXY.7.2. or CXY.7.1.1). Submit this assembled deck at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Output: The original card deck (PCE + KMXY + K.7.6 + CXY.7.5) and a listing of incorrect cards, indicating the type of error on the card and the location of the error. Label the list LXY.7.6.1 and store the input cards where you found them. Phase 2: Correcting the Errors Found by the Edit Program 1) If there are no error messages on LXY.7.6.1, file the listing in a binder labelled "A-error listings", mark the data card deck as CXY.7.6.1, and check off the "Repeat" column on the log sheet (see section 4.9 in WAIS 667-040 for details). 2) If there are errors on the listing, investigate them and correct them, using the same procedure as was outlined for job 4.9. Phase 3: Verifying the Corrections Run the corrected deck CXY.7.5 (or CXY.7.2 or CXY.7.1.1) again through phases 1 and 2; repeat the procedure until there are no more errors that can be corrected (see job 4.9 for details). Your final output data deck for this section should be labelled CXY.7.6.2 [or a number higher than 2 if job 7.6 was run more then twice, or a number lower than 2 (i.e., 1) if job 7.6 was run only once and no errors were found at all]. Output: The final output will be: original program and control cards PCE, KMXY and K7.6, corrected data cards CXY.7.6.2, listings LXY.7.6.1 and LXY.7.6.2. 7.7 A-Card to Tape Conversion Purpose: This job is needed to transfer the data from the punched cards to a magnetic tape. Input: Card deck CXY.7.6.2 (the last digit can be a number other than 2) program labelled P.7.7, and a scratch tape. Check with the supervisor who will tell you how to get this tape and how to make sure that we can identify it later. Action: Assemble the input cards in the order: P.7.7, CXY.7.6.2. Take the assembled input cards, as well as the tape, over to the inputoutput room in the Commerce Building. Your output should be available there the next day or the day after that. Output: The original input cards (P.7.7 and CXY.7.6.2; store these where you found them) and your tape (which now contains the data); label the tape TXY.7.7. 8.1 Sort of L-Cards Purpose: This job has to be run to produce a sorted file of L-cards; it will also indicate some incorrect cards. Input: Card deck CXY.5.2.6. Action: Sort the input deck on card number (column 2), year (columns 11-12) and ID-number (columns 3-10) numerically. The only alphabetics which may occur are "NN" in columns 9-10. Put aside all cards with the following characteristics: a) cards with a number other than "1" in column 2; b) cards with numerics higher than "64" or lower than "59" in columns 11-12; c) cards with punches other than "XY" (where "XY" stands for the number of the name group you are working on); d) cards with alphabetics anywhere in columns 2-12, except "NN" in columns 9-10. Combine all the cards you put aside (if any) into one deck. Output: Sorted deck of L-cards; label it CXY.8.1.1; possibly a small deck of incorrect cards, label that CXY.8.1.2. If there are no incorrect cards, mark off job 8.2 as not required, by putting an "X" in the "date started" and "date completed" columns for that job on the log sheet. 8.2 Correction of Incorrect L-Cards Purpose: This job is required to correct the errors found in job 8.1. Note 1: If you find an "X" in the "date started" and "date completed" columns on the log sheet for this job, and if you do not find input cards CXY.8.1.2, you can proceed to the next job. Note 2: This job is similar to jobs 5.5 and 7.1; we will refer to those sections for details. Input: Deck CXY.8.1.2, listing LXY.5.1, deck CXY.8.1.1. Action: Each one of the cards in CXY.8.1.2 is incorrect or seems to be incorrect. Locate the card on listing LXY.5.1. The following types of error can have occurred (see sections 5.5 and 7.1 for details). 1. Columns 3-4 do not contain "XY" (where "XY" stands for the number of the name group you are working on). If columns 3-4 contain a legitimate name group number (i.e. < 52 or = 70) and if the card is an L-card (check the format in WAIS 667-013), add the card to the appropriate deck if that has not yet been processed beyond job 8.1; in all other circumstances, put the incorrect card in box CER. 2. Column 2 contains a number other than "1". If you find another L-card on listing LXY.5.1 with a number one lower than your incorrect one in column 2, and with the same contents in columns 3-12 (ID-number and year!) as your incorrect one, and with for the rest the same contents as your incorrect one (i.e., columns 13-80), you can delete your incorrect card. If you find another card and the contents are different, or if you don't find another card: get the folder containing the tax returns for the person whose ID-number you find in columns 3-10 of your incorrect card. Locate all assessments for the year indicated in columns 11-12 of your incorrect card. If you find more than one assessment for the same person for the same year (this is very unlikely, but not impossible!), there should be more than one L-card; if you find only one assessment for that person and year there should be only one L-card, which should have a "1" in column 2. If your incorrect card just has an incorrect number, have it duplicated with a corrected column 2, and add it to deck CXY.8.1.1 (remember that this is sorted by ID number, year and card number); if your incorrect card does not belong where it is, place it in box CER. 3. Columns 3-12 contain an alphabetic punch (other than "NN" in columns 9-10). Try to determine the correct ID-number or year from the other cards on LXY.5.1; if you can determine the correct number, have the card duplicated with corrections by the keypunchers, then insert the corrected cards in their proper position in deck CXY.8.1.1. If you cannot determine the correct ID-number or year, put the incorrect card in box CER. 4. Columns 11-12 contain numerics outside the valid range. Valid punches for these columns are,;)- 58 and < 65. If you find a card with numerics outside this range, locate the card on listing LXY.5.1. It is possible that you will be able to find another card for this person (check columns 3-10 which contain the ID-number) immediately before your incorrect one. If you find a card with an "L" or a "K" in columns 77 ( for the same person, of course!) and you don't find an L-card for that person and year you may assume that the contents of columns 11-12 of your incorrect card should be the same as those for the card you found on LXY.5.1. If you do not find any card for that person with a "K" or "L" in column 77 place your incorrect card in box CER; if you do find a card with "K" or "L" in column 77, but you also find an L-card for that person for the designated year, use the tax returns to determine what happened. If your incorrect card just has an incorrect punch, have the card duplicated with corrections and add it at the proper place to deck CXY.8.1.1; if your incorrect card does not seem to belong in our files, put it in box CER.hahttp://www.ssc.wisc.edu/wais/WAIS667042.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667042.txt3 John deVries 1967PIRequirements for an Integrated Data Processing System for WAIS Data Files June 15, 1967 WAIS paper667-044<5Administration Maintenance System - Files, Data, Etc.c22John deVries WAIS 667-044 June 15, 1967 Requirements for an Integrated Data Processing System for WAIS Data Files 1. General requirements In all cases where a data file is so large that its maintenance by a computer-based maintenance system is cheaper than by re-creation of the file if changes are to be made, a file-maintenance system should be produced which will allow for the following types of changes to be made: a) add records; b) delete records; where multiple observations exist, either (i) delete all observations for a particular unit, or (ii) delete only specific observations for a particular unit; c) delete variables within all observations; d) add variables to all observations (provided that these variables can be constructed by specific operations on already existing variables); e) change the contents of specific variables in specific observations. 1.1. The concept of the "double identifier" Another essential feature of a file-maintenance system results from the necessity to safeguard against the application of changes to the wrong records, due to mispunches, clerical errors, etc. The "double identifier" concept simply means that a unit for whom observations are contained on the file, is identified by two fields on the record, instead of one. Many of the WAIS files already carry this feature in the form of an ID-number (the "primary" identifier) and a Social Security number (the "secondary" identifier) for the same individual. The apparently redundant secondary identifier must be used by the file-maintenance system in the following way: (i) the maintenance program locates the record, indicated by the primary identifier, on the file affected (this file has been sorted on the primary identifier); (ii) the program tests, on that record, the contents of the secondary identifier-field (as indicated on the change-generating item); (iii) if the test sub (ii) fails, the change is not made, and a message to that effect is produced on the printer; (iv) if the test sub (ii) passes, the change is made. 1.2. Record of changes made As an additional safeguard against the introduction of additional errors as a result of improper correction of other errors (other fields in the corrected record may be affected), a file-maintenance program should produce a printed version of the affected record, before as well as after the change was made. For very small files, where it is desirable as well as possible to maintain a printed version of the current file, this device will allow the researcher to "update" his printed file at a very low cost and with a minimal time-lapse after his tape-file has been updated. In general, such printouts allow for some feedback to ensure that the change was made correctly; it may also be used effectively in the investigation of the consequences of the change for other variables in the same record. 1.3. Activity-date Not only do files, as a whole, require a date-field or some other field, contained in the file-label, to specify when the latest activity took place; individual records require a similar field. When a record is changed by the maintenance-program, the activity-date field on the new record should reflect the date, or the number of the updating cycle, when that particular change was made. Especially for files undergoing many changes in relatively short periods of time, this feature is helpful to "trace back" the status of specific records in previous stages. These requirements are, in my opinion, minimum general standards for a file-maintenance system. For variable-length records, additional features are essential to deal with addition and deletion of specific variables for specific observations (involving search-routines, checks for recordlength, etc.). 2. Specific requirements for WAIS data files There are, in addition to the general requirements stated above (which apply, ideally, to all files which deserve proper maintenance), requirements caused by the specific nature of the WAIS file-complex: 2.1. Interlocking files The most unique aspect of the WAIS file-complex is the fact that many of the files are "interlocking'", i.e. there are overlaps between records on different files for the same individual. This implies that changes made to a record on one file will frequently affect records on other files; in many cases, changes will have to be made on several files simultaneously. So far, decisions regarding the files affected by a change, and regarding the specific actions to be taken, have been made by the staff. The danger with this approach is that errors, ignorance and logical fallacies (such as the assumption that the tape-files contain exactly what they should contain according to the source-documents) will introduce inconsistencies in the files, certain files will be "forgotten", etc. A much more rigorous control-procedure therefore seems essential to maintain the internal and external consistency which is desirable on a complicated data-archive such as WAIS. There is an existing proposal for rather limited file-control through the ID-file (see WAIS 667-024); in view of other aspects of the longer-range WAIS plans, as well as the very strict requirements for dataquality control set in general for data-archives (which, in my opinion, are not met by the existing proposal), I will submit a much more elaborate proposal in section 3 of this paper. 2.2. History File [HF] and Selection File [SF] Another specific aspect of the WAIS-complex, directly resulting from the multitude of files and the overlaps between them, is the History File. This file was created to be an "archive", indicating the presence or absence of records on the various WAIS files, for all individuals for whom at least one of the WAIS files has at least one record. While the creation of the HF was an important step towards reliable file-control, an important and regrettable omission in the scheme was that the file, as it now exists, does not have any maintenance-programs available; furthermore, changes which have subsequently been made to other WAIS files have not been reflected in the HF and thus invalidate the current version of the HF already. The Selection File, the creation of which has been proposed in WAIS 667-012, is essentially a logical extension of the History File; instead of simply indicating presence or absence of records, an attempt is made to specify the various "categories" of absence which can be inferred from the available information. As such, the creation of the SF would require the construction of a filemaintenance system if it is ever to be used effectively as a tool for the analyses of WAIS data files or subsets thereof. In addition, an extraction program will be required to facilitate the creation of specific subsets of WAIS data files for specific analysis-purposes. 3. Proposal for a WAIS file-maintenance system From the discussion in the first two sections of this paper, we can see that the following interconnected problems require a solution within a reasonably short time: a) creation of a Selection File; b) creation of a maintenance system for the Selection File; c) creation of a general extract system based on the Selection File; d) creation of a WAIS data file maintenance-system allowing for rigorous control over all data files, giving feedback on changes already made and making standard decisions regarding potential effects of changes on other WAIS files. The following proposal is an attempt to solve all these problems simultaneously. The plan has been set up in phases, which can be implemented separately, to produce the highest payoff in the shortest time. The following phases are suggested (several of these can be produced simultaneously if so desired): 1. Create a HF which corresponds to the status quo of each of the component files. There are two ways to achieve this: (i) rerun the programs which were used to create the HF initially (i.e. re-create the file), making corrections as required in the various stages; (ii) check each of the component files against the existing HF and correct the inconsistencies by means of a special single-purpose program. 2. Write a maintenance program for the up-to-date HF, to add records, delete records, change ID-numbers on records or change fields on selected records. 3.. Run checks on internal consistency within records on the newly created HF and correct all errors thus revealed (for component files using existing updating programs, for the HP using the maintenance system produced in phase 2). 4. Run checks on external consistency between the HF and each of the component files; correct errors found [N.B. if the HF was created through comparisons of the old HF with all component files, this step may not be necessary; its main function is to ensure that the creation of internal consistency in phase 3 did not introduce external inconsistencies1] 5. Run phases 3 and 4 over to eliminate newly introduced internal or external inconsistencies; this cycling has to be repeated until no more inconsistencies can be found. 6. Create a Selection File (i.e. expansion of the information contained on the HF); amend the HF maintenance system to operate on the SF instead. 7. Create a two-stage extract program to be used in connection with the SF; the two stages would have the following specifications: Stage 1: Input: (i) Selection File; (ii) control statements specifying: a) files affected; b) conditions for records to be extracted; c) operations on missing records, based on "gap-plugging codes" (e.g. no extract or construction of dummy-record, etc.). Output: (i) tape-file of extract keys specifying records to be extracted from specified component file(s); (ii) printer-record indicating which data records are to be extracted. Stage 2: Input: (i) extract keys produced in stage 1 above; (ii) component file(s) to be extracted. Output : extracted records. 8. Create a general, program-controlled WAIS file maintenance system, based on the information in the SF and consisting of the following stages: a) Control cards specifying the changes to be made are run against the SF. Validity checks on the changes are made; the consequences, for all WAIS files, of the changes are investigated and filespecific change-items are generated on the basis of the information contained in the SF. Rejected change-items will be printed out at this stage; it may be possible to generate some warnings about potential inconsistencies resulting from changes at this stage as well. b) File-specific change-items (from stage a) are sorted, in this stage, by "label", thus combining all changes to be made to specific files, in the order in which they have to be made. c) Changes are made to all files affected.. A suggested order is: (i) identification file; (ii) master file; (iii) property file; (iv) benefit file; (v) 805 data file; (vi) survey file. All changes to be made should use the features discussed in section 1 of this paper, i.e. the "double identifier", the "before-and-after" printout, and the activity-field updating. The programs used could be file-specific modifications of generalized file-maintenance programs (UPDATEAL, described in WAIS 667-025, or the general file-maintenance system currently being developed by the SSRI Computation Division should be considered for this stage); this would seem to be more desirable than to have six separate file-specific programs. Before changes are made to the separate files, tests should be made, primarily of course the test of the "secondary identifier", but possibly also some tests about inconsistencies which would be caused by the change to be made. If definite inconsistencies would result, or if the identifier tests fail, the change should not be made. A notification of this should appear on the printer-file; also, "reversal-items" should be generated to eliminate changes to be made for this observation in subsequent files. If the tests do not fail, the changes should be made, a printout recording the change should be produced, as well as a record indicating the change made, to be used in stage d) below. d) All changes actually made to the component files (as indicated by the records produced in stage c) above) are now made on the SF to make this file compatible with all component files in their updated version. e) A clerical stage: the printer-file produced two types of output, each of which requires special clerical follow-up: (i) "before-and-after" records of the changes made. These outputs can be checked against a list of the original changespecifications, then filed in binders (to keep a record of the changes made to the file) and eventually destroyed; (ii) rejects due to errors and warnings of potential errors resulting from the changes: in most of these cases, the reasons (for rejects) or consequences (for warnings) will require further investigation, followed by changes to be applied in a following updating cycle.hahttp://www.ssc.wisc.edu/wais/WAIS667044.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667044.txt$Richard Bauman Ashok Bhargaval 1967TNTranscribing and Imputing AGI and NTI Fields for the 443 Character Master File June 19, 1967g WAIS paper667-048uMaster File- Tax RecordsRichard Bauman Ashok Bhargava WAIS 667-048 19 June 1967 Transcribing and Imputing AGI and NTI Fields for the 443 Character Master File 1. General 1.1. This procedure should provide additional summary income data (transcribed or imputed AGI and NTI) for a subset of records for which this data was not previously available in the "normal" AGI and NTI fields in the Master File (AGI-B153 and NTI-B171 or B306). Most of these records are the "Form Type 6" (Incomplete) Records, where the "Taxable Income Incomplete Form" filed (B378) contains either AGI or NTI (depending upon year filed and the "block or column" indicator). 1.2. Although the immediate application of the following procedure is in generating input for the longitudinal analysis of income (see "PREDAVID" in von Schneidemesser, Plan of Operations..., WAIS 667-046), the following procedure should generate useable AGI and NTI fields for all Master File Records affected. 1.3. AGI-NTI Computation Key: An indicator of the type of imputation shall be appended as field B442 to the old 442 character Master File, resulting in a new 443 character Master File. The indicator is described below. 1.4. It would be useful to have a count of the new indicators, by type, in the Master File. 2. Determination of AGI and NTI, and assigning the AGI-NTI computation key (B442) 2.1. If AGI(B153) and at least one NTI(B171 or B306) filed does not equal Blank or MI, assign AGI-NTI computation key (B442) = 0. 2.2. If AGI(B153) and both NTI(B171 and B306)(all) = Blank or MI,and form type Indicators (B387) = 6 or (B388) = 1, and Taxable Income Incomplete Form (B378) does not equal Blank. or MI, and ... 2.2.1 Block or column indicator (B379) = Blank or 0; Year (B11) = 46-52 or 57-58: 2.2.1.1. Let AGI(B153) = T.I (B378) + 450; if T. I (B378) > $4550. 2.2.1.2. Let AGI(B153) = T.I(B378) x 1.1.; if $0 < T.I(B378) < $4550 2.2.1.3. Let AGI(B153) = T.I(B378); if T.I(B378) < $0. 2.2.1.4. Let NTI, standard deduction basis (B171) = T.I(B378) 2.2.1.5. Let (B442) = I NOTE: T.I(B378) = (Reported) NTI by whatever basis; if the taxpayer itemised, imputed AGI is likely to be less than actual AGI. 2.2.2. Block or column indicator (B379) = 1; Year (B11) = 53-56: 2.2.2.1. Let AGI (B153) = T.I(B378) 2.2.2.2. Let NTI, standard deduction basis (B171) = T.I(B378) - $450; if T.I(B378) > $5000. 2.2.2.3. Let NTI, standard deduction basis (B171) = T.I(B378) x .91; if $0 < T.I(B378) < $5000. 2.2.2.4. Let NTI, standard deduction basis (B171) = T.I(B378); if T.I(B38) < $0. 2.2.2.5. Let (B442) = 2. NOTE: T.I(B378) = AGI; and the taxpayer used a table which gave tax based on the standard deduction, therefore, no imputation is made (except in 2.2.2.2 and 2.2.2.4 where the table was not applicable). 2.2.3. Block or column indicator (B379) = 2; Year(B11) = 53-58. 2.2.3.1. Let AGI(B153) = T.I(B378) + $450; if T.I(B378) > $4550. 2.2.3.2. Let AGI(B153) = T.I(B378) x 1.1; if 0 < T.I (B378)< $4550. 2.2.3.3. Let AGI(B153) = T.I(B378) if (B378) < $0. 2.2.3.4. Let NTI, standard deduction basis (B171) = T.I(B378). 2.2,3.5. Let (B442) = 3. NOTE: Logic is the same as 2.2.1.; however, taxpayer did have the option under 2.2.2. 3. List out ID number and year for all records not imputed under section 2.2. 3.1. The second stage of the imputation procedure involves those cases, from the above listing, where T.I(B378) = Blank, but either AGI(B153) or NTI, standard deduction basis --(but not both)-- does not equal blank or M. 3.1.1. if AGI(B153) = blank or MI, and NTI, standard deduction basis (B171) does not equal Blank or HI: 3.1.1.1. Let AGI - NTI(B171) + 450; if NTI(B171) $4550. 3.1.1.2. Let AGI(B153) - NTI(B171) x 1.1; if 0 < NTI(B171) < $4550 3.1.1.3. Let AGI(B153) NTI(B171); if NTI(B171) < 0. 3.1.1.4. Let (B442) = 4. 3.1.2. NTI (both B171 and B306) = blank or MI and AGI amount present. 3.1.2.1. NTI, standard deduction basis (B171) = AGI(B153)-450; if AGI(B153) > 5000. 3.1.2.2. NTI, standard deduction basis (B171) = AGI(B153) x .9; if 0 < AGI(B153) < 5000. 3.1.2.3. NTI, standard deduction basis (B171) = AGI(B153); if AGI(B153) < 0. 3.1.2.4. Let (B442) = 5. 3.1.3. If all AGI, NTI(B153, B171, B306) = blank, and T.I(B378) = blank, and T.I (B378) = blank, let (8442) = 6.hahttp://www.ssc.wisc.edu/wais/WAIS667048.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667048.txt Richard Bauman 1967F@Development of WAIS - Social Security Benefit Payments Data File June 16, 1967 WAIS paper667-047n Benefit FileRichard A. Bauman WAIS 667-047 June 16, 1967 OUTLINE: DEVELOPMENT OF WAIS--SOCIAL SECURITY BENEFIT PAYMENTS DATA FILE I. Introduction A. Goals of benefit data collection 1. A major source of non-taxable income data 2. Goals of cooperative research project a. Study timing of retirement and b. Resulting changes in sources and levels of incomes B. Goals of benefit data processing 1. Preserve most of the detail of recorded data in conversion to machine readable form 2. Allow for analyses on individual (recipient) basis as well as on account (contributor) basis 3. Provide record which is compatible with most other WAIS data files, i.e. a yearly total amount of benefits received by persons in WAIS sample C. Size and coverage of Benefit Data File 1. Number of accounts and gears covered by sample: Out of a total of 3217 accounts identified by the SSA as being in claims statue, data have been received for 3084. All benefits received from these accounts between Jan. 1, 1946 and Dec. 31, 1964 are included. Some data for earlier and later periods are available but are incomplete. 2. Preliminary creation, of Benefit-Year Record version of Benefit Data File -(for about 2400 of rte 3084 accounts) complete file Will contain data: a. for about 4625 persons, (compare to about 19,500 persons in '46-'60 Master Tax File b. covering about 31,000. year records (compare to about 134,000 year records in Master Tax File) c. with an average. per person record length of about 6.7 years (compare axe to 6.9 years in the Master Tax File) Average length of merged Tax-Benefit will probably exceed 10 years. 3.Miscellaneous -Further comparison of Master Tax, Benefit Data Files a. Husband-wife -units. can easily be formed and studied b. Suspension of benefits because of work allows some validation and/or imputation of earnings amounts c. Differences in- samples e.g. i. Remarriage to husband not in name-groups - woman leaves Master Tax File sample but remains in Benefit sample ii. individual (unit) leaves state--usually leaves Master but does not leave Benefit sample II. Collection of Source Documents A. Sources of Benefit Data Sample 1. Original Form 805 (Earnings Records) matching project completed Mar., '64. 2721 of 14,457 matched 805 records were identified as claims cases by the SSA. 2. Supplemental 805 matching' project completed Sept., '65 479 additional identified claims cases out of 2315 additional 805 matches were found. 3. Negotiations with SSA in late '64 and early '65 resulted in cooperative research project in which the SSA agreed to gather late from its Master Claims File and Payment Centers for the original 2727 accounts in claims status. During the actual collection of data for this original group of accounts, SSA also agreed to supply benefit data for 494 additional accounts (479 from the supplemental 805, and a net addition of 11 accounts which were orgianally in "residual" matched categories (e.g. multiple accounts or duplicate accounts in claims status)). B. Receipt of Benefit Data 1. Format of Source Documents a. Most of the data covering benefits received after Jan., 1962 is contained on the SSA's Master Claims Tape File. Printouts of the tape records were sent to WAIS for all persons who were in our sample (via "identified claims cases"). b. All other data (covering i) benefits received before Jan. 1, 1962 and ii.) all other information for Jan. 1962 and later not found on the Master Tape) was collected from the various SSA payments centers around the U.S. on a 2 page form (SSA Form #9249) prepared for the WAiS study. Information contained on both forms is substantially the same although there are several notable exceptions, e.g. no indication of retroactive payments is made on the Master Tape printout, therefore treatment of benefits received on a "cash" basis is impossible for data received in this form. B. Timing of Receipts Approximately 20 separate mailings of benefit data were made over a 17 month period, beginning in June 1 1965 and ending in October 1966. This long collection period was due in part to the process of gathering the data. Master Tape printouts were made in Baltimore; forms and printouts were sent to the Payment Centers for addition of pre-1962 data and annotation, the Payment Centers returned completed forms to Baltimore and then the documents were mailed to WAIS. The other reason for this long period Was that the request 'for supplemental claims data only partially meshed with the original collection process. During the 17 months data for 3084 SS accounts were received. No explanation for the missing data for the remaining 133 accounts has been given. III. Processing the Benefit Payments Data A. Data Flow (Simplified Flowchart -- p. 8; WAIS 667-001) 1. Design of formats for coding and keypunching a. compatibility of printout, 9249 data b. flexibility of number of beneficiaries per account and number of entries for payment history required fairly complex "tree" format. 2. Logging of source doccuments -- positive control on collection of data 3. Assignment of WAIS ID numbers to beneficiaries a. Matches--updates address on existing ID record b. Non-matches in tax sample--create -new ID's c. Non-matches out of tax sample -- new ID's d. Concurrent correction of identification in related files 4. Coding, pre-coding, and keypunching of data a. Timing caused different tactical considerations. A large number of "normal" documents could be keypunched directly b. Scattered receipts of dissimilar documents more easily coded by expert and then keypunched 5. Card-to-tape (Card Image Records) 6. Single card edits. Utility programs developed for this and later stages: a. File Maintenance-add, delete cards b. Edit program c. Pre-edit program d. Selective - print and punch program e. Whole record update program. 7. Intra-file checks in among various cards is the file 8. Interfile edits--Logging card file, ID File, Card Image Benefit Records File--to update ID's, utilize, control of logging procedure 9. Create summary (analysis) file--Benefit Year Records. Programs: a.Original years program b. Intermediate (rearrange records in ID order) c. in-between records creation program d. Expand BYR format to Include. DOB, DOD. B. Handling difficult cases. 1. Simultaneous benefits for individuals-- Approximately 125 persons in sample receiving benefits from two accounts at same time. We require total in BYR. 2. "Non-sample" Social Security accounts--e.g. non-filing-wife of taxpayer receiving benefits from her own account. Data for these were annotated on some source documents received. (About 65 cases) 3. Duplications and: corrections (Source documents received on different dates. 4. Capability of updating BYR and C.I. records independently. C. Effects of time delays on Processing 1. Piecemeal development of system with incomplete source file. 2. Difficult logical system--e.g. woman changes beneficiary status from wife to widow. 3. ID correction. 4. Changes in WAIS And SSRI staff. 5. Non-edited updates to C.I. Tape.hahttp://www.ssc.wisc.edu/wais/WAIS667047.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667047.txtBarbara Aldrichu 1967,%The Coding of the 1960-64 Master Filem June 30, 1967T WAIS paper667-050eMaster File- Tax RecordsBarbara Aldrich WAIS 667-050 June 30, 1967 THE CODING OF THE 1960-64 MASTER FILE The coding of the 1960-64 tax data was done from December, 1966, to May, 1967. The purpose of this paper is to give a brief summary of the operation as well as some suggestions which may apply to a future operation of a similar type. Coder Training The initial step in the training of each coder was the test coding of a sample of thirty folders. These particular test folders were selected because of their wide variety of coding situations. When the clerk had coded a few folders, he and the supervisor studied the completed test folders, looking for possible misunderstandings and error patterns. After completion of his test folders, the coder was assigned a name group. These folders were also checked by the supervisor until she felt the coder was experienced enough to work without error checking. After this point, occasional spot checks were made on the coder's work. Several devices set up in the codebook were a great help to the coders; among these were the Form S, the consistency indicator, and the reliability indicator. Form S The Form S was particularly helpful in instances where the occupation and/or industry was not clearly defined or did not clearly fall into any of our code categories. By using the Form S, we were able to put occupations into categories that were more indicative of the type of work done than would be a 99 (unknown), 9700 (unclassifiable industry), or a 9900, (unknown industry). Examples of this are telephone operators and expediters, both of whom were ceded as other clerical workers with a Form S. Inconsistency Indicators The inconsistency indicators were frequently used when the answers given by the taxpayer were not consistent with what the return indicated. Reliability Indicators The reliability indicator was a great aid to a coder who was not absolutely certain that a particular occupation or industry code was reliable. This was especially useful when the distinction between skilled and unskilled labor was not clear. An improvement, however, would be to locate this code in the field following occupation and industry codes, rather than at the end. Social Security Numbers A year-to-year change in the social security number that the taxpayer listed was one of the problems which the coder frequently encountered. In earlier returns, the tax department assigned a four digit number to identify a taxpayer with no social security number; in later years they assigned a nine digit number beginning with a nine. This number was frequently assigned to the taxpayer for more than one year. When there was one "legitimate" social security number and a nine digit number assigned by the tax department, both numbers were recorded on the code sheet, along with a multiple social security indicator. The four digit number assigned by the tax department was never recorded, since it was never the same in succeeding years. County Codes and Tax Districts The returns of non-residents require an explanation of the inconsistent county codes. The taxpayer was given the out-of-state code (98) in the identification portion of the code sheet. However, when the tax department codes for county and tax district were indicated, a Wisconsin county and tax district were shown in nearly all cases. Thus, a non-resident's code sheet will reflect his non-resident status as well as the Wisconsin district to which his taxes are apportioned. When the tax department failed to show a district, the coder used AAAA (the tax department code for no district), or, in later returns, ZZZZ (the WAIS code for no district.) Another problem occurred when the tax department assignment of tax districts varied from year to year, although the taxpayer had not changed his address. Since the coder had no way of knowing which district was correct, he always coded whichever district the tax department had indicated, but recorded a change of address only when the return clearly indicated it. This problem occurred almost exclusively in the Milwaukee and suburban districts. Occupation and Industry Codes The categories which caused the greatest problems were the occupation and industry codes. Several references were used in coding these. The Dictionary of Occupational Titles was an aid, but was not very helpful in drawing the dividing line between a skilled and an unskilled laborer. An invaluable guide in determining Industry codes was the Classified Directory of Wisconsin Manufacturers, published by the Wisconsin Manufacturers Association. The list of specific Wisconsin corporations in Appendix D of the codebook was not very complete since only those corporations which had main offices in Wisconsin were listed. This eliminated such large employers as General Motors, Allen Bradley, and Wisconsin Telephone Company. The College Placement Annual was also helpful in coding industry codes of nationwide industries. Another coding rule which (if not explained) would appear to show coding inconsistencies concerns some of the more general occupation codes. In some cases, an actual occupation change was made (e.g. waitress to beautician) but both occupations fell within the same code (in this case, personal service occupations). Here, a change of occupation was recoded, even though the occupation code would not reflect it. Age Group and Dependents The age category and number of dependents developed into a problem when no dependents were actually listed and the amount deducted made it impossible to know. Here the code 99 was used for a number of dependents and 9999 was used for the dependents' age code. (An example of this predicament is a single year return on which no dependents are listed and a $30 deduction is claimed. Here the coder had no way of knowing if the taxpayer (s) were husband and wife over 65, a couple with one dependent, or a single person with a dependent who was claiming an additional $10 as head of family). Suggestions There are several steps which might be taken to encourage more accuracy and speed in a future coding operation. One of these would be to have written the code for a particular concern next to its listing in the Classified Directors of Wisconsin, Manufacturers. Then, in future cases, it could be assured that the same code was always used for a specific company. (Diversification in manufacturing often presented problems in selecting an industry code.) Another aid would be to assign a standard industry code to the concerns listed in Appendix B of this paper. In summary, the coding operation was made more reliable by thorough training of coders, built-in coding checks and good reference material. But, as with anything, the knowledge we have gained from this operations would enable us to do a better job another time. This paper is an attempt to impart that knowledge. APPENDIX A Clerical Costs Average Costs/Folder Filing $ .25 Coding $ .40 Coder Training $ .07 Total Clerical Costs/Folder .72 Key Punching. Costs Punching and Verifying $ 2.00 APPENDIX B List of Companies which should be included in Standard Industry Code Listing: A.C. Spark Plug Allen Bradley Beloit Corporation General Motors Johnson's Wax Kohler Company Ladish Company Regal Ware West Bend Company Wisconsin Telephone Companyhahttp://www.ssc.wisc.edu/wais/WAIS667050.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667050.txt Martin David 1967<5An Inventory of Some Major Data Processing Activities June 15, 1967 WAIS paper667-045NGAdministration Analysis Data Processing General Papers (Regarding WAIS)cf`Martin H. David WAIS 667-045 June 15, 1967 AN INVENTORY OF SOME MAJOR DATA PROCESSING ACTIVITIES Preliminary: Comments, additions, and corrections invited. CONTENTS Activity General Description Most Knowledgeable Persons BN Processing for benefit file analysis David Bhargava ** Bauman* Duddleston (Alley) LA Longitudinal analysis David-Miller v. Schneidemesser Bauman (Gillus) MF Master File Update deVries-Aldrich Barghava (Gates) PF Property file editing Miller Bussman HSF History and Selection File deVries Bauman Bhargava Mansfield * Early processing phases ** Coordinator Organization. Notation, etc. This paper is intended to bring into focus some of the major activities of WAIS staff and the implied programming needs of the project in the immediate future. This paper is a coordination and planning effort along the lines suggested in 667-027. Members of the project will have to fill in details on some of these operations where my knowledge is lacking. Furthermore each activity ends in a series of general and poorly defined steps to indicate possible consequences of present file handling for the future and possible generalizations of present work for the future. Each activity is divided into steps that specify an input and an output together with the operation that transforms input to output. Operations enclosed in boxes are operable; those without boxes must be designed, programmed, and tested. The sub-steps in an activity are not necessarily listed in order of priority, so that a flow chart of operations has been constructed indicating the interdependencies of substeps and activities. The flow chart follows the listing of sub-steps. Files are given an acronymn with two properties: The acronymn is unique to the degree of updating and the format of the file. The degree of updating is indicated by the number in parentheses. Files derived from a file with a given degree of updating should be given the corresponding number, suggested as an activity number in deVries WAIS 667-044. In addition each file should carry logical record counts and programming adequate to determine that records have not been dropped, added, or duplicated. Acronyms of closely related files may be indicated by Acronyms identical except for a non-parenthetical number, preferably not 1 or 0 because of spelling problems in distinguishing one from 1 and zero from 0. Each operation must be spelled out in a job plan or WAIS paper to make clear substance of the operations, while the flow charts indicate an aggregative system in which those operations must be executed. Job Plan for Extraction and Integration of Beneficiary and Income Data Record Abbreviation Description Count BN Beneficiary card image tape BRAN Merged Benefit analysis file BNYR Annual Benefit payment record ID WAIS Identification File (excluding multiple ID's) 805 Wage earnings record tape (format in WAIS paper 645-063) WAIS ID Wisconsin Assets and Income Studies -Identification number (excluding multiple ID's) SSA ID Social Security account number MF Tax Record File SAD Supplementary age and death data file *Any of these files containing multiple ID's will be postscripted (M). BN - Benefit Analysis M indicates that the file is in current multiple ID status 1 indicates that the file has been updated with corrections available 6/67 and removal of multiple ID's. Input Operation Output (Documentation Reference) 1. ID(M), BN(M) IDBENE (See Job #6701) SSA, WAIS ID error listing 2 2. Add, drop cards (WAIS 667-025) UPDATEAL alternatively LONIELLO BN(1) corrected card listing 6 3. ID (M) UPDATE FFID ID(1) 46 ID change cards MF(M) MA-UPDATE (WAIS 667-024) MF(1) 8 4. ID(1) SORT SSA SEQUENCE (SSRI-UWCC) ID(1) SSA 6 5. ID(1) SSA 805 MERGE 805-ID (WAIS 656-019) 805(1) 7 6. BN(1) ID(1) IDBENE (Job Plan 67021) RID(1) BN(1) 7 9 Stop is error checks 7. RID(1) 805(1) CREATE E1(805) R2ID R2ID(1) E1(805) 8. R2ID(1) MF(1) CREATE E2 (MF) E2(MF) 9. BN(1) Loniello BNYR (WAIS 667-001) BNYR(1) 10. BNYR(1) UPDATEAL BNYR (WAIS 667-025) BNYR(2) 11. E2(MF) BNYR(2) MERGE + SELECT E2(MF)-E3(BNYR) E6(MF-BNYR) 12. E1(805)E6(MF-BNYR) MERGE BNAN(1) 13. BNAN(1) CROSS-SECTION ANALYSIS XTAB TAB A1 TAB A2 REGRESSION REGRESSION B1 14. BNAN To Step M2 Level BN(M) ID(M) 805 SAD MF(M) A B C D E F G H 1 BN(2) BN(1) 9 BNYR(1) 10 BNYR(2) ID(M) 3 ID(1) ID(1)SSA 6 RID(1) 7 R2ID(1) 805 805(1) E1(805) SAD 1 Add Age for 70's, WAIS ID# to ( ) MF(M) MF(1) 8 E2 (MF) 11 E6(MF-BNYR) 12 BNAN(1) 13 14(To M2) [TAB A1, TAB A2, REGRESSION B1] Longitudinal Analysis-of-Income L1 Test file TF(1) DAVID RGR TF(2) (BINARY) L2 TF(2) XTAB Tabulations B (specified WAIS 667-031 & supplement) L3 MF(1), 805(1) .1. EXT01 .2. PREDAVID .3. MDAVID .4. DAVID .5. XTAB (Detailed in Plan of Operations on following pages) Tabulations C (not specified MILLER-67 (1) L4 BNCN(1) .1 Longitude EXT BEXT 03(1) .2 Same as L3.2-L3.5 BEXT 03(1) Tabulations D (not specified) L5 MUF(1) Same as L3 Tabulations E (not specified) L6 MF(1), 805(1) .1 Modify pre-david to select income components .2 Same as L3.3-3.5 Tabulations F (not specicied) PLAN OF OPERATIONS FOR GENERATING THE LONGITUDINAL-ANALYSIS-OF-INCOME FILE 805(1) 4 MF(1) 1 ( ) EXT-01 LIST OF FORM-TYPE 6, NEG. AGE, NO NTI 442 Char. MA-F 1 EXT-01 6 PREDAVID RECORDS REJECTED WITHOUT AGI 10 PRE-EXT-03 8 FORM TYPE 6, AGI NTI IMPUTED 9 M DAVID MVS EXT-03 11 DAVID 12 RRG MILLER-67 13 TABULATORS (XTAB) TABLES, GRAPHS, etc. Notes explaining Plan of Operations 1) Format in WAIS 645-056. 2) Includes cards to correct Coded Data to make ID changes to eliminate both valid and invalid Multiple ID's (see WAIS 667-023), to correct various fields on selected records, to add records not previously on Master File, etc. 3) See WAIS 667-003. 4) Format in WAIS 645-063. 5) See WAIS 656-012. 6) Format in WAIS 645-057. 7) Listing of ID#, YR with appropriate message. 8) Includes all records for male individuals if: (a) 47 < year filed < 59. (b) At least four pairs of consecutive year are available for a person. (c) All records put out have a nonzero AGI amount - either given by taxpayer or inputed. (See R, Bauman's description) Blocked 3 x 442. 9) Tape containing all records for which AGI and/or NTI was blank in the 442 Character Master, and for which an AGI and NTI amount could be inputed from taxable income incpl. form (B378). This tape may be merged into the 442 Character Master to replace the corresponding records without AGI and NTI if desired. Blocked 5 x 442. 10) Records for which no AGI (or NTI) is available and also could not be inputed are listed by IN and Year with message "No AGI for." 11) Format described in WAIS 667-031 and this paper (Table 3 3-1). 12) Appends parameters of model A, B, and C so Ext-03 and puts out the whole file in binary form. 13) Binary tape, unblocked. Format in WAIS 667-031. Master File Update MF1 MFU(0) EDITING UPDATE (WAIS 667-035, 667-040, 667-042) MFU(1) MF2 MF(1) GENERATE MISSING RECORDS (WAIS 667-029 667-032 List of missing records L1 MF3 L1 FILE & CODE (WAIS 667-002) MFUU(0) MF4 MFUU(0) To MF1 Property File Analyses P1 PF(M) CARD EDIT SSRI error correction listings error indication PEF (tape or cards) P2 PF(M) PEF ID CHANGE CARDS CORRECTION LISTING file listing merged card image file ID PF card 1 PEF card 1 PF card 2 PEF card2 etc. P3 1960-64 Tax Returns (Temporarily suspended pending further financing P.3.1 ENCODING DATA (WAIS 667-035) 3.2 Keypunching 3.3 Modify & repeat P.1 partial lisitng as in P.2 P4 Corrected file listing from P2, P3 B5500 ERROR CORRECTION FOR PROPERTY FILE TEXT EDITOR Keypunch PF(1) Marital Unit Analyses Ml MF (1) MARITAL SUMMARY MUF(1) L6 M2 BNAN MARITAL SUMMARY BNMU(1) ?hahttp://www.ssc.wisc.edu/wais/WAIS667045.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667045.txt7 Martin David 1967piMarital Unit and Family Integration on WAIS Files, with an Application to EXT 01 Binary Version (BNAN(1))r June 22, 1967e WAIS paper667-05160Extract 01 Marital Unit Master File- Tax Records66Martin David WAIS 667-051 July 31, 1,067 1st Revision & 2nd Revision Marital Unit and Family Integration on WAIS Files, with an Application to EXT 01 BINARY VERSION BNAN (1) 1. General The purpose of this plan is to produce a subroutine that can be easily used with either the Master File or various extracts. The first planned implementation is with the BINARY EXT 01 [11/66] SSRI TAPES 600, 577, 597. The subroutine accepts a matrix of data for several individuals and produces corresponding aggregates for the marital unit and family. The data have the following logical structure: Type Notation Explanation 1.Exogenous constants A Fixed over all units and years 2.Exogenous time variables B(t) Fixed over all units, variable in time 3.Household constants Hi Fixed for a household all years 4.Marital unit constants Mj Fixed for a marital unit all years 5.Individual unit constants Ik Fixed for an individual all years 6.Household time variables Hi(t) Household variable, can be attributed to household members 7.Marital unit time variables Mj(t) Marital unit variables can be attributed to individuals households 8.Individual variables Ik(t) Individual variables, may be attributed to marital units or households The data are nested in the following way: Household i= 1 2 N Marital Unit j = 1 2 m1 1 2 m2 1 2 mN Individual k= 1 2 n11 1 2 n21 1 2 nm1 1 2 n12 1 2 n22 1 2 nm2 1 2 n1N 1 2 n2N 1 2 nmN For convenience we will generally drop subscripts. We characterize the data on BINARY MT 01 as an example: Format EXT 01* Variable Type V1 WAIS ID (8-digits) I** V2 Sex I V3 Dependent # Ik V4 Year B (t) V5 Marital Status I(t) V6 Race I V7 # of Dependents I V8 Occupation I(t) V9 Return reason I(t) V10 County H(t) V11 City Designation H(t) V12 County prior year I(t) V13 AGI I(t) V14 NTI I (t) V15 W & S I (t) V16 Dividends I(t) V17 Gain or loss on sale of assets I(t) V18 Self-employment income I(t) V19 Interest I(t) V20 Rent I (t) V21 Trust income 1(t) V22 Marriage details I V23 Year of Birth I V24 does not equal Record Mark *SSRI TAPES 600, 577, 597 **The first 6 digits are household constants (Hi). The seventh and eighth digits make the ID# an individual constant. 2. Our problem is to aggregate variables of the class I(t) meaningfully to variables of the class M(t) or H(t). Variables that are already of a type F (t) need not be aggregated. Variables of the type I(t) may be qualitative, or quantitative. Logical combinations lead to aggregates in the former case, while arithmetic combinations lead to aggregates in the second case. 3. The subscript i is represented by the Household number or 1st 6-digits of the WAIS ID (1st 6-digits of V1 on EXT 01 BINARY VERSION). The subscripts j and k are not directly available. They can be calculated according to the chart below. ASSIGNMENTS OF j & k V1 DIGIT 8 V1 DIGIT 7 NUMBER OF RECORDS IN YEAR t PER HOUSHOULD MARITAL STATUS, V5 ASSIGN j= k= Case 0 on this Record, 0 for other record 1-3 on either record 0 on this record 1-3 this record, 0 for other femal Other A B C D E F G *Assign 2,3, etc. in order of digit 7 of ID#j Print out all records. ** Assign in order of the index (Digit 8 of ID#)+(V2) beginning with 3 so that dependents can be distinguished from case B and D. For instance, inspection of digit 7 and 8 of the WAIS ID, the number of records for this household in this year and Marital Status Codes permits unique assignment of j = 1 and k = 1 as the head of the marital unit. 4. The data for I(t) can now be converted to data for M(t) or H(t) by the following operations: 4.1. ASSOCIATE Ik*(t) TO M*(t) (ASSOCIATE Ik*(t) to H*(t), etc.) Designate a value k and rename variable as a household variable. Example: Sex k = 1 to Sex of marital unit head EXT 01 (V2) Output. (V. Z) Required information: input variable # output variable # k = k (or j=j etc.) 4.2. SUM I (t) OVER k TO GET M(t) (SUM I (t) OVER k, j TO GET H (t) M (t) = nmEk=1 Ik(t) Required information: input variable # output variable # summing indices (J; k; or both) 4.3. DISASSOCIATE M(t) TO I(t), (DISASSOCIATE H(t) TO I(t)) This is the inverse of the associate operation. The variable M(t) is assigned to all sub-units. Unlike the Associate operation a particular sub-unit is not designated. Example: Family income to Family income of this individual's family H(t) I (t) As a result for a given household a constant H(t) appears on all individual records. 4.4. LOGICAL SUM I (t) OVER k Apply Boolean logic to binary categories: Example: Race I1 Race 12 0 White 1 Non-white 0 White 1 Non-white This method works in general, as follows: M(t)=I1 U I2 7. A summary file will now be specified for BINARY EXT 01 on P.3,. Characters in Variable Type OPERATION SOURCE INDEX Format EXT SUMMARY Variable V1* WAIS Household # 8 Hi ASSOCIATE EXT 01 V1 j=1, k = 1 V2* Marital Unit Index 1 Mj(t) ASSOCIATE 1/ J (step 3) k-1 V3* Sex of Head 1 Mj (t) ASSOCIATE V2 k=1 V4* Year 2 B(t) ASSOCIATE V4 k a 1 V5* Marital Status of Head 1 Mj (t) k-1 LOGICAL SUN 2/ V5 V6* Race 1 Mj (t) LOGICAL SUMS' k V7* Dependents 1 M3 (t) k s 1 V8* Occupation of Head 2 Mj (t) k a 2 V9* Occupation of Spouse 2 Mj (t) k a 1 V10 Return reason head 1 M(t) k s 2 V11* Return reason spouse 1 M(t) k V12* AGI Marital Unit 8 M(t) j, k z AGI Family Unit 8 F (t) SUM OVER V3 ASSOCIATE V8 ASSOCIATE V8 ASSOCIATE V9 ASSOCIATE V9 SUM OVER V13 SUM OVER V13 V13* AGI Family Unit 8 M(t) DISASSOCIATE 3 j V14* NTI 8 M(t) SUM OVER V14 k V15* W and S 8 M(t) SUM OVER V15 k V15* W and S of Head 8 M(t) ASSOCIATE V15 k - 1 V17* Total Dividends 8 M(t) SUM OVER V16 k V18* Gain or loss of assets 8 M(t) SUM OVER V17 k V19* Self-employment income 8 M(t) SUM OVER V18 k * V20 Total interest 8 M(t) SUM OVER V19 k 8 V21 Total Rent 8 M(t) SUM OVER V20 k V22* Trust income 8 M(t) SUM OVER V21 k * V23 Birth year head 4 14(t) ASSOCIATE V23 k M 1 * V24 Birth year of spouse 4 M(t) ASSOCIATE V23 k _ 2 V25* :6 Record Mark 1 The index is derived in step 3, not directly available on input tape. 2 Recode V5 0 Single 1 Other 3 Recode V6 0 White 1 Non-white 4 If SUM I(t) satisfies Mj(t) = 2I1(t), ASSOCIATE V3 k = 1. Note: V9, V11, V24 Code high order 9 .... 9 if no spouse is present E.g. Occupation 999 Return reason 99 Birth Year of Spouse 9999 9 8. Operational considerations for BINARY EXT 01. 8.1. Input BINARY EXT(01) [11/66] SSRI TAPES 600, 577, 597 8.2. Read all records with this household number (1st 6 digits of WAIS ID (V1, EXT (01)) . Sort records into sequence defined by Year (V4), Dependent # (digit 8 of V1.) Sex (V2) )Males before females). The records will then arrayed within each year as follows: 00 - record for male family head 01 - record for female family head, or 1st spouse, or single female 02 - record for 2nd spouse . . . 10 - 1st male dependent 11 - 1st female dependent 20 - 2nd male dependent 21 - 2nd female dependent Etc. (It is highly unlikely that more than 10 numbers will appear in this list for any household #.) 8.3. Assign j and k. 8.4. The assignment of k implies that a wife is not unique over time. The head of a unit may change if the male dies and is survived by his wife. 8.5. Write a tape in the format EXTRACT SUMMARY. The number of output records for each marital unit should equal the number of input records, less the number of records assigned k > 1. Give a count of A ( ) # of Households ( ) # of Marital Units ( ) # of input individual ( ) # of marital unit year records ( ) # of input records 9. Reformat the EXTRACT SUMMARY as follows: Format EXTRACT LONGITUDINAL SUMMARY. V1 WAIS Household # V2 Marital Unit Index V3 1947 Data Availability, Code 0 No data either spouse l/ 1 Data available this year. V4-15 Data availability years 1948-1959 V16 Date of last available year, (including 1960) V17 Number of years available, (1947-1959) V18 Race (first year reported) V6* V19 Birth year (V23*) - historically 1st head 2/ V20 Birth year (V24) - historically 1st spouse 3 V21 2nd spouse (0 - No; 1 - Yes) V22 Birth year - historically 2nd spouse V23 Historically last head - birth year 4/ V24 Male termination date 5/ The remaining variables can be assigned as follows: From 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 * V3 25 37 49 61 73 85 97 109 121 133 145 157 169 * V5 26 * V7 27 V8 28 * V9 29 * V10 30 * V11* 31 * V12 32 V15 33 * V16 34 * V19 35 V PROPERTY INC. 36 48 60 72 84 96 108 120 132 144 156 168 180 181 & Record Mark + V18* + V20* + V21* + V22*) 9.1. Property Income = (V17* 9.2. The number of output records should equal the number of marital units. 1/ Assign high-order 9...9 to all vectors for this year so that variable contains a 9 in the decade larger than possible data input. 2/The variable is defined as V23* on the first year record for this marital unit. 3/ The variable is defined as the first non-zero value of V24. 4/ If V23* (o) :6 V23* (t), define a historically last head. 5/Define male termination date as t such that V23* (o) 4- V23* (t) for any t > t. Marital Unit and Family Integration on Benefit Data This part of the paper deals with the application of the sub-routine to Benefit data. The input file we use is BNAN(1) -- format in WAIS 678-011. The data has the same logical structure as outlined on page 1 and are nested as on page 2. The variables j and k are also assigned as shown on page 4*,and the records will be sorted into sequence as outlined on page 9 (8.2.). Note for v marital status is not available. Assume couple is married if two records exist and sum, i.e. disregard marital status test if exactly two records per household; if more than two records treat 1st two records as married and print out all data involved. Format BNAN (1) Variable Type V 1 WAIS ID # I V2 Sex I V3 Year of Birth I V4 Year of Death I V5 Race I V6 805 Earnings, 1937 to date I V7 805 Earnings, 1951 to date I V8 # Covered quarters, 1947 to date I V9 # Covered quarters, 1951 to date I V10 SSA # indicator I V11 Year of 1st Tax Record I V12 Year of last Tax Record I V13 Year of first Benefit Record I V14 Year of termination of SSA Benefits I V15 Beneficiary History I V218 Marital Status Change V219 OG Max I: Occupation held longest V220 OG Max II: Occupation held second longest V221 Proportion of time held OG Max I V222 Proportion of time held OG Max II V223 Indicator of labor force change V224 Indicator of occupation change V16-V33 V34-V47 V48-V61 V62-V75 V76-V89 V90-V103 V104-V117 V118-V131 V132-V145 V146-V159 V160-V171 V172-V189 V190-V203 V204-V217 Data Availability AGI Wages and Salaries Self-Employment Property Income Medical-Dental Expenses Total Deductions Occupation Marital Status # of Dependents 805 Earnings SSA Benefits Asset Gain or Loss Occupation (recoded) * (* --- Subscripts are omitted), Format of Extract for Benefit Analysis of Marital and Family Units Format Variable Operation Source Position in Type BNAN(l) V1* WAIS Household # U Associate BNAN(1) V1 V2* Marital Unit Index M Associate BNAN(1) Calculate V3 Sex of head M Associate BNAN(l) V2 V4* Year of birth-first head M Associate BNAN(1) V3 V5 Year of birth-first spouse III Associate BNAN(1) V1 * - 1/ V6 Second spouse 1 M Associate BNAN(l) Calculate - Yes 1/ * V7 Year of birth-second spouse M Associate WAN (1) V3 V8* Year of death-head M Associate BNAN(1) V4 * 2/ V9 Year of death-spouse M Associate BNAN(l) V4 * V10 Race (first year reported) M(0) Logical Sum BNAN(1) V5 * V11* # Covered quarters, head, M Associate BNAN(1) V9 1951 to date M Associate BNAN(1) V9 V12* # Covered quarters, spouse * 1951 to date V13 Year of 1st Tax Record M Associate BNAN(1) Vii * V14 Year of last Tax Record M Associate BNAN(1) V12 * V15 Year of let Benefit Record M Associate BNAN(1) V13 * * V16 -V33 Data availability M(t) Logical Sum BNAN(1) V16-V33 V34*-V47* AGI-Marital Unit M(t) Sum Over BNAN(l) V34-V47 * * V48 -V61 AGI-head M(t) Associate BNAN(l) V34-V47 * V62* -V75* W&S + Self Emp.-M.U. M(t) Sum Over BNAN (1) V48 V62 V49 -'- V63 V61 V75 * * V76 -V89 W&S Self Emp.-Head M(t) Associate BNAN(1) V48 V62 V49 _ V63 I I V61 V75 V90*-V103* Medical-Dental Expenses-M.U. M(t) Sum Over BNAN(1) V90-V103 * * V104* -V117* Total Deductions-M.U. M(t) Sum Over BNAN (1) V104-V117 V118*-V131* Occupation of head M(t) Associate BNAN(1) V118-V131 V132*-V145* Occupation of Spouse M(t) Associate BNAN(1) V118-V131 V146*-V159* Marital Status of head M(t) Logical Sum BNAN(1) V132-x145 V160*-V173* # of dependents M(t) Sum Over BNAN(1) V146-V159 V174*-V187* Property Income-M.U. M(t) Sum Over BNAN(l) V76-V89 V188*-V199* 805 Earnings-M.U. M(t) Sum Over BNAN(1) V160-V171 V200*-V211* 805 Earnings-head M(t) Associate BNAN(l) V160-V171 * * V212 -V229 SSA benefits M(t) Sum Over BNAN(l) V172-V189 Variable Position in Format Type Operation Source BNAN(1) V230*-V249* SSA family benefits F(t) Disassociate Z(t) Z = SSA benefits F(t) Sum Over BNAN(1) V172-V189 1/ Code 1 if a wife number > 1 occurs in any t. 2/First spouse if any ambiguity. AGI Data Medical-Dental Available M.U. Head M.U. Head Expenses 1947 V16* V34* V48* V62* V76* V90* 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 V33* Occupation 805 Earnings Marital # of Property SSA Status depen Income Benefits Head Spouse -Head dents. M.U. M.U. Head M.U. .Family V118* V132* V146* V160* V174* V212* V230 V188 V200* V189* * * V199 V211 V229* V249* W&S ._ Self Emp. V47* V61* V75* V89* Total Deductions V104* V103 * V117* V131* V145* V159* V173* V187* V250* Record Mark Ihahttp://www.ssc.wisc.edu/wais/WAIS667051.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667051.txtnAshok Bhargava 1967(!Subject-Wise Index of WAIS Papers July 11, 1967 WAIS paper667-055Administration4&4Ashok Bhargava WAIS 667-055 July 11, 1967 PRELIMINARY Subject-Wise Index of WAIS Papers It is proposed to have a subject wise index of WAIS papers. This is the first step in the preparation of this index. The subjects are in alphabetical order within each subject class, the WAIS papers are listed in chronological order. Any suggestions for additions, deletions, new sub-headings, subject headings are welcome. SUBJECTS 1. Administration 2. Age Data 3. Analyses 4. Averaging Studies 5. Benefit File 6. Consistency of Data 7. Cross Tabulations 8. Data Processing 9. Data - 1962 State Tax Roll 10. Extract - 01 11. Fixed Format Identification File (FFID) 12. Fixed Format Identification - 805 File 13. Formats 14. General Papers 15. History File 16. Kahn Output 17. Maintenance System 18. Master File - Tax records 19. Medical Expense Data 20. Missing Data (Master file Records) 21. Miscellaneous 22. Programs 23. Property File 24. Proposals - for Theses, Analyses, etc. 25. Selection File 26. Social Security 27. Social Security Earnings Data 28. Survey File and Data 29. WAIS Sample These will also be filed as cards 1. Social Security Benefit Data - See under Benefit File. 2. Tax Averaging Studies - See under Averaging Studies 3. Interview File - See under Survey 4. ID File - See under fixed format identification file. 5. State Tax Roll Records, 1962 - See Data, 1962 WAIS Author Title Date I. Administration 645-059 Geffert "General Description" 4-28-65 645-061 Ryshpan "Project Description" 4-27-65 667-010 Moyer, Hinckley Indexing WAIS Tables and Listings 9-29-66 667-014 Bauman On a Wais Inventory and a Proposal 667-016 Von Schneidemesser for a Finder File 10-7- 66 Documentation and Housekeeping 667-027 deVries Procedures for WAIS Staff 11-4- 66 Proposal for the Standardization 667-039 Esterly of WAIS 2-8- 57 WICS: WAIS Inventory & Control 667-044 deVries System 5-2- 67 Requirements for an Integrated Data Processing System for WAIS Data Files 6-15-67 667-045 David An Inventory of Some Major Data II. Age Data Processing Activities 6-15-67 645-065 Moyer & Cassidy "A Proposal to Determine Birth 656-030 Moyer & Geffert and Death Date Data for Persons in. the Master File Who Have No Social Security Account Number" 6-18-65 Listings and Procedure for Getting 656-048 Dey Age and Death Data 11-9 -65 Search for the Age Data 3-21-66 656-055 Bauman Card Format for Supplementary 667-052 Esterly Age Data 4-19-66 Integration of Age Data and Tax III. Analysis Return Files 7-6- 67 645-003 Miller "WAIS Memorandum on Initial 645-005 Moyer Tabulations of Tax Return Data 7-28-64 "Methods of Devising a Distribution 645-007 David of Unrealized Capital Gains 9-22-64 "Note on a Recursive Model of Income 645-011 Moyer Determination for the Wisconsin Assets and Incomes Study" 10-16-64 A Simple Model of Optimal Asset Management Including the Investor's Estate Motive 11-4- 64 WAIS # Author Title Date III. Analysis (cont.) 645-015 Moyer The Locked-In Effect and the Aging Investor 11-30-64 645-025 Moyer A Report on the U.W. Economic Study, Summer, 1964 1-13-65 645-030 Geffert DAIS Wisconsin Income Distribution 2-2- 65 645-035 Moyer An Estimate of Untaxed Wisconsin Income in 1959 and of Non-Filing 2-19-65 Individuals with Income in 1963 645-041 Moyer Comparison of Earnings Reported on the Income Tax with that Re- 3-18-65 ported to the Social Security Ad- ministration 645-071 Cassidy A List of Tables in the Wisconsin Summary Statistics 6-10-65 645-072 Durant Initial Estimates of Earnings - Dynamics Model Based on WAIS Income 6-17-65 Data for the Years 1958-59 656-016 Moyer New FFID's from Survey 8-25-65 656-019 Ryshpan Fixed Format ID File Maintenance System 9-3- 65 656-024 VonSchneidemesser An Additional Capability for Up & Moyer dating the FFID-File 10-21-65 656-045 VonSchneidemesser Utility Print Programs for the Fixed Format Identification 3-9- 66 656-052 VonSchneidemesser A Definition and Description of the Fixed Format ID File 4-4- 66 656-060 VonSchneidemesser Some Considerations About the WAIS ID Number 5-3- 66 656-062 VonSchneidemesser Report on the Matching of the FFID File with Master, SSA FORM C05 and 6-16-66 State Roll Extract 667-023 Esterly Elimination of Invalid Multiple FFID Numbers Scope, Method, and 1-31-67 Results IV. Averaging Studies Format of SSRI WAIS Tax Extract 645-040 Durant #1 File 4-12-65 645-049 David Preliminary Tabulations SSRI DAIS TAY EXTRACT FILE #1 4-27-65 WAIS # Author Title Date IV. Averaging Studies (cont.) 645-050 Miller Creation of 1964 Tax Averaging Law File No. 1--Computation 4-27-65 Year 1958 645-051 Miller Format of 1964 Tax Averaging Law File #1 4-28-65 645-052 Miller Functions Derived from 1964 Tax Averaging Law File 11 4-20-65 656-004 Miller Proposed Scope and Outline of Averaging Monograph 7-14-65 656-005 Groves Chapter 1 (Averaging Monograph) 7-20-65 656-009 Durant Results of Consistency Checks on Averaging Records 3-2- 65 656-010 Moyer & Begum Summary of Wistab Cards for the Preliminary Tabulations 8-3- 65 656-029 David Notes on Future Work in the Area of Averaging Studies 11-3- 65 656-034 Duchan Description of Initial Cross tabulations from 1964 TAX Averaging 12-8- 65 Law File 656-038 Groves Eligibility Tables 1-10-66 656-041 Duchan & Moyer Changes in Computation of Income and Deductions for the Averaging 2-3- 66 Tables 656-047 Miller Computations for Income and De- ductions for the Averaging Tables 3-17-66 656-058 Duchan Description of DAIS Tax Averaging Tables Using the Federal Definition 5-12-66 of Personal Deductions 667-004 Wiegner & Bauman Summary of Specifications for Age Occupation from 1964 Tax Averaging 8-22-66 Law File 667-017 Bauman & Duchan On Interpretation of the Tax Aver aging Tables and a Proposal for 11-7- 66 Further Summary Tables V. Benefit File 645-036 Bridges WIAIS and the SSA 2-23-65 656-006 Bauman Social Security Claims 7-22-65 656-028 Bauman Codes for "Unknown" Lump Sum Death Payment 10-22-65 WAIS # Author Title Date V. Benefit File (cont.) 656-032 Bauman Notes on the Processing of Social Security Benefits Data 11-18-65 667-001 VonSchneidemesser Benefit Data Processing Plan 7-7- 66 667-013 Bauman Summary and Timetable for Com Bauman pletion of Benefit Year Records and 11-10-66 Social Security Data Records 667-036 David Analysis of Social Security Benefit Data 4-19-67 667-047 Bauman Development of WAIS-Social Security Benefit Payments Data File 6-16-67 VI. Consistency of Data 645-020 Geffert Proposal for Consistency Check 12-15-64 645-022 Bauman & Seavey Report of Check-coding of Inter view Schedule and Booklet 1-4- 65 645-044 Miller First Summary Report Consistency Check Error Tabulations 4-1- 65 645-060 Geffert Consistency Check Program Edits 4-29-65 656-009 Durant Results of Consistency Checks on Averaging Records 8-2- 65 656-020 Moyer & Begum Reported on the Checking of Ron's Inconsistent Coded Data 4 Messages 9-7- 65 with recommendations VII. Cross-Tabulations 645-003 Miller WAIS Memorandum on Initial Tabu lations of Tax Return Data 7-28-64 645-021 Durant Proposed Format for WAIS Crosstabs 12-15-64 645-031 Durant Proposed Format of an Extraction Record to be Used in WAIS Cross 2-3- 65 Tabulations 645-040 Durant Revision of WAIS Working Paper 645-031--Proposed Format of an 3-2- 65 Extraction Record for Extract #1 File to be Used in WAIS Cross Tabu lations 645-053 Miller Initial Crosstabulations from 1964 Tax Averaging Law File #1 4-23-65 WAIS # Author Title Date VII. Cross Tabulations (cont.) 656-001 Geffert Document "MEANS", a Program to Calculate Means From WISTAB 6-30-65 Output 656-034 Duchan Description of Initial Cross tabulations from 1964 TAX Averaging 12-0- 65 Law File 667-031 David Cross Tabulations for the Longitud inal Analysis 3-16-67 VIII. Data Processing 645-001 Durant Document: Proposal for Wisconsin Income Tax Data Processing Pro 7-16-64 cedures 645-034 David Data Processing on WAIS 2-8- 65 645-035 Moyer Tape Record for Medical Expense Data 3-2- 65 645-043 Bauman Editing Corrections Used in In terview Coding 3-30-65 645-059 Geffert General Description 4-28-65 645-061 Ryshpan Project Description 4-27-65 656-032 Bauman Notes on the Processing of Social Security Benefits Data 11-100-65 656-056 Bauman, Geffert Problem Definition for Microfilming & Moyer 1959-1964 Tax Returns 4-21-66 667-001 VonSchneidemesser Benefit Data Processing Plan 7-7- 66 667-002 & Bauman Processing 1959-1964 Wisconsin Tax Moyer. Geffert, Bauman, deVries & Filing and Coding Manual 7-26-66 667-007 Hinckley Some Thoughts on the WAIS Survey deVries File and Its Readiness for Analysis 9-26-66 667-013 deVries Processing 1959-1964 Wisconsin In come Tax Returns 10-7- 66 667-015 deVries Outline for Further Processing of the 1959-1964 Wisconsin Income Tax 10-18-66 Data 667-033 Esterly A Note on Integration of Files 4-5- 67 667-035 deVries 1959-1964 Wisconsin Income Tax Data Outline and Flow of Processing 4-11-67 Operations WAIS # Author Title Date VIII. Data Processing (cont.) 667-033 Bussman & Processing 1959-64 Wisconsin Income 4-24-67 deVries Tax Returns: Property Income File Coding Manual 667-052 Esterly Integration of Age Data and Tax 7-6- 67 Return Files IX. Data 1962 (State Tax Roll Records) 645-069 Moyer Format of 1962 State Tax Roll Records 6-7- 65 645-071 Cassidy A List of Tables in the Wisconsin Summary Statistics 6-10-65 656-042 Moyer 1962 Deductions of Wisconsin Tax payers 2-21-66 X. Extract - 01 645-049 David Preliminary Tabulations SSRI WAIS TAX EXTRACT FILE #1 4-27-65 645-057 Durant Format of SSRI WAIS TAX EXTRACT #1 FILE: Format of SSRI WAIS TAX EX 4-28-65 TRACT #1A File 645-070 Durant Report on the Running of EXTRACT-01 Which Created EXTRACT 1 and A Files as outlined in WAIS Working Paper 645-057, April 23, 1965 (1st Revision) 6-9- 65 656-012 Durant Operating Instructions 3-25-65 656-023 VonSchneidemesser The Updating of Extract 01 10-21-65 656-051 David Preliminary Marital Unit and Family Integration on WAIS Files, 6-22-67 with An Application to EXT01 BINARY VERSION XI. Fixed Format Identification File (FFID) 645-012 Geffert Proposed Identification and SS Record Format 11-16-64 645-055 Ryshpan Updating the Fixed Format ID File 4-27-65 645-053 Ryshpan Fixed Format ID Tape 4-29-65 645-063 Ryshpan Identification and Social Security Record Format 4-29-65 656-016 Moyer New FFID's from Survey 8-25-65 WAIS 0 Author Title Date XI. Fixed Format Identification File_ (FFID) (cont.) 656-019 Ryshpan Fixed Format ID File Maintenance System 9-,3- 65 656-024 VonSchneidemesser & An Additional Capability for Up Moyer dating the FFID File 10-21-65 656-045 VonSchneidemesser Utility Print Programs for the Fixed Format Identification 3-9- 66 656-060 VonSchneidemesser Some Considerations About the WAIS ID Number 5-3- 66 656-062 VonSchneidemesser Report on the Matching of the FFID File With Master, SSA FORM 005 and 6-16-66 State Roll Extract 667-023 Esterly Elimination of Invalid Multiple FFID Numbers Scope, Method, and Results 1-31-67 XII. Fixed Format Identication-805 File 645-012 Geffert Proposed Identification and SS Record Format 11-16-64 III. Formats 645-004 Geffert Format, Card Record 9-16-64 645-012 Geffert Proposed Identification and SS Re cord Format 11-16-64 645-017 Moyer & Roubal Document: The Keypunching of the Tax Forms. 12-2- 64 645-021 Durant Proposed Format for WAIS Crosstabs 12-15-64 (Kahn Output - after 645-024) Proposed Format of an Extraction 645-031 Durant Record to be Used in WAIS Cross 2-3- 65 Tabulations 645-038 Moyer Tape Record for Medical Expense (coding) Data 3-2- 65 645-038 (medical) Durant Revision of WAIS Working Paper 645-040 645-031 - Proposed Format of an 3-2- 65 Extraction Record for Extract #1 File to be Used in WAIS Cross Tab ulations and Regressions 645-045 Moyer Card Format for Multiple Social Security Number Cases 4-8- 65 WAIS 0 Author Title Date XIII. Formats cont-. 645-046 Durant Assignment of Entry Codes to Up date and Correct Existing WAIS Master Records (7 Digit Pre-Con 4-20-65 sistency Master Record) 645-047 Barger Extracted Record for Tax Averaging 4-23-65 645-051 Miller Format of 1964 Tax Averaging Law File #1 4-28-65 645-057 Durant Format of SSRI WAIS TAX EXTRACT #1 FILE: Format of SSRI WAIS TAX EX 4-28-65 TRACT #1A File 645-058 Ryshpan Fixed Format ID Tape 4-29-65 645-069 Moyer Format of 1962 State Tax Roll Records 6-7- 65 645-074 Bauman Logging of Social Security Claims Data 6-24-65 656-006 Bauman Social Security Claims 7-22-65 656-011 Durant Assignment of Entry Codes to Update Existing WAIS Post-Consistency 400 8-12-65 Character Master Record File (9'. Digit Amount Fields). 656-031 Geffert Document - Proposed History File Format 11-18-65 656-055 Bauman Card Format for Supplementary Age Data 4-19-66 667-030 deVries 1959-1964 Wisconsin Income Tax Re turns Tentative Record Formats 3-15-6788XIV. General Papers (regarding WAIS) 645-025 Moyer. A Report on the U.W. Economic Study, 1-13-65 Summer, 1964 645-030 Geffert WAIS Wisconsin Income Distribution 2-2- 65 645-036 Bridges WAIS and the SSA 2-23-65 Sample of Wis. I. Tax Returns - Miller (after 667-043) 645-059 Geffert General Description 4-28-65 645-061 Ryshpan Project Description 4-27-65 645-073 Moyer & Bauman Some Preliminary Tests of Whether the Method of Choosing Name Groups Influenced some Characteristics of the WAIS Tax Sample 6-21-65 WAIS # Author Title Date XIV. General Papers (receding WAIS) (cont_.) 656-044 Geffert Computerized Error Correction 2-24-66 667-011 Moyer Applied to Income Tax Returns 9-29-66 667-045 David A Profile of the Wisconsin Income 6-15-67 XV. History File Taxpayer, 1964 11-18-65 An Inventory of Some Major Data 4-21-66 Processing Activities 6-28-67 Document - Proposed History File 1-4-66 Format 1-4-66 History File Job Plan (General) History File (present condition) Kahn Tape File Description Format of Kahn Records 656-031 Geffert 656-057 Geffert 667-049 Loniello XVI. Kahn Output (Paper after 645-0_4) 656-035 Geffert 656-036 Geffert XVII. Maintenance. System-files, data, etc. 656-012 Durant Operating Instructions 8-25-65 656-013 Durant Operating Instructions 8-25-65 656-014 Durant Operating Instructions 8-25-65 656-015 Durant Operating Instructions 8-25-65 656-019 Ryshpan Fixed Format ID File Maintenance System 9-3- 65 656-024 VonSchneidemesser An Additional Capability for Up & Moyer dating the FFID File 10-21-65 667-003 VonSchneidemesser The WAIS Master File Maintenance System 8-8- 66 667-010 Moyer & Hinckley Indexing WAIS Tables and Listings 9-29-66 667-014 Bauman On a WAIS Inventory and a Proposal for a Finder File 10-7- 66 667-016 VonSchneidemesser Documentation and Housekeeping Procedures for WAIS Staff 11-4-`66 667-021 VonSchneidemesser Correcting and Updating the Various WAIS Files and Approximate Sequence 1-12-67 of Steps 667-024 VonSchneidemesser WAIS File Maintenance 2-3- 67 667-025 VonSchneidemesser UPDATEAL-A Program to Update Tape Files 2-9- 67 667-044 deVries Requirements for an Integrated Data Processing System for WAIS Data Files 6-15-67 WAIS # Author 11 Title Date XVIII. Master File - Tax Records 645-001 Durant Document: Proposal for Wisconsin Income Tax Data Processing Pro 7-16-64 cedures 645-004 Geffert Format, Card Record 9-16-64 645-008 Durant Actual Computer Times and Flow of Sorting and Master Creation Runs 10-21-64 645-009 Geffert Proposed Method of Merging SS In formation with Wisconsin Income Tax Data 10-23-64 645-017 Moyer & Roubal Document: The Keypunching of the Tax Forms 12-2- 64 645-020 Geffert Proposal for Consistency Check 12-15-64 645-038 Moyer Tape Record for Medical Expense (coding) Data 3-2- 65 645-047 Barger Extracted Record for Tax Averaging 4-23-65 645-054 Durant Programming Systems Involved in the Creation and Updating of the WAIS 4-23-65 Master Income File 645-056 Geffert Master File Format 4-27-65 \1 I 645-067 Geffert Utility Print Programs for the Master File PRINTMAS 5-26-65 645-068 Durant Report on the Second Updating of the WAIS Master File 6-2- 65 656-011 Durant Assignment of Entry Codes to Update Existing WAIS Post-Consistency 400 8-12-65 Character Master Record File (9 Digit Amount Fields) 656-013 Durant Operating Instructions 3-25-65 656-014 Durant Operating Instructions 8-25-65 656-015 Durant Operating Instructions 8-25-65 656-037 Geffert Logical Construction of the 400 Character Master File 1-7- 66 656-056 Bauman, Moyer, Problem Definition for Microfilming & Geffert 1959-1964 Tax Returns 4-21-66 656-062 VonSChneidemesser Report on the Matching of the FFYR File with Master, SSA FORM 805 and 6-16-66 State Roll Extract 667-002 Moyer, Geffert, Processing 1959-1964 Wisconsin Tax Bauman, deVries, Returns Filing and Coding Manual 7-26-66 & Hinckley 667-003 VonSchneidemesser 667-013 deVries 667-015 deVries 667-030 deVries 667-034 VonSchneidemesser 667-037 deVries 667-040 deVries 667-042 deVries \I-. 667-046 VonSchneidemesser 667-0400 Bhargava 667-051 David 1959-1964 Wisconsin Income Tax Re turns Tentative Record Formats (3-15-67) 3-15-67 improving the 1946-60 Master File Tape 4-7- 67 1959-64 Wisconsin Income Tax Data Compatibility of Old and New Master Files 4-21-67 Detailed Outline for Further Proces sing of 1959-1964 Wisconsin Income Tax Summary Data Vol. I - The Pro cessing of the ID-Cards 5-3- 67 1959-1964 Wisc. Income Tax Data: De tailed Instructions for Processing Summary Cards 6-6- 67 Plan of Operations for Generating the Longitudinal-Analysis-of-Income File 6-16-67 Transcribing and Inputing AGI & NTI Fields for the 443 Character Master File 6-16-67 Preliminary Marital Unit and Family Integration on WAIS Files, With an Application to EXT01 BINARY VERSION 6-22-67 WAIS # Author Title Date XVIII. Master File - Tax Records The WAIS Master File Mainten ance System 8-8- 66 Processing 1959-1964 Wisconsin Income Tax Returns 10-7- 66 Outline for Further Processing of the 1959-1964 Wisconsin Income Tax Data 10-18-66 XIX. Medical Expense Data 645-033 Moyer Tape Record for Medical Expense Data 3-2- 65 (Medical) XX. Missing Data (Master File Records) 667-029 Bhargava Proposed Procedure for Updating Old and New Master Files with Respect to 2-23-67 Residual Tax Records for WAIS's Sample 667-032 Bhargava Analysis of Folder Shots-Required for Updating Old and New Master Files 4-4---67 With Respect to Residual Tax Records for WAIS's Sample WAIS # Author 13 Title Date XX. Missing Data (Master File Records) (Cont.) 667-043 Bauman Residual Tax Records-Analysis of Results of Visit to Tax De 6-13-67 partment XXI. Miscellaneous 645-073 Moyer & Some Preliminary Tests of Whether Bauman the Method of Choosing Name Groups 6-21-65 Influenced Some Characteristics of the DAIS Tax Sample 656-018 Geffert Document - Transformation of Multi 8-24-65 dimensional Arrays to One-Dimension al Arrays XXII. Programs 645-014 Durant Description of Tax 04 Program 11-21-64 645-016 Durant Edit Procedure (Keypunching) 12- 1-64 645-027 Durant Description of Tax 06 2-1- 65 645-023 Durant Description of Tax 03 2-1- 65 645-054 Durant Programming Systems Involved in the Creation and Updating of the WAIS 4-28-65 Master Income File 645-060 Geffert Consistency Check Program Edits 4-29-65 645-066 Geffert Modifications of WISTAB 6-29-65 645-067 Geffert Utility Print Programs for the Master File PRINTMAS 5-26-65 645-070 Durant Report on the Running of EXTRACT-01 which Created EXTRACT 1 and 1A Files 6-9- 65 as Outlined in WAIS Working Paper 645-057 656-001 Geffert Document "MEANS" a Program to Cal culate means from WISTAB Output 6-30-65 656-012 Durant Operating Instructions 8-25-65 656-013 Durant Operating Instructions 8-25-65 656-014 Durant Operating Instructions 8-25-65 656-015 Durant Operating Instructions 8-25-65 656-021 Moyer Report on the Construction of the "Average of Variables" Tape 9-17-65 656-025 VonSchneidemesser To Use the Program FFYR: HEADR: Redate Last Year 10-21-65 TITLE: FFYR 1410 Program WAIS # Author XXII. Programs 656-026 VonSchneidemesser 656-045 VonSchneidemesser 667-025 VonSchneidemesser XXIII. Property File 645-005 Moyer 645-006 Moyer 645-013 Seavey 645-015 Moyer 645-017 Moyer & Roubal 645-022 Bauman & Seavey 656-039 Geffert 656-050 Miller 656-053 Miller 667-020 Bauman Outline and Timetable for Preliminary Processing of WAIS Files Relevant to Analyses of Property Income DATA 11-22-66 667-028 Bussman Remaining Clean-up Operations on File 12 and Files 21-28 2-9- 67 14 Title Date Purpose and Operation of the 10-21-65 Programs FFID, FFIDS, FFIDE Utility Print Programs for the 3-9- 66 Fixed Format Identification UPDATEAL - A Program to Update Tape 2-9- 67 Files Methods of Devising a Distribution 9-22-64 of Unrealized Capital Gains Gathering a List of Prices 10-13-64 Attitudes of Individual Investors 11-17-64 The Locked-In Effect and the Aging 11-30-64 Investor Document: The Keypunching of the 12-2- 64 Tax Forms Report of Check-coding of Inter 1-4- 65 view Schedule and Booklet Notes on Portfolio Construction 1-21-66 Portfolio Evaluation from Wisconsin 3-23-66 Individual Income Tax Returns: I. General Considerations Portfolio Evaluation from Wisconsin 4-12-66 Individual Income Tax Returns 667-033 Bussman & deVries Processing 1959-64 Wisconsin Income Tax Returns: Property Income File Coding Manual 4-24-67 XXIV. Proposals - for Analyses, Theses, etc. 645-003 Miller WAIS Memorandum on Initial Tabu lations of Tax Return Data 7-23-64 645-005 Moyer Methods of Devising a Distribution of Unrealized Capital Gains 9-22-64 645-007 David Note on a Recursive Model of Income Determination for the Wisconsin As sets and Incomes Study 10-16-64 WAIS # Author 15 Title Date XXIV. Proposals - for Analyses, Theses, etc. 645-010 Durant Proposal for an Econometric Analysis of the Earnings Dynamics 11-2- 64 of Taxpayer Units for a Constant Sample of Wisconsin Income Tax,_ payers for the Period 19xx-19xx 645-013 Seavey Attitudes of Individual Investors 11-17-64 645-019 Bauman Methods of Constructing a Life Income Measure 12-11-64 645-023 Seavey Proposal for Interview Analysis 1-5- 65 645-037 David Notes on the Use of Wisconsin As sets and Incomes Data in Studies 2-23-65 of Retirement 645-042 Moyer A Proposal For a Thesis on the Methodology of WAIS 3-29-65 645-065 Moyer & Cassidy A Proposal to Determine Birth and Death Date Data for Persons in the 5-24-65 Master File Who Have No Social Security Account Number 645-072 Durant Initial Estimates of Earnings - Dynamics Model based on WAIS In 6-17-65 come Data for the years 195E-59 656-002 Moyer A Proposal for Indexing Tables and Other Tools of Analysis 7-8- 65 656-004 Miller Proposed Scope and Outline of Aver aging Monograph 7-14-65 656-046 Geffert Completion Estimates 3-15-66 667-033 Esterly A Note On Integration of Files 4-5- 67 667-046 VonSchneidemesser Plan of Operations for Generating the Longitudinal-Analysis-of-Income File 6-16-67 667-051 David Preliminary--Marital Unit and Family Integration on WAIS Files, With An 6-22-67 Application to EXT01 BINARY VERSION XXV. Selection File 656-033 Moyer The Treatment of Wives in WAIS' Tax Sample 11-24-65 667-012 Bauman A Plan for Preparation of a Long itudinal Selection File 10-6- 66 667-018 Bauman Summary and Timetable for Completion of Benefit Year Records and Social 11-10-66 Security Data Records Number Author Title Date XXVI. Social Security 645-009 Geffert Proposed Method of Merging S: Information with Wisconsin Income Tax Data 10-23-64 645-036 Bridges WAIS and the SSA 2-23-65 645-041 Moyer A Comparison of Earnings Reported on the Income Tax With That Reported to the Social Security Adminis tration 3-18-65 645-045 Moyer Card Format for Multiple Social Sec urity Number Cases 4-8 -65 656-006 Bauman Social Security Claims 7-22-65 656-007 Moyer Table Specifications for the Social Security Covered Earnings Project 7-20-65 656-005 Moyer Approximate Costs for the SSA Tables 7-28-65 667-018 Ryshpan Fixed Format ID File Maintenance System 9-3- 65 XX.VII. Social Security Earnings Data-305 645-012 Geffert 645-063 Ryshpan 645-074 Bauman 656-007 Moyer 656-008 Moyer 656-022 VonSchneidemesser 656-027 VonSchneidemesser Proposed Identification and SS Record Format Identification and Social Security record Format 4-29-65 Logging of Social Security Claims Data 6-24-65 Table Specifications for the Social Security Covered Earnings Project 7-28-65 Approximate Costs for the SSA Tables 7-28-65 Integration of New 805 Data with Old 805's 10-21-65 Identification Code for Social Security Administration Punch Card Files 10-21-65 Report on the Matching of the FFID File with Master, SSA FORM 805 and State Roll Extract 6-16-66 11-16-64 656-062 VonSchneidemesser Number Author Title Date XXVIII. Survey Data and File 645-002 Moyer Sampling Procedures for the 2,000 Name Interview Sample 7-17-64 645-018 Moyer The Response Rates of the Interview Sample 12-11-64 645-023 Seavey Proposal for Interview Analysis 1-5- 65 645-024 Moyer ID Numbers Assigned to Individuals from the Unmatched State File 1-5- 65 645-026 Moyer A System of Weights for the Inter view Data 1-26-65 645-029 Geffert Proposed Flow Diagram of Weighting Procedures 2-2- 65 645-032 Bauman A List of Hand Matches of Interviews and Tax Forms 2-5- 65 645-033 Seavey Suggested Additions to Coding and Punching of Stock and Bond Infor 2-8- 65 mation in Assets Booklet 645-039 Moyer The Weighting of the Interview Data for the Report to Respondents 3-2- 65 645-043 Bauman Editing Corrections Used in Inter view Coding 3-30-65 645-062 Bauman A Summary of Information Contained on Data Cards for Interviews and 4-28-65 Booklet Questions from the Wais Survey 645-064 Weingingner Distribution of Response and Non Response (by Category) by "Key" 5-18-65 Stratification Variables 656-016 Moyer New FFID's from Survey 8-25-65 656-043 Moyer A Weighting Function for WAIS Interview 2-22-66 656-049 Moyer Cards Formats for the WAIS Inter view Survey 3-22-66 656-061 Moyer Tables From the Survey 5-5- 66 656-061 667-005 Moyer Some Evidence of the Validity of 667-006 Lieberman the Weighting Function 9-22-66 FORMAT for X-TAB INPUT Number Author Title Date XXVIII. Survey Data and File (Cont.) 667-007 deVries Some Thoughts on the WAIS Survey File and its Readiness for analysis 9-26-66 667-008 Moyer An Additional Comment on the Survey Weights 9-28-66 667-009 deVries Check for Presence of Proper Cards in the Survey File 9-28-66 667-022 Lieberman Processing the WAIS Survey 1-19-67 667-026 Lieberman COVERSHEET FORMAT 2-6- 67 667-041 Lieberman WAIS Report on the WAIS Survey 5-23-67 XXIX. WAIS Sample 645-073 Moyer & Bauman Some Preliminary Tests of Whether the Method of Choosing Name Groups 6-21-65 Influenced Some Characteristics of the WAIS Tax Sample 656-003 Bauman Analysis of Variance Tests for the Effects of Criteria and Sampling 7-8- 65 Techniques on Name Group Size in WAIS's Sample 656-033 Moyer The Treatment of Wives in WAIS' Tax Sample 11-24-65 656-056 Bauman, Geffert, & Moyer Problem Definition for Microfilming 4-21-66 1959-1964 Tax Returnshahttp://www.ssc.wisc.edu/wais/WAIS667055.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667055.txtMike VonSchneidemesser 19662,Some Considerations About the WAIS ID Number May 3, 1966 WAIS paper656-060tNGFixed Format Identification File (FFID) General Papers (Regarding WAIS)hahttp://www.ssc.wisc.edu/wais/WAIS656060.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656060.txtzSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. M. von Schneidemesser WAIS Paper 656-060 May 3, 1966 Some Considerations About the WAIS ID Number At the occasion of expanding the WAIS files with the 60-64 returns it seems appropriate to reconsider the current ID number and the family unit concept. It is suggested that we abandon the family unit concept and just retain a husband and wife unit for the following reasons: (1) Up to now the family unit has never been used in any research work. (2) If it would be used the results would be poor, since we did not succeed in integrating all families. (3) The biggest problem in integrating the new returns with the old ones would consist in trying to integrate dependent children with their parents, and then to integrate those with the 46-59 returns. This can only be done manually; we do not expect to succeed in a satisfactory degree, it wi11 hold up the flow of operations and thus cost us a lot. (4) Many of the clerical and conceptual problems in our previous work were caused by this ideal but hardly practical concept. (5) The family unit concept results in many people having two ID numbers. This is unfortunate for most statistical computations. The number of cases with two ID's is expected to go up the more years we have in our folder, due to dependents setting up their own households. For these reasons we may want to assign new family unit numbers to all dependents who are not a wife of somebody. There are two ways of doing this, but before going into the technical details of these processes, there are some other problems we should consider since they could be resolved together with the family unit problem. (a) Gene Moyer estimates that up to 25% of the new 60-64 returns are new filers and therefore need new numbers and folders. I estimate that up to 20% of all the people currently in our file are dependents who would need new numbers if separated from their parent folder. This means that we have to increase our numbers by up to 45%. Even more numbers are needed when we get the 65-70 returns, etc. Currently in some name groups we have numbers up to ..0800.. At 1000 the high income supplement numbers start. This means we have to find ways of accommodating all our new filers and dependents. (b) It would be nice if name group 51 could be integrated again with 24 (N******). (c) Occasionally some staff members wanted to introduce some additional codes in our ID number. This would be the time to do it. Methods of assigning new family numbers Method I Wives so far are uniquely identified by a "......10" or "......20" in the ID number. If we give all other dependents their own family ID number position 8 in the ID becomes obsolete. We therefore could simply exchange position 7 with position 8 in the ID and consider the new position as an additional digit in the consecutive ID number. This would increase the number of available ID's by a factor of 10 and should be sufficient till the year 2000 even in face of a rapid population increase. This would take care of separating dependents and increasing the number supply (problem A). Generally this will create sufficient gaps in the numbering sequence which will allow us to insert new filers in their approximate alphabetical sequence. To implement this we need to write one program accepting records up to 500 positions long, so that it can read the Master file, ID file, Capital Gains file, Benefit file and Interview file. The program has to check for 'dependent daughters (coded "......11" etc) and change their numbers to 01 or insert them behind their brothers, if available. Then it should exchange pos. 7 and 8. Method II This is a proposal to renumber the whole file. That will, give us a file sorted alphabetically and the ID numbers assigned correspondingly. All the objectives (a) (c) will be accomplished. Basically two programs are needed only. Proceed as follows: Before the integration of the new returns with the old ones take the most recent ID tape sorted in ID and sort out all people with a "70" ID, and the interview or benefit ID's which have a "1" and a "2" in position 3 of the ID. When pulling out these records append a field for the new consecutive number: the name group stays the same, put a 9 for the 1000 ID and an 8 for the 2000 ID. Then sort this on the new number. The remainder of the ID file should be written out on another tape with the following, changes made: For every "......10" or "......20" ID number which is associated with a husband, overlay her middle and her first name with the husbands first and middle name. Move a consecutive number (0,...,9,...0,...9,...0,...,9 etc) in the last position of the middle name of all these wives. This procedure will prevent the wives from getting separated from their husbands in the now following sort on middle, first and last name. Assign consecutive new ID numbers within the name group and merge this file with the "70", interview and benefit numbers, using the new ID. To change the ID numbers in our existing files write a program which accepts all WAIS record formats. This program should match the input file with the 18 tape having the new ID numbers on the old ID number, and substitute the new ID for the old ID when matched. An output file of nonmatching records of our existing files will give us a chance to put these missing FFID's on the new ID tape. When integrating the new 60-64 returns we can list out stickers containing the new ID's and the old ID's, making it easy to match the old with the new folders. For all people who came into the sample from 60 on, than will we have numbers available up to 8000. For all practical means these would result in a reduction of our effective ID number from 8 to 7 positions since the wives are identified in position 7. Position 8 could be left blank or used for whatever purpose we want. People who enter the file through a 60-64 return could be assigned numbers beginning with a new full hundred; that will clearly identify people as to when they entered the sample since here a new alphabetical sequence starts. Treatment of divorced wives If a wife gets divorced keep her in her old husbands folder if we still have her returns, that implies she would continue to use her former husband's name. Only when she marries again and her new husband is in the sample, we would give her a second ID number which identifies her as the wife of her new husband and file her returns from that time on with the new husband. These cases, luckily will be exceedingly rare, thus minimizing the problems of more than one ID. It would be even better if we could avoid these few multiple ID's completely. But the proposed method would minimize changing ID's and thus confusion, since ID changes will be necessary only, when a sample unmarried woman marries a sample man.Mike VonSchneidemesser 1966>8A Definition and Description of the Fixed Format ID File April 4, 1966\ WAIS paper656-052l.'Fixed Format Identification File (FFID)(hahttp://www.ssc.wisc.edu/wais/WAIS656052.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656052.txteSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. M. von Schneidemesser WAIS Paper 656-052 April 4, 1966 A Definition and Description of the Fixed Format ID File During the updating of the Fixed Format Identification File (FFID file) frequent questions arose as to what address and person are relevant for the FFID file. From this I concluded that the purpose and meaning of the FFID file should once and for all be rigorously defined so that (a) future updating will be done in a manner consistent with earlier practices, and (b) the coding personnel need not continously consult with a more experienced WAIS staff member in order to make a correct decision if an unusual case comes up. I. The Purpose of the FFID File The FFID file (the layout of which is described in WAIS 645-058) should (a) List with the complete name every individual contained in one or more of the following WAIS files: (1) The Master file (i.e., individuals who have at least one tax return in the WAIS folders) . (2) The Form 805 Data file from the Social Security Administration. (3) The Benefit file from the Social Security Administration. (4) The interview file. (b) Associate the individual with a unique FFID number and the one Social security number used by WAIS, also called the "primary" SS#. (c) Give the latest known address where the individual can be reached. II. Description of Some Aspects of the Major Sectors of the FFID Record (1) The record type identifier (position 1) contains an "I" if the data came from a fixed format ID card, and "N" if the data have been changed with an N change card, and a "J" if the data were extracted from the 1962 Wisconsin Income tax tape. The different types of update cards are described in WAIS 645-055. (2) The FFID number (position 2-9 on the layout and the Code sheet) This identification number is described in general in WAIS 645-038 by Gene Moyer, pp. 2-5. For Treatment of Wives see WAIS 656-033. (By the way: this last number means the 33rd WAIS paper written in the fiscal year 1965-66.) Only one individual can be associated with any one ID number. But one individual may appear under more than one FFID number, if he belonged to more than one family since 1946. The year at (position 122-123) gives the last year for which this individual was a member of this particular family. The first two digits (position 2-3) of the number designate the name cluster. N****** appears under either 24 or 51. For all persons with a last name not in one of our name groups, a ''70" is coded. The third digit, i.e., the first column of the running number in the name group (position 4-7), will generally be a "0". A "1" in this position indicates a person in the interview sample for which no Master file entry exists. A "2" in this position indicates a person from the Benefit file without a Master file entry. A person in both the Interview file and the benefit file but not on the Master file should be coded "1". For example: 3410**** is a person of name group 34 which appears in the Interview file and possibly also in the Benefit file. Care should be taken to assign "1" or "2" type family unit numbers only to those individuals who do not have any family member in the sample yet. if they do, they should be coded under their family FFID number. The position inside the family is represented by the last two columns (position 8-9). A "7" in position 8 indicates a lump sum death payment (see WAIS 656-028). But these cases are not contained on the FFID file. Wives divorced after 1946 but who did not file a tax return should not be entered on the FFID. But the code 10 in the family unit should be reserved for them in case they show up in the Benefit file as a beneficiary. The following wife (if ever acquired) should then be coded as 20. (3) The Social Security Number (SS#) (See page 6 of WAIS 645-038). Due to errors made by the taxpayers, this number is not always unique for one individual. We therefore have primary SS#'s and one or more secondary SS#'s, These secondary SS#'s will be contained in the History file. Only the primary SS# will be on the FFID file (position 10-13), and it has to be the same for any other entry of that individual in the FFID file. If Form 805 data are available, the SS# given there is the primary SS#, so that we can locate reliably any individual in the 805 file. Note: If any changes in the FFID or SS# have to be made or, the FFID file, these changes should also be made on the Master file and the Interview and Benefit files, if applicable. The only admissible change in the Form 805 data is a change in the FFID number. (4) The Name (position 19-62) should be the most complete and most consistently used version found on the returns. Generally this should be the most recent version, too. If a taxpayer begins to use a different first name, for example, it seems advisable to make the change only if he used that particular name for at least two years in a row. In some of the versions of the FFID file a "1" or a "2" can be found in position 62, that is the last column of the Middle Name field. This gives the reason why this FFID record did not match with the Master file record for him; if "1", this persons appears in the Interview file and we do not have a Master file record for him; if "2" then this person appears in the Benefit file and no tax return has been filed during the time for which the Master file has been created, or the person is not in one of our name groups. (5) The Address and the Year (position 63-123) always gives the last known place where the individual can or could have been (he may be dead by now). The year column therefore may not necessarily stand for to year in which the individual resided at this place. The year indicates always the last year for the year in which we have returns for that particular ID number. If the individual's address is outside Wisconsin, the state's abbreviated name should be coded in positions 114-117 of the Post Office field, that is position 70-73 on card 2 of the code sheet. Also enter a 98 in the county code field. 111. Provisional Treatment of the Death Date and the Secondary SS# Column 63-79 of card I of the FFID code sheet so far are unused. It is suggested that column 63-64 are used for the death date and column 71-79 for the secondary SS#, if these data exist at the time an I or N card is coded. If no SSA 805 data were found for a person code and punch "NO 805" in column 74-79. In the process of updating the FFID file without present methods these two fields will be ignored. But in case we decide to append fields for these data to our FFID files those cards can readily be used for putting this information into the file. In any case, cards with such an entry should be kept for future use in connection with the History file and other problems.Jan Smith Mark Lieberman 1967("Progress Report on the WAIS Survey May 23, 1967 WAIS paper667-041Survey Data and Filehahttp://www.ssc.wisc.edu/wais/WAIS667041.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667041.txtSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Jan Smith, Mark Lieberman WAIS 667-041 May 23, 1967 Progress Report on the WAIS Survey WAIS 667-022 was a description of the processing that was to be done on the survey. Although some of this work has been done, much remains unfinished. Happily, the problem is not one of insurmountable bottlenecks or of unforeseen circumstances. The original time estimates of the old paper seem to be accurate to within 10% or so. Instead the problem has been partly lack of computer time and partly unavailability of staff time. Neither of these problems should occur this summer. In addition, the change of staff that will occur at the end of the semester should not decrease efficiency (and can well be expected to increase it). This paper is a progress report on the processing of the survey. We will first discuss the work completed since February, 1967. Next will come a description of work currently in progress. Finally, we shall discuss some future plans on the survey and a date of completion of the processing. Step I has been completed to the limits of our present knowledge i.e. all C and J changes that we know about from processing that was completed while Gene Moyer was here and those that John deVries suggested were run. Survey interviews were changed appropriately. Concomitantly, we made a survey inventory that told us how much actual paper we had, as opposed to what the survey tape maintains we have. This inventory shows we have 1291 schedules and 1130 assets booklets. This contrasts with 1300 ID's on the Geffert extract and no count of assets booklets. Eight additional schedules apparently were lost. We will continue to carry these people on the survey tape, although the difficulty in further editing their records is obvious. Presence and Absence. Lost Schedules Interview # ID-# ***** ***** ***** ***** ***** ***** ***** ***** Step II, the check for presence and absence of cards, has not yet been completed. We have had unanticipated difficulties on the programming of this step. However, we feel we are close to solving these, and expect the program to be run within 2 to 3 weeks. Testing output from this program suggests possibly 50-100 ID's with inconsistent card sets. It is possible the time estimate on this step is too low; however, it is too early to tell this for sure, and a new estimate will wait until we get final output to look at. Preliminary results from presence and absence show that the required cards (30,31,48) for the assets booklet are missing for a few ID's. Some of these may be the ID's that don't have assets booklets, some may have assets booklets and these cards may be missing. This will be reviewed and corrected in very short order. III and IV are obviously incomplete as the inputs for them are not yet ready. V is unfinished for the same reason, although use of Jim Geffert's SURWT program suggests there are three weight cards with no survey and 3 surveys with no weight cards. Action on this has been postponed until updated survey tape is available. It is in the editing procedure that we have made the most progress. Twenty-four of the edits specification sets have run and checked out completely. It should take about fifteen hours to wrap up the writing. We are, then, very nearly done with this work, which comprised the bulk of the time estimate I made for all processing. It is impossible to make any estimates of time limits for any other sections of the processing, since the inputs for these sections are not yet prepared. However, we do not anticipate substantial revisions of our original estimates. We should mention that cards 12, 13, and 14 are missing from our survey tape along with the pages from the Moyer Master Code Book describing the content of these cards. Gene Moyer said Ken Letterman was using these data for a project. The data have not been seen around WAIS since at least the summer of 1966. We have all assumed Professor Letterman is storing the data somewhere. But we have made no effort at all to contact him so we do not know what he is doing or has done with these data. Although he supposedly has edited them, they must certainly be re-edited. They will have improper ID numbers as they have not been present for our C changes. We suggest these data be integrated as soon as possible. Finally, we feel a total revision of the WAIS survey codebook is in order. The Moyer codebook is often inaccurate and incomplete, making it hard to use. In addition to the information presently included, we propose the inclusion of frequency counts for some of the key variables. When the new codebook is written, the author should keep in mind that the document will carry the WAIS project to other universities who will not know what the codes "meant to say." The new codebook will have to be many times more precise than the present book. Thus, although we originally intended only to revise the old book, this is impossible. The new book promises to be quite huge, and its production will surely be much harder than we had anticipated in February of 67. Several other comments should be made regarding various assets of the survey processing: weighting, reformating, and extracting. We have previously discussed some difficulties in Gene Moyer's weighting system. Van Bussman has a forthcoming WAIS paper describing his proposal to revise the weights. Since the details of the weighting is beyond the competence of either of the authors of this paper, we can only recommend that a decision be made and new weights somehow calculated so we can put these weights on a new survey master. Furthermore, we are fairly sure that a different set of weights are necessary for the assets books since the response rate was different from that of the survey. We are discussing the possibility of reformating some card codes that are presently difficult to use. This is low priority as we can use the one presently existing. However, with an eye toward making our data as usable as possible, we have some suggestions for standardizing our codes. Finally, we feel proposals for a new survey extract are in order. There already exists a newer one than in February. SSRI 492 is a 534 character extract Jim Geffert made that will supercede the older shorter one. This new extract is undocumented as yet, but the task is small and Jim would give us the data. Nevertheless, there is nothing sacred about what he extracted, and WAIS may want to make one or several new extracts. Since the extract constitutes our data in its final usable form, considerable advance thought need be given to it. The only remaining statement concerns a final due date for the survey. The date will be presented with the following assumptions: 1) Jan Smith and one other project assistant will be here most of the summer. 2) These people will be giving the survey top priority, working on something else only when the survey work is stalled or done. 3) A minimum of unexpected data difficulties will occur. 4) Incomplete planning will not cause many tie-ups on the later stages of the processing. 5) Computer time is readily available, hopefully. 6) No one gets hoof and mouth disease, there are no fires, and the mylar oxide of our tapes doesn't decay. Keeping these in mind, we feel confident of an August 1967 completion date for the survey with the new codebook coming hopefully within a month of that time - plus or minus two months.j  Roger Miller 196581Proposed Scope and Outline of Averaging Monograph July 14, 1965 WAIS paper656-004>7Averaging Studies Proposals- For Analyses, Theses, etc.Roger F. Miller WAIS Paper 656-004 July 14, 1965 Proposed Scope and Outline of Averaging Monograph It seems impossible to do a complete job of analysing and estimating individual behavioral responses to changes involving tax averaging devices in this monograph. This does not mean that some effort in this direction should not be made, however. It does mean that the main focus of this monograph is on tax averaging devices per se, with the effects they may have on the distribution of income and the amount of government revenue from a given pattern of income streams through time. But we will still need to have the framework of a model to distinguish relevant variables and to suggest ways in which results of applying averaging devices to given income streams need to be qualified. In addition, we will not wish to deliberately ignore reliable quantitative results of other studies. Parts III and V of the following outline are appropriate places to invoke the discipline and control of a formal model and of other studies. Averaging Proposals for the Taxation of Fluctuating Individual Incomes by Martin H. David, Harold M. Groves, and Roger F. Miller of the Wisconsin Assets and Incomes Studies Committee Social System Research institute University of Wisconsin G. I. Introduction: First Principles and Basic Issues A. The Need for Averaging, and Its Costs. B. The Income Concept. G.& M. II. Tax Averaging Devices A. Catalog and Analysis of Averaging Proposals. B. Data Needs for Appraising the Proposals. C. Design of the Data Analysis. M. III. Models of Income and Portfolio Behavior A. Accounting Relations. B. Behavioral Relations. D.& M. IV. Wisconsin Assets and Incomes Data A. Sources and Nature of the Data. (Moyer) B. Patterns of Income Variation. (Durant?) 1. Cross Sections. 2. Time Series and Averages. 3. Analysis by Source. 4. Analysis by Characteristics of Households. 5. Analysis by Income Concept. D.& M. V. Application of Averaging Devices to the Data A. Revenue Act of 1964} B. Groves-Simons } {1. Distributional Aspects C. Bolt } {2. Revenue Aspects D. Vickrey } All VI. Conclusions.hahttp://www.ssc.wisc.edu/wais/WAIS656004.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656004.txt 60z Gene Moyer 19650*A System of Weights for the Interview DataJanuary 27, 1965 WAIS paper645-026rSurvey Data and FilexrGene Moyer WAIS 645-026 1/27/65 Draft A System of Weights for the Interview Data In order to make our interview data representative of its tax sample "population", it is necessary to develop weights for each of the interviews which reflect the percentages of the subcategory population from which they were drawn. There are two apparent problems in developing a weighting scheme which are not actually relevant. The first is that some of our respondents had a double probability of being chosen for the sample because of errors in classification. Same of them were in the unmatched master file as well as in the unmatched state file and could have been chosen twice. In fact many of them were chosen twice, although an effort was made to eliminate all those before the sample was given to the Wisconsin Survey Research Laboratory. If the weights are based solely on the percentage of the population which responded to the interview, these a Priori probabilities are no longer relevant. The second of these problems is that our original sampling percentages were in error because of errors in the definition of populations. Again, however, if the weights are based solely on the proportion of the populations which responded, then these a priori sampling percentages are no longer relevant. To base these weights on a posteriori probabilities is only possible because of our use of finite populations but this is an advantage on which we should capitalize in every way possible. The scheme proposed in this paper is simple and is based on the same "keys" which were given to respondents before the sample was drawn. These "keys" were nine digit numbers each digit of which described some characteristic of the taxpayer with whom it was associated. For a definition of these keys, see "Sampling Procedures for the 2000 Name Interview Sample," WAIS paper 645-002. There are three obvious subsets of respondents to the schedule and three similar groupings of respondents to the booklet. Weights for the schedule should be developed-separately from those for the booklet because the proportions responding are different for these two parts of the survey. Before any weighting can be done, another matching should be done of the new complete master file and the 1962 tax roll so that the information on our respondents will be as complete as possible. Having done this matching and having made our information as complete as possible, it is necessary for us to determine the number of people who have "keys" which are identical. Each of these groups can then be considered a population from which our respondents were drawn. Not all of these small populations will represent the same major population. Some of them (those with the entire set of nine "key" digits) will represent the matched master file and 1962 tax roll population. A second group will represent the unmatched master file population. These will have seven digits in their "key". The third group will represent the unmatched 1962 tax roll population. These people will have only three digits in their key. The weights for each of these groups will have to be devised in the same way even though our information is much more limited for the last two groups because there seems little possibility of doing better. To determine the weights for the ith group of people with identical keys, let xi = the total population of the group (in one of the tax samples) yi = the number of people in the group who responded to the schedule zi = the number of people in the group who responded to the booklet wi = some weight to be attached to the schedule values of these people vi = some weight to be attached to the booklet values of these people (theta)i = some population value (e.g. total value of housing owned by members of the group). Since the proportion of these people who responded to the schedule is yi/xi, we would expect that the value of (theta)i which we would find in the sample is (1) E(Si) = yi/xi (theta)i (if (theta)i is a schedule value) or (2) E(Bi) = zi/xi (theta)i (if (theta)i is a booklet value) We desire to determine the values of wi and vi such that (3) wi E(Si) = (theta)i and (4) vi E(Bi) = (theta)i Substituting (1) into (3) and (2) into (4) we have wi yi/xi (theta)i = (theta)i and vi zi/xi (theta)i = (theta)i dividing each side by (theta)i and multiplying each side by xi/yi or xi/zi , we get wi = xi/yi and vi = xi/zi , the weights for our interview schedule.hahttp://www.ssc.wisc.edu/wais/WAIS645026.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645026.txt Gene Moyer Richard Baumang 196582A List of Hand Matches of Interviews and Tax FormsFebruary 5, 1965 WAIS paper645-032pSurvey Data and Filehahttp://www.ssc.wisc.edu/wais/WAIS645032.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645032.txt\Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Gene Moyer R. Bauman WAIS 645-032 February 5, 1965 Draft A List of Hand Matches of Interviews and Tax Forms Recently all of the interviews from respondents whose name was selected from the unmerged state file in our name groups were checked to see if the tax forms for these people were actually available. On page 2 is a list of the identification numbers of people who were actually matched, their original and new keys, and the method we used to match them. For the eleven cover sheets at the end of the table whose identification numbers have been changed, the code for the last two digits is as follows: 90 Divorced wife 91 Divorced wife who has remarried 92 Widow of a respondent who was living in 1962 93 Separated from a male respondent, but not divorced A new category has been added to the cover sheet code. 2000-2999 is now the code for those cases which should have been coded but were not because of an interviewer error. These cover sheets will be repunched with the old key in columns 70-78. Col. 80 will contain the code for the type of matching as follows: 1. Matched on name and address 2. Matched on name and occupation 3. Matched on name and dependent ages 4. Combination of 2 and 3 5. Other 6. Matched on SS Number 7. One person drawn twice 8. Matched on Intervieewer information 9. NOT ASCERTAINED "C" cards have been punched so that data cards can be associated with the new Identification Numbers.g\ 2 James Geffert 196582Utility Print Program for the Master File PRINTMAS May 26, 1965 WAIS paper645-067(!Master File- Tax Records Programs  WAIS 645-067 James Geffert 26 May 1965 Utility Print Program for the Master File PRINTMAS Purpose: This document describes the utility printing program for the WAIS Master files and the control cards necessary for its use. General Description of Program: PRINTMAS reads all versions of the WAIS Master file and creates printed output. Variables in the printed records are labeled with the names of the variables and their corresponding entry codes (if any). The user must specify on a control card which version of the master file is to be read, which records are to be printed, and on what computer records are to be printed. The program is written in 1410 Autocoder for the Commerce IBM 1410. CONTROL CARD FORMAT (Both cards required) Columns Description 1-13 PRINTCONTROL1 Identifies this card as the first PRINTMAS control card. 14-16 310 Indicates to program that the 310 character master record is to be read as input. 400 Indicates to program that the 400 character master record is to be read as input. ETC. 17-19 CRD Indicates to program that records to be printed are identified by punched cards. (See Specification Cards) ALL Indicates to program that all records are to be printed. If this option is chosen a limit must be set in Columns 20-25. 20-25 XXXXXX Blank if CRD in 17-19. Indicates to program the maximum number of records to be printed. If the first 100 records are desired, columns 17-19 would be punched ALL and 20-25 would contain 000100. 26-32 OFFLINE Indicates to program that printing is to be done on a machine other than the IBM 1410. The offline machine must be specified in Columns 33-36. ONLINE Indicates to program that printing is to be done on the 1410. This option should be chosen for runs only, since 1410 time is expensive relative to 1401 or 1460 time. 33-36 XXXX Blank if ONLINE is punched in 26-32. 1401 Indicates to program that printing is to be done on the Commerce IBM 1401. 1460 Indicates to program that printing is to be done on the UWCC's IBM 1460. 37-44 SSRI XXX Indicates to computer operator (by typewriter message) the first input tape. 45-52 Second input tape. 53-60 Third input tape. 61-68 Fourth input tape (blank if using 310 character master record) Columns Description 69-76 SSRI XXX Indicates the tape user provides for offline printing option. If the 1401 is used as offline device enter SCRATCH. 1-13 PRINTCONTROL2 Identifies this card as the second PRINTMAS control card. 14-80 Any title to be printed on each page of printed output. Specification Cards 1. For one person, single year only: Column 1- 8 WAIS Identification Number Column 9-10 Year of tax form Column 11-80 Blank 2. For One Person, all years. Column 1- 8 WAIS Identification Number Column 9-80 Blank 3. For One Household, all persons, all years Column 1- 6 First six digits of WAIS identification number Column 7-80 Blank N.B. Specification cards must be sorted by Columns 1-10 to conform to master file sequence. Specification cards (if any) follow the PRINTCONTROL cards.hahttp://www.ssc.wisc.edu/wais/WAIS645067.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645067.txt James Geffertg 1965>8"MEANS", A Program to Calculate Means from WISTAB Output June 30, 1965\ WAIS paper656-001 Cross Tabulations ProgramsJames Geffert WAIS 656-001 30 June 1965 Document "MEANS", a program to calculate means from WISTAB output. A program has been written to calculate means from the tape output of WISTAB*, a general cross tabulation program. The MEANS program requires two tapes from separate WISTAB runs, one of which contains the dividends (amounts) and the other the divisors (frequencies). The tapes must have tables of the same size and in the same order; that is, the dividend and divisor cells must correspond. At present the user can specify only a title which appears at the top of each page of printed output. Each line of printed means corresponds exactly to each line of printed output from the WISTAB runs which created the tapes used by the MEANS program. ----------------- *McCoy and Kenyon, WISTAB Users Manual, University of Wisconsin, School of Commerce Data Processing Center, Madison, July 1964. hahttp://www.ssc.wisc.edu/wais/WAIS656001.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656001.txtl: >Janet Whitaker 1968(!Asset Identification Code RevisedAugust 1, 1968 WAIS paper689-005 Property File H AJanet Whitaker WAIS 689-005 August 1, 1968 Asset Identification Code Revised The Asset Identification Code for the Property File is presented in WAIS 678-031 and WAIS 678-031 Supplement. Attached is a revised code which is compiled from Property File Asset Identification Code and Survey File Asset and Asset Uses Code. In essence, the two codes serve identical purposes; thus, one standard code is appropriate. In many cases, there is only one asset "issuer" fee a particular type of asset (or asset use); for such assets, the asset type code and asset identification number are presented together to give an exact code. Asset Type Code Type of Asset Asset Identification Number 01, 51 Commercial bank savings accounts 02, 52 Savings and loan association shares 9999 03, 53 Mutual life insurance policies. 04, 54 Credit union shares 05 Postal savings deposits 0008 06 Notes and mortgages-obligations of individuals 0000 07 Notes and mortgages-obligations of business firms 0001 29 Notes sad mortgages-obligations of unknown persons or firms 0011 O8, 58 Bonds issued by states or local communities 09 Bonds issued by firms to LFF 24, 74 Bonds issued by firms not in LFF 10 Stock issued by firms in LFF 11, 61 Stock issued by firms not in UT 27, 77 Stock Issued by banks not in LFF 32 Preferred stock issued by firms in 11F 33 Preferred stock issued by firms not in LFF 12, 62 Dividends from cooperatives 13, 63 Debt instruments other than bonds 20 Farm residences 0009 14 Non-farm residences 0002 15 Other non-farm real estate (including easements) 0003 28 Options an non-farm zeal estate 0028 22 Other farm real estate (including easements) 0010 30 Options an farm real estate 0030 16 sales of proprietorships 0004 17 Sales of partnerships 0005 18 Other types of income-producing assets including personal property and federal tax rebate 0006 19 U.S. bonds 0007 21 Stock options, warrants, script, etc., 26, 76 issued by firms in US Stock options, warrants, script, etc., 25, 73 Issued by firms not in US Retirement funds, profit sharing funds, etc. 46 Investment clubs 0050 23, 73 Capital gains distributions from mutual funds or other Investment funds 6767 67 Capital gain from unknown source 31 Capital gain on sale of commodities 1001 55 Dividends - fine not ascertained 5555 77 Interest - firm not ascertained 7777 88 Short tax form and incomplete information 8888 95 Partial tame return present, sheet with information not available. 9593 43 Reinvested proceeds-did not volunteer use 0043. 00 Did not reinvest proceeds-did not volunteer use 0000 44 Did not reinvest proceeds-used for short term comsumption 0044 45 Did not reinvest proceeds used to purchase consumer durables 0045 34 Did not reinvest proceeds-used for gift 0034 35 Did not reinvest proceeds-used foe travel 0035 36 Did not reinvest proceeds-used for other consumption (make out Form 1) 0036 37 Did not reinvest proceeds-used as Income 0037 41 Used for investment in education 0041 42 Used for general metes, in portfolio 0042 66 More than allowed number of responses given 6666 88 Don't know 0000 99 Not ascertained 9999hahttp://www.ssc.wisc.edu/wais/WAIS689005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689005.txt2Mike VonSchneidemesser 1968WAIS FILE Maintenance IIApril 22, 1968 WAIS paper678-056,&Maintenance System - Files, Data, Etc.Michael von Schneidemesser WAIS 678-056 April 22, 1968 Revised April 29, 1968 WAIS File Maintenance II: Programs to match the files via the ID file. The need for the format of the file indicators on the ID file are described in WAIS paper 667-024. These indicators tell the absence or presence of records on other files for each individual. The present paper outlines how these indicators can be kept up to date. The indicators are created and can be updated by matching the ID file against the other WAIS files. This is done by the ID xxx.., programs, which are outlined further down. All matching is done on ID-number. The ID file updating program (Joh Ryshpan's UPDATFID, WAIS 656-019, p. 9) has been adapted to take the 128 character ID file and to blank out the indicator field therefore means that a matching operation has not been made yet. It is not necessary to run the four matching programs together or in a specific sequence. Which programs should be run after any changes to any file have been made depends on: a) the kind of changes which were made to the various files b) the use to which the indicators should be put. NOTE: The Form 805 file contains an indicator for the presence or absence of Benefit file records. If this indicator needs to be kept up to date, it is necessary to run IDBENE before IDAGE8, whenever changes to the Benefit file have been made. IDMAPROP Matches the ID file with the Master file and the Property file to create the Master-Property indicator in col. 124 of the ID file. indicator values: 0 - no Master, no Property 1 - Master, no Property 2 - both Master and Property ID 10x128 MA 5x400 PROP (Property) card-image IDMAPROP ID 10x128 125 updated MA and/or PROP without ID. PROP without MA, but ID. MA indicates certain PROP records logically required, but not available. MA and PROP present, but fields do not match (amounts, cardtypes) Programmer: none yet The consistency check between Master and Property is optional. IDAGE8 Matches the ID file with the form 805 file and the Age data file to create the 805-Age indicator in column 125 of the ID file indicator values: 0 - no 805, no Age data 1 - 805, no Age data 2 - no 805, but Age data 3 - both 805 and Age data AGE DATA BENEFIT card-image SAD-Revise AGE + BENE Match, but disagree AGE expanded 1 ID 10x128 805 10x383 (2) IDAGE8 805 incl. ID + AGE (3) 10x383 ID 10x128 125 updated Inconsistencies: 1. 805 not matching ID 2. AGE not matching ID 3. SS#'s of ID + 805 disagree 4. 805 + AGE disagree in DOB or Race Programmer: BILL Katke His current version is a two-step procedure. For operational convenience this should be turned into a one-step procedure. (1) Format of Age records extracted from Benefit file: Contents Source fields on Benefit file 1 'A' 2 -9 WAIS ID# card 2 or 3, s 00, cot. 13'20 10-24 blank 25 '3' for Benefit file 26-30 blank 31-36 date of birth card 2 or 3, sequence 00, col. 23.28 37-38 blank 39-41 age in years at death computed from card 2, sequence 00, col. 27-28 and col. 60-61 blank if col. 60-61 is blank 42-51 blank 52-57 date of death card 2, col. 56-61 58-80 blank (2) A version of the 805 file sorted on ID-number has been created, into which the ID#'s of the ID file have been placed. ("805 with good ID#'s" ). To avoid sorting the original 805 data into ID# sequence after each match on SS#, it is necessary to update the 805 file's ID#'s whenever ID#'s are being changed. (3) Col. 1-18 always from 805 if present, otherwise from ID file 19-123 from ID file 124-381 from 805 if present, if not available then 126-131 from Age date 382 from ID file col. 126 IDBENE Matches the ID file with the Benefit file to create the Benefit indicator in column 126 of the ID file. Indicator values: 0 - no Benefit record 1 - Benefit present ID 10x126 BENEFIT card-image IDBENE ID 10x128 126 updated BENEFIT without ID BENEFIT with ID, but SS#'s disagree Programmer: Dennis Alley IDINT Matches the ID file with the Interview (Survey) file to create the Survey indicator in column 127 of the ID file. Indicator values: 0 - no survey, no coversheet 1 - survey and coversheet 2 - coversheet only, i.e., non-respondent ID 10x128 SURVEY card-image COVERSHEET IDINT ID 10x128 127 updated SURVEY without ID SURVEY + COVERSHEET card-i. COVERSHEET without SURVEY SURVEY without COVERSHEET Programmer: Ken Nelson His current version matches the Coversheets in a separate step. The match with the coversheets is optional and redundant after the first time, when they are integrated into the Survey file as card 00.hahttp://www.ssc.wisc.edu/wais/WAIS678056.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678056.txt     $t % ` h y  Jvy/ ?5<L nl  D  X[ , rkld Y }  X)    X !~  i[@5  L  *-GR.(1a-u 3 Y&. RsNB  ]J$Kcx , $&}/1  ~ $kr g /!%z H +f8 9F  Z e*N" X go@EJ  8 , ns  B c9S   .4 bh,      ,    ' AQ      ; J Y i+<Y  dw  A J  }D JHg   !@ W  d  " ~   7)+u`$: ' z 1+* )"K_$?   %6X   A *OG2  * <#" HV o |    7   B 1h18Bz*h$~  m15 mxb   GQ  v]xVe$W!2$'  % !'`+/8#{'  Roger Miller 1966TNPortfolio Evaluation from Wisconsin Individual Income Tax Returns: IV. - XIII.April 12, 1966 WAIS paper656-053e Property Filed Roger F. Miller WAIS Paper 656-053 April 12, 1966 Portfolio Evaluation From Wisconsin Individual Income Tax Returns IV. Processing of Files 13 and 14 (Stage Two) V. Internal File Integrations (Stage Three) VI. Integration of External Files 23-25 (Stage Four) VII. Extraction of Asset Data from Internal Files, Separated According to External Files (Stage Five) VIII. Supplementing External File 41 (Stage Six) IX. Integration of External Files 21 and 22 (Stage Seven) X. Asset and Yield Valuations (Stages Eight and Nine) XI. Reintegration with Other Taxpayer Data (Stage Ten) XII. Quality Checking Crosstabs (Stage Eleven) XIII. Final Comments and Outline of Additional Work **The rest of this document could not be translated to basic text. Please view the PDF file.**^W**Terms and topics from paper, listed for searching purposes** Reading in of records Cover Sheet Interviewer's Supplement File 13 Intrarecord Intrafile Interfile checks Electronic checks editing process errors in editing instructions and card format specifications Question 36 Q36 Q24 Q36 Q40 editing action sheets not ascertained File 12 variable record length staff assignments Checking 30-series files assignment to output files reliability ratings increment counters Output tape taxpayer's I.D. asset I.D. Class A real estate output files external files internal files crosstabulations portfolio value Treasury studies realized and unrealized gains and losses by total income and major asset type file flow diagram critical path analysis programming and data preparation efforts variable length record calculations ratings counts original data filesthahttp://www.ssc.wisc.edu/wais/WAIS656053.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656053.txt6 Roger Miller 1966Define VariablesNovember 17, 1966x WAIS paper667-019 Tables Roger Miller WAIS 667 - 019 November 1966 Define variables N1i = # yrs. ith person filed in yrs. 1946-1953 N2i = # yrs. ith person filed in yrs. 1954-1960 N3i = N1i + N2i cG1i = 8Et=1 Capital Gains reported in 46-53 CG2i = 15Et=9 Capital Gains reported in 54-60 CG3i = CG1i + CG2i DIRi = 1/N3i 15Et=1 (Dividends + Rent) AGI*i = 1/N3i 15Et=1 (AGI - CG)it - DIRi Si = Sex (M or F) xli = # yrs. reported CGi does not equal 0 in 46-53 x2i = # yrs. reported CGi does not equal 0 in 54-60 x3i = x1i + x21 Y1i = yrs. reported DIRT = (Dividends 3 = + Rent) i 0 in 46-53 y2i = # yrs. reported DIRi = (Dividends + Rent)i 0 in 54-60 Y3i Yli +Y2i Ai = Age of ith person in 1950 Note t = 1 - 7 => 1946-53, t = 8 - 15 => 1954-60 (12) Cell Entries (1) EiCGji (j defined by column) (For given values (2) Ei1; for all persons ever reporting CGi does not equal 0 (Xji does not equal 0) of page defining (3) Ei1; for all persons ever reporting DIRi does not equal 0 (Yji does not equal 0) and row defining (4) Total records having CGi does not equal 0 (EiXji) variables) (5) 1/2 (6) 1/3 (7) 1/4 (8) 2/3 (9) % that 1 is of its col total (10) % that 2 is of its col total (11) % that 3 is of its col total (12) % that 4 is of its col total (5) Row Definition - Classes of Ai (1) NA (2) < 40 (3) 40 - 60 (4) over 60 (5) all With headings and spacings each "page" will be spread over two sheets of printout. (11) Column Definition by Nji within j: j=1 (t= 1-8) j=2 (t = 9 - 15) j=3 (t=1 15) Nji = 0 - 4 5 - 8 all 0-3 4-7 all 0-3 4-7 8-11 12-15 all (60) Page Definitions (1) by S (M, F, all) (2) AGI* : (a) < 10,000 (b) 10,000-25,000 (c) over 25,000 (d) all (3) DIR: (a) Y3i = 0 (b) < 100 (c) 100 - 500 (d) over 500 (e) all (b -> d) Y3i > 1 Bill Gates WAIS 667-019 Programing Supplement August 2, 1967 Programing Specifications for Tables lA-1D, 2A-2D, 3A-3D, 4A-4D Source: PROGRAM MILLER with the subroutines GET, RESET, REFORM, ACCUM and OUTPUT appended. No modifications were necessary. They were tested on EXT-01F (3 reels, SSRI 600, 577, 597) as of August 1, 1967. When a new extract file is created a production run will be made providing the "driver" does not have to be completely reconstituted because of file format changes. The variables now specified probably do not correspond to those in 667-019. Documentation: The program was tested under the name MILLER, that name has been pulled and PROGRAM TAB16EX has been inserted for the anticipated September production run. A program listing of MILLER will be retained with the latest program listing. The test data produced will be available in Professor Miller's office. The program source deck should be close by the latest listing, and at present both are stored in the INCOME Research card file cabinets.hahttp://www.ssc.wisc.edu/wais/WAIS667019.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667019.txtt Robert Frost 19792,Caveat: Possible Excess of 'Not in Probates' July 30, 1979 WAIS paper790-005$WAIS-Wealth: Sample ProcessingFROST WAIS 790-005 JULY 30, 1979 Caveat: Possible Excess of 'Not in Probates' Decedent address information contained in the WAIS archive lists most often the mailing address. While this is little problem in most cases, there is a possibility of error near the edges of counties where postal address is in another county. For example, a decedent listed as R.R.4, Ripon would be listed as a County 20 (Fond du Lac) resident, when indeed, that address may well be in County 24. (Green Lake). In cases where the township is listed, there is no problem; given townships are of necessity associated with their correct counties. Note well, however, that in cities/towns where this was a major problem (Marshfield (71 & 37), Wisconsin Dells (56,11,1) and Watertown (28,14)), decedents of such areas were checked in all relevant counties. The problem, however, does remain with small towns or rural areas.hahttp://www.ssc.wisc.edu/wais/WAIS790005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS790005.txt Robert Frost 1979(!Valuation of Gifts in WAIS-WealthAugust 21, 1979 WAIS paper790-006$WAIS-Wealth: Sample ProcessingFROST WAIS 790-006 AUG 21, 1979 Valuation of Gifts in WAIS-Wealth Dollar values for WAIS-Wealth estates are, of course, recorded in current value, date-of-death values. A 1969 decedent thus has an estate expressed in 1969 dollars, etc. However, under Wisconsin probate practices, if that 1969 decedent gifted a sum in, say, 1947, the earlier dollars would be counted as being of equal value to those 22 years later, a practice which is clearly inaccurate. To solve this problem, all gifts given one or more years before death are indexed to date of death value and are entered into the data base as such. The result is that GIFDAT14-A-D express the year of gifts, but GFTVAL14 expresses their aggregate value in date of death dollars; as such, the two variables are not strictly comparable. The formula used is as follows: ( gift value/CPI for year of gift x (1.03)tt-to ) CPI for year of death or ( gift value/CPIto x (1.03)tt-to ) CPItt CPI information from Stat Abstracts (Dept. of Commerce, 1967-72).hahttp://www.ssc.wisc.edu/wais/WAIS790006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS790006.txt 7! Janet Smitho 1968Survey Progress Report July 10, 1968F WAIS paper689-002Survey Data and File! !Janet Smith WAIS 689-002 10 July 1968 Survey Progress Report Nearly two years ago, John deVries described, in WAIS 667-007, his doubts about the validity of Survey data. The greatest source of such doubts lies in the codebook. The "Master Codebook," which apparently is the actual codebook used by coders and keypunchers is incomplete and inaccurate. Leslie Appleton's Codebook is considerably more accurate, but still incomplete. The WAIS Interview Codebook (to be presented this summer) will be more accurate and complete, I will include enumerations and explanations of seemingly unsolvable codebook-data discrepancies. Codebooks are usually devised to categorize data, to translate individual responses into meaningful categories. Such is not the case with the various survey codebooks. The Appleton Codebook (I am only distinguishing this version, not blaming Miss Appleton for the unfortunate errors and omissions in that codebook) has served as a "data guide," as an interpretation of what the data is supposed to resemble. In writing the WAIS Interview Codebook, I have attempted to determine what the codebook says the data is, compare this with the existing data and analyze and solve the discrepancies. In some cases, the data is actually wrong; it is corrected. In other cases, the codebook is incomplete; it is updated. However, in still other cases, the codebook and data simply do not resemble each other at all; such cases are difficult to solve, and if they can't be solved, they are noted, enumerated, and discussed in the text of the WAIS Interview Codebook itself. Thus, I hope that the WAIS Interview Codebook will be a more complete and accurate, and less ambiguous description of the survey data. Next, I'd like to discuss the "deVries allegations" first in general terms, and then in more specific terms. Generally, most of the "deVries allegations" deal with a code book interpretation of the survey data. As I have already discussed, the most recent codebook (Appleton) is incomplete and ambiguous. Many "errors" have been corrected by simply listing all legitimate codes for a particular question. This produces a much lengthier, but certainly more complete, codebook, which is more accurate, and leaves less room for conjecture about the meanings of codes and questions. Another general cause of data errors and presence of illegitimate codes, etc., is due to the presence of many terminated interviews (I am not sure of the exact number of interviews which were started and subsequently terminated by or for respondents). The presence of the interviews in the data file itself must be questioned (and will be it a later date). Specifically, these are many data errors in the Survey File. I'll follow John's format in presenting my comments to his arguments. 3.2. The effect of "blanks" 1. Interviewers, editors, coders and keypunchers all made mistakes which produced blanks. Many, many of these errors have been found and corrected; it has been considerably easier to find them where a pattern resulted from the repeated mistake (s) of one interviewer or coder, etc. 2. Allowable blanks exist, as in contingency questions. For example, the interviewer asked whether or not the respondent was employed; if he states that he was employed, unemployment questions do not apply to him; the interviewer was to skip those questions on that condition; the unemployment questions and codes remain blank. Standard data processing techniques now call for using an "inapplicable" code in such cases. There are two reasons for not following the norm: The first reason involves the expense and time used in actually filling the "inapplicable" columns. The alternative methods of "filling" are: recoding and keypunching, remote typing, and specialpurpose batch programs. The last of these is the most practical, fastest and cheapest. The second reason concerns the codebook. This problem is twofold. First, there is the case of a contingency within a contingency. Normally, the presence of an "inapplicable" code would indicate that a set of questions weren't appropriate; but, if there is one such set within another, an "inapplicable" code in the inner set would be ambiguous. There are many multiple contingencies, especially in demographic sections. Second, in many codes, all ten numbers are used; since it is unwise to mix alphabetics and numerics within the same code, an "eleventh" decimal is needed. Even if that were available, although, it would be difficult to distinguish why a particular question is "inapplicable" if it lies within a multiple contingency. Thus, blanks remain as legitimate codes, meaning "question(s) inapplicable." Although "blank" does not appear in the codes for all contingent questions in the WAIS Interview Codebook, the instructions following the independent question are "Go to Qx." Perhaps this isn't the best solution, but it's the cheapest, easiest and least time-consuming approach. It will present few analysis problems as long as these conditions are remembered and allowed for. 3. Most of these coding errors have been located; they are considerably easier to find and correct in contingency questions. 4, 5, 6. If the indicating codes are correct, most of these errors can be repaired. The planned coding of Card 49 will summarize many indicators, and code, for the first time, many indicating questions formerly not punched (e.g., IS R NOW WORKING?) 3.3 Summary of findings 1. Blanks (clearly illegitimate) Generally, there are three kinds of errors here: a. Codebook discrepancies (See above) b. Incomplete interviews (See above, also) c. True errors As an example of a) concerns the number of employees responsible to R. This question (Q59) is dependent on Q58 (which, incidentally, is dependent on Q56, wherein a source of error may lie). Q58: DO YOU WORK FOR SOMEONE ELSE OR ARE YOU SELF-EMPLOYED? There are three answers: "someone else," "owns controlling interest in corporation," and "self-employed." If the R was self-employed, he was then asked how many employees were responsible to him. The Interviewer Instructions allow for people who don't fit any of these categories; such people manage life insurance agencies, etc.; these people followed the Q59 pattern. However, a code indicating this was not allowed for and reported in the codebook. This is another source of error (which incidentally has been cleared up in the future codebook). Incomplete interviews produce such errors as blanks in the education code. 2. Blanks (possibly legitimate), ---number of children. The (old) codebook states that the codes are 00, 01, 02, ... 99. The data shows the codes as 0, 1, 2, ... 99. This may be the source of error (two-digit-code versus right-justifiedcode). The codebook has been corrected. 3. Impossible codes -- 26 = business debt. -- wife's educational level. This code is similar to, but not the same as, the educational code. See WAIS 678-003 for a discussion of this error. -- The codebook code for "number of friends and relatives helped" and "numbers of persons who lived with R" is 1 = Yes, 3 = None. The data code is 0 = None, 1-9 = Number of people. The codebook has been corrected. 4. Unlikely codes (Most of the deVries examples are yet to be reviewed.) -- 00 = R never held a job (or a previous job) 5. Combinations of information a) Educational level vs. highest degree received: I will run these tables again to see if errors still exist following EDIT. b) In many cases, when R took short courses or extension courses which were vocational in nature, the school was coded in the undergraduate state and college columns. This may be the source of "error" for those 30 R's. In any case, these columns refer to "state where first degree received," and "state of undergraduate college attended." c) In some cases, the undergraduate college and state were simply repeated in the graduate college and state columns; these errors have been corrected. Conclusions In the past two years' work on the Survey has produced definite clarifications to many "deVries allegations." Future plans include concluding EDIT-ing the Survey File, concluding the WAIS Interview Codebook, runs of WISTAB and X-TAB on a revised extract for editing purposes, and, eventually, for analysis purposes. By September 1, hopefully, the Survey will be ready for final analysis. Any notes, comments, or suggestions concerning this paper, or the Survey itself, are very welcome.hahttp://www.ssc.wisc.edu/wais/WAIS689002.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689002.txt;@Larry Schroeder0 1968 Naming Files on the B5500August 8, 1968 WAIS paper689-0066/Data Processing General Papers (Regarding WAIS)5.(Larry Schroeder WAIS 689-006 August 8, 1968 Naming Files on the B5500 Besmear of our very prolific B5500 programmers and the ease in which one can write a program an it and change it just a little to yield a different program, we must implement some WAIS-wide program-naming conventions so as to more easily keep track of these files. This paper is not meant to force upon the entire project any certain convention. Instead it is intended to 1) give an example of how property file programs and data files will be named and 2) more importantly, become an object for discussion so that sometime in the near future the programming staff can Implement some agreed upon coventions. Each of the three major types of files will be discussed -- source program files, object program files and data files. Also test files both data and programs will be mentioned with the paper concluding with a short example from our own property file. Throughout the paper the main objective is to design a two-part naming system using not more than 14 alpha characters which will allow users of the B5500 to most easily locate all of a specific type of file and tell from a glance whether it is a source program, object program, or data file. Source programs are of course those programs written in one of the various programming languages, for example, FORTRAN, ALGOL, or COBOL. Since these are written by specific programmers it would seem reasonable to use the programmers name -- either last name, first name or unique (but recognizable) combination of letters from it -- as the first part of the 2-part source program name. The second part can be uniquely made up by the programmer himself. For example, SCHRADER/ATTEMPT GATES/RETREEV WILDE/MAN If each programmer always used this code all of his source programs could be easily found by simply doing a TFD under his code name. Object decks are programs which have been compiled into machine language from a programmer's source file. It is the object program which is executed on the various data files to provide the desired (at least usually desired) output. Because of its integral relation with 1) a specific WAIS file (e.g., Property file or master file) and with 2) a specific source program it is suggested that the first part name of an object deck correspond with a specific WAIS file or in the case of a general program it be called GENERAL/ and the second part name correspond with the second part name of the source file from which the object deck was compiled. Examples include : PRPERTY/ATTEMPT GENERAL/RETREEV MASTER/MAN SURVEY/EXAMPLE Data files are, of course, the basis of the WAIS project. For this reason perhaps the first part name can include WAIS and the second part name give some indication of its characteristics. If we reach the point where both property file and master file data are on disk we will have to differentiate them. The second part name could be creation date or format specification. Some examples: WAISP/AUG768 WAISM/JUL468 WAISPF/FORM1B WAISMF/CARD2C A separate question concerns test programs and data files. Here the programmer need not be as particular about using a specific naming-convention similar to all other WAIS programmers. However, it does seem useful to the programmer himself to use some type of unique identifier. For example I use LARRY/ for test programs and TEST/ for test data. An example of these above-suggested conventions taken from the property file is as follows: I have written an ALGOL program which takes an 80 column record which had been segmented into 3 segments and puts these segments "back together" to form an 80 column record. I named this program in source form SCHRDER/PUTEK1A When compiled it was called PROPRTY/PUTEK1A and test run on TEST/CD1A which was a data file I created. The program was finally run for production on the data file WAIS/CD1A.hahttp://www.ssc.wisc.edu/wais/WAIS689006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689006.txttes the years 1963-64. The second line, though different from the old format 805, is not changed, because its layout was not specified by the SSA. To operate the program CH805 place it in read hopper and watch console for message to install the new delivery of 805 data (contained on SSRI 366) and a scratch tape, which will contain the output, when the program terminates. This output on scratch can then be used as an input for program JR 5, after it has been adapted to a one-reel input of 805 data. The output from this run then should be sorted into the already existing WAIS-805 data file (JR tape "Reformatted Form 805,E") the sort program for this is S0805: JOB Sort Old and Now 805 Files 1410 sort program NOTE: Program JR 5 has been modified so that a "2" has been placed in position 47 (last position of the sex field) of line one. This way we can distinguish records of the second delivery from those of the first delivery by checking for a "2" in column 138 of the reformatted 805 tape~8 1965:3An Additional Capability for Updating the FFID FileOctober 21, 1965 WAIS paper656-024lTNFixed Format Identification File (FFID) Maintenance System - Files, Data, Etc.hahttp://www.ssc.wisc.edu/wais/WAIS656024.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656024.txth>!Gene Moyer Mike VonSchneidemesserhGene Moyer VonSchneidemesser WAIS 656-024 October 21, 1965 An Additional Capability for Updating the FFID File In coding the benefit file, it became obvious that WAIS needs to be capable of keeping its ID file up-to-date without repunching the entire FFID. If we find that a person has the same address in 1965 that he had in 1960, we should be able to replace the latest date we knew the person resided at the given address with 65. The program FFYR: HEADR: Redate Last Year TITLE: FFYR , 1410 program is designed to take care of this need. It will be appended to the already existing "Fixed Format ID Maintenance System" and added to that program collection (Jon Ryshpan, WAIS 656-019). The inputs to this program are (a) the already existing tape with FFID data sorted on WAIS ID number (JR tape FID,B) (b) a card deck with the following layout: Cols. 1-8 WAIS ID# 9-17 SS# (no dashes) this entry is not required 18-19 Last two digits of the latest year the person resided at the address 20-74 blanks 75-80 "REDATE" - this identifying code is not required The program then will overlay cols. 122-123 of the FFID records with the year given in columns 18-19 of the card input.Gene Moyer James Geffert 1965<5Listings and Procedure for Getting Age and Death DataINovember 9, 1965 WAIS paper656-030Age DataGene Moyer James Geffert WAIS 656-030 November 9, 1965 Listings and Procedure for Getting Age and Death Data In order to facilitate gathering age and death data for those persons who have no social security number or who could not be found by the Social Security Administration, we need a list of these people and a form on which to record the data we gather. To keep the amount of paper handling to a minimum, James Geffert is working on a listing which will have the recording form directly after the name and address of the person for whom data is to be gathered. Each entry on the list will have the following approximate format: Year Sequence FFID Number 1a County of death Death number 4. Date of death MO DAY YR 6. check if non-white 7. Marital Status M NM W D 8. Date of birth MO DAY YR9. Age in years 10a Usual Occupation14. Mother's maiden name 15. Armed services record16. Social Security # This listing will be sorted alphabetically within "last year filed'' (79-80 of FFID) so that there will be a maximum of 15 alphabetically sorted lists. Procedure: (1) Death record indices are arranged alphabetically for the entire state by year of death. Each entry has the following information: Name, date of death, place of death The records themselves are arranged differently depending on whether death occurred before or after January 1, 1960. Before 1960 (1947-1959), these are arranged by year of death and county. After 1960 (1960-1965), these are arranged by year and death number. Therefore take a listing for year t and the index for year t + 1. Run through our list and the index, recording the date and place of death for any name which agrees on both lists. If there are two or more names on the death index which are the same, record all the dates and counties of death so that the decision about which one is proper can be made when you look at the death certificate itself. Check the indices of t+2 and t4-3 for duplicate names as well. For these duplicate names, use the form shown on page 4 . (2) Having compared our list with the index for year t+1, also check the index for year t+2, and t+3. Then go to the records. (3) If the date of death is < 1959, find the record by checking all deaths in the county of death for that year. If death occurred after 1959, find the record by checking the death date. (4) If there is only one person with this identical name, check section 2 of the death certificate, "Usual Residence" and the FFID address: A. If the addresses are the same (1) Record the township (town) name in the address area if it is a rural address. (2) Fill in the form on the listing. B. If the addresses are different (1) Cross through the address on the FFID and write in the address on the death certificate above it. Do not forget to change the county number. (2) Fill in the form on the listing. (5) If there are two or more persons with this identical name, check section 2 of the death certificate. A. If the usual residence and the FFID are the same for one of these persons, fill out the listing form for that person only. B. If no address is the same, fill out the "Form for persons with Identical Names" for each person with this name who has a death certificate in the file. (6) Bring each list back to the campus for punching when it is completed. APPENDIX A Form for Persons with Identical Names Year-sequence number 2c: city or town county number 2d: street address 1a. County of death Death Number 4. Date of death MO DAY YR 6. Check if non-white 7. Marital status M NM W D 8. Date of birth MO DAY YR 9. Age in years 10a Usual Occupation 14. Mother's maiden name 15. Armed services record 16. Social security # APPENDIX B The following are sample death certificates used by the state from 1947 to the present.hahttp://www.ssc.wisc.edu/wais/WAIS656030.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656030.txta+X Wynn Bussmann 1968Outline of Operations That Must Be Performed on the Property File and Stock Price and Dividends File (Formerly Firm Data Files) in Order to Reconstruct Portfolios for a Sample of Taxpayers March 6, 1968 WAIS paper678-049h Property FilecWynn V. Bussmann WAIS 678-049 March 6, 1968 Outline of operations that must be performed on the Property File and Stock Price and Dividends File (formerly Firm Data Files) in order to reconstruct portfolios for a sample of taxpayers. I. Introduction The author's dissertation involves reconstructing stock portfolios for a sample of 130 taxpayers (the 130 Sample) of the WAIS Survey sample of 1300 Wisconsin taxpayers. This paper merely outlines the steps that must be undertaken in order to process the data into reconstructed portfolios. Although this paper refers specifically to the author's dissertation, the fundamental ideas of portfolio reconstruction hold also for the entire WAIS sample. Thus, the basic steps necessary to process the entire set of WAIS data to reconstruct portfolios are the same in either case; in that light, then, this paper may serve as a guide for future work on portfolio reconstruction for the entire WAIS sample. II. Operations to Property File data A. Extract the interest and dividends card and the capital gains card for the 130 Sample onto tape (tape 1). This step requires some modification of Mark Lieberman's extract program. B. Reformat tape 1 using Mark Lieberman's reformat program. The new format will be the same as the formats of cards 1 and 2 on pp. 19-24 of WAIS 678-031. C. Design the print-outs of reformatted tape 1 (tape 1 R) to be used by the coder and console operator. Larry Schroeder is currently working on these designs for the entire Property File. D. Obtain separate print-outs of the dividends and interest card and the capital gains card. This step requires a program that is being written by Mark Wilde. E. Code data from source documents onto print outs. This requires a codebook to be written by the author. 1. Perform visual edits of the existing data (e.g., dollar amounts, asset types, and identification numbers). 2. Record additional data (e.g., units for dividends, number of shares for dividends and capital gains, sequence numbers, etc.). 3. Eliminate invalid duplicates, resolve valid duplicates, and surge the file of interest data (the author is concerned only with stock portfolios and will use the total interest income only as a control variable). 4. Add new data which was not coded or not recorded in the formation of the original file. This step involves use of the Survey data (1) to check on the Property File data, and (2) to provide additional data on ownership stocks which are not available from the tax returns. F. Enter the code sheet data produced by Step E into the file by means of the console to the B5500 to get tape 2 R. Steps El and E2 will be entered in one pass at the console, and Steps E3 and E4 will be entered at a second pass at the console. G. Code 1960-1964 tax return and Survey data for the 130 Sample on coding sheets appropriate for use at the console. H. Merge the 1960-1964 data with the 1946-1959 data on tape 2 R for the 130 Sample to get tape 3 (1946-1964 Property File data for the 130 Sample). If there are not standard merge programs available for this purpose, one will have to be written, unless the merge can take place as the 1960-1964 file is being formed at the console. Note: An alternative way of performing Steps G and H is to follow the traditional route of coding, keypunching onto cards, and then putting the cards onto tape. Since the anticipated volume for the 130 Sample is small, it is felt that performing Steps G and H via the console would provide a useful pilot study to determine the alternative costs of file formation via the console. From such information, the decision may be made concerning which method to use for the entire Property File. I. Edit and correct tape 3 using Bill Gates' edit program. If the necessary revisions are made in time, the edit program may operate during the file correction and formation at the console, thus providing "instant" detection of errors which may be corrected right at the console. What results from Section II is a sample Property File ready to be used for stock portfolio reconstruction. Section III outlines the steps necessary to obtain a complete Stock Price and Dividend File for the stocks owned by the taxpayers in the 130 Sample (SPDF-130). III. Operations to stock price and dividends data A. Extract onto tape (or cards) from tape 3 the identification numbers of firms that are supposed to be on the Lorie-Fisher File (LFF) to get FID-1. Also extract the identification numbers of firms that are not on LFF to get FID-2. This step requires a program to be written. B. Run FID-1 on the LFF tapes to extract onto another tape, LFF-1, the relevant information for the firms listed in FID-1. The program, which has yet to be written, should also kick out the identification numbers of firms that are on FID-1 but are not on LFF, so that they may be added to FID-2 to get FID-3. Note: We are expecting to replace the six defective LFF tapes at Commerce with new LFF tapes which will run from 1926 to 1966. Code books and formats of the new LFF tapes are to be sent along with the tapes from Standard Statistics, Inc., N. Y., which currently has title to all of the Lorie-Fisher, Center for Research on Security Prices data. Once we have received the code books and the tapes, we can decide what data we wish to extract from them and in what format we wish to extract them. C. Run tape FID-3 on the Firm Identification Files, FID (which are on cards and which link a firm's name with its identification number), to match the numbers on FID-3 with the names on FID and print out a list of matched names and numbers, FNID. This step may be done by hand if the volume is small; otherwise a program needs to be written to perform the match. D. Obtain the data for the firms stocks on FNID from the Schaffner Library (Chicago) and from newspapers. These data will be coded onto code sheets from which a file will be made (either by console or otherwise), SPD-1 E. Merge SPD-1 with LFF-1 to get SPD-2. F. Edit and correct SPD-2. At this stage, we have two files ready to be merged to produce reconstructed portfolios. A reconstructed portfolio is essentially a time path of the month-end values of the whole portfolio and of individual stocks in the portfolio. Various embellishments are also envisaged. For instance, we shall want to know how much total appreciation there is and what proportion is long-term as opposed to short-term. We shall want to know also other characteristics of the taxpayer such as his age, total income, breakdown of income by source, etc. Because the exact content of the final portfolio file has not yet been decided upon, the outline of the steps necessary for its construction shall be included at a later date as part IV of this paper. For now, parts I-III will serve as a guide for work in the immediate future.hahttp://www.ssc.wisc.edu/wais/WAIS678049.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678049.txt Last Name 38- 50 13 5 First Name 51- 62 12 6 Middle Name 63- 96 33 7 Address 97-117 21 8 Post Office 118-119 2 9 Zone 120-121 2 10 County Code 122-123 2 11 "bb" 124-124 1 12 " "hahttp://www.ssc.wisc.edu/wais/WAIS645058.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645058.txt/7AFRTXZfgpquo}u |~  {!"#%&'(&*+)0/2457!6!!<9GG &>G,,D88K?:AJJPKMORQTWSXU>YZ>.!=^]`b;cegNhidFF@@lk\_1@nnw1LooLv|xppp@oa~ b Srilekha Dey 1966Search for the Age DatatMarch 21, 1966 WAIS paper656-048pAge Data~wSrilekha (Cuccu) Dey WAIS Paper 656-048 March 21, 1966 SEARCH FOR THE AGE DATA Several hundreds of people had stopped filing their income tax returns at various times between 1947 to 1964. We had the information regarding age for a vast number of these people but there remained a paltry 2,500 whose ages we did not have and had to think of devious means of obtaining them. Thinking over various methods which seemed quite futile, we decided upon five possible methods which looked most hopeful. (1) Investigating Wisconsin death records kept at the State Office Building (2) Matching ID's in the Benefit data file which had ages for those ID's that were entered there. (3) The parents in the Benefit data file whose children appeared in our list of ageless, often had their ages recorded in their parents file. (4) Hunting through our files to see if by any chance a reference had been made as to the age of the person. (5) The Motor Vehicle Department would be able to get some ages through their records if given the names (inclusive of middle name or initial) and the address. However, they can only look for those people who have renewed their licenses within the last four years. By means of the first method, which required checking against indices and then counter checking against actual death certificates, I have found 466 positive ages and also have 207 tentative ages. These are people whose names are common and several appear in the indices. The 2nd, 3rd and 4th methods seemed to overlap slightly but have yield 269 definite ages. Counting the doubtful 207 as certains, we still have 1,518 people whose ages remain unknown to us and we have to give their names and addresses to the Motor Vehicle Department. However, I am not sure as to how many of the 207 will yield positive results so the number for the department may be slightly larger. Percentage Number checked 2500 Number of positive death 466 18.64 record matches Number of positive additional 207 8.28 death record Number of positive ages from 269 10.76 Benefit file and number of cases found in our record Number of cases to be given 1518 62.56 to license bureauhahttp://www.ssc.wisc.edu/wais/WAIS656048.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656048.txtr Marcia Hinckley Gene Moyer 1966(!Indexing WAIS Tables and ListingsoSeptember 29, 1966 WAIS paper667-010NGAdministration Cross Tabulations Maintenance System - Files, Data, Etc.sjdMarcia Hinckley, Gene Moyer WAIS 667-010 September 29, 1966 Indexing WAIS Tables and Listings Since WAIS has already produced numerous tables and listings and seems destined to produce countless more, some method of indexing these must be implemented, so that each table or listing can be located quickly and easily. The indexing system must be simple enough to be readily understood by someone relatively unfamiliar with the project, yet it must also be complete enough to help this stranger to WAIS find the desired tables, variables, etc. without undue mental anguish and proverbial haystack-searching. The indexing method outlined by Gene Moyer in WAIS paper 656-002 is too complicated to be practical for this purpose. Several indices should be devised, instead, using the very program control cards and run cards used to produce the tables, in the case of new tables; a simple index of existing tables and an index of all WAIS listings should be compiled by hand. As the tables are run they should be collected in some logical order and bound into books, each of which would receive a title descriptive of the tables contained therein. A general index should list the titles of all the tables in the book in alphabetical order. There should also be two more detailed indices in the front of each book: 1) a Variable Index in alphabetical order by variable name; 2) a Position in Record Index in numerical order by the positions of the variables in the records. This variety of data arrangements should facilitate tracking down tables, variables and the data in any position in a record. As the tables ( both WISTAB and CROSSTAB) are run, they will be gathered into books, and their control cards and run cards will be collected into an index file. When a book of tables seems complete, these index cards will be used to produce the three indices to be placed in the front of the book. A XTAB variable card is used just once in any run, so the card will simply be punched for the book name and page number and then transferred from the original input deck to the index file. Each card will contain: 1. The variable name column's 1-8 2. Position in the record columns 48-52 and columns 54-57 3. The name of the book columns 59-74 4. The page number in the book (left-justified) columns 76-80 One WISTAB control card is normally used in each run, and, as for the CROSSTAB Index, the card can usually be punched for book name and page number, and transferred from the input deck to the index file. However since an X-card or a Z-card may be used in more than one table, any such card would have to be duplicated to produce one card for each table in which the original card was used. Also, if the columns are punched beyond 58 (i.e., 59, 60, etc.), the card will have to be duplicated up to column 58 and then punched for book name and page number. Each card will contain: 1. The variable name columns 5-14 2. Position in the record and columns 18-20 3. The name of the book columns 59-74 (left-justified) 4. The page number in the book columns 76-80 columns 15-17 TABLE TITLE INDEX The run cards from the XTAB programs and the CROSSTAB cards from the WISTAB programs for all the tables in a given book, once the book seems complete, should be put in alphabetical order by hand according to table title. This will result in an input deck for producing a listing by table tithe, giving the run number as well, e.g,,: Table Title Run # Qualifying Status by AGI No Interpolated years allotted in B test 1 Qualifying Status FNT1 No interpolated years allowed in B test 1 VARIABLE INDEX The control cards (for WISTAB tables) will be separated from the variable identification cards (for CROSSTAB tables), by sorting the cards on the first two columns. The two decks produced will then be sorted on the variable name, WISTAB cards, columns 5-14; CROSSTAB cards, columns 1-8. The resulting decks must then be merged by hand, resulting in an input deck for producing a printed-listing of variable names, with other related information, in alphabetical order, e.g.: Variable name Position in Record Book Name Page number AG1 132 Tax Averaging 35 POTAVINC 104 Tax averaging 11 POSITION IN RECORD INDEX After the Variable Index has been printed, the deck will again be sorted into WISTAB and CROSSTAB decks. These decks will then be sorted according to position in the record, columns 15-17 in WISTAB; columns 48-52 in CROSSTAB. The two decks will then be merged by hand and a printed listing, ordered numerically by position in record, will produce the resulting input deck, e.g.: Variable name Position in Record Book Name Page number RACE 9 30 DRYER 127 12 The Position in Record Index will not give sufficient information to identify the record itself, but this can be ascertained by referring to the page given, finding the table title on that page and referring to the Table Title Index to discover the run number.hahttp://www.ssc.wisc.edu/wais/WAIS667010.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667010.txtcKolloft" 1979,&Case Dispositions Code for WAIS-Wealth June 26, 1979 WAIS paper789-010$WAIS-Wealth: Sample ProcessingnhKoffolt WAIS 789-010 June 26, 1979 Case Dispositions Code for WAIS-Wealth Code 01 Death confirmed by SSA, not found in vital ('0' Series) --Temporary code: Further info to come from SSA 02 Death confirmed by SSA & vital, will but no proceedings found in probate (confirmed not in probate--NIP) 03 Death confirmed by SSA & vital, estate probated but probate file either still open or lost (confirmed NIP) 04 Death confirmed by SSA & vital, case not found in probate, but correct match confirmed by other sources in probate (confirmed NIP) 05 Death confirmed by SSA & vital, case not found in probate (unconfirmed NIP) 06 Non-soc sec decedent with 3 features hand-matched between WAIS archive & vital, not found in probate 07 Non-soc sec decedent with 2 or less features hand-matched between WAIS archive & vital, not found in probate 08 Non-soc sec decedent (hand-matched) with case found in probate but no asset info ('99' series confirmed NIP) 09 Non-soc sec decedent in which probate decedent did not match WAIS and/or vital decedent ('no match') 10 Death confirmed by SSA & mn. vital, not found in probate 11 Death confirmed by SSA & (loosely) by fl. vital, not in probate 12 Death confirmed by SSA & CA. vital, not in probate 13 Death confirmed by SSA & IL. vital, not in probate 14 Death confirmed by SSA, resided in another state which WAISwealth has not searchedhahttp://www.ssc.wisc.edu/wais/WAIS789010.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS789010.txt Gene Moyer 19650*The Treatment of Wives in WAIS' Tax SampleNovember 24, 1965 WAIS paper656-033\*$Master File- Tax Records WAIS Samplehahttp://www.ssc.wisc.edu/wais/WAIS656033.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656033.txtGene Moyer WAIS Paper 656-033 November 24, 1965 THE TREATMENT OF WIVES IN WAIS' TAX SAMPLE Women's names as well as men's were among the 50 names chosen at random. Indeed, of the 50 names chosen, almost 1/3 (30%) were women. When WAIS went to the archives, however, it found a completely different situation. Returns were filed in family folders under the name of the male head of the family. In families which were unbroken from 1946-1960, therefore, we got a group of wives whose initials did not necessarily place them in our name groups. For example, one name group was G.B ____. All persons with the last name B______(deleted to protect confidentiality) and with a first initial "G" were to be included in the sample; all other persons with the last name B____ were to be excluded. This was criterion 3(a), but the same provision applies to groups chosen by 3(b). Consider, however, George B_____ who is married to Mary B____. Mary is included in our sample even though she should strictly have been excluded. Marvin B_____ was excluded from the sample. His wife, Gertrude, however, is also excluded from the sample simply because she married Marvin B_____ rather than a George or a Gerald B______. While Mary and Gertrude may not be prefect substitutes for each other, WAIS assumed that all the Marys and all the Gertrudes were pretty much the same and so this substitution of one group of wives for another probably made little difference in the sample or even in the randomness of the sample since the choice of G. B_______ rather than M. B____ or L. B_____ was random. A more serious problem, still involving wives, arose from the placing of families in the male head's folder. All families were not stable units during this period. Some families were not found until some years after 1946, even though both husband and wife may have filed from 1946 until the marriage as single persons or as spouses of other persons. The men in these groups pose no problem. If they were in the sample in 1946 (or 1947), they remained in it until 1959 or 1960 unless they died, moved out of the state, or failed to file because of low incomes. Women, however, present many problems. As a general rule, when a woman marries for the first time, the tax department places her returns for the years she was single into the folder of her new husband. If her husband subsequently dies, her returns remain in his folder as long as she remains a single widow. If she remarries her returns for the single years are put into her new husband's folder along with returns for the subsequent married years. If she divorces this first husband, her returns for the years after the divorce are kept in a separate folder until she remarries at which time they are placed in the folder of her new (second) husband. While there were errors made in this procedure because searches for early returns are time-construing and matching is difficult when names change, the returns were in general arranged in this way. Because of the thirteen (or fifteen) year time period of the sample, and because the sample included only 0.775 percent of the taxpayers on the rolls for 1958, this procedure has many implications for the sample. Six cases can be identified: 1. A woman with any maiden name marries a man with a "sample" name after 1946 and remains married through 1960: All her returns from 1946 to 1960 are included in the sample under her husband's household identification number. 1a. A woman with any name (incl. married woman) whose husband dies after '46 will be filed with the husband, if he is in the sample. If she marrries again a man in the sample her single years only after death of first husband are filed with her new husband. 2. A woman with a "sample" maiden name marries a man with a non-sample name after 1946 and retains his name until 1960: None of her returns are included-in the sample. 3. A woman is married in 1946 to a man (in the sample) who dies before 1960. She does not remarry by 1960: All her returns are in her husband's folder and are coded under one household identification number. 4. A woman is married in 1946 to a man with a "sample" name, divorces him after 1946 and does not remarry by 1960: Her returns for the years she is married are under her husband's household identification number, Her returns for subsequent years are available under the same household identification number if her name is still in the sample. 5. A woman is married in 1946 to a man in the sample. He dies or she divorces him after 1946; she marries before 1960 a man with a "sample" name: Her returns for the years of the marriage are under her first husband's household number; for the year of the divorce or death and subsequent years, her returns are under her second husband's household number. An integration deck exists so that this may be merged if this is desired. 6.A woman is married in 1946 to a man in the sample. He dies or she divorces him after 1946; she marries before 1960 a man with a "non-sample' name: Her returns for the years of the marriage are under her first husband's household number. No other returns were available because they were in the folder of a man not in the sample. Errors not corrected by the time WAIS microfilmed its sample or inconsistencies caused by impossible matches make additional "cases." These cannot be easily characterized. WAIS has corrected these when possible, but returns which were not available remain unavailable. A Note on the Integration (Cross-Index) File The cross-index file exists in rudimentary form only. It was formed by comparing identification records and social security account number and punching out any cases which were matches. WAIS personnel checked the deck as well as possible by looking at cases which seemed "doubtful" and by changing identification records clearly in error. Errors probably remain in the file. WAIS is currently contemplating the construction of a "History" file which will include all of a person's identification numbers and an indication of the number of year records present in the Master File for each person in that file. Edits generated in the construction of the "History" file should allow WAIS to correct the errors in the current integration deck, and the corrected "History" file should allow integration to be done without significant error. VonSchneidemesser Appendix to WAIS 656-033 January 13, 1967 The Treatment of Wives in the WAIS Tax Sample Two additions to the six cases p. 2 - 4 7. A woman was married to a man with a "nonsample" name. Her husband dies after 1946. She files after his death using her first name, which puts her into the sample. All her returns as far back as just happen to be available will be included in the sample, even though she was married then. (Ex: Robert and Geraldine B______) 8. A woman with any name, whose husband died after 1946, will be filed with her husband, if he is in the sample. If she marries again a man in the sample, her single widowed years only after death of the previous husband should be filed with her new husband.  Martin David 19676/Cross Tabulations for the Longitudinal AnalysisMarch 16, 1967 WAIS paper667-031Cross Tabulations Tables Martin David March 16, 1967 WAIS Paper No. 667-031 Cross Tabulations For the Longitudinal Analysis 1. Read output tape SSRI No. 2. Transform variables 43 ( ( )i Model A), 54( ( )i Model B), and 63( ( )i Model C) as follows: 2.1 ( )i(Model A) - ( )i(Model A) - ( )i(Model A) [1959-t] Variable No. 70 43 44 52 2.2 ( )i(Model B). ( )i(Model B) - ( )i(Model B) [1959-t] Variable No. 71 54 55 52 2.3 ( )i(Model C) - ( )i(Model C) - ( )i(Model C) [1959-t] - ( )i(Z(c,t)) Variable No. 72 60 61 52 68 2.4 Transform variables 46-48, 56-57, 63-65 as follows, if the variable is X, v is number of observations (v.5?) y - x z - v-4 Enter y in the attached table y = F(P,Z). Define if Pi designates a column in the Table F Po as that column of the table for, which F(Po,Z) < y < F(Po+1, Z) (You may border the table with zeros on the left and high order 9999--- to make this hold.) Replace variables 46-48, 56-57, 63-65 by Po. Appendix Table S-Significance points of t (Reproduced from Ronald Fisher and, Dr. F. Yates: Statistical Tables for Biological, Medical and Agricultural Research (River and Boyd Ltd., Edinburgh, by kind permission of the authors and publishers) - P - 0-9 0.6 1 0.5 0-4 0,3 01 0 0-325 O-S10 0-727 1 'coo 1-376. 1-963 3-0-78 6-314 12'7o6 31-921 1.63-637 2 0-142 0-289 0-445 O'617 o-816 1.061 r-386 1.886 2-920 4.303 6-965 9-945 1636-619 31198 3 0-137 0-277 0-434 0184 0-765 0-978 1.638 4*54t 5-841 32-924 4 0-134 0'2j 0,414 O-Efin 0--7A I 0-941 2So 1.533 2:353 3-182 3-747 4-64 8-6so 1.190 2.132 2-776 5 0-132 0:67 0-406 0-559 0'727 0-920 1156 1-476 2-015 2'571 3-3165 4-3x 669 6 0-131 0-265 0-404 0-553 0-718 j906 It34 V440 1.943 3447 3-1 3107 S959 7 0-130 0-261 0-402 0.549 0-711 a-896 1-41S 1.893 2-361 3.499 5-403 0-130 0-262 0-399 0-546 a-7o6 o-8&) 1119 1'397 1.860 2-300 3-335 0-041 Longitudinal Analysis of Income OUTPUT RECORD FORMAT Variable Name VO - V42 WAIS Paper No. 667-031 Roger F. Miller, Martin David Revised: March 15, 1967' V43 V44 V45 V46 V47 V48 V49 V50 (Same as Table 3.3-1 INPUT FORMAT FOR REGRESSION ANALYSIS PROGRAM, REVISED 2/27/67) Model A Coefficient ( )i Model A Coefficient ( )i Model A Coefficient ( )i Model A t-ratio ( )i Model A t-ratio ( )i Model A t-ratio ( )i Model A Adjusted R2 Model A Standard error of estimate ( )i ( )i ( )i V51 V52 V53 V54 V55 V56 V57 V58 V59 V60 V61 V62 V63 V64 V65 V66 V67 V68 V69 ( )i,t-l t average date of filing no. observations Model B Coefficient ( )i Model B Coefficient ( )i Model B Model B Model B Model B Model C Model C Model C Model C Model C Model C Model C Model C z (c, t) Record mark t-ratio ( )i t-ratio Ri Adjusted R2 Standard error of Coefficient ( )i Coefficient ( )i Coefficient ( )i ( )i ( )i ( )i Adjusted R2 Standard error of estimate estimate ( )i t-ratio t-ratio t-ratio Roger F. Miller Martin David. February 27, 1967 Supplement to Job Plan I. For each person also do following regressions Model B yit= Yit/Yct = ( )i + ( )i(t-59) + ( )it Model C Yit = ( )i + ( )iYet + ( )i(t-59) + ( )it (unobserved residuals) Collect data on these ( )i - ( )i parameters in your output for tabulation, by individual, including standard errors, t ratios, and r2 or r (both with and without degree of freedom adjustment). II. Model A t( ) = coefficient ( ) / variance of coefficietnt 1/2 Model B t( )i = coefficient ( )/variance of coefficient 1/2 Model C t( )i t( ) t( ) t( ) computed analagously to t ( )i in Model B The standard error of estimate ( ) is normally calculated from the unexplained variance as ( ) where k is the number of parameters estimated WAIS Paper No. 667-031 corrected 2/27/67 by Martin David Table 3.3-1 INPUT FORMAT FOR REGRESSION ANALYSIS PROGRAM FIELD TAG NUMBER OF CHARS. CONTENTS OF FIELD LABELS OF MASTER FILE FIELDS EXTRACTED, RECODED, OR COMPUTED TOTAL CHARS V0 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46 V47-V50 V51-V54 V55-V58 V59-V62 V63-V66 V67-V70 V71-V74 V75-V78 V79-V82 V83-V86 V87-V90 V91-V94 V95 8 2 1 1 3 3 2 2 3 3 3 1 1 2 2 2 2 2 2 2 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 81 1 1 8 11 11 11 11 11 11 11 11 11 11 11 11 1 WAIS ID NUMBER BIRTH YEAR SEX MARITAL STATUS CHANGE (MS) INITIAL RESIDENCE LOACATION (RESI) FINAL RESIDENCE LOCATION (RESF) OCCUPATION GROUP HELD LONGEST (OFMAXI) OCCUPATION GROUP HELD SECON LONGEST (OGMAXII) PROPORTION OF TIME HELD OGMAXI (OGRI) PROPORTION OF TIME HELD OGMAXII(OGRII) PROPORTION OF TIME HELD OGMAX1+=OGMAXII(OGRS) INDICATOR LABOR FORCE STATUS INDICATOR OG CHANGE (OG) (LFS) INDICATOR, LF CHANGE (LF) FIRST (DETAILED) OCCUPATION HELD (OI) LAST (DETAILED) OCCUPATION HELD (OF) FIRST YEAR FILED NUMBER OF YEARS FILED (N) NUMBER OF MISSING RECORDS (N*) NUMBER OF GAPS IN THE FILE (G) NUMBER OF "MISSING INFO." INDICATORS IN AGI MEAN AGI MEAN WAGES & SALARIES MEAN DIVIDEND INCOME MEAN CAP, GAINS OR LOSSES MEAN SELF-EMPLOYMENT INCOME, PROFIT/LOSS MEAN INTEREST INCOME MEAN RENT MEAN NTI MEAN OTHER INCOME MEAN TRUST AND ESTATE INCOME MEAN SUM OF SOURCES ERROR STANDARD DEVIATION OF AGI ST. DEV. OF WAGES AND SALARIES ST. DEV. OF DIVIDEND INCOME ST. DEV. OF CAP. GAINS ST. DEV. OF SELF-EMPLOYMENT INCOME ST. DEV. OF INTEREST INCOME ST. DEV. OF RENT ST. DEV. OF NTI ST. DEV. OF OTHER INCOME ST. DEV. TRUST AND ESTATE INCOME ST. DEV. OF SUM OF SOURCES ERROR NUMBER OF SOURCES OF WAGES IN 1947 AGI CALCULATION INDICATOR FOR 1947 POSSIBLE JOINT RETURN INDICATOR FOR 1947 AGI FOR 1947; CODE "8000000" FOR NO RETURN SAME INFORMATION AS V42-45, FOR 1948 SAME INFORMATION AS V42-45, FOR 1949 SAME INFORMATION AS V42-45, FOR 1950 SAME INFORMATION AS V42-45, FOR 1951 SAME INFORMATION AS V42-45, FOR 1952 SAME INFORMATION AS V42-45, FOR 1953 SAME INFORMATION AS V42-45, FOR 1954 SAME INFORMATION AS V42-45, FOR 1955 SAME INFORMATION AS V42-45, FOR 1956 SAME INFORMATION AS V42-45, FOR 1957 SAME INFORMATION AS V42-45, FOR 1958 SAME INFORMATION AS V42-45, FOR 1959 RECORD MARK B-9 B-409 B-9 (POSISTION 8) B-27 (POS. 23) B-27 (POS. 12-14) B-27 (POS. 12-14) B-27 (POS. 18-19) B-27 (POS. 18-19) B-27 (POS. 8-19) B-27 (POS. 18-19) B-27 (POS. 18-19) B-27 (POS. 18-19) B-27 (POS. 18-19) B-27 (POS. 18-19) B-27 (POS. 18-19) B-27 (POS. 18-19) B-11 B-11 B-11 B-11 B-153 B-144, B-153, B-378 B-36, B-45, B-54 B-72 B-90 B-99 B-117 B-63 B-81 B-171 B-306 B-378 B-126 B-108 B-135 B-36 B-45 B-54 B-72 B-90 B-99 B-117 B-63 B-81 B-126 B-144 B-153 B-378 B-36 B-45 B-54 B-72 B-90 B-99 B-117 B-63 B-81 B-121 B-306 B-378 B-126 B-108 B-135 B-36 B-45 B-54 B-72 B-90 B-99 B-117 B-63 B-81 B-126 B-381 B-387 B-368 B-387 B-27 B-153 B-144 B-153 B-378 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 SAME SOURCES AS 1947 8 10 11 12 15 18 20 22 25 28 31 32 33 34 36 38 40 42 44 46 48 56 64 72 80 88 96 104 112 120 128 136 144 152 160 168 176 184 192 200 208 216 224 225 226 227 235 246 257 268 279 290 301 312 323 334 345 356 367 368hahttp://www.ssc.wisc.edu/wais/WAIS667031.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667031.txt$Richard Bauman Ashok Bhargava 1967JDResidual Tax Records- Analysis of Results of Visit to Tax Department June 13, 1967r WAIS paper667-0435("Missing Data (Master File Records)hahttp://www.ssc.wisc.edu/wais/WAIS667043.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667043.txtSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Dick Bauman Ashok Bhargava WAIS 667-043 June 13, 1967 RESIDUAL TAX RECORDS - ANALYSIS OF RESULTS OF VISIT TO TAX DEPARTMENT WAIS 667-029 and 667-032 set out the problems of Residual Tax Records in WAIS's sample. This paper sets out the results of a visit to the Tax Department. The visit was made to estimate what percentage of the missing records we could find, or the reasons for their non-existence. All records for 1946-1954 have been micro-filmed by the Tax Department at the four district offices, and were thus no longer available in the files. The 1955-1958 returns are in the Records section ("purged" file) of the Tax Department. We were thus, able to get only the 1959-64 records for checking. We checked the Tax Department files with the following WAIS records: 1. Folder shots of first 35 household ID#'s in the name groups #01, 08, 30, 36, and 50 (henceforth referred to as the sample name groups). These correspond to cases 1 to 9 in WAIS 667-032. 2. Folder shots of first 15 household ID#'s in the 4000's in the sample name groups. These correspond to cases 13-15 in WAIS 667-032. 3. A sample of about 30 unmatched FFID's in each of the sample name groups. The sample consisted mainly of males, and especially those who had filed in the later years of the 1946-1960 sample. Thus, the sample was biassed - females and early filers were excluded. 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 a. M 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 b. F 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 c. M 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 d. F 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 Inluded Excluded Excluded Excluded 1 - Return exists 0 - Return not available "a" and "b" correspond to case 10 and 11 in WAIS 667-032. "c" and "d" correspond to case 12 in WAIS 667-032. We took only cases under "a" because the pay-off was expected to be the highest. "b" is covered at the same time as "a", since most females under this heading were wives. There seemed little chance of getting any further returns from the early filers. 4. ID#'s with "All Returns Missing" (1959-64) for the sample name groups. These correspond to cases 16a and 16b in WAIS 667-032 (correction), The results are shown in Tables 1, 2, and 3. Interpretation of Results TABLE 1 - Analysis of Folder Shots* (after visit to Tax Office 1959-64) 1. All returns available % distribution in WAIS records 19 26 7 17 14 83 44.36 in 3, 4, 5. 2. Explained 4 3 6 13 6.95 3. As on folder shot 18 14 21 14 13 80 42.87 80 88% 4. Returns available 4 1 1 1 7 3.70 7 7.7% 5. Audit and Estate 1 3 4 2.00 4.3% 48 42 32 35 33 187 100.00 91 100.00 *Name groups 01, 08, 30, 36, 50 First 35 household ID's (except unmatched FFID's) First 15 household ID's in 4000 (except all returns missing) +These are cases where returns can be found. Table 1 gives the results of the checks with the first 35 household ID's and the first 15 household ID's in the 4000's. These should give a total of 250 in the sample name groups. However, the table shows a total of 187. This is because a few of the first 35 ID's are in the unmatched FFID's, and a few of the first 15 ID's in the 4000's are in the "All Returns Missing". (A U B) - (A C) - (B D) = 187 Of the 187 cases 96 had "All Returns Available in WAIS Records" or were explained. We are thus left with 91 cases in which we could expect a pay-off. The distribution of these 91 cases is given in the last two columns of Table 1. 1. In 80 cases the folder shots were correct - no further information was available in the tax records. 2. In 7 cases (7.7%) returns were available in the Tax Records. Of these 4 were in the B**** name group. In each of the 4 cases the returns were found under B*****. This discrepancy in spelling - with one or two *'s - is the cause of a considerable amount of confusion in this name group. (We suspect there is similar confusion in some other name groups also.) 3. 3 files were out for audit and 1 was in the Estate Department. Presumably we will have returns available in these cases. TABLE 2 - Analysis of Unmatched FFID's (1959-64) 01 08 30 36 50 Total % 1. Returns Available 8 -- -- -- -- 8 5.33 2. Names Not in 1959-64 22 30 30 30 30 142 94.67 Tax Files 30 30 30 30 30 150 100.00 Table 2 shows the results of the checks with the unmatched FFID's. In 8 cases returns were available for some years in 1959-64. All 8 cases were in the B**** name group, and were due to the change of spelling (one to two *'s). TABLE 3 - Analysis of Folder Shots (All Returns Missing)* 01 08 30 36 50 Total % 1. Explained - - 6 8 -- 1 15 11.2 2. As On Folder Shot 8 14 12 3 7 54 40.6 3. 1959-60 (Correct) -- 4 3 4 1 12 9.0 4. Returns Available 5 4 2 3 21 15.8 5. Estate Dept. 4 6 2 2 5 19 14.24 6. Delinquent and Audit 1 10 4 1 11 17 12.1 7. Not in Files -- 1 2 -- 1 4 3.0 20 46 35 12 19 132 100.0 *All the "All Returns Missing" in name groups 01, 08, 30, 36, 50. Table 3 gives the analysis of folder shots for the ID#'s with "All Returns Missing." (It also includes cases with 1959-60 returns available only which have already been coded in the old file - and are therefore filed with the "All Returns Missing"). 1. In 15 cases (11.2%) we found an explanation for all the returns missing. (This explanation is available in our records also). 2. In 54 (40.6%) cases there were no returns, and no explanation why all the returns were missing. A number of these cases had information returns in the files (of which we had copies in our records). 3. 1959-60 returns only: There were 12 cases and the tax records did not have any more information in these cases. 4. Returns were available in 21 (15.8%) of the cases. Of the 7 cases under B****'s 2 had returns available under B*****. 5. 19 persons returns had been checked out to the Estate Department (these were the same persons as shown in our records as having returns in the Estate Department)* *We had a discussion with the people in charge of the files. It was not clear whether these returns were returned to the Tax Department files after the Estate Department cleared them, or whether they stayed in the Estate Department. They told us that female's returns were definitely returned but were not sure about the males! In either case-these returns can be obtained. 6. 17 returns were in the delinquent file or out for audit. These can also be obtained, either after they are re-filed, or from where they are not present. 7. We ran into 4 queer cases, where we had a persons name and address, but no file existed in the Tax Department. These have probably been misfiled. Recommended Strategy For Complete Search 1. Get copies of Tax Department microfilms (for 1947-1956 returns) in our name groups for use on a reader-printer. 2. 1955-58: Look at "Purged records" of the Tax Department. a. Delinquent and audit should pose less of a problem here. 3. Go to estate files for deceased males whose returns have not been returned to the tax files. 4. There are some name groups where there could be name change problems. a. B**** -- B***** b. C**** -- C**** c. C**** -- C**** d. B**** -- B**** e. F**** -- F**** f. N**** -- N**** g. P**** - P**** -- P**** - P**** h. S**** -- S**** i. T**** -- T**** 5. Since we are keeping files for households, it may be possible to extend the sample to include widows (not re-married) who filed separately after the husband's death. some name groups where there could be name change problems. a. B**** -- B***** b. C**** -- C**** c. C**** -- C**** d. B**** -- B**** e. F**** -- F**** f. N**** -- N**** g. P**** - P**** -- P**** - P**** h. S**** -- S**** i. T**** -- T**** 5. Since we are keeping files for households, it may be possible to extend the sample to include widows (not re-married) who filed separately after the husband's death.hahttp://www.ssc.wisc.edu/wais/WAIS667043.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667043.txt/tMark Lieberman 1968,%Terminal Operation: A Progress ReportxMarch 22, 1968 WAIS paper678-053Edits ProgramszsMark Lieberman WAIS 678-053 March 22, 1968 Terminal Operation: A Progress Report This paper will give a brief explanation of the current state of WAIS's knowledge of B-5500 remote operations. Our B-5500 activity grows daily, as does our awareness of problems and our competence in solving them. However, our present system of working at the remotes, while more sophisticated than we would have expected a few months ago, is still far from smooth. The context of WAIS's operation is essentially that envisioned since September although the feelings of "Well, this may be possible" are now replaced by feelings of "Sure we can do it this way, but do we want to?" It appears that all WAIS files can be updated using the current version of EDITOR when EDITOR is linked to my programs ONTODISK and TAKEOFF. ONTODISK can take any file and break it into sections so EDITOR can handle all characters of the record. Presently, ONTODISK exists only in a form suitable for 80 column records, but the modifications needed for records of any other size are small. FIGURE 1 shows how an 80 character record appears after ONTODISK has segmented it. The remoted operator sees two 40 character records. Each half of a record can be easily distinguished since the first half has a + as its initial character while the second is preceded by 8 quotes. Eight quotes also mark the end of the first 40 character set. Furthermore each 40 character set is uniquely numbered by EDITOR with each record getting two numbers. If n is the number of a card when each card is given a consecutive sequence number, the EDITOR will assign (2n-1)(100) and 2n(100) to the half card images. Thus, in the figure, 1900 and 2000 are the first and second half of card 10. The extended string of quotes after 2300 are the result of the method used to end the printout and do not indicate program failure. After the file has been updated, TAKEOFF reconstructs the record and produces new updated records on any medium required (currently, it produces only cards, but new tape or disk files are easily produced also). TAKEOFF can be initiated on the remote without having to go to UWCC. A future modification of TAKEOFF will tell the remote operator how many new records were produced by a message printed on the remotes. This will be a step toward file integrity throughout the editing process. As a further precaution, when TAKEOFF creates a file for further processing, it does not destroy the original file. Rather this file is maintained as a backup file by UWCC, and can be reloaded anytime within about a month of its last use. The problem of file integrity is one Bill Gates and I have spent considerable time on. The result of our worrying is the HISTORY OF FILE UPDATING sheet. This sheet is to follow any set of records from the time it is edited until the time it is completely corrected. A crucial item on this sheet is the "# of records" which appears in items 2, 6a, 6b, and 8. Equality in these items is designed to insure that no records are lost in the processing of the file. As EDIT, ONTODISK, EDITOR, and TAKEOFF produce record counts each time they are used, this sheet should provide an excellent safety precaution. To be sure the sheet is used, remote operators receive instructions to process no files unless they have such a sheet attached to the job. It is only proper to dim the picture by briefly sketching some of the problems that WAIS must solve. On a trivial level, we do not yet know how to use ONTODISK so we create the maximum size file EDITOR can handle. Eventually, we hope to find this out. On a somewhat less trivial level, we are still uncertain about how to use the B-5500 file security system and the UWCC file backup process. The file security system has the irritating tendency to make files appear to vanish even though they are on the disk. This occurs because only the person who runs ONTODISK may access the file thus created unless he specifies otherwise. The problem arises when these specifications do not work (cause of failure unknown, naturally). The file backup system at UWCC is imperfect despite its use since September. Occasionally entire disk files are lost. Sometimes this happens, and UWCC does not even seem aware of it. When the disk fails, the file is irretrievably lost, and the only way to get it back is to either recreate it or load it from a disk back-up tape -- if the later is possible, a full days' work on that file is lost. In fairness, however, it should be pointed out that these systems errors are occurring very infrequently, and their continued decrease in frequency is predicted. Even the menace of halt loads i.e. complete restarts of the B-5500 system are very infrequent now. Also a new operating system, MARK VIII, gives remote jobs almost immediate access to the computer. Hence, less time is spent waiting for the scheduler to put one's job into the mix of jobs it time shores with. The last great problem is certainly the element of operator training. Remote operation, while glamorous, is easily botched. There are constant opportunities to ruin a file. Even calm tempered people can find that problems on the remotes can be irritating and seemingly causeless. It may well be that the ultimate success or failure of a remote updating system depends not on the software or the hardware of the computer, but of the humans who run the remotes. It is in this area that WAIS must excel if remote operations are to become the time and money saving operations they can become. HISTORY OF FILE UPDATING PAGE___OF___ Is this sheet a continuation of a former sheet? Yes Answer from 5 1. FILE ID PRIMARY USER 2. DATE OF EDIT PRODUCTION RUN 3. DATE FILE PUT ON DISK 4. # OF CORRECTIONS TO BE MADE REPLACEMENTS DELETE CARDS ADDED CARDS 5. DATE JOB GIVEN TO REMOTE OPERATOR 6. REMOTE OPERATOR REPORT a. DATE JOB BEGUN # OF RECORDS b. DATE JOB ENDED # OF RECORDS IS DISK FOR DATE IN 6b AVAILABLE? Yes No No Answer from 1 # OF RECORDS If no, what is the date of the recreated file? RETURN ALL OUTPUT TO PRIMARY USER 7. RESULTS OF CHECK OF TTY OUTPUT: SPOT CHECK or 1009. CHECK (check one) How many TTY OPERATOR errors_ were found 8. DATE OF RERUN OF EDIT : # OF RECORDS Were there still errors? t No Yes COMMENTS ON NATURE OF THESE ERRORS (e.g. "Errors at remotes" or "Missed some errors before" etc.) 9. IF FILE IS NOW COMPLETELY CORRECTED, WRITE DATE OF COMPLETION; OTHERWISE CHECK CONTINUATION BOX CONTINUATION or Date of file completion REMOTE OPERATOR' LOG OPR DATE TIME TIME ADD CHANGES OTHER TIME LI H/L or COMMENTS OF LI IN DEL ENDED EOJ MX 4: EDITOR/EDITOR=5 BOJ 1618 FROM 01/09 24910 0 : *'L I ST 300:+010501002701 400: ''''''''''''' 500:+010501004500 900:+0105010051500 21i2163451612 1000: '''''''''''''''''' '110.0:+010501006200 2570 590 1200: ''''''''''''''''''''''' 13000:+010_501006810 '3412621e326244262 1400: '''''''''''''''''''''''' 11.500:+010501007100 1600: ''''''''''''''''''''' 1700:+010501007800 1800: ''''''''''''''''' 1900:+010501009000 2000: '''''''''''''''70 100:+010501002000 , 137157 1630 0 700:+010501005600 2229 209 '''''''''''''' ''''''''''''''''''' ''''''''''''''' '''''''''''''' '''''''''''' ''''''''''''''''''' ''''''''''''''''''' 570 5 ''''''''''''''''' M 0 22 L32625 148 6-111.763430 470 530 2100:+01050101.0300 2259/1526946 2200: '''''''''''''''''''''''' c. 2300: +01_05610105000a3/415241 160 2500:+010501010501 3580 600 610 '''''''''''''''''' I ''''''''''''''''''''''''''''''''' 2500:*SAVE SURVEY/CARDS DUP. FILE: SURVEY /CARDS 2500:*SAVE: SURVEY/CARD5 DUP. FILE SURVEY /CARD5 2500: *QUIT- 0 0 MINS 5 SEC; PROC 0 0 MINS 9 SECS I / 0. 3 3 MINS 18 18 SECS TOTAL. EDITOR/EDITOR=5 EOJ 1621 ?PD SURVEY/CARD5 SURVEY/CARDS ?LO_ #STATION 1/9 LOGGED-OUT AThahttp://www.ssc.wisc.edu/wais/WAIS678053.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678053.txtH^Mark Lieberman 1966Format for X-Tab InputSeptember 22, 1966 WAIS paper667-006r"Survey Data and File TablesoGbG\Mark Lieberman WAIS Paper 667-006 Revised September 30, 1966 FORMAT FOR THE X-TAB INPUT The following list is a supplement to Appendix A of WAIS paper 656-061, "Tables from the Survey." The "card" and "card column" rows allow reference to the actual position in the survey from which the information came. The "input array position is the number of the position In the input array as the data go into the X-TAB subroutine. The MNEMONIC is that used on the X-TAB variable identification cards that are already prepared. It is, of course, possible to use different MNEMONICs for the game tape position as long an new variable identification cards are written. A brief description of the actual data appears on the far right. Format for Extract From Survey Card Input Array "711 11-12 1 5-16 13-14 17-11 1 15-16 13-20 17118 Z101 U-20 Position IM: * USE?1HDl USUMM2 usvIND3 REICINDI RE! UD2 Rte..:; IND3 Data 1 2 11-14 I 11-22 6 First use of WINDFALL SecmA~ We of WIMP"ALL Third use of WINDFALL First reason for using WINDFALL this way Second reason for acing WINDFALL this Third reason for using WI ALL this way 15 23 24 25 26 27 28 29 30 TS 31 32 !7 35 Y 36 41 39 42 40 43 46 44 47 45 46 50 48 h; V 7 ASSTOWN1 First asset owned - 2 , ASSTOWN2 Second asset owned nod ? ASSTOWN3 Third asset owned 15 ASSTOWN9 Ninth asset owned is I CUDRINC1 First asset owned for current income 4 20 CURRIM04 CAPGAIN1 Fourth asset owned for current income First Asset owned for capital appreciation CAMAIM 1'' LIQUDTY1 Fourth asset owned for capital appreciation First asset owned for LIQUDITY UqU-M-4 K; MCCONS 2? SOVISTR 3is INCOMAS 1 Fourth asset owned for LIQUIDITY R's description of portfolio Satisfaction with portfolio distribution First asset R wants to increase Third asset R wants to increase Card Input Array Card Column WISTAB Position MNEMONIC Data 02 51 49 .33 DECREAS1 First asset R wants to decrease I a 0 a 0 0 a 53 51 35 DECREAS3 Third asset R wants to decrease 54 52 36 PLNCHNGE Will R-make these changes? 55-"-6 53-54 37 WHYNOCG1 First reason for not changing portfolio 57-58 55-56 38 WHYNOCG2 Second reason for not changing portfolio 59-60 57-58 39 WHYNOCG3 Third reason for not changing portfolio 03 13-14 59 40 NEVOWN1 First asset R has never owned I 15-16 60 41 NEVOWN2 a 0 a 0 0 0 0 0 a 0 0 0 0 a 20-21 67 48 NEVOWN0 Ninth asset R has never owned 47: 68 49 QUALPRTI First quality of ideal portfolio 48. 69 50 QUALPRT2 Second quality of ideal portfolio 48 70 51 QUALPRT3 Third quality of ideal portfolio Q 13-14 ii-72 52 REDELIVER R's Highest academic level 15 13, 53 RVOCSCHL Length of time in vocational sob o1 16 74 54 RHIDEGR R's highest degree 17-18 75-76 55 STRCOLL Undergraduate colleges state 17-20 75-78 56 RCOLL 4DIGIT code for undergraduate college 2122 79-80 57 STRGRDDG State of graduate work 23-24 79-82 58 COLLGRAD 4 DIGIT code for graduate college 25-26 83-84 59 RFIELD Field of highest degree 27-28 85-86 60 RBRTBYR Year R was born 29 87-89 61 RGREWUP Place R lived before age 18 32 90-90 62 RSZPLGRW Size of home town 36-38 91-93 63 RMTHRHME R's mother's home city and state) 42-44 94-96 64 RFTHRHME R's father's home 4s 97 65 RDADSED Amount of father's education 46 47 98-99 66 RDADSOCC R's father's job 48 100 67 MAT Marital status 49-50 101,102 68 YRMSTAT Year marital status began Card Input Array and column WISTAB Position MNEMONIC Data 15- 1.3-14 103-104- 69 #CHLDRN Number of R's children 06 13-14 105-106 70 WEDLEVEL Highest level of wife's education 15 107 71 WVOCSCHL Amount of wife's vocational schooling 35 108 72 WPAY 'T How important was R's wife's pay 36-37 109-110 73 WWKTENUR now many years wife worked since marriage 45 ill 74 WEMP5963 Did wife work full time from 1959-1963 08 13 112 75 REMPSTAT R's employment status 14 113 76 WHYUNE14P Cause of R's unemployment 15-16 114-115 77 ROCC R's occupation 21-22 116-117 78 RINDSTBY R's industry 24 118 79 #E1IPL!E Number of people R manages 25 - 119 80 RRLYSALD Salary basis 26-27 120-121 81 OTDtE1 First arrangement for overtime pay 28-29 122-123 82 OTIMB2 Second arrangement for overtime pay 38-39 124-125 83 HRSPRWR Mean hrs. per week R worked last year 40-41 126-127 84 LOWWKS Number of weeks of lean than 40 hours 42-43 128-129 85 HIWKS Number of weeks of more than 40 hours 52-53 130-131 86 #IMPJOBS Number of important jobs R has hard 09 13-14 132-133 87 ) TD JB Most important job R has had 15 16 134-135 88 NSTIMIND Industry of most important job 21 136 89 #C(QKED Number of firms R worked for in last 5 yea 22 137 90 MVSHELPD Have moves helped R get ahead? 28 138 91 #JOBSUC? How often has B looked for new job 29.30 139-14 92 NEWJOB1 opportunities? First type of opportunity B looked for 31-32 141-142 93 NEWJ0B2 Second type of opportunity R looked for 33-34 143-1 94 NEWJOB3 Third type of opportunity R looked for 35 14 95 UNMPBIST Frequency of unemployment 45-46 146-147 96 MffUNEU1 Unemployment compensation/wk, ($), assoc. 61 15 97 #PTIMJBS Number of part time jobs since 1959 62-63 151-1s 98 P27JOB1 First part time job 64-65 153-15 99 WKSFTJB1 Number of weeks worked in 1963 66-67 155-15 1.00 1U PTJB1 Hrs/wk worked in 1963 3 Card Input Array and Column WISTAB Position MNEMONIC 10 13 157 101 PRBJOBCG1 Probability of a job change 14-15 158-159 102 WHYJBCG1 First expected advantage of change 16-17 160-161 103 WHYJBCG2 Second expected advantage of change 18-19 162-163 104 WRYJBCG3 Third expected advantage of change 20-21 164-165 105 ATTNWJB1 First attraction of this new job 2 166 106 SSRRRET Covered by social security or railroad retirement 26 167 107 INCPAYT Does R have sick pay? 29 168 108 %EMPPYS % of benefits employer pays 30 169 109 LNGSCKPY Length of sick pay 3 170 110 TRMSCKPY Terms of sick pay 32 171 111 INRETIR Is R in a retirement program? 33 172 112 FINRETPL How is retirement program financed? 14 65-70 173-178 113 VALDON1 Value of donationsiin kind ($) (1963) 71 179 114 SGNMV-C Sign of (mkt. value - cost) of these donations 78 180 115 TAXADV Was there a tax advantage in this gift? L5 13 181 116 #RELHLPD Number of people R helped outside his own family in 1963 38 182 117 #LVDWR Number of people lived with R since 1-1-49 6 67 183-184 3.18 #TRSTTOR Number of trusts whose beneficiary is in family 68 185 119 #GIFTTOR Number of gifts received since 1949 7 186 120 TRGFTINH Probability of R's getting gift or inheritance 16 13 187 121 TRANS1 First type of transfer payment received 25 188 122 TRANS2 Second type of transfer payment received ,7 17-23 189-194 123 AMTINHE1 Amount of first inheritance ($) - assoc. 80 195 124 #INHER Number of inheritances since 1949 31-32 196-197 125 YRLSTINH Year of latest inheritance 13 198 126 HMSATISF Satisfaction with home 14-15 199-200 127 REHMSAT1 First reason for satisfaction with home 16-17 201-202 128 REHMSAT2 Second reason for satisfaction with home 18-19 20s-2O4 129 REHMSAT3 Third reason for satisfaction with home 20-21 205-206 130 #ROOMS Number of rooms to R's house Card WISTAB Card Column O 22 1.3 14 207-208 15-19 209-213 20-24 1,214-218 LO--U 30 38-39 219-220 40-43 221-224 2 67-71 + 225229 31-35 37-39 230 232 40-42 44 233 45 234 24 13 235 15.16 236-237 17-19 238-240 20-22 241-243 28 244 31-35 245-249 24 58.62 250-254 25 32-s6 25 52-56 255-259 59-61 260-262 2 13 263 14-1.8 264-268 20.21.. 69-270 35 271 :' 7 13-1 4. 19-23 25 52-5 .72-277 ' - 62 12 S 5 Input Array Position MNEMONIC Data 131 YRBOTHMB Year R bought home 132 USRVALUI R's estimate of home value -- assoc. 133 UM MM 1 R's investment in home; value is 0. if both are not ascertained; if only one addend is 134 #UNITSHM Number of units in R's home 135 MORNTRC1 Amount of monthly rent received -- assoc. 138 fOtN'1FURN Does R rent a furnished home? 139 WHYN(>RNT Why does R neither own nor rent? 140 HOWFMACQ Method of acquiring farm 141 YEFMACQ Year. R acquired his farm 142. ORIGACR Acres originally in farm 143 CURRACR Acres presently in farm 144 FMOWTTYP Method of owning farm e.g., partnership 145 VALHSYD1 R's estimated of value of house and yard -- assoc. 136 AMTHMTGE1 amount of mortgage -- assoc. 137 RNMIL1 Monthly rent plus utilities 146 147 148 149 150 151. 1.52 F! D TG1 Value of mortgage on R's farm LAAMLD1 Value-of land and building 12-31-1953 ACRNTF'RM Number of acres rented from others HOWB .EN Method of rent payment e.g. cash, income share 63FMRNTI 1963-Rent assoc. RSHARE R's share' of farm income WIWNRTPM Why R neither owns nor rents 153 FARHVAL1 Net worth of R' farm -- assoc, 25 -02-36) Input Array Card Column Position 027 49 278 Card VALINNOV VALEXSTA 51 280 52 281 28 13-19. + 82-288 158 22-28 8 13-19 22.28 31-37 289-295 4046 49-55 58-6 29 13 d4-15 16-17 18-1.9 20 First, early, or late adapter of farm innovations R's evaluation of ag. exprmt1. station. suggestions 159. 296 160, 97-298 161 299-300 1.62 301.-302 163 303 16.4 304 305 165 306 166 307 167 308 168 309 169 310 170 311 171 312 172 313 173 314 174 315 175 316 176 317 177` Sc3NINGGG REINCCGI. REINGCG2 REINGCG3 AMTINGCG VALCOAGT Frequency of consulting county agent VALLBULL Frequency of examining ag. exp. sta. bulletin HEADING1 Income of head of family*--assoc. Probable direction of 63-68 income change First reason for expected change in direction Second reason for expected change in direction Third reason for expected change in direction Expected percentage of change Networth MANAGER1 First asset managers title MANAGER2 Second asset managers title MGRJOB1 First job of Mgr. MCR..TOB2 Second job of Mgr. WIFEASST Does R's wife have investments FIXRET Amount of fixed return income ANTSTK Amount of common and preferred stock AMTBUS Amount of holding in a business #REST #TVS DRYER WASHER DSHWSHR Number of parcels of real estate Number of TV`s R owns Does R own a dryer? Does R own a washer? Does K own a dishwasher? Card Card Column Input Array Position MNEMONIC t 30 30 318 178 #AIRCOND Number of room air conditioners 35 319 179 CENTACON Does R own a central air conditioner? 38 320 180 FREEZER Does respondent own a separate freezer? 41 321 181 RIPI Does respondent own a hi-fi or stereo? 44 322 182 #CARS Number of cars and/or trucks R owns 31 13-14 323-324 183 BOAT1 Description of first boat 8 33 325 184 TAXMAIN Will R. gain from 1964 tax cut 34-35 326-327 185 RE&.T%GNi First way R will make use of gains 36-37 328-329 186 REATZ=2 Second way R will make use of gains 38-39 330-331 137 REAT3[GN3 Third way R will make use of gains 28 13-19 22-28 + '32-338 188 JTRTINC1 Joint return income -- assoc. 31-37 40-46 31-37 + 39-345 189 WIFEINC1 Wife's income -- assoc, 40-46 49-55 + 46-352 190 GHLDINC1 Other family members Income -- assoc, 58-64 24 23-26 53-356 191 63FM ML1 1963 farm property tax {$) -- assoc, 5 38-39 157-358 192 AGW0ODT[ Acres under woodland tax law 41,-42 c59-361 193 ACF0RC3'"X. Acres under forest crop tax law 21 462-364 194 WPLCRWUP Size of wife's home town_ 06 13-20 365 195 WSZPLGRW Place wife grew up G9-70 c66-367 196 #TRSTSXR Number of trusts established by R's family 115-1-) 368 197 ERR63DON Error in 1963 donation 369 198 IRINH Error in inheritance received 370 199 ERRASVAL Error in home's value 371 200 EURPRIHS Error in home's purchase price 372 201 ERRRMDFP Error in. home improvements 373 202 ERRNTRG Error in rent received 374 203 ERRMTGE1 Error in first mortgage 375 204 'ERRMTGE2 Error in second mortgage Card Input Array Card Column WISTAB Position MNEMONIC Data O 22 15-19 376 205 EPR$NTPP Error in monthly rent paid 377 206 IRRUTIL Error in utilities paid 378 207 MUMS .Error in value of house and yard 379 208 EFLUMM"1 Error in first farm mortgage 380 209 Error. in second farm mortgage 381 210 EERI D Error in value of land and building 382 211 E RLNTRT Error in land rent paid 383 212 ERMP 1M.AC Error in farm machinery 384 213 ERRL.VSTK Error in livestock 385 214 ERRFEED Error In value feed 386 215 ERRILDWG Error in head wage 387 216. ERRUUPRP Error in head property income 388 217 EI RWFWG Error in wife's wage 389 218 ERRWPRD Error in wife's property income 390 219 EL'RCIIWG Error in child's wage 391 220 ERRGUPRD Error in child's property income 392 221 ERRT'MDET Error in farm debt 393-399 222 WEIGHT1 W GITI - FOP14AT F7.3 400 223 WEIGHT2 WEIGHT2 - 0 or 1 weight The following is a series of 10 assets, each accompanied by 7 reasons for owing or not owing this asset. Since each asset will have the same format, only the codes for the first asset will appear in their entirety the others will follow in abbreviated form. 401 224 OWNINS Does R own insurance? 402 225 CI1INGINS Does R own insurance for current income? 403 226 CPAPPINS Does R on insurance for capital appreciation? 404. 227 LIQINS Does R own insurance for liquidity? 405 228 INCRIN S Does R wish to increase his insurance? 406 229 DECRINS Does R wish to decrease his insurance? 407 230 NEVINS 1 if R never owned insurance 408-409 231 RM40INS Why R never owned insurance MNEMONICS for most further assets are derived by removing the last 3 digits of the appropriate code and replacing them by 3 digits representing the proper assets e.g. if R owns home equity for capital appreciation, the, MNEMONIC is CPAPPHEQ. Card Input Array Card Column WISTAB Position MNEMONIC Data OWNUMEQ Does R own home equity CRINCHEQ Does R own home equity for current increase RENONEQ Why R has no home equity OWZNCBD Does R own government bonds? RENOGBD Why R never owned government bonds? OWNLBD Does R own local bonds? RENOLBD Why R does not own local bonds? ONNCBD Does R own corporate bonds? a RENOCBD Why R does not own corporate bonds? OWNSACC Does R own a savings account? CRINCSAC Does R own a savings account for current income? CPAPPMC Does R own a savings account for capital appreciation? LIQSACC Does *R own a savings account for liquidity? INCRSACC Does R grant to increase his savings account? DECRSACC Does R want to decrease his savings account? NEVSAG 1 if RR never owned a savings account. RENOSAC Why R never owned a savings account? OWNCSTK Does R own common stock? CRINCCS' Does R own common stock for current income? 448 266 449 267 450 268 451 269 452 270 53-454 271 455 272 456 273 a 0 458 275 a 0 LIQCSTK Does P own common stock for liquidity? Column WISTAB Card 459 460 461. 462-463 464 465 466 467 468 469 470 471-472 473 474 475 476 480-481 482 489-490 491 08 23 492 10 27 493 IC% 47 494 15 63-64 495-496 29 22 499 29 41-43 500-502 44-44-6 503-505 47-49 506-508 50-51 509-510 Card Input Array Position 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 10 MNEMONIC Data INCRCSTK Does R want to increase common stock? DECRCSTX Does R want to decrease common stock? HEVCSTK Did R ever own common stock? RENOCSTK 1 if R never owned common stock OWNPSTK Does R own preferred stock? CRINCPST Does R own preferred stock for current income CPAPPPST Does R own preferred-stock for capital appreciation? LIQPSTK Does R own preferred stock for liquidity? INCRPSTK Does R want to increase preferred stock? DECRPSTK Does R want to decrease preferred stock? NEVPSTK 1 if R never owned preferred stock RENOPSTK Why R owns no preferred stock? OWNREST Does R own real estate? CRINCRES Does R own real estate for current income? CPAPPRES Does R own real estate for capital appreciation. LIQREST Does R own real estate for liquidity? a e RENOREST Why R does not own real estate? OWNBUS Does R own business 303 RENOBUS Why R does not own business 304 NOASSETS I if R owns no Assets 305 RSELPEIIP Is R self employed? 306 RHLTRItiS Does R own health insurance? 307 DEFINC Does R have deferred income? 308 .TRANS Number of transfer payments received 309 INVMGR Does R have help in managing investments? 310 #BOND Number of bond lot R owns (corporate, state,, etc. 311 #BONDSL Number of lots of bonds R sold since =2-3 -55 312 #LOANS Number of loans R owned on 12-31-6.3 313 #INVCLB'S Number of investment clubs R belongs to 29 52-53 511-51.2 314 Card Input Array 0 MNEMONIC Data #STKOPT Number of stock options since 12-31-58 54-56 513-515 315 STOCKS Number of lots of stock 57-58 516-517 316 #STCKSLD Number of lots of stock R sold since 1.2-31-58 59-60 513-519 317 'BUS Number of businesses R owns 51--62 520-521 318 #BUSSLD Number of businesses sold since 12-31-58 63-64 522-523 319 #CLCOR S Number of interests in closely held corporations 65-6 524-526 320 #tFARBEST Number of parcels of real estate? 527-529 321 #RESTSLD Number of real estate parcels sold? 530 322 GRLYINS Is life insurance currently in force 531-532 323 OASSTS Number of other income producing assets 29 2_ 533 324 NETWORTR Net worth - 9 categories 29 21 533 325 NEORTI1 Net word. - 4 categories The following are recoded forms of data which appear elsewhere. These forms are identical to those are recoded from, except that blanks in these fields were recoded as 999...9. All recoded items end in "2", 09 0 14 17 22 22 22 23 23 I 24 2' 25 25. 26 27 ,.w 26 24 All 45-46 146-147 326 AMTUNEM2 amount of unemployment compensation 65-70 173-178 327 VALDON2 Amount of donation in kind 17-23 189-194 328 AMTINHE2 Amount inheritance 49-63 13.19 209-213 329 USEVALU2 Home value 20-24 214-218 330 IFIVINM2 R's investment in home 40-43 221-224 331 AOORNTRC2 Monthly rent received 67-71 225-229 332 AMMI'MTGE2 Amount of mortgage 37-39 230-232 .333 RNTUTIL2 Rent/mth + utilities payments 31-35 245-249 334 VALUSXD2 Value of farm house and yard 58-624 2:0254 335 MAIM= Mortgage owed on the farm 32-36 52-56 255-259 336 LANDb'LD2 Value of the farm land and building; 14-1.3 264-268 337 63 NT2 1963 farm rent 13-17 272-277 338 FA AL2 Farm net worth etc., 282-288 339 HEADING2 Income of the head of the family 17-19 etc. 289-295 340 F'ANINC2 Family income 13-19 etc ~ 341 JTFRTINC2 Joint return income I 13-191 332-338 etc. 42 WIFEINC2 Wife's income 31-37 339-345 etc. 346-352 343 GHLDTC2 Other family income 49-55 23-26 353-356 344 63FMTX2 1963 farm real estate tax 534 345 SEX SEX - taken from ID numberhahttp://www.ssc.wisc.edu/wais/WAIS667006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS667006.txtA Roger Miller 1966Portfolio Evaluation from Wisconsin Individual Income Tax Returns: II. General File Processing, and III. Processing of Files 11 and 12 (Stage One)March 31, 1966 WAIS paper656-051Analysis@@Roger F. Miller WAIS Paper 656-051 March 31, 1966 Portfolio Evaluation from Wisconsin Individual Income Tax Returns II. General File Processing, and III. Processing of Files 11 and 12 (Stage One) II. General File Processing A. File Flows (l) One could envision reading all the data from all Files into the computer at once and performing all processing at once, but this is obviously impractical due to tape drive and core limitations an well as programmer limitations. On the other hand, there seems to be little necessity for going to the other extreme and ever processing one file by itself. Even if more than two Files are actually processed at once, however, it seems desirable to keep the descriptions simple, and this is best accomplished by dealing with only two files at a time (plus a listing of what other files have in them). (2) The processing can be described by the following stages, which also describe the input and output Files of each stage (the records included in output Files are those that come from the given input files):. (a) Stage one: Inputs: Files 11 and 12, and lists of persons in Files 13 and 14 Outputs: File 31: All records of persons properly only in 11 " 32: " 11 and 13 " 33: " 11, 13 and 14. " 34: Integrated records of persons only in both 11 and 12 35: 11, 12 and 13 " 36: in all of 11, 12, 13 and 14 37: A list of persons in 14 but not in 13 38: All records of persons in 12 but not in 11 for some year 39: not in 12, but who have sources in 11 indicating they should also be in 12, for some year. (b) Stage two: Inputs: Files 13 and 14, and lists of persons in Files 32, 33, 35 and 36 Outputs: File 51: All records of persons only in File 13. 52: 13 and 32. 53: 13 35. 54: Integrated records of persons only in 13, 14 and 33 55: 13, 14 and 36 56: 13 and 14 57: File 14. (c) Stage three: Inputs: Pairwise, Files 32 and 52, 35 and 53, 33 and 54, 36 and 55. Outputs: File 71: Integrated records of Files 32 and 52. " 72: 35 53. " 73: 33 54. " 74: 36 55. (d) Stage four: Inputs: Files 23, 24 and 25 Output: File 41: Integrated records of input files. (e) Stage five: Inputs: Files 34, 38, 51, 57, and 71-74 and a listing of asset identifications on File 41. Outputs: File 42: By asset identification, all data from the input files on assets which are also in 41. File 43: By asset identification, all data from the input files on assets not also in 41. (f) Stage six: Input: Files 41 and 42 Output: File 61: File 41 with added data from File 42. (g) Stage seven: Inputs: Files 21 and 22, and a listing of asset identifications in File 43. Outputs: File 62: Integrated records for assets in 21, 22 and 43. File 63: Records for assets only in 21 and 43. File 64:" " " " 22 and 43. (h) Stage eight: Inputs: Files 42 and 61 Output: File 81: By taxpayer identification all data on File 42 with additions based on File 61. (i) Stage nine: Inputs: Files 43 and, Successively, Files 62, 63 and 64. Outputs: File 82: Like 81, data from 43, additions based on 62. 83:" " " " " " " 63. 84:" " " " " " " 64. (j) Stage ten: Inputs: Files 34, 51, 71-74, and 81-84. Output: File 91: By taxpayer I.D., integrated data from all Input files (the analysis file). (k) Stage eleven: Input: File 91 Outputs: Crosstabs allowing comparisons with other sources as a check on the quality of the final output. (3) Comments which may clarify the above: (a) As laid out there is very little leeway for changing the ordering of the stages. (b) It appears that the procedures will be tapebound because of the many output filers at some Stages, but this can be handled by using a format which will accept several of the output files on the same output tape, each record having a tag indicating the file to which the record belongs. The files may then be separated by a simple sort. For example: in stage one Files 31-33 and 57 could be written on one tape, and 34-36 and 38 on another tape. Also, hopefully some output files will be empty, namely 37-39, 57 and 64. (c) Any output in Files 37-39 and 57 should be investigated for correction and, retry. (d) One of the considerations involved in the above scheme in to minimize read and write times. Notice: Stage one splits off File 31 which never reenters at a later stage; Stage seven drops the data in 21 and 22 which is of no further use in subsequent stages, etc. It is possible that a more efficient scheme can be devised. B. General Outline of the Procedure at Each Stage (1) Read in the relevant records for the Stage. (In Stage one it is the records of one person from Files 11 and 12 and an indication if he is on the input listings from 13, 14.) (2) Perform intra-record checks and and flag errors, gaps, etc. (3) " "-file (4) " inter- " (5) Perform any calculations called for at this Stage. (6) Assign reliability ratings if called for at this Stage. (Generally, retention of the flags from (2)-(4) above will suffice until Stage five.) (7) Assign to an output File (8) Increment counters. (9) Write on output tape. C. Outline of Work to Follow (1) In the sections which follow I will present the detailed specifications for Steps B (2) - B (6) above, and comments on any special aspects of the other steps, for each Stage described in A(2) above. (2) Step B(7) should be clearly specified already from the description of the Outputs in A(2) above. (3) Step B(8) involves more than simply counting the number of records in each file. In addition we need, by File, counts on the various error and gap plugging flags. (4) Step B(9) it general is simply a copying operation for most Stages, merely writing out all the information that was read in Step b (1) plus the results of the manipulations in Steps B(2) - B(7). (5) Warning: operations of the sorting type may be called for between Stages in order to prepare the output of one Stage for use as an input in a subsequent Stage. (6) In many cases, the specifications which follow will outline some work for the staff to fill in, and a summary of references to such places in the specifications will be given at the end of each section. (7) Before Stage One, Files 12-14 should be checked to make sure the I.D.'s are correct and up to date. (8) Preparation of Files 23-25 should be undertaken soon enough so that they will be ready by the time we are ready to run Stage Four. III. Processing of Files 11 and 12 (Stage One) A. Step (2): Intra-Record Checks (1) File 11: this has already been performed for this File but much of the program will probably have to be embodied in this Step again because of Step (4) below. (2) File 12: The main thing that can be done here is to check for completeness. The presence of data in certain fields implies there should also be data in certain other fields, and if that data is missing the field and record should be flagged to so indicate. (a) An obvious example occurs when all we had for class A Real Estate was the Net Rent Received, and this was positive so that only the Gross Rent Received field is punched and there is no minus sign to tell us this is really a Net figure. The other fields pertaining to this property will be blank. A new field for Net as opposed to Gross Rent Received could be created, the data entered there, and a flag be attached to the field indicating that no supporting detail is present. This will be necessary for the work in Step (4) below. (b) Not all such blank fields are indications of a real lack of data, but some are. Clearly if Gains or Losses on Sales of Assets has anything in it, certain fields necessarily should have data: Cost of the Asset is an example. However, fields such as Subsequent Improvements on the Property may be properly blank even if the asset type indicates it is Real Estate. (c) Obviously what is needed here is for one of the staff to sit down with our Phase II coding and keypunching manuals and make a detailed list of the possibilities so that we can distinguish them. (d) The following checks are trivial but should be made in any event: (i) Where applicable, are both asset type and asset ID complete and within the range of possible numbers? (ii) For Gains and Losses, is the Year Asset Sold the same as the Year of Return? (iii) Is the Net Gain or Loss connected with the Sales of assets equal to the of Amount Received from the sale and Depreciation in Prior Years, minus the sum of Cost of the Asset and Subsequent Improvements on the Property, within the "rounding error limits" described more fully in C. (1) (a) below? (in making this check, watch for Net Losses that were rewarded as Net Gains and make the correction if necessary.) B. Step (3): Intra-File Checks (1) These are distinct from Steps (2) and (4) in that they do not consider data from other files but they do consider data from other records in the same File. In this case, they are concerned with inter-year checking of the records of an individual. (2) File 11: Some checking already performed, as in the husband-wife program. Other inter-year checks for this File and also on File 12 are better performed jointly in Step (4). C. Step (4): (1) Within Year Checks: The data in 12 are really supporting data for the corresponding Sources fields in 11. A number of possibilities occur. (a) Correct item: data in 12 are present and complete enough to verify the total in the corresponding Source field in 11. This verification is done by manipulating the data in 12 with a program which duplicates the steps performed on the tax form to yield the corresponding Source entry on 11. A Source is considered verified if it agrees with the result from manipulating 12 within the "rounding" error limits occasioned by our having dropped the cents in keypunching. For example, the total of the interest and dividends fields in 12 must be less than the sum of the two fields for interest and dividends in 11 by as much as n dollars (where n is the number of distinct interest and dividend items in 12) before the discrepancy is considered "significant", but the sum from 12 need only be 2 dollars greater than the sum in 11 before it is considered "significant" (in the sense of indicating a real error). Similar "rounding" tolerance limits may be evolved for the other types of income except for the Business and Professional or the Farm, income Sources. For them "exact" verification is not possible because gross receipts and total expenses were not recorded. However, even for these two items, we should at least check for the presence of data in the corresponding locations of the Files. In the case of Rental Income or Gains and Losses, care must be taken to take proper account of dropping cents in an item which is negative or to be subtracted in arriving at the net figures. For example, it appears to me that for each asset in the Gain or Loss part of 12, the net Gain amount in 12 must be too large by 2 dollars or too small by three dollars to be significant. One of our staff should make a complete table of such critical levels of discrepancies. Watch for possible sign discrepancies on Net Rent. (b) Correctable item: Given that a "significant" discrepancy is discovered it may be correctable: (i) if there is a Sum of Sources error in File 11 it may be removable (or brought within the rounding tolerances) by substituting the comparable sum from 12. Such correction should follow checking all sources for discrepancies and take all such discrepancies into account singly, jointly, etc. to get the combination yielding the closest agreement. Also watch for cases where changing the signs on Net Business or Farm incomes in 11 will now remove a discrepancy which it would not do before in the correction program for File 11 alone. (ii) Equal and offsetting discrepancies may be found if the only tax payer error was to switch the lines on which data was to be entered in the Source fields for File 11. Then we must reverse the switch. In checking for this type of error the program should consider all of the Source fields in 11, not just those corresponding to the type of data found in File 12. (c) Uncorrectable discrepancies: those remaining after the measures in (b) above have been taken must simply be flagged aced the amounts of the discrepancies entered in new fields for future reference, if the amounts are large, or there are too many discrepancies, the records should be printed out. Possibly there has been a faulty I.D. assigned to the records of one of the files. (d) Missing records: If either 12 contains data for a given year but 11 has no entries in the Source fields corresponding to 12*/ , or if there are corresponding Source entries in 11 and 12 lacks a record for that year, then this should be flagged for use in Step (7), since those persons' records should be assigned to Files 38 or 39 respectively. Notice that it is this condition for any one year that in sufficient to require such assignment, although it nothing can be done about it we may wish to resubmit the records. These cases should be few but are important to track down and possibly correct. (up to now we have had no check on the completeness of File 12). (2) Inter-Year Checks: There are no "exact" checks here beyond those already carried out on File 11, the results of which should be flagged. What can be done is limited and perhaps postponable to a later Stage. ------------- */ The ultimate extreme case of this would be when this year's record is missing from File 11 altogether. ------------- (a) Clearly we wait to note additional record gaps (an intermediate year's record is missing in both files), and should create a pair of dummy records for later use with a flag indicating the records are dummies. As output of this Stage these records would merely have the taxpayer's I.D. and the year indicated (in addition to the flag). (b) For use in assigning reliability ratings in a later Stage we should flag cases where the data indicate a possible error, although at this Stage it is not possible to make a final determination as to whether the error is real or not. Two cases are fairly clear (can the staff suggest others?): (i) In some year the taxpayer reports the sale of a property whose asset type indicates it is probably income producing property, but the taxpayer has not reported income from this asset, in years during which the asset was held (as indicated by purchase and sale dates). (ii) A taxpayer ceases reporting income from an asset for which he had been reporting a sizeable amount of income (perhaps more than 50 dollars?) in preceding years, but he does not report any sale of the asset and the asset type indicates it is salable (not savings deposits, etc.). D. Other Steps (1) Step (5): there a no computations called for at this Stage other than those involved in the checking process. (2) Step (6): simply retain the flags and amounts of dollar discrepancies for use in a later Stage. (3)Step (7): this is the only place where the listings of Files 13 and 14 need be consulted. (4) Step (8): it is desirable to count flags in such a way that we get individual totals for the different files. In addition we should also count certain joint flags. Those from C.(2)(b) above are a case in point: (i) is a probable case of non-reporting of income, while (ii) is a probable case of non-reporting of a gain or loss. However, the intersection cannot occur for a single asset, so that a person's records getting both of these flags very likely has had the I.D.'s of the assets assigned inconsistently by the coders. The staff should investigate other possibly meaningful intersections to count. E. Other Comments on Stage One (1) At present File 11 has a fixed record length but File 12 is on card image tape and the actual records are of variable length. Can we work in future Stages with variable length records? Even in Stage one this will be a problem. In reserving core for reading in both Files it will be necessary to allow for the maximum record length. Do we know what this in likely to be? how much tape will be "wasted" if we use fixed record lengths in the output files that have data from File 12? Geffert should try to answer these questions at once. If these problems are too severe, the entire procedure I am using may have to be revised rather radically. (2) References for staff assignments (a) Definite assignments are called for by II.C.(7), (8), III.A.(2)(c), C.(1)(a)(last sentence), D.(4) ana E.(1). (b) Possible additional staff attention is called for by II.C.(5). IIIA.(2)(d)(iii), and C.(2)(b).hahttp://www.ssc.wisc.edu/wais/WAIS656051.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656051.txt; Gene Moyer 1966Tables from the Survey May 5, 1966 WAIS paper656-061r"Survey Data and File Tables::Gene Moyer WAIS paper 656-061 May 5, 1966 Tables from the Survey I. General Considerations The basic reason for constructing these tables is to make up a report to respondents. As respondents become more and more insistent, this becomes more and more important. In addition, however, these tables serve at least two other purposes. They allow us to do some elementary validation of the data by comparing Lorenz Curves with Federal and State tax Lorenz Curves and they allow us to see what the data look like. In order to achieve these results, we are mainly constructing two or three dimension tables. After we see these tables, we hopefully may be able to specify more complex tables for use in publications from the data. The tables in this paper are divided into the following categories: II Tables for Lorenz Curves III Tables to Indicate Portfolio Optimality IV Tables on the Education of Respondents V Tables on the Occupations of Respondents VI Tables on the Protection of Respondents Against the Future VII Tables on Intergenerational Transfers VIII Tables on the Homes of Non-Farm Respondents IX Tables on the Experience of Farmers in the Sample These categories do not exhaust the list of categories in the survey, and they are not mutually exclusive. Still they represent a start on the processes of understanding the survey data and of publishing the results. The numbers in parentheses after each variable indicate the position of the variable in the extract which is Appendix A. These tables should be run on X-Tab even though they are only two or three dimensional tables. The reasons for this are that chi-square and various statistics are available from this program and that we may want some N dimensional tables at a later date. By constructing an extract compatible with X-Tab and the CDC machines we can get this. In addition, graduate school funds will help finance the construction of the tables. Unless otherwise specified, the brackets on income will be (lower bound) (-infinity, 3,000, 5,000, 7,000, 10,000, 15,000, 25,000). The brackets on net worth are predetermined from the survey itself. II. Tables for Lorenz Curves A. Federal Concept 1 Column: Joint return income (333-339) (negative; 0, 1,000; 2,000; 3,000; 4,000; 5,000; 6,000; 7,000; 8,000; 9,000; 10,000; 15,000; 25,000; 50,000; 50,001) Row: Other family members income (347-353) (same categories as above) Associated variable: Amount of joint return income Associated variable: Amount of other family members income B. State Concept 1 Column: Head's income (282-288) Row: Wife's income (340-346) Associated variable: Amount of Head's income Associated variable: Amount of Wife's income III. Tables to Indicate Portfolio Optimality A. Uses of Windfall Income Page: Family Income Class (289-295) 1 Column: First use of windfall income (11-12) Row: First reason (17-18) 2 Column: Second use (13-14) Row: Second reason (17-18) 3 Column: Third use (15-16) Row: Third reason (21-22) 4-6 [Repeat these three tables using Page: Net worth class (304)] B. Portfolio Quality Page: Net worth class (304) Column: Speculative, Conservative, other (44) 1 Row: Ideal quality for a portfolio (68) 2 Row: Second most important quality (69) 3 Row: Least important quality (70) 4 Row: First use of windfall income (13-14) 5 Row: Satisfaction with distribution (45) 6 Row: First reason R does not plan to change (53-54) 7 Row: Second reason R does not plan to change (55-56) 8 Row: Title of first mentioned manager (305) 9 Row: Title of second mentioned manager (306) IV. Tables on the Education of Respondents Page: Place reared (29-31) (000-799; 80-998; 999) Column: Education level (71) 1 Row: Income of head (282-288) 2 Row: Industry of R's occupation (116-177) 3 Row: Occupation of R (114-115) 4 Row: Number of relatives helped (281) 5 Row: Education of father (95) 6 Row: Occupation of father (96) 7 Row: Number of children (103-104) 8 Row: State in which R's college is located (76-77) 9 Row: Place wife reared (363-365) 10 Row: Education of wife (105-106) V. Tables on the Occupations of Respondents A. Occupation and Work History Column: Occupation (114-115) 1 Row: Industry (116-117) 2 Row: Occupation of wife (108-109) 3 Row: Occupation of R's father (98-99) 4 Row: Number of employees responsible to R (118) 5 Row: Hourly, salaried? (119) 6 Row: Number of important jobs R has had (130-131) 7 Row: Most important job R has had (132-133) 8 Row: Number of companies R has worked for (136) 9 Row: Have moves helped R? (137) 10 Row: Frequency of unemployment (145) 11 Row: Number of part-time jobs since 1959 (152) 12 Row: Hours per week on first part-time job (155-156) 13 Row: Number of relatives helped (181) 14 Row: Number of rooms in home (205-206) 15 Row: Present value of home (209-213) 16 Row: Family income (289-295) 17 Row: Head's income (282-288) 18 Row: Hours worked per week (124-125) 19 Row: First use of windfall income (11-12) 20 Row: Second use of windfall income (13-14) 21 Row: Third use of windfall income (15-16) B. Industry and Work History Page: Occupation (114) Column: Industry of occupation (116-117) 1 Row: Industry of most important occupation (134-135) 2 Row: Number of important jobs R has had (132-133) 3 Row: Number of companies R has worked for (136) 4 Row: Have job changes helped R? (137) 5 Row: Frequency of unemployment (145) 6 Row: Weekly compensation amounts (Lower bounds blank, 0, $25, $50, $75, 99) C. Occupation and Income Stability Page: Family income (289-295) Column: Occupation (114-115) 1 Row: First person who worked while R was unemployed (148) 2 Row: Second person (149) 3 Row: Third person (150) 4 Row: Fourth person (151) 5 Row: Probable sign of income change, 1964-1968 (296) 6 Row: First reason (297-298) 7 Row: Second reason (299-300) 8 Row: Third reason (301-302) 9 Row: Probable amount of income change (303) 10 Row: Probability of job change (157) 11 Row: First reason (158-159) 12 Row: Second reason (160-161) 13 Row: Third reason (162-163) 14 Row: First mentioned attraction of new job (164-165) VI. Tables on the Protection of Respondents Against an Uncertain Future Page: Family income Column: Net worth class 1 Row: Covered by SS or RR retirement? (166) 2 Row: Does health insurance pay income payments? (168) 3 Row: Percentage of premiums employer pays 4 Row: Length of time job sick-pay would continue (170) 5 Row: How is retirement program financed? (172) VII. Tables on Intergenerational Transfers A. The Incidence and Amount of Inheritance Page: Net worth class (304) Column: Number of inheritances received 1 Row: Year latest inheritance received 2 Associated variable: Mean amount of inheritance received since 1949 B. The relationship of inheritance to asset holdings and to occupation Page: Family income (289-295) Column: Total amount of inheritance received since 1949 (Lower bounds: -infinity; 0; 100; 1,000; 2,000; 3,000; 5,000; 10,000; 15,000; NA). 1 Row: Present value of house (209-213) 2 Row: Value of farm house and yard (245-249) 3 Row: Net worth class (304) 4 Row: Occupation (114-115) 5 Row: Occupation of father (98-99) 6 Row: Probable sign of income change (296) 7 Row: Age (85-86) 8 Row: Likelyhood of receiving a gift or inheritance (186) 9 Row: Number of trusts to family (183-184) 10 Row: Number of gifts to family (185) VIII. Tables on the Homes of Respondents A. Satisfaction with Home Page: Family income (289-295) Column: Satisfaction with home (198) 1 Row: First reason for satisfaction (199-200) 2 Row: Second reason for satisfaction (201-202) 3 Row: Third reason for satisfaction (203-204) 4 Row: Year acquired home (207-208) 5 Row: Number of units in home (219-220) 6 Row: Value of home (209-213) 7 Row: Monthly rent plus utilities (230-232) 8 Row: How is it R neither owns nor rents? (234) B. The Capital Gains in Owned Housing Page: Family income (289-295) Page: Year acquired home (207-208) Page: Number of rooms (2050206) Column: Value of home (209-213) 1 Row: Investment in home (214-418) 2 Associated variable: Value of home (209-213) 3 Associated variable: Amount of mortgage (225-229) C. Rental of Furnished and Unfurnished Housing Page: Does R rent furnished? (233) Column: Family income (289-295) Row: Monthly rent plus utilities (230-232) (Lower bound blank, 0, $50, $75, $100, $125, $150, $200, $250, NA) Associated variable: Mean amount of monthly rent plus utilities (230-232) IX Tables on the Experience of Farmers in the Sample A. The Value of the Farm Page: Net worth class (304) Column: Market value of land and buildings (255-259) (Lower bound's: blank; 0; 1,000; 5,000; 10,000; 20,000; 30 000; 100,000; 200,000; NA) 1 Row: Market value of house and yard (245-249) (Lower bounds: as above) 2 Row: Type of ownership (244) 3 Row: Current acres (241-243) (Lower bounds: blank; 0; 50; 100; 150; 200; 250; 300; 500; 997; 999) 4 Row: Acres rented from others (260-262) (Lower bounds: as above) B. Farm Management Practices Page: Family income (289-295) Column: Farm assets less debts (272-277) (Lower bounds: as IXA1) 1 Row: Early, late adapter (278) 2 Row: Value of recommendations from experiment station (279) 3 Row: Value of county agents advice (280) 4 Row: Use of experiment station bulletins (281) C. Farm Property Taxes Page: 1963 property taxes (354-357) (Lower bounds: blank; 0; 100; 200; 300; 500; 1,000; 9,997; 9,999) Column: Number of acres under Woodland Tax Law (358-359) Row: Number of acres under Forest Crop Tax Law (360-362) (Lower bounds for (both) above blank; 0; 5; 10; 20; 25; 30; 50; 100; 200; 300; 500; 997; 999) APPENDIX AFormat for Extract from Survey Columns Tape Position 5-12 1-8 20 9 22 10 13-18 11-16 19-24 17-22 25-33 23-31 34-45 32-43 46-60 44-58 13-21 59-67 47-49 68-70 13-26 71-84 27-28 85-86 29-32 87-90 36-38 91-93 42-44 94-96 45-47 97-99 48-50 100-102 13-14 103-104 13-15 105-107 35-37 108-110 45 111 13 112 14 113 15- 6 114-115 21-22 116-117 24 118 25 119 26-29 120-123 38-43 124-139 52-53 130-131 Number of Record Position Number Data 8 8 9 Identification number Race 02 1 03 6 9 12 15 04 14 2 4 3 3 3 10 Sex 16 Uses of Windfall 22 Reasons for using windfall this way 31 Assets owned 43 Reasons for owning assets 58 Optimality of portfolio 67 Assets never owned 70 Ideal quality of portfolio second, least important 84 Education of R 86 Age of R 90 Place R grew up (with size indication) 93 Place mother now living 96 Place father now living 99 Father's education and occupation 05 06 2 102 Marital status 104 Number of children 107 Education of wife 110 Employment status of wife 111 Employment tenure of wife 112 Employment status 113 Reason unemployed 115 Occupation 117 Industry of occupation 118 Number of employees 119 Hourly, salaried? 123 Overtime payments? 129 Hours worked per week 131 Number of important jobs R has had 15 16 17 (All Cards) 13-14 132-133 2 133 Most important job R has had 15-16 134-135 2 135 Industry of most important job 21 136 1 136 Number of companies R has worked for 22 137 1 137 Have moves helped R? 28-34 138-144 7 144 Job opportunities R looked for 35 145 1 145 Frequency of unemployment 45-46 146-147 2 147 Weekly compensation amount 57-60 148-151 4 151 Persons who worked while R unemployed 61 152 1 152 Number of part-time jobs since 1959 64-65 153-154 2 154 Weeks worked at first part-time job in 1963 66-67 155-156 2 156 Hours per week at first part-time job in 1963 13-19 157-163 7 163 Probability of job change 20-21 164-165 2 165 Attraction of new job (1st mention) 26 166 1 166 Covered by SS or RR retirement? 28-33 167-172 6 172 Health insurance, other protection 65-71 173-179 7 179 Value of 1963 donations, in kind with indication of sign of gain 78 180 1 180 Was there a tax advantage? 13 181 1 181 Number of relatives helped 38 182 1 182 Number of persons who had lived with R 66-67 183-184 2 184 Number of trusts to family 68 185 1 185 Number of gifts 71 186 1 186 Likelihood of receiving gift or inheritance 13 187 187 Letter of first transfer payment received 25 188 1 188 Letter of second transfer payment received 17-23 189-194 6 194 Amount of inheritance received since 1949 80 195 1 195 Number of inheritance received since 1949 31-32 196-197 2 197 Year latest received 13-19 198-204 7 204 Satisfaction with home and reasons 20-21 205-206 2 206 Number of rooms 0 22 13-14 207-208 208 Year bought home 15-19 209-213 5 213 Present value of home 20-24+ 214-218 5 218 R's investment in home 26-30 38-39 219-220 2 220 Number of units in home 40-43 221-224 4 224 Amount of monthly rent received 67-71+1 23 225-229 5 229 Amount of mortgage 31-35 37-39+ 230-232 3 232 Monthly rent plus utilities 40-42 44 233 1 233 Does R rent furnished? 45 234 1 234 How does R neither own nor rent? 24 13 235 1 235 Method of acquiring farm 15-16 236-237 2 237 Year acquired 17-19 238-240 3 240 Original acres 20-22 241-243 3 243 Current acres 28 244 1 244 Ownership status 0 31-35 245-249 5 249 Market value of house and yard 2 58-62+ 32-36 250-254 5 254 Current mortgage balance 52-56 255-259 5 259 Value of land and buildings 59-61 260-262 3 262 Number of acres rented from others 26 13 263 1 263 Method of paying for land rent 14-18 264-268 5 268 Amount paid in 1963 20-21 269-270 2 270 Proportion of income which was R's share 35 271 I 271 How does it happen R neither owns nor rents? 14 27 13-17+ 19-23+ 25-29+ (25) 52-56 272-277 6 277 Value of farm assets less debts -(24) 58-62 -(25) 32-36 - (27) 31-35 49-52 278-281 4 281 Management questions 13-19+ 282-288 288 Head's income 28 22-28 4 13-19+ 22-28+ 31-37+ 289-295 7 295 Family income 40-46+ 49-55+ 58-64 29 13 296 1 296 Probable sign of income change 14-19 297-302 6 302 Reasons 20 303 303 Amount 21 304 1 304 Net worth class 23-24 305-306 2 306 Title of asset managers 25-26 307-308 2 308 Job of managers 29 29 309 1 309 Wifes asset value 30 310 1 310 Amount of fixed return investment 31 311 1 311 Amount of common or preferred stock 32 312 1 312 Holdings in a business 33 313 1 313 Number of parcels of real estate 13 314 1 314 Number of TV's owned 20 315 1 315 Clothes dryer? 23 316 1 316 Clothes washer? 26 317 1 317 Dishwasher 30 318 1 318 Room air conditioner (number) 35 319 1 319 Central air conditioner 38 320 1 320 Freezer 41 321 1 321 Hifi or stereo? 44 322 2 322 Number of cars and trucks 31 13-14 323-324 2 324 Description of first named boat or motor 48 33-39 325-331 7 331 Use of tax cut 28 13-19+ 22-28+ 332-338 7 338 "Joint return" income 31-37+ 40-46 31-37+ 7 345 Wife's income 339-345 40-46 49-55+ 346-352 7 352 Other family members income 58-64 24 23-26 353-356 356 1963 Farm R.E. taxes 25 38-39 357-358 358 Acres under Woodland Tax Law 40-42 359-361 361 Acres under Forest Crop Tax Law 06 18-21 362-365 4 365 Place wife reared (with size indication) 15 69-70 366-367 2 367 Number of trusts established by R 368 1 368 (record mark)hahttp://www.ssc.wisc.edu/wais/WAIS656061.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656061.txt 4 Bill Gates 1971*#Format: Identification Control FileaAugust 13, 19710 WAIS paper712-004.'Fixed Format Identification File (FFID) Bill Gates WAIS 712-004 August 13, 1971 FORMAT: IDENTIFICATION CONTROL FILE MODE: BCD. RECORD SIZE: 192 Characters (24 words). MEDIA: B5500 disk and tape. BLOCKING FACTOR: 5 (120 words). STRUCTURE: Nine independent file segments sequential by identification number. FILE SIZE: 30146 records. Data Field Field Field Descriptor Name Size Position SC Source code 1 1 ID WAIS-ID 8 2-9 SS SS# 9 10-18 LNAME Last name 17 19-35 TITLE Title 3 36-38 FNAME First name 13 39-51 MIDNAM Middle name 12 52-63 STBOX Street or box # 10 64-73 RRRFD RR or RFD 3 74-76 STNAME Street name 17 77-93 STCLAS Street class 4 94-97 POCITY Post office (city) 17 98-114 ZIP Zip code 5 115-119 CNTY County code 2 120-121 TBIRTH Birth date (NM=) see BIRDIS 6 122-127 DEATH Death date (YYMMDD) 6 128-133 SSS SS status 1 134 LNS Last name status 1 135 Data Field Field Field Descriptor Name Size Position MCH Match of FFID's 1 136 MSS Multiple SS #'s 1 137 MID Multiple ID's 1 138 PMAS Presence MASTER 1 139 PPROP Presence PROPERTY 1 140 PE805 Presence E805 1 141 PBEN Presence BENEFIT 1 142 PSURV Presence SURVEY 1 143 ID2 WAIS-ID (primary or secondary) 8 144-151 ID3 WAIS-ID (secondary) 8 152-159 ID4 WAIS-ID (secondary) 8 160-167 SS2 Secondary SS# 9 168-176 UPDATE Record update status 2 177-178 EBIRTH Birth data from E805 (MMYYYY) 6 179-184 BIRDIS Birth data discrepancy 1 185 Expansion: 192 - 185 = 7 characters Field Interpretation: Descriptor Values SC 0 - 1960-1964 data 1 - 1960-1964 no wife returns 2 - 1960-1964 all returns missing 3 - 1960-1964 residuals 4 - 1946-1959 I or N 5 - 1962 tax roll (old SC = J) 6 - BENEFIT (old SC - b) TBIRTH The first two characters are the month of birth (may be "99") and the last four characters are the year of birth. This field represents a consolidation of E805 birth data and tax birth data (see BIRDIS). Descriptor Values SSS 0 - OK 1 - New FFID incorrect (implies 60-64 master incorrect) 2 - Old FFID incorrect 3 - Both incorrect INS Values are the same as SSS MCH 0 - Both FFID files 1 - New only 2 - Old only MSS 0 - One SS# 1 - No SS# 2 - Multiple SS#'s MID 0 ID is primary 1 Multiple ID's and ID is primary with ID2 secondary 2 ID is a valid secondary and ID2 is primary 3 ID is an invalid secondary and ID2 is primary BIRDIS 1 Difference LEQ 1 YEAR (TBIRTH <- EBIRTH) 2 Difference LEQ 3 YEARS (TBIRTH <- EBIRTH) 3 Difference GTR 3 YEARS (TBIRTH <- self) 4 TBIRTH = data, EBIRTH = 990000 (TBIRTH <- self) 5 12999 < TBIRTH < 990000, EBIRTH = data (TBIRTH <- EBIRTH) 6 TBIRTH = 990000, EBIRTH = data (TBIRTH <-EBIRTH) 7 129999 < TBIRTH < 990000, EBIRTH = 990000 (TBIRTH <- self) 8 TBIRTH = EBIRTH = 990000 (TBIRTH <- 990000) Note: Values 1-3, both have birth data. Values 4-7, there is some birth data indication. Value 8, absolutely no indication. Presence Indicators: Descriptor Values PMAS Blank - No comparison made 0 - Comparison made (no match) 1 - Comparison and match PPROP Same as PMAS PE805 0 - No age and no E805 record 1 - E805 record present 2 - Age data record 3 - Age data and E805 (E805 record) 9 - E805 and FFID nonmatch Descriptor Values PBEN Blank - No comparison made 0 - No BENEFIT 1 - BENEFIT PSURV Blank - No comparison made0 - No SURVEY 1 - SURVEY non-respondent (coversheet only) 2 - SURVEY respondent 3 - SURVEY substitute who subsequently became a non-respondent (coversheet only) 4 - SURVEY substitute who subsequently became a respondent 5 - Original SURVEY sample member who was replaced by a substitute (no coversheet)hahttp://www.ssc.wisc.edu/wais/WAIS712004.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS712004.txt Bill Gates 1971PJDocumentation: GATES/10PER - Ten Percent FFID Sample [executed July 27-28]August 13, 1971\ WAIS paper712-005.'Fixed Format Identification File (FFID)Nhahttp://www.ssc.wisc.edu/wais/WAIS712005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS712005.txt Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Bill Gates WAIS 712-005 August 13, 1971 Documentation: GATES/10PER - Ten Percent FFID Sample [executed July 27-28] It was the purpose of this program to use the nine full groups of the FFID file as the basis for drawing a random sample of ten percent of the households in the WAIS name group-cluster sample. The general requirements were as follows: 1) extract 1 in 10 households counting only primary FFID records (MID = 0,1). 2) extract all wives where MID = 0,1 of the wife. 3) exclude children, 7X records, households in the 1000s, and secondary records (MID = 2,3). This was accomplished using the B-5500 disk files FFID/23JUN1-9 with nine independent starts creating nine separate output samples. In addition all the sampled ID's have been listed along with their MID status (0,1).* Also, extract files are full record extract, i.e., we have all fields such as PSURV, PE805, TBIRTH, EBIRTH, etc. The breakdown on record counts is listed below by name group sections: section count of records (MID = 0, 1, 2, 3) drawn from 01-08 325 3649 (3650)** 09-16 340 3622 (3626) 17-22 283 3212 23-27 370 4035 (4037) 28-33 353 3798 (3801) 34-39 289 3204 (3206) 40-43 271 2916 44-47 242 2683 (2685) 48-70 271 3012***(3013) 2744 *Filed with production listing of program. **Counts after Dave Dicken's updates. ***Actually 3013 records except that the last two records duplicate each other (the very last will be deleted). Some additional comments are in order. These disk files have been written as multiple-files to single reels where the primary tape is U7150 (300 ft) and backup is U4747 (300 ft) with 5500 labels. This will call for 3 EOF's to be handled per file, e.g., label-1a file 1 label-1b label-2a data label-2b ... EOF EOF EOF EOF EOF EOF A B C D E F I am not exactly certain labels C and D both occur but at least one of them does and we can easily determine if they both do when we utilize the 1108. An additional question is raised by the occurrence of ******** three times. These records have been examined and they are exact duplicates all with three secondary IDs (********,********, ********). A complete scan of the listings revealed no other such happenings. However some recollection of past events discovered that this individual had been affected by survey files ID changes so that the duplicates were most likely introduced then. In addition careful retracing of record counts leads one to the same conclusion. It was determined that his wife's and the next individual's records had been overlayed. They were replaced and the sample was redrawn. Finally name group 70 should have been excluded but was not so that the last section 48-70 contains seven records that may be deleted. Neither of the above two problems will have any impact on our processing plans.son). 3. The exposure counter value for the last shot of the day. Example Daily Microfilm Log Name I. M. Afilmer Date Time Counter Folder 4-12-64 8:00 28-1498 Grump, Alfred X. 4-12-64 5:00 29-4318 Heartburn, Janet E. 4-13-64 8:00 29-4319 Heartburn, Kermit J. 4-13-64 5:00 30-2097 Irwin, Publish ,R. (b) Reel Log The Reel Log is the most important record you will keep. BE CAREFUL in filling it out because it must agree exactly with what is actually on the reel. At the beginning of each reel of film enter the following information in your Reel Log. 1. The Reel Number of the new reel and the data. 2. The exposure counter of the first shot. 3. The name from the first folder you will shoot on this reel. If you should reach the end of a name group in the middle of a reel the following information your Reel Log. 1. The Reel Number of this reel and the date 2. The exposure counter of the last shot for the finished name group. 3. The name from the last folder shot. 4. The name from the first folder in the next name group. 5. The exposure counter value for the first shot in the next name group. Example Reel # Date Real Log Name. Group. Count 1 4-19-66 01001 B******, A*** A. 1 4-19-66 01896 B***, C**** N. -1956 2 4 19-66 02001 B***, C**** N. -1956 2 4-20-66 02385 C*****, A***** 6. The accuracy of exposure counter is especially important. Be sure that the counter is changed properly as each new reel is used. (c) Film Record When you finish a roll of film, mark the first name group and the last name groups contained on that roll on the box. We will want to read these on a reader when they have been developed. (d) Name Cluster Record On the sheet containing the name group designations record the order in which your group microfilmed them. e.g. - if B***** was first, number it 1; if you had to skip B**** for some reason, and B**** was second, number B**** 2, and give B**** the order number in which you finally microfilm it. The number will help us to identify the roll of film which contains name group. 4. Questions If you have any questions about the procedures outlined in this paper, or need additional information, call the writers at 262-3122 or 262-1981 or Martin David at 262-5831. Daily Microfilm Log Enter first folder and last folder processed daily Microfilm Machine # Date Time Counter # Folder Name and Date Microfilm Operator Reel Log Enter first year record and last year record on each reel. When you finish filming a name group enter: (1) Reel, date, counter # and name of last record in the group. (2) Reel, date, counter # and name of first record in new group. (3) Be sure to adjust the counter for each new name group. Reel # Date Counter # Folder Name Year of record Microfilmer File Log Surname First name in group Last name in group Record Location Drawer or Shelf # Date Pulled Beginning Name & Initial Ending Name & Initial Date Returned If not pulled explain reasonhahttp://www.ssc.wisc.edu/wais/WAIS656056.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656056.txtD" Roger Miller 1966D>Computations of Income and Deductions for the Averaging TablesMarch 17, 1966 WAIS paper656-047oAveraging StudiesbCCRoger F. Miller WAIS Paper 656-047 March 1966 Computations of Income and Deductions For the Averaging Tables I. Summary A. The original treatment of negative capital gains in (5)(b)(ii) of 645-052 was correct. B. The suggested revisions of the computations of deductions in 656-041 are deceptive in their plausability and in some cases my introduce bias into the results. C. The suggested revisions of the treatment of exemptions in 656-041 are incorrect. D. Other comments on and corrections to 656-041 and 656-042. II. Argumentation A. Treatment of negative items of income in reconstruction. References: Statute: Section 1304, Paragraph (c), Subparagraphs (1)-(3). Proposed Ruling: " 1.1304-4, (b), (2), (1) I read the statement in the rules as saying that when two persons who did not file jointly in the base year are required to reconstruct their incomes for that base year, then "...an item of income of any individual may not be decreased by a loss incurred by the individual with whom he combines his income...". (2) The example immediately following in the Rules applies this to a case involving a negative capital gain. This is the authority, just "rediscovered", for (5)(b)(ii) in 645-052. (3) The section also applies to two other "adjustments to taxable income": (i) Foreign and possessions income; and (ii) Income attributable to gifts etc. -- See Rules: Section 1.1302-3, Paragraph (c). However, our input to the program does not have separate observations on these variables. (4) The action described in the aforementioned (5)(b)(ii) should perhaps be extended to all capital gains as reported, not just when reconstructing income: see Rules: Section 1.1302-4, Paragraph (a), last sentence. At the time of writing 645-052 I felt this would be a much bigger step than that in (5)(b) (ii) and raise the nasty question of separating short term from long term gains and losses in much greater degree than its use in (5)(b)(ii). In my hasty letter of November 12, 1965 replying to Duchan's urgent plea for a quick justification or reference, I inadvertantly applied my recollection of this judgement to the question of why (5)(b)(ii)'s treatment of capital losses was included, rather than to the more relevant question as to why this treatment was not extended to all capital losses. Mea Culpa. B. The revised treatment of computation of deduction. (1) I regret that I was not consulted on these revisions, because I think there is a serious possibility they bias the results. Whether or not they do so is an empirical question which may or may not be easily answered. The surface plausibility of these revisions may prove ephemeral. (2) The shift to the use of a concept allowing Federal Standard Deductions may buy only spurious precision. I did not take this stop in 645-052 because we had made the decision to take the Wisconsin Deductions as they were. It is very true that anyone would take at least the Federal Standard Deduction. They cannot do worse. However, we have not adjusted (how can we?) for all other differences in income and itemized deduction concepts. Supposing that the ratio of G to N in the State concept as a percentage of the ratio of G to N in the Federal concept is a constant (or a zero mean random variable) cross sectionally for any given level of gross income, regardless of whether the standard or itemized deductions are used. Then to move to allow the Federal Standard Deduction would distort the results, as some persons in the same gross income class would have their net incomes artificially increased. This is a systematic, not a random change, and it is not clear that it would not bias the results. The simplest case of this occurs when a person has heavy deductions and therefore itemizes during his base years, and then has virtually no itemizable deductions and a rise of income in the computation year. If he is near the border of averagability (note we use a flexible border) then a shift to the Federal Standard Deduction in the computation year may drop him under the border. It would be interesting to explore this question further, and I suggest we might wish to run the tables both ways to see if it makes any difference. Possibly the refinement may be an improvement, but still may not be worth the extra effort involved. (3) All of the preceding was written prior to the reciept of Moyer's 656-042. Table A below is derived from his Table 1. The discrepancy in the Federal number of returns may be due to these returns having had so little income that no deductions of any sort were taken. If so, they probably correspond more to the State short forms and should be added to the Standard Deduction number of returns. This would reduce the per return Federal figures on lines 7 and 10 and increase the ratios on those lines substantially: the ratio on line 7 to about 0.85, the ratio on line 10 to about 0.70. Now the ratio of 0.70 to 0.85 is approximately equal to the ratio of 0.59 to 0.73. It appears to me that this establishes a prima facia case for the hypothesized equality of ratios in (2) above, and against trying to push the deductions closer to the Federal Standard. (4) Moyer and Duchan are legally correct with respect to the allocation of deductions in the reconstruction of incomes. (a) I am fearful that there has been a misreading of the instructions in 645-052, no doubt due to some ineptness in my presentation. The difficulty seems to revolve around the successive references to precedent --------------- TABLE A 1962 Income and Deductions of Wisconsin per Return, and Number of Returns, Both State and Federal.* Line State Federal Ratio of State to Federal 1 Total Number of Returns 1,733,743 1,407,472 1.23 2 Number of Returns with Itemized Deductions 563,745 493,951 1.14 3 " " " " Standard " 1,169,998 638,480 1.83 4 Line 1 - Lines (2+3)** 0 275,041 ? 5 AGI per Return - All Returns 4,280 5,270 0.81 6 " " " - Itemized Deduction Returns 5,914 8,053 0.73 7 " " " - Standard " " 3,478 5,787 0.64 8 Deductions per Return - All Returns 524 791 0.66 9 " " " - Itemized Deduction Returns 915 1,557 0.59 10 " " " - Standard " " 336 539 0.612 *Source: Table 1 in WAIS paper 656-042 **This discrepancy raises questions in my mind about the validity of all the contents of the Source, and about the computations in the table above. The per return figures use the number of returns in lines 1-3 above without any attempt to allocate the discrepancy. --------------- sections in each successive section. In (5) one is referred back to (4), which refers one back to (3) in certain circumstances. The reason for this was to define for the programmer a sequence of loops and the conditions under which any particular case was to go through a particular loop in the program. I felt that in view of the time pressures under which we were working it was necessary to keep the succession of loops and the corresponding decision rules for their use as simple as possible unless there were compelling reasons for increasing the complexity of the treatment. (b) Leaving aside the problem of treatment of capital gains in reconstruction discussed in A. above, the problem of the Federal Standard Deduction discussed in B. above, and the problem of the treatment of exemptions which will be discussed in C. below, the only point of difference between the Moyer-Duchan proposal and that in 645-052 is their sections 2.4.1.1 and 2.4.1.3 on pages 5 and 6, discussing the treatment of the computation year and base period years in which they are single for persons married and filing separately in the computation year. Their section 2.3 adds nothing (except an error in 2.3.1), since it doesn't hurt to run joint filers through the allocation only affects the Si's, which do not appear in the argument of Bt which is redefined as max ( J1). The sole issue under discussion, therefore, is the treatment of deductions in 2.4.1.1 and in 2.4.1.3 of 656-041 as compared to that in (5)(b)(i) of 645-052. (c) The only distinctions between 2.4.1.1 and 2.4.1.3 of 656-041 are: (i) The treatment of exemptions, the discussion of which is postponed to C. below; and (ii) the omission of 1/2 J from the arguments of Bt in 2.4.1.3. If we accept for the base years the allocation of exemptions used in 2.4.1.3, then the treatment of exemptions is identical in 656-041 and 645-052. Let Si denote the Si defined in 2.4.1.3, for any year, and Si be the Si defined in (4)(c) of 645-052. Then: Si - Si = D1 if P1 0.15 D1P1 - D2P2 if 0.15 P1 0.85 -D2 if P1 0.85 is the precise remaining discrepancy in how the S's are defined. It seems unlikely me that these would be vary large amounts. In the case of the extreme values of P1, the absolute value of the discrepancy involves the actual deductions of the person with the (relatively) very small AGI. A man with AGI of $85,000 with a wife having AGI of $15,000 would have his value of S understated by the amount of his wife's actual deductions, and her S would be overstated by precisely the same amount. If this is base year, her overstatement is very unlikely to result in S > 1/2J, so that his understatement is the only relevant fact. (d) In the Intermediate range of values of P1, the discrepancy can be expressed as ( G1G2/G1+G2 ) ( D1/G1 - D2/G2 ) so that empirical exact proortionalty of D to G makes the discrepancy vanish. Across income classes close proportionality seems indicated by Moyer's Table 4 in 656-042. It is possible that there is a sex-linked difference, but in the absence of other information I think that we cannot refute an hypothesis which claims that variation in the ratio of D to G among independent persons is a zero mean, small variance random variable, so that for base years when the reconstructing persons are not married proportionality of D to G is a close approximation to reality. (e) In a computation year the above considerations are not as compelling because the couple has some freedom as to whom deductions are attributed. Empirically one would expect that all or almost all deductions of any significant size are attributed to the marriage partner having the highest marginal tax rate, i.e; to the person having the highest AGI. If this is so, the lower income person has a Di approximately zero and the diacrepancy in S vanishes even for the higher income person in the case of extreme values of P1. Possibly one might gain enough through averaging to make it worthwhile to shift deductions to the lower income, lower marginal rate partner, but it seems unlikely. (f) In the intermediate range of values of P1, in the computation year, the freedom to attribute deductions may yield a substantial discrepancy in using S instead of S, and this seems to me to a to be the only case worth worrying about. Theoretically the discrepancy ranges from -0.85(D1 + D2) to +0.85 (D1 + D2). However, this seems to me to be an anomaly of the law. Consider two persons each having $50,000 of separate AGI per year, and deductions of a combined total of $20,000 which they are free to manage. For 1959-1963 they attributed $10,000 to each person, so that each had a base (ignoring capital gains and exemptions) of $40,000. Suppose in 1964 H makes all the payments to which are deductible; his net income becomes $30,000 while that of his wife becomes $50,000. She has as apparent increase of $10,000 of income while he ahs an apparent decrease of $10,000. As the law is written there is no point to their doing this, because the (1.33E + $3,000) rule means that none of the change in income is averagable. However, we are concerned partly with investigating what might happen if either part of this rule were changed, and also if negative fluctuations could be averaged. Suppose all positive fluctuations could be averaged but no negative ones. The wife above has $10,000 of averageable income. The following year the average bases become $4,000 for H and $42,500 for W, allowing them to continue this game. In order to prevent this, some rule such as the proportional allocation rule would have to be adopted. (g) I therefore think we should retain proportional allocation of deductions in the computation year. In the only case where it is likely to make any differance, the possibility of induced behavior contrary to the intent and purpose of the law makes it the most relevant procedure. (H) These same considerations justify retaining the 1/2 in the argument of B5. In the above example, the husband would have 10,000 of negative fluctuation in 1964 which would be so tabulated by us, unless we retain this feature. On the face of it, the proportional allocation of deductions and exemptions prevents this. But suppose the husband and wife are President and Board Chairman of a closely held corporation and artificially vary their AGI by annually switching salaries. The variation in NTI is induced via AGI in this case. Do we wish to tabulate such a negative fluctuation that is offset by the marriage partner's positive fluctuation? If averaging of negative fluctuations is ever allowed, surely such a possibility would be blocked, and this feature would probably do so effectively because it would positively prevent the switching of income among husband and wife from resulting in a decreased tax liability. (i) The above arguments do not gainsay the fact that this particular feature of the Moyer-Duchan proposal is closer to the law as it is written than the original specifications. All that is questioned is the relevance of the results. I imagine that distinct runs, one with each procedure, would be virtually identical in tabular output. I would not wish to see the original procedure abandoned lightly, but I have no objection to separate runs with both procedures provided this can be afforded, in terms of both money and time. C. The revised treatment of exemptions (1) The revisions in 656-041 are inconsistent and erroneous, as nearly as I can determine. There is no justification for using E1 in place of max (E1, E1). See sections (5) and (8) in my 645-050, and the last paragraph on page 2 of 656-041. The use of E1 for each person results in double counting of exemptions to the extent of min (E1, E2) in 656-041. (2) From the standpoint of married persons filing separately, handing exemptions back and forth may operate in a manner similar to handing deductions back and forth. For the reasons set out in B.(4)(e) - (h) above, I think retention of the allocation procedure is desirable for exemptions too. Other Comments and corrections: (1) in 656-041, Section 2.3.1, the quantity -- (C1 + C2) is omitted from the expression for J. When the remarks in C. above are also taken into account, there should be a period following the word "correct" in 2.3.1, and nothing more. (2) The third row of Table II in 656-041 is incorrectly labelled. It should say "Not Applicable" in column one "Same as Table I" in column four, and "Same as married joint return base period year in Table I" spread over columns two and three. It will be noted that there will then be three small blocks and one double block in Table II indicating the places where Moyer and Duchan recommend revision. One block (Row 1, Col. 4) involves only the exemptions change. One small block and the double block involve only the allocation of deductions and exemptions, and the remaining small block also involves the question of including 1/2J in the argument of Bt. These four blocks, involving five of the distinct circumstances possible, do not require distinct program loops in original procedure. The recommended changes would require five new loops and decision functions. The only one that should be seriously considered, I believe, is the one in the fifth row, third column. (3) Moyer's 656-042 needs reworking. The discrepancy on number of Federal returns (see Table A above) requires explanation if not resolution. More fundamental is the problem of explaining too much. It is possible that, say, 1,000 persons file returns that in aggregate have the same total AGI and deductions on State and Federal Returns; then a few additional State returns would make the State aggregate totals different from the Federal totals, but there may be no real discrepancy to explain. The most desirable procedure would separate the returns into four groups: State only filers, Federal only filers, and the State and Federal returns of those who filed both. Perhaps the AGI and Deductions of the "only" filers could be estimated separately, but I doubt that it is feasible. Failing this, however, we still should be concerned with explaining the differences in average total deductions. These differences are relatively greater than the differences in the aggregates. On page four Moyer says that the difference in deductability of certain taxes accounts for "almost the entire amount of the difference in Federal and State itemized deduction amounts." Actually, the maximum this could account for is $170,881 of a total of $253,075, only about 2/3 of the difference. On a per return basis, however, this accounts for only $346 of a $642 difference, or less than 54%.hahttp://www.ssc.wisc.edu/wais/WAIS656047.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656047.txtd^ | Gene Moyer 1965,%Format of 1962 State Tax Roll Recordss June 7, 1965 WAIS paper645-0692+Data- 1962 (State Tax Roll Records) Formatsi j cGene Moyer WAIS 645-069 June 7, 1965 Format of 1962 State Tax Roll Records Column Number of Data columns 1-18 13 Last name of taxpayer 19-29 11 First name and middle initial 30-38 9 Social Security number 39-61 22 Street address 62-78 17 Post office 79 1 "X" if post office is in Wisconsin "R" if post office is outside Wisconsin 80 1 Type of return to be sent to the taxpayer for 1963 income year 81-84 4 Tax district 85-86 2 County of residence code [same as that used by WAIS) 87-93 7 Adjusted Gross Income (in whole dollars) 94-100 7 Net Taxable Income (in whole dollars) 101 1 "1" if taxpayer used 10% standard deduction "2" if taxpayer itemized deductions 102-106 5 Amount of exemption (last two columns for cents) 107-114 8 Net Normal Tax (last two columns for cents) 115-122 8 Tax paid to other states (Last two columns for cents) 123-124 2 Month tax year ended 125-126 2 Tax year [Note: If a "00" appears in columns 125-126,the remaining computations were "blocked," i.e. not done. This code was used when there was some reason to believe that regular computations (on the computer) would result in an error message. Persons who filed 1961 returns after June 30, 1962 compose the bulk of these "00" records.] Columns Number of Data Columns 127 1 Type of return filed blank = regular return 1 = amended return 2 = tentative return 3 = final return 128 1 "manner" of filing return blank = regular 1 = late with tax liability 2 = late without tax liability 129 1 Did the amount of deduction equal 10% of AGI for those persons with a 1 in column 101? blank = yes 1 = no 130-137 8 Amount of tax withheld (last two columns for cents) 138-145 8 Amount of tax paid by declaration (Last two columns for cents) 146-153 8 Amount of the balance of the tax liability (last two columns for cents) 154 1 Code for amount of the balance 0 = no tax 1 = tax due 2 = refund payable 3 = refund not issued by machine 155-161 7 Validation number [Note: This number is significant for our purposes because each return was given a validation number. Therefore each husband and wife who filed a single return have the same validation number. Husbands and wives can be identified by this number] 162-169 8 Amount of tax paid (last two columns for cents) 170-176 7 Delinquency charges (last two columns for cents) 177-183 7 Adjustments (last two columns for cents) Columns Number of Data Columns 184 1 Code for adjustments 1 = debit 2 = credit 185-191 7 (For prior delinquents only) transfer of credit 192 1 Code for adjusted balance 0 = no tax 1 = tax due 2 = refund payable 3 = refund not issued by machine 193-199 7 Adjusted balance (last two columns for cents) 200 1 Was an extension granted for this person? blank = no 1 = yes 201 1 Coded "B" if spouse is filing a return on the same tax return blank 202 1 (record mark)hahttp://www.ssc.wisc.edu/wais/WAIS645069.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645069.txt Gene Moyer 1965,%Objectives of WAIS and of this Thesisr June 28, 19657 WAIS paper645-075sHBAdministration Miscellaneous Proposals- For Analyses, Theses, etc.ZS**This document could not be translated to basic text. Please view the PDF file.**5**Terms and topics from paper, listed for searching purposes** I. Objectives of WAIS and of This Thesis permanent file of the data II. The Design of the Income Tax Sample Wisconsin Department of Taxation The returns covered a five year time span archives exact last name group sizes name groups tax roll books Table I Comparison of Name Groups Produced by Different Criteria 1958 tax roll books wives of men in the name groups photographing of returns The Units of Analysis microfilm Recordak Company of Chicago sorting returns identification number family unit relationships of household members Data Processing System error check and correction Table II Number of Errors Discovered in Master File by Kind of Error Type of error V. A Comparison of the Population of Wisconsin State Income Tax Filers with other State Popultaions PIt = Wisconsin Personal Income in year t C59 = Total income earned by Wisconsin residents in 1959 FAGIt = Adjusted Gross Income of Wisconsin residents SAGIt SNTIt = Net Taxable Income The Income of Wisconsin Residents as Reported by Four Sources of Income Data in 1962. Table I Some Relationships Among Four Estimates of Individuals' Income in Wisconsin The Income of Wisconsin by Source: Personal Income and Adjusted Gross Income Table II Source of Personal Income and Federal Adjusted Gross Income in Wisconsin during 1959 Table III Definitions of Income: Personal Income, Wisconsin State Tax Department, U.S. Internal Revenue Service, Bureau of the Census, 1959. Table IV Sources of Personal Income and Federal Adjusted Gross Income in Wisconsin during 1959: Adjusted to Approximate the State Income Tax Definition The Income of Wisconsin: Four Estimates Adjusted to the State Income Tax Definition The Distribution of Income among Taxpayers and Income Recipients in Wisconsin Table VI State Returns and State Adjusted Gross Income by Net Taxable Income Class, 1962 Table VII Federal Returns and Federal Adjusted Gross Income of Wisconsin in 1962 by Adjusted Gross Income Classes Table VIII Persons and Families Who Reported 1959 Income to the 1960 Census by Census Income Class A Comparison of the Number of Persons in the Three Populations Table X Some Relationships Among Three Populations Of Income Recipients in Wisconsin in 1962 Appendix A Document: The Coding of the Wisconsin State Tax Forms (1946-1960) 1. (2-9) The Identification Number 2. (399-407) Social Security Number 3A. Years Filed 4, 5. Name and Address of Taxpayer 6. Residence Location 7. (15-16) County Prior Year 8. (17) Address Change 9. (18-19) Occupation Code 10. (20) Occupation Change 11. (21) Return Filed in Previous Year? 12. (22) Partnership Name Given? 13. (23) Spouse's Name Given? Does spouse have separate income? 14. (24) Was the taxpayer married during the tax year? Were marriage details given? 15. (25) Does the taxpayer claim the "Head of Family" exemption? 16. (26-27) Number of Dependents 17, 18, 19. Non-Calendar year; Gift Information III. The Coding of Income from Interest, Dividends, Rent, and Capital Gains Class A Property code sheet Rent Income Appendix II Instructions for Coding Fixed Format ID Cards Appendix III Code Sheet for Taxpayers Not Using Printed Tax Form for Interest or Dividends Code Sheet for Taxpayers Not Using Printed Tax Form for Capital Gains New Outline of Thesis Chapters Special Problems in WAIS Data A. Long Time Series B. The Method of Choosing the Original Name Groups Appendix 1 A Document on Coding B Document on Keypunching C Formats of Major WAIS Tape Files D Document on Programming E Current Status of the Benefit File Processing F Interview Schedule, Booklet, and Instructions G Data in the Social Security Account File. IV Supplementary Data Social Security Account File The Social Security Benefit File State Tax Population Files State Summary Statistics Victor M. Cassidy WAIS Paper 645-071 June 10, 1965 A List of Tables in the Wisconsin Summary Statistics The Property Income File Stock Price and Dividend File The 1960-1965 Tax Record File The Donations File The High School Graduates File Additional Future Fileshahttp://www.ssc.wisc.edu/wais/WAIS645075.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645075.txt> "= < Gene Moyer 19652+The Coding of the Wisconsin State Tax FormstFebruary 25, 1965  WAIS paper645-038a"Formats Medical Expense DataJC**This document could not be translated to basic text. Please view the PDF file.** Some of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information.**Terms and topics from paper, listed for searching purposes** Gene Moyer WAIS 645-038a February 25, 1965 1st Revision Document: the Coding of the Wisconsin State Tax Forms (1946-1960) I. The Integration Process II. The Coding "code sheet" 1. The Identification Number name cluster Doomage 2. Social Security Number 3A. Years Filed 4, 5. Name and Address of Taxpayer 6. Residence location 7. County Prior Year 8. Address Change 9. Occupation Code Dictionary of Occupational titles (U.S.G.P.O.) 10. Occupation Change 11. Return Filed in Previous Year 12. Partnership Name Given? 13. Spouse's Name Given? 14. Was the taxpayer married during the tax year? Were marriage details given? 15. Does the taxpayer claim "Head of Family" exemption? 17, 18, 19. Non-Calendar year; Gift Information III. The Coding of Income from Interest, Dividends, Rent, and Capital Gains Type of Asset capital gains stocks firms Rent Income Class A Property Instructions for Coding Fixed Format ID Cards Code Sheet for Coding Fixed Format ID Cards Code Sheet for Taxpayers Not Using Printed Tax Form for Interest or Dividends Code Sheet for Taxpayers Not Using Printed Tax Form for Capital Gainsjchttp://www.ssc.wisc.edu/wais/WAIS645038a.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645038a.txt Gene Moyer 1965*$Tape Record for Medical Expense Data March 2, 19655 WAIS paper645-0380"Formats Medical Expense Data Gene Moyer WAIS 645-038 March 2, 1965 Draft Tape Record for Medical Expense Data (all amounts are rounded to the nearest dollar) Number of Variable Columns Columns Number Variable Name 1 1 1 "M" 2- 9 8 2 Our identification number 10-18 9 3 Social Security number 19-25 7 4 1947 gross income 26-32 7 5 1947 wife's gross income 33-37 5 6 1947 total medical expenses (of both husband and wife) 38-44 7 7 1948 gross income 45-51 7 8 1948 wife's gross income 52-56 5 9 1948 total medical expense, 57-63 7 10 1949 gross income 64-70 7 11 1949 wife's gross income 71-75 5 12 1949 total medical expense 76-82 7 13 1950 gross income 83-89 7 14 1950 wife's gross income 90-94 5 15 1950 total medical expense 95-101 7 16 1951 gross income 102-108 7 17 1951 wife's gross income 109-113 5 18 1951 total medical expense 114-120 7 19 1952 gross income 121-127 7 20 1952 wife's gross income 128-132 5 21 1952 total medical expense 133-139 7 22 1953 gross income 140-146 7 23 1953 wife's gross income 147-151 5 24 1953 total medical expense 152-158 7 25 1954 gross income 159-165 7 26 1954 wife's gross income 166-170 5 27 1954 total medical expense 171-177 7 28 1955 gross income Tape Record (cont.) Number of Variable Columns Columns Number Variable Name 178-184 7 29 1955 wife's gross income 185-189 5 30 1955 total medical expenses 190-196 7 31 1956 gross income 197-203 7 32 1956 wife's gross income 204-208 5 33 1956 total medical expense 209-215 7 34 1957 gross income 216-222 7 35 1957 wife's gross income 223-221 5 36 1957 total medical expense 228-234 7 37 1958 gross income 235-241 7 38 1958 wife's gross income 242-246 5 39 1938 total medical expense 247-233 7 40 1959 gross income 254-260 7 41 1959 wife's gross income 261-265 5 42 1959 amount paid to doctors 266-270 3 43 1959 amount paid to dentists 271-273 5 44 1959 amount paid to druggists 276-280 5 43 1950 amount paid to not ascertained N.B: if the coder cannot ascertain the proper category of a person to whose money for medical care had been paid, the amount paid to that person will be placed in variable 45. 281-285 3 46 1939 amount paid to hospitals 106-290 5 47 1939 health insurance premium 191-395 5 48 1959 amount paid to others (make out card) 396 1 49 Record markhahttp://www.ssc.wisc.edu/wais/WAIS645038.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645038.txt Gene Moyer 1965HAThe Weighting of the Interview Data for the Report to Respondentso March 2, 1965 WAIS paper645-039nSurvey Data and File Gene Moyer WAIS 645-039 March 2, 1965 First Revision The Weighting of the Interview Data for the Report to Respondents. Because of the need to report our findings to respondents as quickly as possible, some "short cut" weighting system is necessary. Two possible methods are available to us. One is to make use of Bj, the a priori percentage at which the ith interviews was sampled. While the true percentage rate at which this jth interview was sampled (Bj) is different from ~Bj , because of errors in matching, (Bj-~Bj) is probably very near zero. If there were no non-response bias, then (~Bj)-1 would probably be an adequate weight for the jth interview. There is, however, some evidence that there is a non-response bias in the distribution of the 1962 incomes of our respondents. Let yi = the number of persons in the ith 1962 income bracket who were chosen in the sample of 2069. Xi = the number of actual respondents in the ith 1962 income bracket. Zi - the number of non-respondents (for any reason) in the ith 1962 income bracket. Values of Xi, Zi, and yi are given in the following table: 1962 Income 1 2 3 4 5 6 7 8 9 E Bracket (H) Xi 34 113 304 357 176 71 113 12 8 1188 Zi 15 91 196 151 73 37 64 13 3 643 E = yi 49 204 500 508 249 108 177 25 11 1831 Since there is some non-response bias but it is not systematic, a second term in the weight for each interview is necessary so that the interviews will in general "represent" the income distribution of the original sample of 2069 persons. A weight, then, for the jth interview of Wj = yi/Xi (Bj)-1 should insure that our statistics will in general reflect the experience of all Wisconsin taxpayers. So that these weights can be constructed easily, cards with the following format will soon be punched on the computer: Number of Variable Column Columns Number Variable Name, 1 1 1 W 2- 9 8 2 Taxpayer identification number 10-13 4 3 Serial of interview l4 1 4 Was a booklet returned? 1 - yes 0 - no 15-17 3 5 yi 18-20 3 6 Xi 21-23 3 7 Sampling rate (a priori) 24-32 9 8 Key 33-80 48 - Blank If there were no variation in the response rates, we would expect ( ) and ( ) To test the null hypothesis that ( ) we compute ( ) where ( ). The hypothesis that ( ) is rejected at the .01 significance level and response - income bias seems almost certain. This result gives rise to the question of whether the bias is at all systematic. One test of systematic bias is to calculate the ~Bi in a regression of the form X/y = ~B0 + ~B1H + ~B2H2 + e. If the R2 (multiple correlation coefficient) from a regression such as this is high, the null hypothesis of no relationship can be rejected. For this regression R2 = .00235, yielding an F statistic for the whole relationship (H and H2) of .0001238. Since the tabled value for F with 2 and 7 degrees of freedom is 4.74, the null hypothesis that {B1 = B2 = 0} is accepted and we would conclude that the bias is not systematic.shahttp://www.ssc.wisc.edu/wais/WAIS645039.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645039.txtoM Wynn Bussmannl 1968<5The Supplementary Firm Data File for Van's 260 Sample July 19, 1968. WAIS paper689-003. Property File2$2Wynn V. Bussmann WAIS 689-003 19 July 1968 The Supplementary Firm Data File for Van's 260 Sample 1. Origin In April, 1968, the author took a random sample of 260 taxpayers from the WAIS survey. Data from this 260 Sample will be used to analyse corporate stock portfolios of the taxpayers in the sample. Interest and dividend data (old Property File card 1, new Property File card 1) and capital gains data (old Property File card 4, new Property File card 2) were extracted from the old Property File (presently on tape 443 at DPLS and tape 454 (U7125) at UWCC) and then reformatted to agree with the formats in the supplement to WAIS 678-031. These tasks were accomplished through the use of Mark Lieberman's extract and reformat programs, LIEBS/ HYBRIDB for card 1 and LIEBS/HYBRID for card 2. The interest, dividends, and capital gains data for the 260 Sample were then extracted onto tape and printed out as code sheets (described in WAIS 678-055) using Mark Wilde's programs, PROPRTY/LIST1 and PROPRTY/LIST2. In the meantime, WAIS coders had coded onto coding sheets the interest, dividends, and capital gains data for the 260 Sample for the years 1960-1964, which are presently being keypunched. The author then hand extracted from these and from the printouts of the 1946-1960 data for the 260 Sample a list of assets that are not on the Lorie-Fisher tapes (LFF). During May, 1968, he obtained monthly price and dividend data on 162 (42.4%) of the 382 dividend-producing assets not in LFF but owned by members of the 260 Sample. These monthly price and dividend data were obtained from Standard and Poor's Security Owner's Stock Guide at the Schaffner Library of Northwestern University, 339 E. Chicago Avenue, Chicago, Illinois. 2. Format The format for the Stock Guide data (S & P Data) and the codes are as follows: Table 1 -- Format and Codes for S & P Data # of Column Columns Datum 1-2 2 Asset Type 3-6 4 Asset identification number 7 1 Blank 8 1 Code: code 0 Comments See WAIS (the supplement to) 678-031, pp. 47-48 for a complete list of asset types explanation signifies the beginning (along with a 1 in col. 50) or the end (along with a 2 in col. 50) of a span of months for which the particular asset is not found in the Stock Guide (see lines 1 and 2 of the sample codes, p. "10. Table 1 (cont'd) # of Column Columns Datum Comments 8(cont'd) code explanation 1 The card contains a price datum only (see line 3, p. 14) 2 The card contains a dividend datum only (see line 4, p. 14) 3 The card contains a stock dividend datum only (see line 5, p. 14) 4 The card contains a stock split datum only (see line 6, p. 14) 5 The card contains a capital gains distribution only, usually from a mutual fund (see line 7, p. 14) 6 As of the month on the card, the stock does not appear subsequently in the Stock Guide, although someone in the 260 sample owned it after that date (see line 8, p. 14) 8 As of the month on the card, the stock does not appear in the Stock Guide, although someone in the 260 Sample owned it before that date (see line 9, p. 14) 7(a) Followed by a date and the A.I.C. of another firm, signifies a merger with that other firm, the other firm's A.I.C. being the surviving A.I.C. (see line 10, p. 14) .7(b) Followed by a date and two A.I.C.'s signifies the merger of those firms whose A.I.C.'s follow the date, the A.I.C. in cols. 1-6 being the sur viving A.I.C. (see line 11, p. 14) Not used 9 9 1 Blank 10-11 2 Month 12-13 2 Day 99 implies not ascertained 00 implies end of month 88 implies sum of payments for the year (see line 12, p. 14) 99 implies not ascertained 14-15 2 Year 99 implies not ascertained Table 1 (cont'd) # of Column Column Datum Comments 16 1 Blank 17-23 7 Price 24 1 25-32 8 33 1 34-40 7 41 1 42-46 5 47-49 3 50 1 Blank Dividend or capital gains distribution Decimal points were coded in col. 27; leading and trailing zeros were coded if needed. Blank Stock Dividend Blank Stock split Price quoted were month-end closing, if the stock was traded on an exchange; otherwise, bid prices were coded unless the asked price was the only prices quoted, in which case the asked price was coded. Decimal points were coded in col. 20; prices quoted in 16ths were rounded to the nearest even number; e.g., 16-3/16 was coded as 016.188; leading and trailing zeros were coded if needed. Decimal points were coded in Col. 37; leading and trailing zeros were coded if needed. The datum coded was in percent; e.g., a 50% stock dividend was coded 050.000. 888.888 implies that stock of another company was paid; 999.999 implies not ascertained. Slashes were coded in col. 44; the split was always reduced (expanded) to the lowest whole numerator and denominator; e.g., 2-1/2 for 1 was coded 05/02 Code See Table 2, pp .6-10 for the codes used Code: code 1 Explanation Signifies the beginning of a missing span (see explanation for code 0 in col. 8). Table 1 (cont' d) 2 3 4 5 6 7 8 51-55 5 Alphabetic 56-80 25 abbreviation of the firm's name Blank # of Column Column Datum Comments 50(Cont'd) code explanation Signifies the end of a missing span (see explanation for code 0 in col. 8) Not used Includes extras Paid so far during the year. This code is not used with code 88 in col. 12-13. See line 13, p. 14. Codes 4 and 5 Codes 4 and 8 Payment date is in year following year of ex-date. That is, if the card contains a dividend datum, then the date on the card refers to the date on which an investor must own the stock in order to receive payment of the dividend. Whenever ex-date is unavailable, record date was coded; if record date was unavailable, then payment date was coded. Thus, a taxpayer could own a stock on the ex-date, say Dec. 20, 1955, but may not receive payment until, say, Jan. 20, 1956. See line 14, P. 14 for the coding of this information. 3. Further Explanation of Codes Table 2 -- Codes Used in Cols. 47-49, in Order of Code Number Code Meaning 01 Also stock 02 Optional 03 Adjusted 04 Excludes optional dividends 05 Is in addition to another entry of the same data. This code is usually used when a mutual fund makes a dividend payment and a capital gains distribution on the same date. See lines 15-16, p. 14. 06 Code 30 and code 05 07 In Canadian funds, less 15% non-resident tax 08 Taxable as ordinary income 09 Special distribution from capital gains 10 Suspicious entry 11 Excludes $0.70 payable in cash or stock 12 Excludes capital gains of $0.20 II $1.11 13 II II $0.84 14 15 Less Wisconsin dividend tax 16 Excludes capital gains of $0.60 17 It if $0.58 18 " $1.20 19 " $0.097 20 " " $0.62 21 " $0.22 22 " " $0.28 23 Excludes realized profits of $0.50 24 Excludes capital gains of $0.21 25 Excludes capital gains of $0.555 26 " $0.55 27 Capital gains dividend payable in stock, or at holder's option, in cash 28 Excludes capital gains of $0.44 29 " " $0.81 30 Payable in stock unless cash requested 31 Payable in cash or stock 32 Excludes capital gains of $0.45 33 Optional in stock or in cash 34 Excludes capital gains of $0.14 35 Taxable as ordinary income 0.97, capital gains Jan. dividend of $0.50 36 Excludes capital gains of $0.56 37 " $0.31 38 Excludes $0.90 optional in stock or cash 39 Excludes capital gains of $0.89 40 Excludes $0.03 from 1951 security profits 41 Excludes capital gains of $0.40 42 " " $0.17 43 " $0.09 44 " " $0.12 45 " $0.43 46 " $0.50 47 " " $0.48 48 $0.35 optional 49 Excludes capital gains of $0.75 50 " " $0.33 51 " $0.37 52 " " $0.35 53 " $0.23 54 Excludes capital gains of $0.092 55 " $0.38 56 " $0.00625 57 " " $0.54 58 " " $0.51 59 " " $0.92 60 Not used 61 Excludes security profits of $0.24 62 " " " $0.90 63 Special capital gains distribution 64 Excludes capital gains of $0.30 65 " " $0.27 66 " " $0.97 67 " " $0.47 68 Excludes capital gains 69 " ' $0.26 70 " ' $0.57 71 Excludes capital gains of $0.66 adjusted to $0.33 after 2 for 1 split 72 Excludes security profits of $0.625 adjusted to $0.3125 after stock dividend 73 Excludes capital gains of $0.64 74 " " $0.88 75 " $0.11 76 " $0.26375 77 " " $0.29 78 $1.09 79 " " $0.36 80 " " $0.15 81 " $0.61 82 " " $0.215 83 Excludes capital gains of $0.70 84 " " $0.18 85 " $0.67 86 " $1.10 87 " " $0.53 88 " $0.46 89 " $0.63 90 " $0.65 91 " $0.49 92 " $0.59 93 " $0.295 94 " " $0.10 95 " $0.175 96 Initial to the public 97 Ex Phillips - Eckardt Electronic stock 98 Less tax of origin country 99 Not used 100 Excludes capital gains of $0.13 101 " " $0.04 102 Represents 10% paid-in-capital plus surplus 103 Excludes capital gains of $0.004 104 " " $0.19 105 " $0.52 106 " $0.495 107 " $0.32 108 " " $1.38 109 " " $0.73 110 " " $0.043 111 " " $0.74 112 $1.03 113 Excludes capital gains of $0.304 114 " " $0.39 115 " " $0.16 116 Distribution of Class A stock The codes in Tables 2 and 3 are taken (in almost all cases) verbatim from the footnotes in the Stock Guide; certain codes, however (e.g., code 10 -- suspicious entry), are the author's own invention. Table 3 Codes Used in Cols. 47-49 for Capital Gains Distributions, in Order of Dollar Amounts Code Meaning 103 Excludes capital gains of $0.004 56 " " $0.00625 102 " " $0.04 110 " " $0.043 43 " " $0.09 54 " " $0.092 19 " " $0.097 94 " " $0.10 75 " " $0.11 44 " " $0.12 100 " " $0.13 34 " " $0.14 80 " " $0.15 115 " " $0.16 42 " " $0.17 95 " " $0.175 84 " " $0.18 104 $0.19 12 $0.20 24 $0.21 82 $0.215 21 $0.22 53 $0.23 61 $0.24 69 $0.26 76 $0.26375 65 $0.27 22 $0.28 77 $0.29 93 $0.295 64 $0.30 113 $0.304 37 $0.31 107 $0.32 50 $0.33 52 $0.35 79 $0.36 51 $0.37 55 $0.38 114 $0.39 41 $0.40 45 $0.43 32 $0.44 28 $0.45 88 $0.46 67 $0.47 47 $0.48 91 $0.49 106 $0.495 23 (or 46) " $0.50 58 $0.51 105 " $0.52 87 " " $0.53 57 $0.54 26 $0.55 25 " $0.555 36 " " $0.56 70 " " $0.57 17 " $0.58 92 " $0.59 16 " " $0.60 81 " $0.61 20 " $0.62 89 $0.63 73 $0.64 90 " " $0.65 85 " " $0.67 83 " " $0.70 109 $0.73 111 $0.74 49 " $0.75 29 " " $0.81 14 " " $0.84 74 " $0.88 39 " $0.89 62 " " $0.90 59 " " $0.92 66 " " $0.97 112 $1.03 78 " " $1.09 86 " " $1.10 13 " " $1.11 18 " " $1.20 108 $1.38 4. Explanation of Examples, page 14. Lines 1 and 2 tells us that from January, 1952 to March, 1956, inclusive, no data are found in the Stock Guide for the asset with asset identification code (A.I.C.) 115980. Line 3 tells us that the October, 1950, month-end price for asset with asset identification (A.I.C.) 610353 was 27.000. Line 4 tells us that owners of stock with A.I.C. 119572 on August 13, 1951 received a dividend of $0.75 per share. Line 5 tells us that owners of stock with A.I.C. 116288 on December 24, 1952 received a stock dividend of 50%. Line 6 tells us that owners of stock with A.I.C. 116424 on February 15, 1961 received a stock split of 2 for 1. Line 7 tells us that owners of stock with A.I.C. 111700 (in this case, the Boston Fund, a mutual fund) received a distribution of capital gains of $0.34 per share. Line 8 tells us that data for stock with A.I.C. 610353 do not appear in the Stock Guide for December, 1954 or thereafter. Examples of S&P data codes SSRI FORTRAN CODING FORM PROGRAM PROGRAMMER DATE PAGE OF TYPE STATEMENT NO. CONT. FORTRANT STATEMENT 0=ZERO 0=ALPHA 0 1=ONE I=ALPHA I 2=TWO Z=ALPHA Z SEQUENCE NUMBER 1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980 Line 1 1 1598 0 0 010052 1SEC 2 1 1598 0 0 030056 2SEC 3 6 1035 3 1 100050 027.000 SPQ 4 1 1957 2 2 08135 00.75000 SVP 5 1 1628 8 3 122452 050.000 SOT . . . SMP BFI SPQ SMP AMI MSE BOG 5BSI 8RBR 45 BFI 05 BFI Line 9 tells us that data for stock with A.I.C. 116424 do not appear in the Stock Guide for or before March, 1960. Line 10 tells us that as of February 24, 1961, the company with A.I.C. 119643 merged with the company with A.I.C. 114368 to result in a new company with A.I.C. 114368. Line 11 tells us that as of sometime in November (the exact day is not known), 1960, companies with A.I.C.'s 119515 and 615334 merged to form a new company with A.I.C. 118604. Line 12 tells us that the sum of dividend payments from stock with A.I.C. 118814 for 1959 was $1.25. Line 13 tells us that the sum of dividend payments up to July 24, 1947 for stock with A.I.C. 111744 is $0.50. This entry is made probably because the dates and/or amounts of previous payments during that year for that stock are not available. Line 14 tells us that an investor owning shares of stock with A.I.C. 115640 on December 20, 1955 would not receive the $0.65 per share dividend until sometime after December 31, 1955. Lines 15 and 16 tell us that stock with A.I.C. 111700 paid a regular dividend of $0.20 and a capital gains distribution of $0.43 on December 10, 1956.zsTable 4 -- Firm's names, A.I.C.'s and Alphabetic Abbreviations Name Alphabetic Abbreviations A.I.C. Aero Supply Mfg.Co., Inc. AER 61-0348 Aircraft Radio Corp. ARC 11-9167 Air Products & Chemicals APD 61-4824 Allied Paper Corp. APP 61-5341 *Allied Supermarkets ASU 61-4416 *Allis (Louis) Co. ALC 11-1124 *American Auto. Ins. Co. AAI 11-8374 American Foreign Power Co., Inc. FP 11-1204 American Marietta Inc. AMI 11-9643 *American Super-Power Corp. ASW 11-8361 American Thread TH 61-0347 *Amerline Corp. AML 11-1284 *Ametek AME 61-4998 Ampco Metal Inc. AMP 11-1292 Amphenol-Borg Elect. Corp. ABE 11-1294 *Ansul Chemical Co. ACH 11-1314 Arizona Public Service Co. APS 11-1342 Arkansas Fuel (Oil) AFO 11-9051 Arkansas Louisiana Gas Co. AKG 11-1346 Arkansas Natural Gas Co. AKS 11-8377 Atlanta Gas Light Co. AGL 61-0534 Avon (Allied) Products API 61-0913 Bankers National Life Ins. Co. BNL 11-8787 Barlow and Seelig BRW 61-0350 Beneficial Standard Life Ins. Co. BSL 11-1594 *Berkey Photo Inc. BKY 61-4312 Binks Mfg. Co. BIN 61-0519 Borg (Geo. W.) BOG 11-8814 Boston Fund Inc. BFI 11-1700 Broad Street Investing Corp. BSI 11-1744 California Electric Power Co. CAP 11-1832 *California Oregon Power COP 11-1840 *Canada General Fund CGF 11-1884 Canadian Atlantic Oil CLC 61-0406 Canadian Delhi Oil Ltd CDO 11-1888 Canadian Marcony CMW 11-8952 Ceco Steel Products Corp. CSP 11-1888 Central Maine Power Co. CMP 11-1920 Chesebrough-Ponds Inc. CBM 11-1984 Chemical Fund Inc. CFI 11-1968 *Chicago Musical Instrument CMI 11-8557 *Churchill Downs Inc. CDI 11-9157 Clark Oil & Refining COR 11-2112 Club Aluminum Products Co. CLB 61-0349 Coastal States Gas Prod. Co. CSG 11-2136 Commonwealth Oil Co. COC 61-1288 Compo Shoe Machinery Co. CEM 11-8973 Connecticut Light & Power Co. CLP 11-2252 Consolidated Freightways CFR 11-9801 Consolidated Papers, Inc. CPS 11-2268 Continental Ill. National Bank & Trust CNB 27-8915 Continental National Life Ins. CIS 61-0469 Dumont (Allen B.) Laboratories DUM 11-9405 C.I.A. common *Eaton & Howard (Balanced Fund) EHB 11-2584 *Eaton & Howard (Stock Fund) EHS 11-2588 FMC Corp. FMC 11-2724 Fidelity Fund Inc. FFI 11-2796 Fireman's Ins. Co. (Newark) FIN 11-8702 First Mortgage Investors FMI 11-2924 First Western Financial Corp. FWF 11-2972 Frigikar Corp. FRG 11-3096 Fundamental Investors FII 61-4417 General Box Co. GBX 11-9712 Ginn & Co. GNN 61-3268 Haloid Xerox Inc. HLC 11-3424 Hamilton Mfg. Co. HMC 11-3432 Harnischfeger Corp. HRC 11-3464 Heileman (G.) Brewing Co. HIL 11-8969 Helene Curtis Industries Inc. HCI 11-3516 Hevi-Duty Electric Co. HVY 11-8150 High Voltage Eng. HVE 11-8169 Holiday Inns of America HIA 61-4663 Houston Corp. HC 11-3616 Hudson Pulp & Paper Corp. HPP 11-3624 Incorporated Income Fund IIF 11-3676 Incorporated Investors Inc. III 11-3680 Interstate. Bakeries ISB 11-9923 Iowa Southern Utilities Co. ISU 11-3808 Johnson Service Co. JSC 11-3880 Kaiser Steel Corp. common KSC 11-3904 preferred KSCPF 33-3904 Kentucky Utilities KYU 11-8422 Larsen (The) Co. LSN 11-4064 Leath & Co. LH 61-4834 Lilly (Eli) Co.Cl.B. LIL 61-5382 Line Material LMT 61-0102 Lone Star Steel LSS 11-8000 Longview Fibre Co. LFC 61-0828 Lucky Stores Inc. LSI 11-4204 Macwhyte Co. MWC 11-4280 Madison Gas & Electric MGE 11-8566 Marine Capital Corp. MCC 11-4340 Martin Marietta Co. ML 11-4368 Massachusetts Investors Growth Stk Fund MIG 11-4380 Massachusetts Investors Second Fund MIS 11-9515 Massachusetts Investors Trust MIT 11-4384 Massey Harris Co. MS E 11-8604 Meredith Publishing Co. MPC 11-8623 Meyer (Geo. J.) Mfg. Co. MGJ 11-4520 Morgan Guarantee Trust Co. MGT 61-4752 Mosinee Paper Mills Co. MPM 11-4692 Narrangansett Capital Corp. NCC 11-4740 National Dairy Products Corp. ND 11-4784 National Investors Corp. NIC 11-4796 National Union Radio NUM 61-1229 National Video Corp. NVD 11-4824 Nekoosa-Edwards Paper Co. NEP 11-4852 Nesbitt (John J.) Inc. NJJ 11-4856 New York Trust Co. NYT 61-0632 Northern Illinois Gas Co. GAS 11-4940 Northern Indiana Public Service NIPS 11-4944 Northern States Power Co. (Minn.) Com. NSP 11-4960 Northwest Engineering C1.B. NENB 33-8437 C1.A. NENA 11-8437 Northwestern Nat. Ins. Co. (Milwaukee) NNI 11-5000 Nun-Bush Shoe Co. NBS 11-5020 One William Street Fund, Inc. OWS 11-5092 Opemiska Copper Mines 0CM 61-0357 Otter Tail Power Co. OTP 11-5116 Pacific Power & Light Co. PPW 11-5176 Perini Corp. PER 11-5280 Permanente Cement Co. PMC 11-5288 Phillips-Eckardt Electronic PEK 61-4801 Preway Iuc. PRE 11-5432 Progress Mfg. Co. PMC 11-9291 Public Service of New Hampshire PNH 61-1444 Public Service of New Mexico PNM 11-9711 Puritan Fund Inc. PFI 11-5484 Rap-In-Wax (Rap Industries RIW 11-8316 after 1961) Reading & Bates Offshore Drilling Co. C1.A. RBODA 33-5548 RBOD 11-5548 River Brand Rice Mills Inc. RBR 11-5640 Robbins & Meyers RMI 11-9826 Rorer (Wm. H.) Inc. ROR 11-5724 Sams (Howard W.) & Co., Inc. SHW 11-5940 Sapphire Petroleum (Ltd.) SAF 11-8230 Schlitz (Joseph) Brewing SJB 61-4108 Schulte (D.A.) Inc. SHU 61-0518 Schuster (Ed.) & Co., Inc. SEC 11-5980 Servomation Corp. SVM 61-4977 Sonotone Corp. SON 11-8237 Soundview Pulp SVP 11-9572 Speed Queen Comp. SPQ 61-0353 Springfield Fire & Marine Ins. SFM 51-0126 St. Louis Public Service SLP 11-9771 Standard Accident Ins. Co. SAI 11-6420 Standard Dredging Corp. SDR 11-8266 Standard Motor Products, Inc. SMP 11-6424 Standard Oil, Ky. SKY 11-9140 Strategic Materials Inc. SMC 61-4488 Stroock (S.) & Co., Inc. SOK 61-0517 Syntex SYN 61-4499 Technical Material TMC 61-4328 Texas Eastern Transmission Corp. TET 11-6636 Texas Industries TXI 61-4731 Thermogas Co. TMG 61-4489 Towmotor Co. TOW 61-0105 Transcontinental Gas Pipe Line 11-6740 Corp. TGP Travelers Ins. Cor. TIC 11-9496 Trav-ler Radio Corp. (TraV-ler Industries) TVL 11-6760 Universal Cyclops Steel UCS 61-4997 Vance-Sanders Inc. VSN 61-3009 Virginia Electric & Power Co. VEL 11-7016 Webb & Knapp WN 61-0359 Wellington Fund Inc. WFI 11-7108 Western Power & Gas Co. WPG 11-7164 Western Publishing Co. WPB 11-7168 Weyenberg Shoe & Mfg. Co. WEY 11-7200 Wieboldt Stores WIE 61-0633 Wisconsin Bankshares WBK 11-9023 Wisconsin Fund Inc. WFI 11-7260 Wisconsin Power & Light preferred WPLPF 33-9031 common WPL 11-9031 Wolverine Shoe & Tanning WST 61-3008 Wright Hargreaves Mines Ltd. WRT 11-9982 Zero Mfg. Co. ZER 11-7356 The alphabetic abbreviations that were used are the ticker symbols that appear in the Stock Guide, if the ticker symbol is provided; otherwise, the author made up the abbreviation (which usually consists of the first letters of the important words in the name). A total of 15,871 cards were coded and put onto tape 785 (DPLS) unsequenced. This tape must be properly sequenced and edited according to (card image) edits which the author has written.hahttp://www.ssc.wisc.edu/wais/WAIS689003.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS689003.txther awards, the combined amount will be shown with the appropriate combined symbols (see example 2.). If two Group 1 a. Part I and Part II on the first page need not be completed. Part I will be completed by the University of Wisconsin and Part II, information is contained on the attached printout sheet beneficiaries of the same sex are involved, use an additional form for one of the entries. An explanation of the action symbols to be used is shown in the lower right corner of the second page. In addition to these symbols, an "A" will be shown under the amount when the rate was adjusted because of legislation, (see example 1). Other changes in rate such as adjustment for family maximum, savings clause, etc. should be explained under "Remarks" or on the reverse side of the page. Group 2. Death Cases a. Complete the following information in Part II of the first page. l. Item' 7 Date of birth - Enter for adult beneficiaries 'other than those for whom a DOB is shown. on the. printouts. If other than the wage earner and/or spouse. are shown on the account, enter the DOB under "remarks"-with' the appropriate claim "E", "F", "D" symbol such as "E", "F", "D". If the WE and spouse are not shown on the printouts but were "on the rolls" previously, enter the DOB as shown on the Form OA-C101 in file. Otherwise, enter the date as shown on the Form SS-5 2. Item 10 - Date of Death - Enter date of death of wage earner. b. Enter the payment record in Item 11 of the adult beneficiary(ies) in the same manner as described for the cases in Group 1, for the months 1/46 - 1261. Group 3. Records not found on 'the master tape a. If there is no record of the case in file, annotate "NR" in the "Remarks" under Part II. b. If an award was disallowed or denied, enter "disallowance" or "denied" and the date of such action in the "Remarks' column under Part II. c. If the beneficiary was previously entitled and was receiving benefits which were terminated prior to 162, complete Part II page 1 e .d also Item 11 page 2 showing all payment records through December 1964. This is the only group for which information is needed for all years 1946-1964. If the folder is not available, show the reason under "Remarks" as follows: If there is no record of the folder, enter "NR". If the folder is in the Federal Records Center, requisition the folder and complete the necessary information on Form SSA-9249. If the folder has been permanently transferred to another payment center, enter the name of the payment center. If the folder. is in operations, complete the necessary information when the folder is available. Please return completed material to DRS on a weekly basis, until the job has been. completed.Send each shipment under a covering memorandum to the Division of Research and Statistics, Attention: Mrs. Ziskin, Room 4-F23a, Operations Building. Time spent on the completion of the study forms should be reported on the Production Time Report under "Special Work" and footnoted in accordance with Section 133 of the Payment Center Report Manual.\. If you have any questions about the project, do not hesitate to contact the Operations Branch. Richard E. Branham Director, Enclosures Example No. 1 - Deductions for Earnings "A" became entitled to benefits of $110.00 in 3/58. His benefits were adjusted to $.117.00 in 1/59 because of legislation. His benefits were suspend in 3/59 because he reported that he had returned to work in that month and expected to earn over $1,200. In 7/59, he reported that he had stopped working and his benefit payments were resumed. Jan. Feb. March April May June July Aug. Sept. Oct. Nov. Dec. In this case, the symbol "S" will be used to indicate the months for which "A" received no payment. (Under "Suspension" include benefits withheld to recover an overpayment, suspensions pending development of Payee, and other miscellaneous reasons for suspension.) Month 1958 1959 CL. SYM. Man Woman CL. SYM. $ Man woman A. $7117.0 $ A 110.0 117.0 I Example No. 2 - Retroactive Payment A and B were awarded monthly benefits of $100 and $50 respectively and their first check represented payments for 11/8 - 2/59. A died in September 1961 and B was entitled to D benefits A lump-sum death payment was combined with D's monthly benefit amount. Month 195 1959 1 0 1961 CL. SYM. Man Woman S . Man Woman . SYM. Woman CL. SYM. Man Woman Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec. R R l+00.0 200.0 100.0 50.0 R R T 337.5 82.5 Example No. 3 - Underpayment A and B were entitled to benefits 1/60. A was entitled to $100.00 a month and B was entitled to $50.00 a month. B had temporary deductions in 1960. Her annual report for 1960, processed in 4/61, showed earnings of less than $100 in 6/60 and 7/60. She indicated that she was working all of 1961 and benefits were withheld for all of these months. Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec. 1960 I 196 Month CL. SYM. Man CL. Woman SYM. Man Woman $100.0 $ $ 100 .0 Example No. 4 - Man is auxiliary beneficiary A woman became entitled to a benefit of $116 under her own account number and her husband became entitled to a benefit of $58.00 as a B1 I 1958 Month I CL. SYM. C Man Woman Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. De B1A $58.0 $116.0 Codes - File Reference 1. NOP - Number of payments on the account 2. NOB - Number of beneficiaries on the account LMOA - Last month of activity on the account PIA - Primary insurance amount 5. PCOC - Payment center office code 1. N. Y. 2. Philadelphia 3. Birmingham 4. Chicago 5. S. F. 6. K. C. \ 7. Baltimore -- PIC - Payment identification code 8. MPA - Monthly payment amount 9. SCC - State and county code 10. Current pay (only shown in current pay cases) 11. Payee name and address 12. BIC - Beneficiary identification code and subscript A - OAIB - Primary Beneficiary B - Wife (whose entitlement or benefit amount at the time of filing, is not dependent on having a child in her care) B1- Husband (dependent) B2- Wife (whose entitlement or benefit amount at the time of filing, is dependent on having a child in her care) C - Child (including disabled child) 3. A - Primary - Wife C -- Child - Widower) E - Widowed mother under age 62 with child(ren) in her care F - Parent - Combined husband-wife check payment 7. NOB - Number of beneficiaries in the payment - Widow Dl- Widower - Mother widow) El- Mother divorced wife) F - Parent 13. Beneficiary name 14. DOB - date of birth 15. Sex and race 16. DOE - Date of entitlement 17. Payment status code A - Adjustment C - Current pay - Deferral M - Matured deferred - Conditional (suspension) T - Termination 18. TOC - Type of claim 0 - Death claim 1 - Life claim 2- Reduced claim 3 - Death claim (for disabled child or for mother entitled solely because of disabled child) 4 - Life claim (for disabled child or for young wife entitled solely because -of disabled child) 5 - Disabled claim (other than disabled child's claim 6 - Reduced (DIB) disability claim 1 7 - BIB claim (for disabled child or young wife entitled solely because of disabled child) 19. LDA ARM For internal use only SAC 20. Hist. - Last history posting a. RFD - Reason for deduction Space - no deductions 0 - 9 - deductions Y - previous entitlement to another type of benefit T - termination ' A - withdrawn for adjustment b. WIC - Work identification code 0 - No work indication 2 - Worked c. BPD - Beneficiary payment designation 0 - Not paid 1 - Paid d. XR - Cross reference account number indicated 21. ARD - Annual report data 22. Type of earnings code as reported on latest annual report 1 - Wages 2 - Self-employment 3 - Self-employment and wages 23. Representative payee data 1. Date reflects the date of selection of payee 2. (Codes represent type of payee, custody and guardianship) a. b. Incomplete 1963 estimated 1964 estimated or actual All history postings for the beneficiary from 1/62 on (other than Baltimore Payment Center, PCOC - code 7) a. Date of each change b. Monthly benefit amount c. RFD, WIC, BPD - refer to item 20hahttp://www.ssc.wisc.edu/wais/WAIS656006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS656006.txt> "= < Gene Moyer 19652+The Coding of the Wisconsin State Tax FormsDFebruary 25, 1965  WAIS paper645-038a"Formats Medical Expense DataZS**This document could not be translated to basic text. Please view the PDF file.****Terms and topics from paper, listed for searching purposes** Gene Moyer WAIS 645-038a February 25, 1965 1st Revision Document: the Coding of the Wisconsin State Tax Forms (1946-1960) I. The Integration Process II. The Coding "code sheet" 1. The Identification Number name cluster Doomage 2. Social Security Number 3A. Years Filed 4, 5. Name and Address of Taxpayer 6. Residence location 7. County Prior Year 8. Address Change 9. Occupation Code Dictionary of Occupational titles (U.S.G.P.O.) 10. Occupation Change 11. Return Filed in Previous Year 12. Partnership Name Given? 13. Spouse's Name Given? 14. Was the taxpayer married during the tax year? Were marriage details given? 15. Does the taxpayer claim "Head of Family" exemption? 17, 18, 19. Non-Calendar year; Gift Information III. The Coding of Income from Interest, Dividends, Rent, and Capital Gains Type of Asset capital gains stocks firms Rent Income Class A Property Instructions for Coding Fixed Format ID Cards Code Sheet for Coding Fixed Format ID Cards Code Sheet for Taxpayers Not Using Printed Tax Form for Interest or Dividends Code Sheet for Taxpayers Not Using Printed Tax Form for Capital Gains jchttp://www.ssc.wisc.edu/wais/WAIS645038a.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645038a.txt  Gene Moyer 1965*$Tape Record for Medical Expense Data March 2, 19655 WAIS paper645-0380"Formats Medical Expense Data Gene Moyer WAIS 645-038 March 2, 1965 Draft Tape Record for Medical Expense Data (all amounts are rounded to the nearest dollar) Number of Variable Columns Columns Number Variable Name 1 1 1 "M" 2- 9 8 2 Our identification number 10-18 9 3 Social Security number 19-25 7 4 1947 gross income 26-32 7 5 1947 wife's gross income 33-37 5 6 1947 total medical expenses (of both husband and wife) 38-44 7 7 1948 gross income 45-51 7 8 1948 wife's gross income 52-56 5 9 1948 total medical expense, 57-63 7 10 1949 gross income 64-70 7 11 1949 wife's gross income 71-75 5 12 1949 total medical expense 76-82 7 13 1950 gross income 83-89 7 14 1950 wife's gross income 90-94 5 15 1950 total medical expense 95-101 7 16 1951 gross income 102-108 7 17 1951 wife's gross income 109-113 5 18 1951 total medical expense 114-120 7 19 1952 gross income 121-127 7 20 1952 wife's gross income 128-132 5 21 1952 total medical expense 133-139 7 22 1953 gross income 140-146 7 23 1953 wife's gross income 147-151 5 24 1953 total medical expense 152-158 7 25 1954 gross income 159-165 7 26 1954 wife's gross income 166-170 5 27 1954 total medical expense 171-177 7 28 1955 gross income Tape Record (cont.) Number of Variable Columns Columns Number Variable Name 178-184 7 29 1955 wife's gross income 185-189 5 30 1955 total medical expenses 190-196 7 31 1956 gross income 197-203 7 32 1956 wife's gross income 204-208 5 33 1956 total medical expense 209-215 7 34 1957 gross income 216-222 7 35 1957 wife's gross income 223-221 5 36 1957 total medical expense 228-234 7 37 1958 gross income 235-241 7 38 1958 wife's gross income 242-246 5 39 1938 total medical expense 247-233 7 40 1959 gross income 254-260 7 41 1959 wife's gross income 261-265 5 42 1959 amount paid to doctors 266-270 3 43 1959 amount paid to dentists 271-273 5 44 1959 amount paid to druggists 276-280 5 43 1950 amount paid to not ascertained N.B: if the coder cannot ascertain the proper category of a person to whose money for medical care had been paid, the amount paid to that person will be placed in variable 45. 281-285 3 46 1939 amount paid to hospitals 106-290 5 47 1939 health insurance premium 191-395 5 48 1959 amount paid to others (make out card) 396 1 49 Record markhahttp://www.ssc.wisc.edu/wais/WAIS645038.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645038.txt Gene Moyer 1965HAThe Weighting of the Interview Data for the Report to Respondentso March 2, 1965 WAIS paper645-039nSurvey Data and File Gene Moyer WAIS 645-039 March 2, 1965 First Revision The Weighting of the Interview Data for the Report to Respondents. Because of the need to report our findings to respondents as quickly as possible, some "short cut" weighting system is necessary. Two possible methods are available to us. One is to make use of Bj, the a priori percentage at which the ith interviews was sampled. While the true percentage rate at which this jth interview was sampled (Bj) is different from ~Bj , because of errors in matching, (Bj-~Bj) is probably very near zero. If there were no non-response bias, then (~Bj)-1 would probably be an adequate weight for the jth interview. There is, however, some evidence that there is a non-response bias in the distribution of the 1962 incomes of our respondents. Let yi = the number of persons in the ith 1962 income bracket who were chosen in the sample of 2069. Xi = the number of actual respondents in the ith 1962 income bracket. Zi - the number of non-respondents (for any reason) in the ith 1962 income bracket. Values of Xi, Zi, and yi are given in the following table: 1962 Income 1 2 3 4 5 6 7 8 9 E Bracket (H) Xi 34 113 304 357 176 71 113 12 8 1188 Zi 15 91 196 151 73 37 64 13 3 643 E = yi 49 204 500 508 249 108 177 25 11 1831 Since there is some non-response bias but it is not systematic, a second term in the weight for each interview is necessary so that the interviews will in general "represent" the income distribution of the original sample of 2069 persons. A weight, then, for the jth interview of Wj = yi/Xi (Bj)-1 should insure that our statistics will in general reflect the experience of all Wisconsin taxpayers. So that these weights can be constructed easily, cards with the following format will soon be punched on the computer: Number of Variable Column Columns Number Variable Name, 1 1 1 W 2- 9 8 2 Taxpayer identification number 10-13 4 3 Serial of interview l4 1 4 Was a booklet returned? 1 - yes 0 - no 15-17 3 5 yi 18-20 3 6 Xi 21-23 3 7 Sampling rate (a priori) 24-32 9 8 Key 33-80 48 - Blank If there were no variation in the response rates, we would expect ( ) and ( ) To test the null hypothesis that ( ) we compute ( ) where ( ). The hypothesis that ( ) is rejected at the .01 significance level and response - income bias seems almost certain. This result gives rise to the question of whether the bias is at all systematic. One test of systematic bias is to calculate the ~Bi in a regression of the form X/y = ~B0 + ~B1H + ~B2H2 + e. If the R2 (multiple correlation coefficient) from a regression such as this is high, the null hypothesis of no relationship can be rejected. For this regression R2 = .00235, yielding an F statistic for the whole relationship (H and H2) of .0001238. Since the tabled value for F with 2 and 7 degrees of freedom is 4.74, the null hypothesis that {B1 = B2 = 0} is accepted and we would conclude that the bias is not systematic.shahttp://www.ssc.wisc.edu/wais/WAIS645039.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS645039.txto Ashok Bhargava 1967Selection File Job Plans July 17, 1967s WAIS paper678-001"History File Selection File\ Ashok Bhargava WAIS 678-001 July 17, 1967 SELECTION FILE JOB PLAN (PHASE I) The creation of the selection file will be done in the eight steps outlined in WAIS 667-044 by John DeVries. Since the present History File is invalidated by the ID changes, we will have to recreate the History File in the first stage. History File Format Positions Field Item Description Input Position in Size File Input File 1 - 8 8 M IS ID # FFID 2 - 9 9 - 17 9 Social Security # FFID 10 - 18 18 1 Original sample, high income (1000's), Benefit FFID 2 - 4 (70's) 0 - Original sample HP 1 1 - High income (1,000's) 7 - Benefit (70's) 19 - 56 38 Record Absent/Present 10 - 11 each year 57 - 58 2 500 - Record absent HP yea 01 - Record present 10 - 11 First year filed return 59 - 60 2 Last year filed return EP 10 - 11 61 - 62 2 First year filed (not 1946) NP 10 - 11 63 - 64 2 Total # year records appear 10 - 11 65 -102 38 Interest, Dividends, Capital Gains, Rent, 55 - 99 Business (present/absent indicator, 1946-64). each year Binary code, Rent Interest Dividends Capital Bus. Gains Absent 0 0 0 0 0 Present 1 2 4 8 16 Field positions Size Input Location in Item Description 103-121 122-140 141-197 198-235 236-242 19 Spouse separate income (1.946-64) - acme indicator as MF. Return reason code (previous year) same indicator as MF. Residence location - same indicator as MF. County prior year - saw indicator as MF: Year 46-52 indicator 0 - Total income >_ 3500 (1946-48) 0 - Total income E 5000 (1949-52) 1 - Otherwise Form type (1946-52) - same indicator as HP Month of death Year of death Month of birth Year of birth Year of last known address County code - same as FFID Absent-present indicator 0 - absent 1 - present Race - same as ID/805 Chosen in survey f 0 - absent 1 - present Response Asset booklet 19 57 38 243-249 250-251 252-253 7 7 254-255 256-257 258-259 260-261 2 2 2 266 2 2 2 267 268 1 1 269 270 1 1 1 1File Input File lie 23 each year MF 21 each year MF 12-14 each year MF 15-16 each year NP 127-135 each yr. MF 387 each year 7 Age 52-57 BF Card 2 Sequence 00 Field 56-61 Age 31-32 Age 35-36 FFID 122-123 FFID 120-121 ID/805 ID/805 131 SF ID # SF By hand SF By hand Positions 271-279 19 Absence/Presence indicator PF 10-11 each year 280-281 2 0 - absent PF 1011 1 - present First year filed 282-283 2 Last year filed PF 10-11 284-302 19 Absence-presence indicator BYR 27-28 each year 303-304 2 First year filed BYR 27-28 305-306 2 Last year filedhahttp://www.ssc.wisc.edu/wais/WAIS678001.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678001.txtC'TRichard Bauman 1967~wTable Specifications for the Series P Tabulations from EXT 01A: Occupation Distributions and Summary Demographic Tables July 20, 1967 WAIS paper678-002i*#Cross Tabulations Extract 01 Tablesc%%Richard A. Bauman WAIS 678-002 July 20,1967 Table Specifications for the Series P Tabulations from EXT 01A: Occupation Distributions and Summary Demographic Tables I. Table Specifications Table P1 Page Variables: Sex 01 Birth Year Group 1 (10 yr intervals) Row Variable: Year 1 (single years) Column Variable: Dependents 1 Cell Entries: Simple F.C. Row % Table P2 Page Variables: Year 3 (Census years - other years grouped) Sex 01 Marital Status 1 (Single - Married) Birth Year Group 1 Row Variable: County of Residence 1 (In - Out of State) Column Variable: County of Residence (Prior Year) 1 (In - Out of State) Cell Entries: Simple F.C. Row 7 Column 7 Table P3 Page Variables: Sex 01 Marital Status 1 Row Variable: Year 1 Column Variable: Birth Year Group 1 Cell Entries: Simple F.C. Row % Mean A.V. - AGI 1 Table P4 Page Variables: Sex 01 Marital Status Birth Year Group 1 Row Variables: Year 1 Column Variable: Return-Reason 1 (Detailed) Cell Entries: Simple F.C. Row 7. Mean A.V. - AGI 1 Table P5 Page Variables: Year 3 Sex 01 Row Variable: County of Residence (Prior Year) 1 Column Variable: Return-Reason 1 Cell Entries: Simple F.C. Row % Column % Table P6 Page Variables: Sex 01 Birth Year Group 1 Row Variable: Year 1 *Column Variable: Trust Income 2 (Neg, Zero, Positive, NA) Cell Entries: Simple F.C. Row % Mean A.V. - Trust Income 1 Table P7 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: AGI 2 Cell Entries: Simple P.C. Row % Mean A.V. - AGI 1 Table P8 Page Variable: Sex 01 Row Variable: Year I *Column Variable: NTI 2 Cell Entries: Simple F.C. Row % Mean A.V. - NTI 1 Table P9 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: W + S 2 Cell Entries: Simple F.C. Row % Mean A.V. - W + S 1 Table P10 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: Dividend 2 Cell Entries: Simple F.C. Row % Mean A.V. - Dividend 1 Table P11 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: Cap Gain 2 Cell Entries: Simple F.C. Row % Mean A.V. - Cap Gain 1 Table P12 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: Self Inc. 2 Cell Entries: Simple F.C. Row % Mean A.V. - Self Inc. 1 Table P13 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: Interest 2 Cell Entries: Simple F.C. Row % Mean A. V. - Interest 1 Table P14 Page Variable: Sex 01 Row Variable: Year 1 *Column Variable: Rent 2 Cell Entries: Simple F.C. Row % Mean A.V. - Rent 1 Table P15 Page Variable: Sex 01 Row Variable: Year I *Column Variable: Trust Income 2 Cell Entries: Simple F.C. Row % Mean A. V. - Trust Income 1 *N.B. The last column should be omitted in these tables. Table P16 Page Variables: Sex 01 Birth Year Groups 1 Row Variable: Year 1 Column Variable: Occupation 2 (Broad Groups) Cell Entries: Simple F.C. Row % Mean A. V.- AGI 1 Table P17 Page Variables: Sex 01 Birth Year Groups 1 Row Variable: Year 1 Column Variable: Occupation 2 Cell Entries: Simple F.C. Row Mean A. V. - W + S 1 Table P18 Page Variables: Sex 01 Birth Year Groups 1 Row Variable: Year 1 Column Variable: Occupation 2 Cell Entries: Simple F.C. Row % Mean A.V. - Self Inc. 1 Table P19 Page Variable: Sex 01 Row Variable: Year 1 Column Variable: Occupation 2 Cell Entries: Simple F.C. Row % Mean A.V. - AGI 1 Table P20 Page Variables: Sex 01 Birth Year Groups 1 Row Variable: Year 1 Column Variable: Occupation 3 Cell Entries: Simple F.C. (In IF (Grouped) - Not in IF (Detailed)) Row % Mean A.V. - AGI 1 Table P21 Page Variables: Year 3 Occupation 2 Sex 01 Marital Status 1 **Row Variable: Birth Year Group 2 (5 yr. Intervals) Column Variable: AGI 3 Cell Entries: Simple F.C. Row % Column % **N.B. - The last row should be omitted in this table. Table P22 Page Variables: Birth Year Group 3 (Known - Unknown) Occupation 2 Sex 01 Marital Status 1 Row Variable: Year 1 Column Variable: AGI 3 Cell Entries: Simple F.C. Row % II. Input Record Format The BINARY EXT 01A [11/66] SSRI TAPES 600, 577, 597 contain the Input for these tables. The format follows: Binary Position Variable No. Ext. 01A Data 1 1-8 ID Number 2 7 (ID) Sex 3 8 (ID) Dependent number 4 3-10 Year Excluded (See 2) 11 Sex 0-Male, 1-Female 5 12 Marital Status Excluded (See 23) 13-14 Age 6 15 Race 7 16-17 Number of Dependents 8 18-19 Occupation (Detailed Code) 9 20 Return - Reason 10 21-22 County of Residence (Current Year) 11 23 City Designation (Current Year) 12 24-25 County of Residence (Prior Year) 13 26-32 AGI 14 33-39 NTI 15 40-46 W + S 16 47-53 Dividends 17 54-60 Cap Gains 18 61-67 Self Employment Income 19 68-74 Interest 20 75-81 Rent 21 82-88 Trust Income 22 91 Marriage Details 23 89-90 Year of Birth Excluded - See WAIS 92-95 ID Positions 7, 8; Recoded Year 645-047 96 Record Mark III. Grouped Occupation Codes The following grouped occupation code will be assigned by the "driver" using V* and will be designated as V24 in the input array: If V8= Assign V24= Description 01-10 01 Professional 11-17 02 Semi-Professional 18, 19, 22 03 Managerial 20 04 Self-Employed Business 21 05 Self-Employed Farm 25 06 Clerical 26 07 Sales 27, 28, 29 08 Service 24, 30, 31 09 Skilled 23, 32, 33, 34 10 Semi-, Unskilled 35, 36, 37, 38 11 Not in Labor Force < 01 or > 38 12 Not Ascertained IV. Interval Boundaries (Upper) for Classifying Variables Sex 01 (2 Boundaries) 0, 1 Dependents 1 (8 Boundaries) 0, 1, 2, 3, 4, 5, 7, 10 Year 1 (15 Boundaries) 1946, ... (1) ..., 1960 Year 3 (5 Boundaries) 1948, 1949, 1958, 1959, 1960 Marital Status 1 (2 Boundaries) 0, 1 Occupation 2 (V24) (12 Boundaries) 01, ... (1),..., 12 Occupation 3 (V8) (7 Boundaries) 00, 34, 35, 35, 37, 38, 99 Return - Reason 1 (10 Boundaries) 0, ... (1) ..., 9 County of Residence 1 (5 Boundaries) 00, 72, 97, 98, 99 County of Res (Prior Year) I (5 Boundaries) 00, 72, 97, 98, 99 AGI 2 (4 Boundaries) -1, 0, 9999998, 9999999 AGI 3 (5 Boundaries) 999, 1999, 2999, 4999, 7999, 8000 NTI 2 (4 Boundaries) -1, 0, 9999998, 9999999 W + S 2 " Dividends 2 " Cap Gains 2 " Self Inc. 2 " Interest 2 " Rent 2 " Trust Income 2 " Birth Year Group 1 (10 Boundaries) 1884, 1894, 1904, 1914, 1924, 1929, 1939, 1950, 9999 Birth Year Groups (19 Boundaries) 1864, 1869, 1874, 1879, ... (5)..., 1944, 1960, 9999 Birth Year Group 3 (2 Boundaries) 9998, 9999 V. Suggested Priority The Tables should be run in the following order: First Pass: Tables P3-P5, _16-P19 Second Pass: Tables P1-P2, P6-P15, P20 Production of Tables P21 and P22 trill be postponed. Bill Gates WAIS 678-002 Programing Supplement August 7, 1967 Programing Specifications for the Series P Tabulations from EXT-01F Occupation Distributions and Summary Demographic Tables Source. (1) modification of PROGRAM DAVID source deck which serves is a basic driver for XTAB and accomplishes a straight presentation of twenty-three variables to the XTAB working array from EXT-01F, a 3 reel binary file, not labeled (SSRI 600, 577, 597). (2) The XTAB used allows 32,000 computer words to be used for tables and 10 digit boundaries ire permissible. An alias for this version of XTAB is "Moyer's XTAB". (3) The file structure: three logical records per physical record. A contingency to the basic driver is that when a parity error occurs in a physical record the three logical records are not presented to XTAB but are lost. (See 678-002 for format of the file.) Proram Modification Specifications: 1. Additional Occupation Classification assigned to position 24 is presented to the XTAB routine. 1.2. The description presented in 678-002 will be repeated here for sake of clarity. if V8= Magnitude V24= Description 01-10 8 01 Professional 11-17 12 02 Semi-Professional 18,19,22 9 03 Managerial 20 7 04 Self-Employed Business 21 4 05 Self-Employed Firm 25 2 06 Clerical 26 10 07 Sales 27-29 6 08 Service 24,30,31 3 09 Skilled 23,32-34 1 10 Semi-,Unskilled 35-38 5 11 Not in Labor Force <01 or >38 11 12 Not Ascertained 1.3. A special purpose FORTRAN subroutine was written called NEWOC to accomplish the assignment of V24. It is called from SUBROUTINE GET (standard part of driver) just prior to the presentation of the twenty-three variables to the XTAB working array. 1.4. The approach in NEWOC was to structure a series of IF tests and when true assign V24. The first IF test checks for Group 10(V24),the largest group, in order to optimise in the time spent in the subroutine. After V24 has been assigned control is returned to SUBROUTINE GET which then presents twenty-four variables to XTAB. 2. The Year 4 variable Is a cross relation variable. In the case of V13, V15, or V18 a new V25 is created if any of the three is equal to 9999999 i.e., not ascertained. 2.1. V13 is AGI V25 is Year 4 variable or 99 V15 is W & S V4 is tax or calendar year variable V18 is Self-Inc 2.2. The description and conditions of V25 are summarized below. When V13 or V15 or V18 equals 9999999, then V25 is assigned the value 99. When this is not true V25 is set equal to V4. 2.3. The approach is identical to that described 1.3. and 1.4. except that before the twenty-four variables are presented to XTAB a twenty-fifth variable is created by SUBROUTINE W25 which is called directly after SUBROUTINE NEWOC. Then twenty-five variables are presented to XTAB for distribution. Documentation: 1. A program listing of DAVID will be attached to the final tested modified program with subroutines. The name, of that program will be PSERIES. This should assure that the basic DAVID (driver) can be reconstructed. The modifications in PROGRAM PSERIES will reflect the current status of program source deck that was once PROGRAM DAVID.hahttp://www.ssc.wisc.edu/wais/WAIS678002.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678002.txtWilliam Duddleston 1967.(Benefit Analysis Format for E1(805) TapeAugust 1, 1967 WAIS paper678-003oB;Benefit Analysis Formats Social Security Earnings Data- 805pWilliam Duddleston WAIS 678-003 October 13, 1967 Revision 2 Revised by Bill Katke Benefit Analysis Format For E1(805) Tape The job plan for the extraction and the integration of beneficiary and income data, which was sketched out by Mr. David in WAIS 667-045, called for the creation of various specialized extract tapes. In this paper, E1(805) will be briefly described and its format will be presented. This tape will contain pertinent information regarding certain accounts extracted from the 805 file and the Reformatted ID file (the FFID l file with a SS claims indicator in column 126). An account will be extracted if positions 2-9 on both tapes are identical and if either one of these two conditions is satisfied: a) account has a 1 in col. 126 of the RID tape. b) positions 123-130 (birth year) fall in the range 00 < xx < 04; 60 < xx < 99. This extract will provide us with age and death data and social security earnings data for all individuals in our sample who have received social security payments or who reached age 60 by 1964. This extract will play an integral role in subsequent benefit analyses. E1 (805) will be used as one of the input tapes which will create BNAN(1), the extract Mr. David will use in his study of the retirement decision. In addition, E1(805) or the R2ID (FID tape with Indicators) must be used in creating E2 (MF), an extract of the Master File containing all the accounts listed on E1(805). E2(MF), in turn, will be integrated with the benefit year record tape (BNYR) to create E6 (MF-BNYR) the other input tape which will be used to create BNAN (1) . Position No. of Cols. Variable Name Input File Position on Input file 1 1 L - 2-9 8 WAIS ID 805 2-9 10-18 9 Social Security # 805 10-18 19-20 2 County Code 805 120-121 21-22 Last Year of Data 805 122-123 23 1 Multiple Account Number location 805 124 24 1 indication that name on Record 805 125 2 2 does not agree with Finder Card 126-127 Month-Birth 805 27-30 4 Year of Birth 815 129-130 Race Indication 805 131 32 1 Sex (alpha) 805 132-138 33-37 5 Indication of Railroad activity 805 139-143 38-39 2 Newly Posted Credit Earnings Item 805 144-145 40-41 2 Additional Earnings Indication 805 146-143 42-43 2 Active Earnings Discrepancy 605 148-149 44-48 5 Account in Benefit Status Other 805 150-154 than Disability 49-52 4 Benefit Status Other than 805 155-158 53-55 3 Disability was Terminated 159-161 Account in Disability Benefit Status 805 56-59 or Disability Freeze Status 805 16 2-165 Disability Status was Terminated 60-63 4 Credit Indication 805 166-169 64-68 5 Earnings Statement Issued in year 805 170-174 65-71 Indicated 805 175-177 Indication of Self-Employment 72-74 3 Activity 805 178-180 Indication of Delinquint Self 75-76 2 Employment Item 835 181-182 Indication of Agricultural Activity 77-85 9 Earnings, 1937 to Date 803 183-191 86-87 2 Wage Quarters of Coverage, 1947 805 192-193 33-39 2 to Date 805 194-195 Self-Employment: Quarters of Coverage 1951 to Date Position No. of Cols. 90-91 2 92-100 9 101-102 2 103-104 2 105-108 4 109 1 110-113 4 114-118 4 119 122 4 123 1 124-127 4 128 -131 4 132 1 133-136 4 137-140 4 142 143-146 147-150 1 1 Position Variable Name on Input File Input File Agricultural Quarters of Coverage 805 196-197 1955 to Date Earnings 1951 to Date 805 198-206* Wage Quarters of Coverage 1951 805 207-208 to Date Self-Employment Quarters of 805 209-210 Coverage 1951 to Date 1951 Earnings 805 211-218 1951 Self-Employment Quarters 805 219 of Coverage 1952 Earnings 805 220-227* 1953 Earnings 805 228 1953 Quaterly Wage Quarters of 805 237-240 Coverage Pattern 1953 Self-Employment Quarters of 805 241 Coverage 805 242-249* 1954 Earnings 1954 Quarterly Wage Quarters of 805 250-253 Coverage Pattern 1954 Self-Employment Quarters of 805 254 Coverage 1955 Earnings 805 255-262 1955 Quarterly Wage Quarters of 805 263-266 Coverage Pattern 1955 Self-Employment Quarters of 805 257 Coverage 1955 Agricultural Quarters of 805 268 Coverage 19.56 Earnings 865 269-276* 1956 Quarterly Wage Quarters of 805 277-280 Coverage Pattern 1956 Self-Employment Quarters of 805 281 Coverage 1956 Agricultural Quarters of 805 282 Coverage 1957 Earnings 805 283-290 1957 Quarterly Wage Quarters of 805 291-294 Coverage. Pattern 1957 Self-Employment Quarters of 805 295 Coverage Position No. of Columns Variable Name Input File Position on Input File 162 1 1957 Agricultural Quarters of 805 296 Coverage 163-166 4 1958 Earnings 805 297-304* 167-170 4 1958 Quarterly Wage Quarters of 805 305-308 Coverage Pattern 171 1 1958 Self-Employment Quarters of 805 309 coverage 172 1 1958 Agricultural Quarters of 805 3110 Coverage 173-176 4 1959 Earnings 805 311-318* 177-180 4 1959 Quarterly Wage Quarters of 805 319.322 Coverage Pattern 181 1 1959 Self-Employment Quarters of 805 323 Coverage 182 1 1959 Agricultural Quarters of 805 324 Coverage 183-186 4 1960 Earnings 805 325-332* 187-190 4 1960 Quarterly Wage Quarters of 805 333-336 Coverage Pattern 191 1 1960 Self-Employment Quarters of 805 337 Coverage 192 1 1960 Agricultural Quarters of 805 338 Coverage 193-196 4 1961 Earnings 805 339-346 197-200 4 1961 Quarterly Wage Quarters of 805 347-350 Coverage Pattern 201 1 1961 Self-Employment Quarters of 805 351 Coverage 202 1961 Agricultural Quarters of 803 352 Coverage 203-206 4 1962 Earnings 805 353-360* 207-210 4 1962 Quarterly Wage Quarters of 805 361-364 Coverage Pattern 211 1 1962 Self-Employment Quarters of 805 365 Coverage 1962 Agricultural Quarters of 805 366 Coverage 213-216 4 1963 Earnings 805 367-374* 217-220 4 1963 Quarterly Wage Quarters of 805 375-378 Coverage Pattern 221 1963 Self-Employment Quarters of 805 379 Coverage 222 1963 Agricultural Quarters of 805 380 Coverage Position Position No. of Cols Input File on Input File Variable Name 223 1 Claims Indicator 805 381 224 1 RID Indicator (claim paid 1) RID 126 225-228 4 (Claims not paid 0) SAD 52-57 Date of Death Month/year 229 1 4 Record Mark 805 382 *Birth Year program will insert the appropriate century, 00 < year < 59 becomes 1900 < year + 19 < 1959 60 < year < 99 becomes 1860 < year + 18 < 1899 *Sex Variable Male <-- 0 Female <-- 1hahttp://www.ssc.wisc.edu/wais/WAIS678003.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678003.txtPaul Robertson 1967,&Procedural Standards for WAIS Projects July 6, 1967 WAIS paper678-004xAdministrationPaul Robertson July 6, 1967 WAIS 678-004 PROCEDURAL STANDARDS FOR WAIS PROJECTS. Given the limited funds and personnel which are available to WAIS, it is imperative that we be able to make efficient use of our time. To do this, we must develop methods of accurately judging the amounts of time which will be required to complete the various stages of the jobs which are proposed and of properly assigning staff members to each. This will not be a simple process since the necessary methods can only be refined through experience, but the effort should pay off in the long run. To accomplish this, it is recommended that the following procedures for planning and carrying out jobs be adopted: (1) The suggestions made in 667-027 should be implemented. Jobs should be carefully planned in advance so that changes in objective which are made during the analysis and programming stages will be the exception rather than the rule. Time estimates ought to be made in advance and revised at regular intervals. Finally, job-description sheets should be required for each project and, in the case of large projects, for each significant step. (2) The practice of having the staff members submit weekly timesheets should be continued. Together with the job-description sheets, the time-sheets can be used to allot assignments to each staff member in such a way that all are busy yet none is over-worked. Work assignments should be made weekly. (3) Periodic "position statements" for all WAIS activities should be issued. These statements should list which stages of each of project have been completed, which have not, and how long it is estimated will be required to complete those which rain. All currently relevant WAIS working papers should also be listed. Because of the inter- locking nature of many of the projects, they should all be reported in the same paper so that all staff members willl be able to keep track of the progress of each job which affects their particular activities. It is recommended that the first of. the papers be issued as soon as possible and that subsequent ones follow quarterly. It is strongly recommended that the following procedures which are outlined in 667-027 be implemented as soon as possible. (a) A list of priorities covering all WAIS jobs currently in progress should be established. Barring emergencies, the list should be unalterable. Whenever a new job is approved, it should be added to the list. (b) The order of stages to be followed in undertaking any job, as given on pages 3-5 of 667-027, should be adhered to as closely as is feasible. Personnel should be assigned to specialize in each stage. Taken together, these procedures should make it possible both to refine our methods of making time-estimates and to keep tabs on where each project stands at the moment and how well it is proceeding. B. It is also felt that procedures for the composition and filing of WAIS working papers ought to be revised. The working papers serve four main purposes: 1. As reports on the planning of new activities.; 2. As instruction booklets for methods of carrying out those activities; 3. As vehicles for gripes on old. procedures and suggestions an to new ones; and 4. As reports on completed jobs. At present, there are quite a number of the first three varieties of papers and very few of the fourth. This is unfortunate because it makes it difficult to determine which projects have been completed and in what ways they may differ from the original plans. It is therefore recommended that a paper be submitted whenever a project is completed which details the final form of the project and the amount of time spent on it. In the case of larger projects, It may be advisable to have a report submitted for each of the larger stages as it is finished. Papers for publication, which deal with aspects of WAIS may easily be assigned a number and paced in the files along with other working papers. There are two other major defects with our system of working papers as it stands now. For one thing, there is no simple way of determining which papers are current and' which have been replaced by other papers. Accordingly, it is proposed that in the future, whenever a paper is written which revises another paper in the WAIS files, a note be placed on the top of the first pages which clearly indicates which paper is now outdated. This would make it much easier for the present staff to refresh their memories and for new staff members to become aclimatized. The second defect is that the papers are not properly indexed. To remedy this, it is recommended that the current author index be corrected aid updated and that a subject index be created. Both indices should be typed on file cards (one card per paper) so that they can be easily kept up-to-date. Ashok Bhargava has already begun work on compiling these indices. It is hoped that the secretary will have time to type the completed versions. Jobs-Description Sheet (Preliminary) Job Title: Job Number: WAIS Paper: # Date: Instructions: [WAIS Paper # ] Date: Planner: hours per week. Analyst: hours per week. Programmer: hours per week. Clerical Staff: hours per week. hours per week: hours per week. Time-Estimate : Man-hours: Expected Completion Date: Actual Completion Date:** Initial 10%* 50%* Time Actually Spent** Start of processing depends on jobs numbered: Completion of this job essential for jobs numbered: Comments: *To be filled in after ten and fifty percent, respective, of the job have been completed. ** To be filled in after completion of job. WAIS Weekly Progress Report A. Name: Date: B. Total number of hours worked this week: C. Number of hours spent on each job on which you, worked: Hours Jobs D.Next stages of jobs, as you see them, and how many hours you expect to spend on each in the coming week (barring unforeseen assignments) Stage Hours How the current week's work (reported in Section C) compared with the expectations given in Section D of last week's report and with the assignments which you received for the week: Comments:hahttp://www.ssc.wisc.edu/wais/WAIS678004.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678004.txt John deVries 19672+File Standards for WAIS Files (Preliminary)l July 6, 1967 WAIS paper678-006D>Administration Maintenance System - Files, Data, Etc. ProgramsJohn deVries Working Paper July 6, 1967 WAIS 678-006 File Standards For WAIS Files (Preliminary) The following set of standards should be adopted for all files to be created and, where feasible, also for permanent WAIS files currently in existence. 1. Documentation 1.1. File Catalog While the current version of the file catalog sheet requires many of the entries which are essential for efficient file-handling, it was felt that additional specific entries should be provided to promote uniform and complete file-documentation. The following sheet is a suggested means to achieve this uniform documentation (see next page). In addition to the entries on the suggested "file description sheet", the file catalog should contain, for each file: (i) a complete and up-to-date record format specifying variable-names, field tags or entry codes, valid ranges for each variable, etc. A standard COBOL record-description could be used here; (ii) a complete listing of all WAIS papers (and other written materials, were relevant) which give information about the file; (iii) a "file history sheet" giving essentially the same information as is currently requested on the back side of the "file catalog sheet" (see Appendix) WAIS FILE DESCRIPTION SHEET 1. File Identification: 1.1. File name 1.2. External file identification 1.3. Usage restrictions 1.4. Data base 1.5. Type of file 2. Physical Attributes of the File (if tape): 2.1. Blocking factor 2.2. Logical record length 2.3. Variable length record terminator 2.4. Padding characters 2.5. Number of reels 2.6. Format and content of tape labels 2.7. Parity 2.8. File origin (computer) 2.9. Density 3. Physical Attributes of the Observation: (only required if "'observation" and "logical record" do not coincide) 3.1. Observation length 3.2. Variable length observation terminator 4. Logical Attributes of the File 4.1. Sorting sequence (refer to format for variable-names, etc.). 1.2. "File Folder" In addition to the entries in the file catalog, each file should have a "documentation folder", containing: (i) a code book giving a detailed description of the file as a whole and of individual variables. For many WAIS files, a selected set of relevant WAIS papers could fill this requirement; (ii) a current listing of a small set of records. For small files, a complete listing would be advisable; for large files, some records from the beginning and some from the end of the file would be sufficient. For multi-reel files, some records from the beginning and end of each reel should be printed; (iii) frequency tables or marginals for all variables in the file; (iv) complete printout of a test file. This could be either a small sample taken from the main file or (preferably) a set of fictitious records, which can be used to check and debug programs using the file;. (v) frequency tables or marginals for the test file data. 2. Contents All files produced or in use by WAIS should satisfy the following conditions; (i) (tape files only): standard header-and trailer- labels should be carried on each reel, specifying at least: -- file name and/or code, -- reel number, -- creation date, -- edition number (cycle number), -- [trailer labels] record counts and block counts, -- trailer labels should differentiate between End-of-Reel and End-of-File. Preferably, labels should be compatible with B5500 operating system standards; if possible, compatibility with the 3600 operating system as well would be desirable. (ii) all files should have been purged of impermissible characters (alphabetics., special characters, etc., where not permitted). N.B. This requirement implies that "MI" and blanks in amount-fields and coded data for the WAIS Master File ought to be recoded to specific numeric values. (iii) ideally, all files should have been purged of invalid codes (i.e. codes outside the valid range). (iv) ideally, all files should have been checked for intra-record and inter-record consistency; inconsistencies should be corrected where possible. Where correction is impossible (either because the source documents are inconsistent too or because the source documents are not available), code expansion and/or insertion of "flags" would be advisable. (v) card files should have a "lead" card as an equivalent of the tape label mentioned above, specifying at least: -- file name and/or code, -- creation date, -- [if possible] number of cards. APPENDIX FILE CATALOG SHEET Identification code File name Type of file Arrangement, sorting sequence Format described in Other relevant papers Summary description of filehahttp://www.ssc.wisc.edu/wais/WAIS678006.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678006.txt  John deVries 196760WAIS Program Performance Standards (Preliminary) July 11, 1967  WAIS paper678-005Administration Programs\ John deVries July 11, 1967 WAIS 678-003 WAIS PROGRAM PERFORMANCE STANDARDS (Preliminary) Besides documentation standards, which are discussed elsewhere, the following "performance" standards can be specified for programs written for WAIS or Processing any of the WAIS files: 1. Label checks (i) Most programs are "file-specific" or can be made "file-specific" by minor - adjustments to a more general program. For all file-specific programs, standard checks on input labels can be specified. The program can contain the file-code or file-name of each of the input files, while the edition-number can be specified at running time by means of either a control-card or an operator-provided type-in. If an input-file at running time is not the one expected by the program, the program should, in most cases, not be run (there should, however, be simple methods to force the acceptance of non-standard input files or other files with a label which is not in total agreement with the label expected by the program). (ii) Programs should maintain block-counts and record-counts on input files and compare, at the end of a run, the counts with those given in the trailer label for the input file(s). If a discrepancy occurs, a rerun from the start should in most cases be initiated. (iii) As a logical derivation from the file-standards supplied elsewhere, all programs which produce output files on tape, on cards or on (semi-) permanent disk or similar devices, should provide all of these files with standard file-labels. (see file-standards for requirements). 2. Re-run Procedures All WAIS programs, with an expected running time exceeding 15 minutes should have standard re-run procedures; this will decrease the probability of loss of computer time. 3. Documentation For Listings If a program produces a printer file as one of its outputs, the printer file should contain several entries. At the beginning of the file there should be a "header", specifying: a) The program which produced the file; b) The cycle number and the date of running; c) All input files and all other output files; d) A short statement regarding the purpose of the program, and, where applicable, the larger job or project of which the program produces a part. On the listing, each page should have a header line clearly identifying each page (giving program name, date of running and page number as a minimum), At the end, a summary of all the counts maintained by the program should be given: a) Block-counts and record counts for all input files; b) Block-counts and record counts for all output files (including a printer file where it is produced!); c) Separate counts for various types of errors (if the program does check for errors); d) The total running time, if the computer's system does not provide this automatically.hahttp://www.ssc.wisc.edu/wais/WAIS678005.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678005.txtb Jan Smith 1967"Survey Processing Proposals.August 10, 1967  WAIS paper678-012Survey Data and FilengJan Smith WAIS 678-012 August 10, 1967 Survey Processing Proposals The Survey File contains a great quantity of data. The data can be grouped into logical groups, each group containing numerous one, two, and three digit code fields. The file now exists in card images (46 different card type formats). Most card-types contain one or more logical data groups; however, a few groups overlap into two or three card-types (e.g. employment record of R). Some card-types have multiple sequence cards. For example, card 37 data is concerned with stock lots. If a respondent has numerous lots of stock, his record contains numerous card 37's (one sequence card per lot of stock). To complicate problems, no two respondents have exactly the same card-type combination records. In order to facilitate working with this file, a more standardized and more specific tape format is needed. Each respondent must have certain card-types (such required card-types and card-type segments 2,3,4,8,10,21,28, and 29). Various codes on some of these required card-types indicate that certain other cards are required for a certain R. Codes on some of these dependent card-types, indicate whether or not other card-types are necessary. Thus, each R has a set of required fixed format card-types, at well as a set of dependent fixed (but variable in number) formats. These circumstances necessitate variable length records; however, to make the file more adaptable to other files, and to utility and general purpose programs, a variation of the variable length record, is another alternative. Following are the analyses of four different methods of data processing. 1. Card-image records The file now exists in card image format. This format was chosen due to the availability of George Loniello's "General Card Edit XS--Program". Logical groups can be coded one to a card image, or to a series of dependent card Images. For example, card images 11,12, and 13 all deal with contributions to charitable organizations; these cards can easily be pulled from the Survey File and processed, and analysed as part of another study. Furthermore, card-image format is easily adapted to general purpose and utility program it is fairly easy to write program for. However, there are significant disadvantages to card-image format. The Loniello card edit program has never really worked as it was designed to; for all practical purposes, this program does not fit the bill for Survey File (or New Tax Data File) edits. Because card-image is restricted to an 80 character length, and because the logical data groups vary in length, there is some overflow of logical groups onto more than one card image, as well as large blank fields for other shorter logical groups. General purpose and utility program must be designed to handle 46 different card image formats, and up to about 220 card images for a given respondent. Such length overflows the computer's working storage area, making processing more difficult. Some general purpose programs work only for one format. This is true of the Loniello Card Edit Program which requires that card-images be batched as to card image type. Checks on card combinations (Presence and Absence) may show illegal combinations. For example, at least five R's claim they are farmers; at the same time they claim they are non-farmers. Actually, they are part-time farmers and should have both farm and non-farm card images. Appropriate codes should be added. 2. Fixed length records Fixed length format involves reformatting the existing file so that only one record exists for each respondent. The card edit program would not be necessary, since intra-record editing could be performed by cobol area checks. Such format would be fairly easy to sort and update. All data would be on one record. However, a fixed length record must be as long as the longest record. For the Survey File, the length would be about 10,000 characters, which is much too long for ease in handling, for use of general and utility programs (e.g., UPDATEAL is built for a maximum of 1000 characters). 3. Variable length records Variable length records are used in disc file processing, where there are thousands of respondents, and thousands of possible codes for each respondent. Code fields are numbered or Indicated in some way, so that only the appropriate codes appear for a given respondent. This would be the ideal format, as far as processing is concerned. However, such records cannot be used easily with utility programs or by other programers. Special purpose programs would be necessary--the Survey File is not large enough to warrant the required effort and expense. 4. Combination length records A combination format of fixed length, and variable length records is also possible. All required information and coding indicators can be formatted into one fared length record. The coding indicators would signify which other logical groups) were necessary for a certain respondent. Each logical group would be of fixed length, and would be added together like building blocks. Multiple sequence cards could be so Indicated in the fixed length record; and a repeating format used in the trailer records. General programs could be adapted to handle such a format. If cobol were used to create the new file, the data division could easily be converted into a cedebook. Also, logical data groups could be pulled out of the Survey File and put onto tape or printer for analysis and integration with other files. The Survey File is a source file for information regarding certain aspects of investment behavior. Such segments of data as are necessary for a particular analysis can be easily pulled from the source file. The entire record for a certain respondent can be printed and analyzed just as easily for a complete analysis of the behavior of one person. Input Logic Output Pl Pr Cl Survey Tape VI 65 Reformat to tape format Survey Tape VII 75 20 1 10 20 Survey Tape VII 75* Edit (Check for Presence and Absence) 76 UPDATEAL Agree? N 76 Y Survey Tape VIII 90 5 2 10 50 Survey Tape XI 105** Create new extract by calling on 65 Data Division Survey Extract 110 2 0 1 1 Total estimates 41 4 34 161 * Replaces 70 and 80 ** Replaces 100 Note time estimate changeshahttp://www.ssc.wisc.edu/wais/WAIS678012.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678012.txt Paul Menchik 1979Letter to Buchler -- SSA June 18, 1979 WAIS paper789-008pWAIS-Wealth: GeneralMENCHIK WAIS 789-008 JUNE 18, 1979 INSTITUTE FOR RESEARCH ON POVERTY THE UNIVERSITY OF WISCONSIN 262-6358 AREA CODE 608 SOCIAL SCIENCE BUILDING 1160 OBSERVATORY DRIVE MADISON, WISCONSIN 53706 Paul L. Menchik June 18, 1979 Warren Buckler Room 2F1 Meadows E Bldg. Social Security Administration 6401 Security Blvd. Baltimore, Maryland 21235 Dear Mr. Buckler, I have enclosed the two decks of computer cards constructed according to your specifications. There are 626 cases in this sample. The utility of the processing by your office will be immeasureably great. A check for $850 will be sent to your office shortly in another letter. Thank you very muck for your help. Sincerely, Paul L. Menchik PM/RLF\hahttp://www.ssc.wisc.edu/wais/WAIS789008.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS789008.txt  John deVries 1967f`1959-1964 Wisconsin Income Tax Data: Detailed Instructions for Processing Summary Cards Vol. III July 25, 1967 WAIS paper678-008Master File- Tax RecordsWWJohn deVries WAIS 678-008 July 25, 1967 1959-1964 Wisconsin Income Tax Data: Detailed Instructions For Processing Summary Cards Volume III This paper is the third volume in the trilogy "New Master Data"; the first two volumes were previously issued as WAIS 667-040 and WAIS 667-042; a summary of the total operation can be found in WAIS 667-035. 9.1. Sort of Form 1 Cards Purpose: This job has to be run to produce a sorted file of Form 1 summary cards; it will also indicate some incorrect cards. Input: Card deck CXY.5.2.1. Action: Sort the input deck on card number (column 2), on year (columns 11-12) and on ID-number (columns 3-10) numerically. Note: there may be some cards with "NN" in columns 9-10; this is acceptable. Put aside the following groups of cards: -- all those with punches outside the range l-5 in column 2 (i.e. numerics 0 or 6-9 or alphabetics); -- all those with alphabetics anywhere in columns 3-12 (except "NN" in columns 9-10); -- all those with punches other than [XY] in columns 3-4 (where [XY] stands for the number of the name-group you are working on); -- all those with columns 11-12 not equal to "59" or "60" or "61". Combine all the cards you put aside into one deck. Output: Sorted deck of Form 1 Cards; label it CXY.9.1.1. Possibly also a small deck of incorrect cards; label this deck CXY.9.1.2. If you do not get any cards in CXY.9.1.2. mark off job 9.2. as not required (by putting an "X" in the "date started" and "date completed" columns for that job on the log sheet). 9.2. Correction of Incorrect Cards Purpose: This job is required to correct the errors found in job 9.1. Note 1: If you find an "X" in the "date started" and "date completed" columns on the log sheet for this job, and if you do not find input cards CXY.9.1.2. you can proceed to the next job. Note 2: This job is similar to jobs 5.5. and 7.1. as described in WAIS 667-042; we will refer to those sections for details. Input: Deck CXY.9.1.2. listing LXY.5.1. deck CXY.9.1.1. Action: Each one of the cards in CXY.9.1.2. is incorrect or seems to be incorrect. Locate the card on listing LXY.5.1. The following types of error can have occurred (see sections 5.5. and 7.1. in WAIS 667-042 for detail): 1. Columns 3-4 do not contain "XY" (where '"AY stands for the number of the name-group you are working on). If columns 3-4 contain a legitimate name-group number (< 52 or = 70) and if the card is a Form 1 Card (check the format in WAIS 667-013), add the card to the appropriate deck if that has not yet been processed beyond jobs 9.1.; in all other circumstances, put the incorrect card in box CER. 2. Column 2 contains a number outside the range 1-5 Locate the card on listing LXY.5.l. If you find at least one other card with the same contents in columns 1 and 3-12 as your incorrect card and adjacent to it on listing LXY.5.1., check the following: a) if the correct card is before your incorrect one on listing LXY.5.1., has a "C" in column 77 and has a number less than 5 in column 2, your incorrect card should have a number in column 2, one higher than that on the correct card. You can have your card duplicated with corrections by the keypunchers. b) if the correct card is before your incorrect one, but one of the conditions specified under a) above does not apply, place your card in box CER; make a note of this on listing LXY.5.1. c) if the correct card comes after your incorrect card, has a number higher than "1" in column 2, and your incorrect card has a "C" in column 77, your incorrect card should have a number, one lower than that on the correct card, in column 2. Have the card corrected by the keypunchers. d) if the correct card comes after your incorrect one, but one of the conditions specified under c) above does not apply, place your card in box CER; make a note of this on listing LXY.5.l. Corrected cards can be inserted in their proper place in deck CXY.9.1.1. (remember that this has been sorted by ID-number, year and card number (columns 3-10, 11-12 and 2). 3. Columns 3-12 contain an alphabetic punch (other than "NN" in columns 9-10). Try to determine the correct ID-number or year from the other cards on LXY.5.1. If you can determine the correct contents, have the card duplicated with corrections by the keypunchers, then insert the corrected cards in their proper position in deck CXY.9.1.1. If you cannot determine the correct ID-number or year, put the incorrect card in box CER; make a note of this on listing LXY.5.1. 4. Columns 11-12 contain numerics outside the valid range. Valid punches for these columns are ''59", "60" and "61". If you find other cards with the same contents in columns 3-10 on listing LXY.5.1., adjacent to the incorrect card, check the following conditions: a) if the correct card is before the incorrect one, and it has a "1" in column 1 and a "C" in column 77 and a number in column 2, one lower than your incorrect card has in column 2, you can assume that your incorrect card should have the same contents in columns 11-12 as the correct one does; have the keypunchers duplicate the card with corrections. b) if the correct card is before the incorrect one, and any of the other conditions specified under a) above does not apply, put the card in box CER and make a note of this on listing LXY.5.1. c) if the correct card is after the incorrect one, and it has a "1" in column 1 and a number in column 2 one higher than the incorrect one does, and your incorrect card has a "C" in column 77, you can assume that your incorrect card should have the same contents in columns 11-12 as the correct one does; have the keypunchers duplicate the card with corrections. d) if the correct card is after the incorrect one, but not all conditions specified under a) above apply, put the incorrect card in box CER and make a note of this on listing LXY.5.1. e) if you find a group of incorrect cards, preceded or followed by a correct card, you can handle the group together (provided all cards in the group contain the same error!) according to conditions a-d above. 5. Combinations of errors 1-4 above. In most cases, the combination of two or more errors on one card makes it virtually impossible to recognize the card; if you find such cases, put them in box CER and make a note of that on listing LXY.5.1. If you can still recognize the card, you can try to process all the errors separately as spelled out in the instructions above. After all cards in CXY.9.1.2. have been processed, you should have: a) no more cards in CXY.9.1.2., and b) corrected cards inserted in CXY.9.1.1., and c) cards which could not be corrected in box CER. Change the label of deck CXY.9.1.1. to CXY.9.2. Output: Deck CXY.9.2. (updated input deck CXY.9.1.1.). 9.3. Listing of Sorted Cards Purpose: This job is needed to produce a listing of the Form 1 Cards sorted by ID-number, year and card number. Input: Deck of cards CXY.9.2. program deck P.4.1. Note: If you find an "X" in the "date started" and "date completed" columns on the log sheet for job 9.2., use deck CXY.9.1.1. instead of CXY.9.2. Action: Assemble the input deck by placing P.4.1. in front of CXY.9.2. (or CXY.9.1.1.). Submit this combined deck at the input-output room in the Commerce Building; this is also the place where you pick up your output the next day or the day after that. Your output will be a listing (besides your original input cards). Label the listing LXY.9.3. and file it in a binder marked "Sorted Summary Listings Form 1". Output: Your input decks P.4.1. and CXY.9.2. (or CXY.9.1.1.); Store these where you found them, and listing LXY.9.3. 9.4. Check for Missing or Superfluous Cards. Purpose: This job is required to ensure that we did not lose cards we should have had, and that We did not include cards which do not belong in this particular file. Input: Card deck CXY.9.2., program deck P.9.4. Action: Assemble the input deck by placing the program deck in front of the data deck CXY.9.2., and submit this at the input-output room in the Commerce Building. Your output should be available the next day or the day after that. Your output should contain, besides your input cards, a listing with information about incorrect cards. Label the listing LXY.9.4. and file it in a binder marked "Form 1 error listings". Note: It is possible that listing LXY.9.4. does not contain any messages regarding incorrect cards. If this is so, job 9.5. is not required; mark this by placing an "X" in the "date started" and "date completed" columns on the log sheet for job 9.5. Output: Input decks P.9.4. and CXY.9.2. (store these where you found them) and listing LXY.9.4. 9.5. Correction, Deletion or Addition of Cards. Purpose: To correct the errors found in job 9.4. Note: It is possible that this job is not required; if you find that your input listing (LXY.9.4.) does not contain any error messages and if you find the "date started" and "date completed" columns on the log sheet for job 9.5. marked with an "X" you can proceed to the next job. Input: Listing LXY.9.4., card deck CXY.9.2., listing LXY.9.3. Action: Listing LXY.9.4. contains indications about missing or superfluous cards in deck CXY.9.2. The following situations may occur: 1. Missing Cards The program which checked for missing and superfluous cards in job 9.4. used the following rules: if a particular card had a "C" in column 77, there should be at least one additional card for that ID-number and year; if the card had a "8" or a or an "L" or an "A" in column 77, there should be no cards following for that ID-number and year. There should be as many cards for a particular ID-number, year combination as indicated in column 2 (the card number) of the card with the "Z", "K", "L" or "A" in column 77. An error message of a missing card can thus have been caused by: a) the "C" in column 77 of the last card for an ID-number, year combination should have been a "Z", "K", "L" or "A"; or b) a card was lost or misplaced (due to a mispunch in columns 3-12 or 1); or c) the last card for an ID-number, year combination has an incorrect card number in column 2 (too high!). To decide what was the cause of the error message, (if you cannot decide that on the basis of the printout on LXY.9.4. alone), get the folder containing the tax returns for the person whose ID-number is in columns 3-10 on the printout, and check the return for that person, for the year which is indicated in columns 11-12. Use the card formats given in WAIS 667-013 (first revision), pages 10-14, in combination with the information on the tax return. You will notice that the information from the returns is punched, beginning with the top of page 1 and ending at the bottom of page 3. The rule is that the last card for an ID-number, year combination is the one containing the last piece of information (e.g. if a taxpayer does not provide any information on page 3, and the last information is at the end of page 2, there should be four cards for that person-year]. From the inspection of card formats and tax returns you should be able to decide what happened: a) if there is a card (or more than one) missing, code the contents of the missing card(s) on 80-column sheets. Be sure to read the general instructions on pages 1-5 of WAIS 667-013 before you do this! b) if the last card present has incorrect contents in column 77, have the card duplicated with correction by the keypunchers. Be sure to use the correct code (see page 14 in WAIS 667-013): c) if the last card present has an incorrect card number (column 2), have the card duplicated with correction by the keypunchers. 2. Superfluous Cards Using rules similar to the ones specified above, the program decided that a card was superfluous if: a) the card prior to the superfluous one, for the same person-year, had an "A", "K", "L" or "e" in column 77; or b) there are two (or more) cards with identical contents in columns 1-12; or c) there are more cards for a person-year combination than the number in column 2 of the last card for that person-year indicates. To decide what caused the error, proceed in the same way as specified above for missing cards; corrections will usually be made by duplicating existing cards with corrections (usually column 2 or column 77), or deletion of really superfluous cards. 3. Miscellaneous Errors The main error printed out under this heading will be the case where a card has an invalid code in column 77 (i.e. not "C", "Z", "A", "K" or "L"). Be very careful in correcting these cases: other cards for that same person-year may be affected; Note: Due to the way the program has been set up, the diagnostics printed out regarding errors don't always mean what they seem to mean; for example: -- if a card has a wrong card number, e.g. "1" instead of "2", it will be printed out as "superfluous card"; -- if the wrong card number caused the card to be sorted behind the last card for that person-year combination, e.g. "5" instead of "1" in a case where only four cards were required, printouts for "missing card" as well as for "superfluous card" will be generated. After you have corrected all errors, insert the cards you want to add (new cards and corrected old ones) in their proper position in deck CXY.9.2. (remember that this has been sorted by ID (cols. 3-10), year (cols. 11-12) and card number (col. 2)!). Mark the changes you made on listing LXY.9.3.; if you deleted cards, mark this on listing LXY.5.1. as well. Change your input card deck CXY.9.2. to CXY.9.5.Output: Card deck CXY.9.5. (corrected deck CXY.9.2.). 9.6. Separation of Form 1 Summary Cards Purpose: This job is required to separate the summary cards by card number to allow for detailed validity checks on each subgroup. Input: Card deck CXY.9.5. Note: If job 9.5. has been marked off on the log sheet with an "X" in the "date started" and "date completed" columns, use deck CXY.9.2. instead. Action: Sort your input deck on column 2, numerically. You should get five decks of cards; label them as follows: -- cards with a "1" in column 2: CXY.9.6.1., -- cards with a "2" in column 2: CXY.9.6.2., -- cards with a "3" in column 2: CXY.9.6.3., -- cards with a "4" in column 2: CXY.9.6.4., -- cards with a "5" in column 2: CXY.9.6.5. If you get any cards in addition to these five decks, yell for help; (In two previous stages, checks were made for this type of error; if we have not found all such cases, we might as well give up!). Output: Card decks CXY.9.6.1-5. 9.7. Card Edits Card 1 Note: Jobs 9.7-9.11. all follow the same procedure, which has been described in detail in section 4.9. of WAIS 667-040. For all these sections, the only specifications given will be those referring to specific inputs and outputs. The total procedure is outlined in section 4.9. Phase 1: Input: Assemble in the following order: 1) PCE, 2) K.9.7., except the last card, 3) KMXY, 4) the last card of K.9.7., 5) CXY.9.6.1. Output: Listing LXY.9.7.1. Phase 2: Mark the changes you made, on listing LXY.9.3. Phase 3: Final output: CXY.9.7.2. [the last number should indicate the total number of times the job was run and could thus be a number other than 2]. 9.8.-9.11. Card Edits Cards 2-5 These jobs are all practically the same as job 9.7.; the differences in input-labels and output-labels are summarized in tabular form below: Job: 9.7. 9.8. 9,9. 9.10. 9.11. Control Card lnput K.9.7. K.9.8. K.9.9. K.9.10. K.9.11. Data Deck Input CXY.9.6.1. CXY.9.6.2. CXY.9.6.3. CXY.9.6.4. CXY.9.6.5. Phase 1 Output LXY.9.7.1. LXY.9.8.1. LXY.9.9.1. LXY.9.10.1. LXY.9.11.1. Phase 3 Output CXY.9.7.2., CXY.9.8.2. CXY.9.9.2. CXY.9.10.2. CXY.9.11.2. 9.12. Creation of Tape-Record Purpose: To create a tape-record combining the information from the separate card-decks. Input: Card decks CXY.9.7.2., CXY.9.8.2., CXY.9.9.2., CXY.9.10.2., CXY.9.11.2. [Note: that for each one of these decks, the last digit can be some number other than 2. For each deck, make sure that the job producing it -- as indicated by the first pair of numeric codes -- was completed; check this from the log sheet: a job has been completed if the "date completed" column on the log sheet has been filled out], program deck P.9.12., a scratch tape. Check with the supervisor who will tell you how to get this tape and how to make sure that we can identify it later. Action: Assemble the input cards in the order: P.9.12., CXY.9.7.2., CXY.9.8.2., CXY.9.9.2., CXY.9.10.2. and CXY.9.11.2. Take the assembled input cards, as well as the tape, over to the Input-Output Room in the Commerce Building. Your output should be available there the next day or the day after that. Output: The original input card decks -- store these where you found them, and your tape (which now contains the data from your input card decks). Label the tape TXY.9.12. 10.1. Sort of Form 2 Cards Note: Section 10 follows essentially the same structure as section 9; for all the jobs in this section, therefore, details will be found in the corresponding jobs in section 9. Write-ups in section 10 will only specify input-files, outputfiles and circumstances where the jobs differ in small details. Input: Card deck CXY.5.2.2. Action: Sort input cards on card number (col. 2), on year (columns 11-12) and on ID-number (columns 3-10) numerically. The only valid year for columns 11-12 is "62"; reject all cases with different contents. Output: Sorted data deck CXY.10.1.1., possibly incorrect cards CXY.10.1.2. (if you don't find any incorrect cards, check off job 10.2. as not required). 10.2. Correction of Incorrect Form 2 Cards Note: For details see job 9.2. Input: Cards CXY.10.1.2. (incorrect cards), listing LXY.5.1., deck CXY.10.1.1. (sorted, correct data). Action: Correct all cards in CXY.10.1.2.; note that the only valid year (columns 11-12) for this deck is "62". Output: Deck CXY.10.2. (updated input deck CXY.10.1.1.). 10.3. Listing of Sorted Cards Note: For details see job 9.3. Input: Deck of cards CXY.10.2., program P.4.1. (if job 10.2. was not run, use cards CXY.10.1.1. instead). Action: Assemble the input (P.4.1. in front of CXY.10.2.) and submit at the input-output room in the Commerce Building. Output: Input decks P.4.1. and CXY.10.2., listing LXY.10.3. 10.4. Check for Missing or Superfluous Cards Note: For details, see job 9.4. Input: Data cards CXY.10.2., program P.10.4. Action: Assemble input cards by placing P.10.4. in front of CXY.10.2., then submit at the input-output room in the Commerce Building. Output: Input decks P.10.4. and CXY.10.2. (store these where you found them), listing LXY.10.4. 10.5. Correction, Deletion or Addition of Cards Note: For details, see job 9.5. Input: Listing LXY.10.4., card deck CXY.10.2., listing LXY.10.3. Action: See job 9.5. Output: Card deck CXY.10.5. 10.6. Separation of Form 2 Summary Cards Note: For details, see job 9.6. Input: Card deck CXY.10.5. Action: See job 9.6. Output: Card decks CXY.10.6.1., CXY.10.6.2., CXY.10.6.3., CXY.10.6.4., CXY.10.6.5. 10.7.-10.11. Card Edits Cards 1-5 Note: For details, see jobs 9.7.-9.11. The following table specifies the labels for input cards and output cards and listings: Job 10.7. 10.8. 10.9. 10.10. 10.11. Control Card Input K.10.7. K.10.8. K.10.9. K.10.10. K.10.11. Data Deck Input CXY.10.6.1. CXY.10.6.2. CXY.10.6.3. CXY.10.6.4. CXY.10.6.5. Phase 1 Output LXY.10.7.1. LXY.10.8.1. LXY.10.9.1. LXY.10.10.1. LXY.10.11.1. Phase 3 Output CXY.10.7.2. CXY.10.8.2. CXY.10.9.2. CXY.10.10.2. CXY.10.11.2. 10.12. Creation of Form 2 Tape-Record Note: For details, see job 9.12. Input: Card decks CXY.10.7.2., CXY.10.8.2., CXY.10.9.2., CXY.10.10.2., CXY.10.11.2., program P.10.12., a scratch tape. Action See job 9.12. Output: Tape TXY.10.12. plus all input card decks. 11.1. Sort of Form 3 Cards Note: Section 11 follows essentially the same structure as section 9; for all the jobs in this section, therefore, details will be found in the corresponding jobs in section 9. Write-ups in section 11 will only specify input-files, output-files and those circumstances where the jobs differ in small details. Input: Card deck CXY.5.2,,3. Action: See job 9.1. for details. The only valid years for columns 11-12 are "63" and "64". Output: Data decks CXY.11.1.1., possibly CXY.11.1.2. (if no cards are found to belong in CXY.11.1.2., check off job 11.2. as not required). 11.2. Correction of incorrect Form 3 Cards Note: For details, see job 9.2. Input: Cards CXY.11.1.2. (incorrect cards), listing LXY.5.1., deck CXY.11.1.1. (sorted, correct data). Action: Correct all cards in =,Y.11.1.2.; the only valid years (columns 11-12) for this deck are "63" or "64". Output: Deck CXY.11.2. (updated input deck CXY.11.1.1.). 11.3. Listing of Sorted Cards Input: Deck of cards CXY.11.2., program P.4.1. (if job 11.2. was not required, use deck C.XY.11.1.1. instead of CXY.11.2.). Action: Assemble the input (P.4.1. in front of CXY.11.2.) and submit at the input-output room in the Commerce Building. Output: Input decks P.4.1. and CXY.11.2., listing CXY.11.3. 11.4. Check on Missing or Superfluous Cards Note: For details, see job 9.4. Input: Data cards CXY.11.2,, program P.11.4. Action: Assemble input cards by placing P.11.4. in front of CXY.11.2., then submit at the input-output room in the Commerce Building. Output: Input decks P.11.4., and CXY.11.2. (store these where you found them), listing LXY.11.4. 11.5. Correction Addition or Deletion of Cards Note: For details, see job 9.5. Input,: Listing LXY.11.4., card deck CXY.11.2., listing LXY.11.3. Action: See job 9.5. Output: Card deck CXY.11.5. 11.6. Separation of Form 3 Summary Cards Note: For details, see job 9.6. Input: Card deck CXY.11.5. Action: See job 9.6. Output: Card decks CXY.11.6.1., CXY.11.6.2., CXY,11.6.3., CXY.11.6,4., CXY.11.6.5. 11.7.-11.11. Card Edits Cards 1-5 Note: For details, see jobs 9.7.-9.11. The following table specifies the labels for input cards and output cards and listings: Job 11.7. 11.8. 11.9. 11.10. 11.11 Control Card Input K.11.7. K.11.8. K.11.9. X.11.10. X.11.11. Data Deck Input CXY.11.6.1. CXY.11.6.2, CXY.11.6.3. CRY.11.6.4, CXY.11.6.5. Phase 1 Output LXY.11.7.1. LXY.11.8.1. LXY.11.9.1. LXY.11.10.1. LXY.11.11.1. Phase 3 Output CXY.11.7.2. LXY.11.8.2. LXY.11.9.2, LXY.11.10.2 LXY.11.11.2. 11.12. Creation of Form 3 Tape-Record Note: For details, see job 9.12. Input: Card decks CXY.11.7.2., CXY.11.8.2., CXY.11.9.2., CXY.11.10.2., CXY.11.11.2., program P.11.12., a scratch tape. Action: See job 9.12. Output: Tape TXY.11.12. plus all input card decks.}}12.1. Sort of Form 4 Cards Note: Section 12 follows essentially the same structure as section 9; for all the jobs in this section, therefore, details will be found in the corresponding jobs in section 9. Write-ups in section 12 will only specify input-files, output-files and those circumstances where the jobs differ in small details. Input: Card deck CXY.5.2.4. Action: See job 9.1. for details. The only valid years for columns 11-12 are "62", "63" and "64". The only valid card numbers in column 2 are "1," "2" or "3". Output: Data decks CXY.12.1.1., possibly CXY.12.1.2. (If no cards are found to belong in CXY.12.1.2., check off job 12.2. as not required.) 12.2. Correction of Incorrect Form 4 Cards Note: For details, see job 9.2. Input: Cards CXY.12.1.2. (incorrect cards), listing LXY.5.1., deck CXY.12.1.1. (sorted, correct cards). Action: Correct all cards in CXY.12.1.2.; the only valid years (columns 11-12) for this deck are "162", 1163T1 or "64"; the only valid card numbers (column 2) are "1", "2" or "3". Output: Deck CXY.12.2. (updated input deck CXY.12.1.1.). 12.3. Listing of Sorted Cards Input: Deck of data cards CXY.12.2., program P.4.1. (if job 12.2. was not required, use deck CXY.12.1.1. instead of CXY.12.2.). Action: Assemble the input (P.4.1. in front of CXY.12.2.) and submit at the input-output room in the Commerce Building. Output: Input decks P.4.1. and CXY.12.2., listing LXY.12.3. 12.4. Check on Missing or Superfluous Cards Note: For details, see job 9.4. Input: Data cards CXY.12.2., program P.12.4. Action: Assemble the input cards by placing P.12.4. in front of CXY.12.2., then submit at the input-output room in the Commerce Building. Output: Input decks P.12.4. and CXY.12.2. (store these where you found them), listing LXY.12.4. 12.5. Correction, Addition or Deletion of Cards Note: For details, see job 9.5. Input: Listing LXY.12.4., card deck CXY.12.2., listing LXY.12.3. Action: See job 9.5. Output: Card deck CXY.12.5. 12.6. Separation of Form 4 Summary Cards Note: For details, see job 9.6. Input: Card deck CXY.12.5. Action: See job 9.6. Note that this form contains only cards 1, 2 and 3 (punched in column 2). Output: Card decks CXY.12.6.1., CXY.12.6.2. and CXY.12.6.3. 12.7.-12.9 Card Edits Cards l-3 Note: For details, see jobs 9.7.-9.11. The following table specifies the labels for input cards and output cards and listings: Job Control Card Input Data Deck Input Phase 1 Output Phase 3 Output 12.7. K.12.7. CXY.12.6.1. LXY.12.7.1. CXY.12.7.2. K.12.8. CXY.12.6.2. LXY.12.8.1. CXY.12.8.2. 12.9. K.12.9. CXY.12.6.3. LXY.12.9.1. CXY.12.9.2. 12.8, 12.10. Creation of Form 4 Tape-Record Note: For details, see job 9.12. Input: Card decks CXY.12.7.2., CXY.12.8.2., CXY.12.9.2., program P.12.10., a scratch tape. Action: See job 9.12. Output: Tape TXY.12.10. plus all input card decks. 13.1. Merge of Summary Tape-Records Purpose: This job is required to combine the summary-records, which were produced in sections 9-12, into one file. Several checks on consistency will be made in the process. Input: Data tapes TXY.9.12., TXY.10.12., TXY.11.12., TXY.12.10., a scratch tape, program deck P.13.1. Action: Take the tapes and the program deck and submit all this at the input-output room in the Commerce Building. This is the place where your output should be available the next day or the day after that. Output: Your original input tapes, (label the scratch tape TXY.13.1.); store these where you found them; program deck P.13.1., and a listing; label the listing LXY.13.1. and file it in a binder marked "Summary error-listings". 13.2. Correction, Addition or Deletion of Records Purpose: This job is required to correct the errors found during job 13.1. Input: Listing LXY.13.1., tape TXY.13.1., a scratch tape, program UPDATEAL, program P.13.2. Action: The action in this job is subdivided into three separate phases: Phase 1: Identification of the Error Listing LXY.13.1. contains the following types of error: 1. Duplicate Records The program determined that duplicate records existed if it found two (or more) records with the same ID-number, social security number and year. The two records found will be printed out on listing LXY.13.1. as well. Several situations could have caused the program to produce the error message; to decide what caused the error and how to correct it, you will need the folder containing the returns for the person whose ID-number id contained in positions 2-9 of the record. The following situations describe probably only part of the different complications which could have arisen: a) a record was punched twice. In such cases, the two records printed out on LXY.13.1. ought to be completely identical (except, possibly, for keypunching errors). If this is the case, you can delete one of the records (if there are keypunching errors, delete the record with the largest number of errors; correct remaining errors on the record you are not deleting). b) the year of the return (pos. 11-12 in the record) was punched incorrectly for at least one of the records. From the returns you find in the folder, combined with the contents of the tape-records, printed out in LXY.13.1., you may be able to determine whether this was the case, and if that were so, which of the two records has to be corrected. You can correct the incorrect record by means of UPDATEAL control cards. c) The ID-number (pos. 2-9 in the record) was punched incorrectly for at least one of the records listed. Since the program, in order to produce this error message, must also have found equal contents in the Social Security field (pos. 395-403), this case will occur mainly if one of two situations exists: (i) the two records belong to a husband-wife unit, where the wife uses her husband's Social Security number, or (ii) the two persons whose records became combined on the tape do not have (or, at least, indicate) a Social Security number. If the first case occurred, you should be able to verify this by checking the tax returns in the folder. You can then decide which record is carried under an incorrect ID-number and correct it accordingly (using UPDATEAL control cards). If the first case did not occur, you should be able to determine, from the tax returns, which record really belongs to the ID-number -- year return indicated in positions 2-11; chances are very small that you will be able to determine the correct ID-number for the other record. Delete the incorrect record from the file, unless you can positively identify the correct ID-number (in that case, have the supervisor check your conclusion!). 2. Mismatched Records The program determined that mismatched records existed if it found two or more records for an ID-number, but not all records carried the same Social Security number. The following cases are examples of what could have happened: a) there was a mispunch in one of the Social Security numbers. You should be able to verify this quite easily: the Social Security numbers concerned should resemble each other closely (generally, in such cases, 7 or 8 of the 9 positions are correct); also, the returns you find in the folder should match the records printed out. If a mispunch occurred, you can correct the mistake by means of an UPDATEAL control card. b) The person has "multiple" Social Security numbers. If this is the case, you should find a "I" in the multiple Social Security number indicator (pos. 1); also, you should find all Social Security numbers for that person written on the back of the code sheet (which has been filed with the tax returns) -- these numbers should contain at least those numbers under which tape records are carried. The top number on the list is the "primary" number. If some records for a person are carried under a "secondary" number, set up UPDATEAL control cards to change all "secondary" numbers for a person to the "primary" number you found on the back of that person's code sheet. c) one of the records has an incorrect ID-number. If this is the case, the "incorrect" Social Security number must not be indicated on any of the person's tax returns and/or code sheets; also, the record containing the "incorrect" number or listing LXY.13.1. should not reflect any return in the person's folder. After you have determined which of the returns do not belong to the person whose ID-number they carry, you may attempt to find the right "owner" of the return. Assume that the ID-number is incorrect but that the Social Security number is correct. Now locate that Social Security number on a listing which has been sorted by Social Security number and connects Social Security numbers with ID-numbers. If your Social Security number is not contained on that listing, you will in most cases not be able to find out to whom the record belongs; the only possible action in such case is to delete the record from the file (using UPDATEAL control cards). If you do find the Social Security number on the listing, locate the ID-number which is connected with it; then take the folder which contains the tax returns for the person with that ID-number. Try to match the record printed in LXY.13.1, with the return contained in the folder. If the two sets of information match, the ID-number under which the tax returns were filed in the folder is your correct ID-number; you can set up control cards to correct the situation. If the two sets of information do not match, you will have to delete the incorrect record from the tape-file, using UPDATEAL control cards. Phase 2: Correction of the Error If you have identified all the errors on listing LXY.13.1. in phase 1 of this job, you can set up UPDATEAL control cards to make the changes. Check the format of the control cards in WAIS 667-025 (first revision), check the format of your data-record in WAIS 667-030. Indicate the action you took briefly on LXY.13.1. When all your control cards have been coded and keypunched, you can run program UPDATEAL to make the corrections. Your input is: program UPDATEAL (card deck), your control cards, data tape TXY.13.1., a scratch tape. Your output will be: your original input data, with the scratch tape now containing the corrected file. Label the scratch tape TXY.13.2. Phase 3: Verification of the Corrections In order to verify the corrections you made in phase 2 of this job, submit program P.13.2. (this is a restricted version of program P.13.1. which produced the tape-file originally), using TXY.13.2. as input. If you corrected all errors on LXY.13.1. and if you did not introduce any new errors with your corrections, your output listing will indicate this. If there are still errors left, repeat job 13.2. as many times as are required to correct all inconsistencies indicated by program P.13.2. Output: Your final output will be tape TXY.13.2. (i.e. the last version produced, where program P.13.2. did not indicate any remaining errors). 13.3. Check for Duplicate Social Security Number Purpose: This job has to be done to determine cases where two persons have returns containing the same Social Security number. Input: Data tape TXY.13.2., program P.13.3. Action: Submit the input at the input-output room in the Commerce Building; this is also the place where you pick up the output the next day or the day after that. Output: Your input data, as well as a listing. Label the listing LXY.13.3. and file it in a binder marked "Summary error-listings". Note: If listing LXY.13.3. does not contain any error messages, you can mark off job 13.4. as not required. 13.4. Correction of Incorrect Records Purpose: This job is required to correct the inconsistencies found in job 13.3. (you will find an "X" in the "date started" and "date completed" columns on the log sheet if this job is not required). Input: Listing LXY.13.3., data tape TXY.13.2., a scratch tape, program deck UPDATEAL, program deck P.13.3. Action: The action in this job is subdivided into a number of separate phases: Phase 1: Locating and Identifying the Error Listing LXY.13.3. contains printouts for all pairs (or groups) of people with the same Social Security number. To find the right "owner" of the number, go through the following steps: 1) Check listing LXY.4.13.; this is a list giving information comparable to that contained on LXY.13.3., but from the ID-cards rather than the summary-cards. LXY.4.13. should contain notes indicating the action taken in job 4.14. (the equivalent of job 13.4.). For all cases which were listed on LXY.13.3. as well as on LXY.4.13., you can accept the decision made in job 4.14. and correct the record on TXY.13.2. accordingly. For cases on LXY.13.3. for which there is no equivalent on LXY.4.13.; 2) Get the folders containing the information for the people whose Social Security numbers you are checking; check the Social Security numbers on the tax returns in these folders; 3) If one of the numbers is incorrect (due to coding or punching errors), set up control cards to correct the Social Security numbers (see job 4.12., section 9.3. for details; the data format to be used is given in WAIS 667-030, pages 5-9). 4) if both (or all) numbers seem to be correct (i.e. the tax returns for both-or all-people carry the same Social Security number: (i) the group may contain a husband and a wife who are using the same Social Security number. In that case, "allot" the number to the husband and change the wife's number to blanks (unless the tax returns give you an indication that she does have another number, or unless the tax returns give you a clear indication that the husband has another number and the "disputed" number really belongs to the wife), using control cards as described in job 4.12., section a.3. (ii) two (or more) "people" may in fact be one and the same person (compare the addresses, names, occupation codes, etc., on the code sheets). If "they" are the same person, show the case to the supervisor. Some of the records on the tape-file will have to be corrected; the supervisor will tell you how to correct the case. (iii) the records "sharing" the same Social Security number may belong to different people who are not in any way related to each other. Search their tax returns for any indication of another Social Security number. If you find one, make the correction by setting up a control card according to WAIS 667-025. If you don't find any indications that anybody has another Social Security number, mention this to the supervisor. Mark off the error on listing LXY.13.3. as "cannot be corrected". Phase 2: Making the Corrections When you have gone through all the errors on LXY.13.3. and your control cards for the corrections have all been punched, you can run program UPDATEAL to make the corrections (see WAIS 667-025 for instructions on submitting the program) -- you need program deck UPDATEAL, your control cards, tape TXY.13.2. and a scratch tape. Mark the output tape (the original scratch tape) TXY.13.4. Phase 3: Verification of the Corrections To make sure that you have corrected all the errors and have not introduced new ones, run job 13.3. again, using TXY.13.4. as input. The listing you receive this time should not contain any errors except the ones you found and marked as "cannot be corrected". If there are new (or remaining) errors, repeat job 13.3. Output: Your input program cards (UPDATEAL; store where you found them), input tape TXY.13.2. (store where you found it), corrected data tape (the original scratch tape, now labelled TXY.13.4.). 13.5. Addition of Data From L- and A- Records Purpose: This job is required to add the information regarding miscellaneous amounts and assessments to the summary records. Input: Data tapes TXY.13.4., TXY.7.7. and TXY.8.7.; program deck P.13.5., a scratch tape. Action: Submit the program with the tapes to be run at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Your output should contain, beside your input program and tapes, a listing. Label the listing LXY.13.5. and file it in a binder marked "Summary error-listings". Output: Your input program cards (store these where you found them), your input tapes (label the scratch tape TXY.13.5.; store all data tapes where you found them) and listing LXY.13.5. 13.6. Correction of Incorrect Records Purpose: This job is required to correct inconsistencies found in job 13.5. Input: Listing LXY.13.5., tape TXY.13.5., program UPDATEAL, a scratch tape. Action: This job is to be done in three separate phases: Phase 1: Locating and Identifying the Error The program which produced the listing LXY.13.5. checked for the following types of error: 1. Missing L-record: the summary record carried a key indicating that an L-record for that person and year was expected, but in job 13.5. no such record was found. 2. Missing A-record: the input summary record carried a key indicating that an A-record for that person and year was expected, but no such record was found. 3. Superfluous L-record: there was an L-record for a person and year where no such record was expected [N.B. this includes cases where an L-record was present for a person-year combination for which no summary-record was present]. 4. Superfluous A-record: there was an A-record for a person and year where no such record was expected. 5. Mismatched L-record: although, for a specific person-year combination, an L-record was expected and found, the Net Taxable Income on the summary record (pos. 177-184) and the "Income previously taxed" on the L-record (pos. 13-20) are not identical. In all cases, where a match (which would result in an error) could have been made (errors 3, 4 and 5), the match was not made. The error diagnostic was printed, together with printouts of the record(s) affected. Fields which would have been affected by a match (if it had been made) are left blank on the tape-record. For all cases, you can determine what the situation should be by checking the tax returns for the person-year combination you are trying to correct. If it turns out that a match which had not been made, should be made, you will have to set up control cards for an UPDATEAL run (see WAIS 667-025 for control card formats, 667-030 for record format). Fields which can be affected are: for addition of data from an L-record: 41 (pos. 329-336), 42 (pos. 337-344), 50 (pos. 394, see note in this section); for addition of data from an A-record: 28 (pos. 225-232), 39 (pos. 313-320), 46 (pos. 369-376), 50 (pos. 394, see note in this section). The case may occur where it turns out that a match had been made illegally; in that case, you have to correct the fields which could have been affected by the match (see the paragraph immediately before this one). There may be cases where you have to locate the missing amounts on the tax returns (e.g. if errors type 1 or 2 occurred). Record formats (667-030) and keypunching instructions (667-013, first revision, especially pages 28 and 29 ) may be useful to help you find these amounts. Note: Field 50 was assigned unique values, giving information about the "completeness" of the return, the presence or absence of L-records and the presence or absence of A-records. The codes used can be found in the following table: Record complete Record incomplete No A-record A-record No A record A record No L-record 0 2 1 3 L-record 4 6 5 7 An example: if a record was complete, had an L-record but no A-record, the code in pos. 394 would be 4. If you want to add an A-record, the code changes to 6. For all changes you have to make, code control cards to be used with UPDATEAL (see WAIS 667-025 for control card format, WAIS 667-030 for record format). When you have determined, for all cases on LXY.13.5., how to correct the inconsistencies, have the control cards punched. Note: Make a notation on LXY.13.5. for every change you are going to make; this will enable us later to determine what happened. Phase 2: Correcting the Errors When your control cards have been punched, submit program UPDATEAL (see WAIS 667-025 for details) -- you need program deck UPDATEAL, your control cards, tape TXY.13.5. and a scratch tape. Submit this at the input-output room in the Commerce Building. After the job has been run, label the scratch tape (which now contains the corrected file) TXY.13.6. Phase 3: Verifying the Corrections To make sure that you have made all corrections properly and have not introduced new errors, run a special program P.13.6., using TXY.13.6. as input. Output will be a listing containing the same type of error messages as LXY.13.5. If this list contains any new errors, repeat job 13.6. Output: The final output for this job (in addition to the original input) is: updated tape TXY.13.6., listing LXY.13.6. 13.7. Internal Consistency-Checks Purpose: This job is required to check for consistency between amounts within each return; an error listing is produced. Input: Tape TXY.13.6., program P.13.7. Action: Submit program P.13.7., with the input tape, to be run at the input-output room in the Commerce Building. Your output should be available there the next day or the day after that. Besides your input program deck and tape, your output will contain a listing; label the listing LXY.13.7. and file it in a binder marked "Summary error-listings". Output: Tape TXY.13.6., program P.13.7., listing LXY.13.7. 13.8. Correction of Incorrect Records Purpose: This job is required for the correction of errors found in job 13.7. Input: Listing LXY.13.7., tape TXY.13.6., program UPDATEAL, a scratch tape, program P.13.7., another scratch tape, control cards K.13.8. Action: This job is to be done in a number of separate phases: Phase 1: Locating and Identifying the Error Listing LXY.13.7. contains messages regarding records with inconsistencies between fields (e.g. cases where (A + B) should be C, but where addition of A and B does not give a value equal to C). For all these cases, you have to locate the amounts on the tax return and check each one of them.. If all amounts on the record are identical to the ones indicated on the tax return (and if, in other words, the calculation made by the taxpayer was incorrect), you don't have to take any action, except to write on the listing that this case was incorrect on the tax returns. If one or more of the amounts on the record disagreed with the information on the tax return, you have to set up a control card to correct the record; note on the listing which amounts were changed and what their new values are. Phase 2: Correcting the Error on the File When all your control cards (see 667-025 for the format) have been punched, run program UPDATEAL to correct the file (667-025 gives instructions about the submitting of this program): you need program deck UPDATEAL, your control cards, tape TXY.13.6. and a scratch tape. After the running of UPDATEAL, your scratch tape contains the corrected file; label it TXY.13.8.1. Phase 3: Verifying the Corrections Made This phase will not only verify the corrections you made in the previous phase; it will also "flag" the records containing inconsistencies which cannot be corrected. Your inputs are: program P.13.7., control cards K.13.8., tape TXY.13.8.1. and a scratch tape. Assemble the program input by placing K.13.8. behind P.13.7. and submit this, together with the input tapes, at the input-output room in the Commerce Building. Your output should consist of the original input items, as well as a listing. Label the listing LXY.13.8. Compare the messages with those on LXY.13.7.; the only error messages remaining on LXY.13.8. should be the ones which you marked on LXY.13.7. as "cannot be corrected". If there are additional errors, repeat this job (using TXY.13.8.1. as your input tape in phase 2). The final phase 3 output tape should be labelled TXY.13.8.2. Output: The output of job 13.8. is, besides the input data: listing LXY.13.8., data tape TXY.13.8.2. 13.9. Inter-Year Consistency Checks Purpose: This job is to be run to check the consistency of information in different records for an individual. Input: Tape TXY.13.8.2., program P.13.9. Action: Submit tape and program at the input-output room of the Commerce Building; your output should be available there the next day or the day after that. It should contain a listing; label this listing LXY.13.9. and file it in a binder marked "Summary error listings". Output: Data tape TXY.13.8.2., program P.13.9., listing LXY.13.9. 13.10. Correction of Incorrect Records Purpose: This job is needed to correct the inconsistencies between records for an individual, as they were found in job 13.9. Input: Listing LXY.13.9., tape TXY.13.8.2., program UPDATEAL, a scratch tape, program P.13.9., control card 8.13.10. Action: This job is to be done in a number of separate phases: Phase 1: Locating and Identifying the Error Program P.13.9. checked for a large number of inter-record inconsistencies, most of which cannot be specified at the time of this writing. Ask the supervisor for a list of possible errors for job 13.10. For all errors on LXY.13.9., you will have to investigate the tax returns for the years affected, for the person whose records are in error. The nature of the errors is quite varied, but in almost all cases you will have to check returns for more than one year to correct the error. There may be quite a few cases where you cannot correct the error, because the tax returns themselves are incorrect. Mark such cases on LXY.13.9. as "cannot be corrected". Phase 2: Correcting the Errors For the situations where you can correct the error, mark the action to be taken on LXY.13.9., then set up control cards to correct the file using program UPDATEAL. WAIS 667-025 specifies card formats and gives other relevant information about submitting the program, WAIS 667-030 gives the format of the tape-record (pages 5-9). Have the control cards punched by the keypunchers; then submit the cards with the program and data tape TXY.13.8.2., as well as a scratch tape, to be run at the input-output room in the Commerce Building. Your output should contain the corrected data on the scratch input tape; label the tape TXY.13.10.1. Phase 3: Verifying the Corrections Made This phase is used to verify the corrections you made in the preceding phase; it is also intended to "flag" the cases which could not be corrected. To accomplish all this, submit program P.13.9., with control card K.13.10. following the program cards, and tape TXY.13.10.1. and a scratch tape, to be run at the inputoutput room in the Commerce Building. Your output will contain, besides your input cards and tapes, a listing of a similar nature as LXY.13.9. Label the listing LXY.13.10., then compare the entries with those on LXY.13.9. If LXY.13.10. contains only entries which you marked on LXY.13.9. as "cannot be corrected", store the listing in a binder labelled "Flagged master records", and label the output tape for this phase (your original scratch tape!) TXY.13.10.2. If LXY.13.10. contains other entries in additon to the ones marked "cannot be corrected" on LXY.13.9., return the output tape for this phase (the original scratch tape) to the pool of scratch tapes and start again at phase 1 of this job to correct the additional errors from LXY.13.10.; for phase 2, use TXY.13.10.1. as your input tape. Repeat phases 1, 2 and 3 as often as is necessary to leave only errors which cannot be corrected. Output: The final output consists of, besides the original input items (LXY.13.9., TXY.13.8.2., P.13.9., K.13.10.), output tapes TXY.13.10.1. (unflagged corrected file) and TXY.13.10.2. (flagged corrected file) and listing LXY.13.10. (summarizing the flagged records). 13.11. Husband-Wife Consistency Checks Purpose: This job is required to check for inconsistencies between records for husband-wife units. Input; Program P.13.11., tape TXY.13.10.2. Action: Submit the program, with the input tape, to be run at the input-output room in the Commerce Building; your output should be available there the next day or the day after that. Your output should contain, besides your input items, a listing; label the listing LXY.13.11. and file it in a binder marked "Summary error-listings". Output; Your input items (P.13.11., TXY.13.10.2.) and error listing LXY.13.11. 13.12. Correction of Incorrect Records Purpose: This job is required to correct the inconsistencies found in job 13.11. Input: Error listing LXY.13.11., data tape TXY.13.10.2., program deck UPDATEAL, program P.13.11., a scratch tape. Action: This job is to be done in a number of separate phases: Phase 1: Locating and Identifying the Error Program P.13.11. checked for a large number of different inconsistencies between the records for a husband-wife unit, mainly related to the coded data and to presence or absence of records. At the time of this writing, no complete list of the different types of errors can be provided; ask the supervisor for an up-to-date list of possible errors for job 13.12. For all the errors on LXY.13.11., you will have to investigate the tax returns for the years affected, for the persons whose records show inconsistencies. The nature of the possible errors is quite varied, but in almost all cases you will have to check the returns for the husband as well as for the wife to determine the cause of the inconsistency. There may be quite a few cases where you cannot correct the error, because the tax returns themselves are incorrect. Mark such cases on LXY.13.11. as "cannot be corrected". Note: Many of the errors on LXY.13.11. will have been spotted on LXY.13.9. already, although the error message is different. In most cases you will be able to recognize these situations, however; especially for the cases where errors are marked on LXY.13.9. as "cannot be corrected", checking LXY.13.9. will help you in deciding what to do with the errors listed on LXY.13.11. Phase 2: Correcting the Errors For the cases where you can correct the error, mark the action to be taken on LXY.13.11., then set up control cards to correct the file using program UPDATEAL; check WAIS 667-025 for control card formats and submission procedures. Have the control cards punched by the keypunchers, then submit the cards with the program deck and data tape TXY.13.10.2., as well as a scratch tape, to be run at the input-output room in the Commerce Building. Your output should contain the corrected data on the scratch input tape; label the tape TXY.13.12. Phase 3: Verifying the Corrections This phase is used to verify the corrections you made in the preceding phase. Submit program P.13.11. again, using tape TXY.13.12. as input tape. The error listing you receive this time should be labelled LXY.13.12.; there should be no errors on it other than the ones which you marked on LXY.13.11. as "cannot be corrected". if there are other errors on the list, repeat job 13.12. This process has to be repeated until there are only inconsistencies left which cannot be corrected. Output: The final output consists of, besides the original input items (LXY.13.11., TXY.13.10.2., P.13.11.), output tape TXY.13.12., listing LXY.13.12. 13.13. Sort and Merge of Summary-Records Purpose: This job is needed to add the segment of the Master File which has now gone through a number of consistency-checks, to the segments which previously completed that process. Input: Tape TXY.13.12., program P.13.13., the previous, Master tape(s), labelled MASTER #ZZ, where "ZZ" is a sequence counter indicating the number of segments on the file, and a scratch tape. Note: For the very first segment to reach this job, you will not have a tape MASTER #00. For that special case, you can perform job 13.13. by simply marking tape TXY.13.12. as MASTER #01. Action: For all segments after the first one, take your input tapes and program P.13.13. to the input-output room in the Commerce Building, where your output should be available the next day or the day thereafter. Output: An "updated" tape, containing the combined information from tapes TXY.13.12. and MASTER #ZZ; label it MASTER #ZZ1, where ZZ1 is a number one higher than ZZ. For an example, see job 5.15. in WAIS 667-042.hahttp://www.ssc.wisc.edu/wais/WAIS678008.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678008.txt9 Wynn Bussmann  1967|uProcessing Property File, 1959-1964 Tax Returns: Instructions for Coding Asset Types and Asset Identification Numbersb July 27, 1967 WAIS paper678-009 Property Filec77Wynn V. Bussmann WAIS 678-009 July 27, 1967 Processing Property File, 1959-1964 Tax Returns: Instructions for Coding Asset Types and Asset Identification Numbers I. Introduction There are several basic documents that you will be working with: (1) the taxpayer's folder containing microfilm prints of his tax returns, (2) a large booklet containing lists of firms' names and corresponding identification numbers, (3) a master residual list book supplementing (2) above, and (4) xeroxed sheets containing asset types. No coding sheets are required; all of your coding will be done directly on the tax forms. The coding of asset types and asset identification numbers is the first step in processing all the income from property that the taxpayer reports on his 1959-1964 tax returns. For our purposes, we distinguish among six types of property income: (1) interest income (from bonds, notes, savings accounts, etc.), (2) dividend income (from stocks), (3) capital gains income (from the sale of an asset-e.g., sale of stock, sale of residence, sale of livestock or machinery, etc.), (4) rent income (from leasing an apartment or land, etc.), (5) farm income (from operating a farm), and (6) business and professional income (from operating a business). The only type of property income that will require coding of asset types and asset identification numbers are the first three; that is, interest, dividends and capital gains. Hence, except for using other information on the tax return to clarify the relevant information for you, you will be concerned only with those parts of the tax returns that deal with interest, dividend, and capital gains income. Briefly, your job is to assign a six-digit number to each asset giving rise to property income (interest, dividends, and capital gains) on the tax returns. The first two digits of the six-digit number identify the type of asset. These numbers together with the corresponding asset type designations are contained on the "xeroxed sheets containing asset types" (the fourth basic document listed in paragraph one of this paper), which is more fully described in Section II.A. below. The last four digits (asset identification number) of the six-digit number identify the particular firms or institutions issuing the asset and are contained in the asset identification lists. For example, if you were to code up dividends from A.T. & T. stock, you would code asset type 10 (because A.T. & T. stock is listed on the New York Stock Exchange) and asset identification number 4716 (which happens to be the number assigned to "Amer Telephone and Tel" on the New York Stock Exchange listing described in Section II.B. below). The instructions contained in this paper are intended to facilitate your job of coding; they deal with the most common problems which may arise on the tax returns. If you come to a problem which is not dealt with in this paper, ask the coding supervisor how to resolve it. Please do not make arbitrary decisions on your own. Even though the decision you make may be the correct one, we would not know about it unless you happened to remember to tell us. It is better to let the supervisory staff make the decisions partly because then they know what decisions have been made, and also, if the wrong decision was made (Heaven forbid), you remain blameless. II. Basic Documents A. Asset type sheets The asset type sheets (two or three of them) contain the asset type numbers that are associated with various types of assets. For example, if you find that a taxpayer receives interest from a bank, you would code asset type 01 and then the four-digit asset identification number of that particular bank (which you would find by looking under the name of the bank in the bank section of the asset identification lists). Similarly, if he receives interest from a savings and loan company, you would code asset type 02 and then the four-digit asset identification number of that particular savings and loan company. And so on. Before you start any coding, look over the asset type sheets to make sure that you understand each item on them. In looking over the asset type sheets, you will notice that some designations not only have an asset type assigned to them, but also have an asset identification number assigned to them. For example, "Notes and Mortgages, obligations of individuals" has 06-0000. What this means is that you do not have to look up the asset identification number in the asset identification lists; just assign asset type 06 and asset identification number 0000 to all interest received from notes and mortgages which are obligations of individuals (as opposed to obligations of business firms, etc.). If you are unsure what asset type to assign, ask the supervisor for help! B. Asset identification lists The asset identification lists are contained in a large booklet of computer print-out sheets with a brown cover. The booklet contains lists of (1) firms whose stocks are listed on the New York Stock Exchange (LFF), (2) banks in Wisconsin, (3) co-operatives in Wisconsin, (4) savings and loan institutions in Wisconsin, and (5) a residual list of firms not in any of the first four categories. Each firm name is associated with a four digit identification number. The firms' names are alphabetically arranged in a column on the left of each page with the identification numbers in a column on the right. In the bank list, there is a (usually) solid column of numbers (25 digits wide) between the column of bank names and the column of bank identification numbers; disregard these numbers as they have nothing to do with your coding. Thus, the bank identification numbers are the last four numbers on each line of the bank list. The listing of firms whose stocks are traded on the New York Stock Exchange (LFF) also has mixed in with it some of the residual firms. The way you can tell them apart is that the LFF-listed firms have either two or four blocks of numbers (or R's) preceding the asset identification numbers which, again, comprise the rightmost column of numbers on each page. The residual firms listing contains nothing between the firm's name and its identification number. However, the listing of the residual firms that are mixed in with the LFF listing contains two letters just preceding the identification numbers. These letters are WB, WC, WD, and WP, and they tell whether the asset issued by the firm is a bond, a common stock, a debenture, or a preferred stock, respectively. These letters may help you in determining the type of asset listed on a tax return. Another notable feature of the asset identification lists is in the residual listing, where some firms have the designation "Add 50" just before the asset identification number. This designation tells you to add 50 to the asset TYPE. For instance, on the asset type sheets there are some designations (e.g., "Bonds issued by firms not in LFF") for which there are two asset types (in this case, 24, 74). The "Add 50" just tells you to assign the higher number (which is 50 more than the smaller number) as the asset type for that asset. If you come across an "Add 50" firm for which there is only one asset type, do NOT simply add 50 to the asset type, but see the coding supervisor. As you work with the asset identification lists, you will become more and more familiar with them. However, there are some idiosyncrasies which should be pointed out. First, if you do not find a firm listed on a list, try again by looking for the firm listed under a different part of its name. For example, when trying to locate "A.J. Christensen (Co.)", you would presumably first look under Christensen. Unfortunately, you do not find it there because it is listed under the A's as A.J. Christensen. Similarly, T.E. Esser is listed under the T's and not under the E's. (Both of these examples come from the residuals where most of this type of error occurs.) Second, watch out for abbreviations, which may cause firms to be listed out of alphabetic order. For example, The Great Northern Iron Ore Company is listed under "Grt Northern Iron Ore." In this case and in others, there may be enough firms between the abbreviated name and the written-out name to put the company you are looking for several pages from where you would first expect to find it. Other notorious abbreviations to watch out for are St., St, and Saint (the period in St. causes the St.'s to be listed in a different place than the St's); Am,, Am, Amer, Amer., and American (same reason), etc. Third, watch out for misspellings or variations in spelling such as Milwauke for Milwaukee and Nielsen, Nielson, Neilson, Neilsen, etc. Fourth, watch out for items which are out of alphabetic order. For example, Schuster Co. is listed before the Sche...'s, not after the Schur...'s as it should be. The only way to catch this type of error without wasting lots of time is to start by looking at whole pages when looking for an item. Thus, once you have found the Sch's (for example), look through them for Schuster. It helps, I guess, if you are not extremely familiar with the alphabet. Any way, the idea is to look in every conceivable place for a firm before asking the coding supervisor for a new number for that firm. She then writes down the name of the firm, the asset type assigned, and the new asset identification number. Every week, this list of new residual firms will be alphabetized and stored in a master residual list book to be kept by the coding supervisor. You or the coding supervisor should, of course, refer to the list when looking for a firm. Do not forget to look for a firm in the master residual list book before asking the coding supervisor for a new number. D. Tax forms Tax forms are the same for the years 1959 through 1961; tax forms for the years 1962-1964 are the same but are different from those for years 1959-1961. For the years 1959-1961, you will be concerned mainly with page three of Form 1. On that page are Schedules C through C. Schedule C (1959-1961) is "Summary of Other Income" and includes the total income from interest, dividends, capital gains, etc. You should first check this schedule to see if the taxpayer has any property income. If property income is indicated, then you can proceed to look for the schedules or attached sheets on which each property income source is listed. Even if no property income is indicated on Schedule C, you should check for the presence of property income on other schedules or attached sheets; taxpayers sometimes forget to complete Schedule C. The schedules which contain individual property income sources and on which you will code the asset type and asset identification numbers are Schedule D (interest received), Schedule E (dividends received), and Schedule C (gain or loss from sale of assets). For the years 1962-1964, you will be concerned mainly with pages two and four of Form 1. On page two are listed dividend income (line four), interest income (line 5), and total capital gains income (line 7). If there is capital gains income, then you should also see Schedule C on page four, which lists the gains (losses) separately for each asset sold. Whenever there is not enough room for the taxpayer to report all of his property income (from interest, dividends, and capital gains) on the tax form, he will attach a separate schedule which should be included with his tax returns for that year. Sometimes the taxpayer will write something like "See Attached Schedule" in the appropriate schedule or line of the tax form. Be sure that you look for all such schedules so that none is missed. For the years 1963 and 1964, statements from banks, savings and loan institutions, co-ops, etc. as to how much interest or dividends the taxpayer received during the year from that particular source should be included with the tax returns. These statements may in some cases help you to determine the correct code to give a company. For example, suppose that the taxpayer says he received interest from "The First National Bank". In order to look up the asset identification number for the bank from the lists provided to you, you will need to know where the bank is located. The reason for this is that the First National Bank of Madison has a different number from the number assigned to The First National Bank of Oconomowoc. There may be (and you should look for) a statement from the bank containing the address of the bank thereby locating the bank for you. Hint: don't forget to check through the returns for ALL the years in the folder if there is incomplete information for one year; the taxpayer may provide clues to the missing information in other years. Again, if you cannot easily find a logical answer to a question, do not make an arbitrary decision on your own -- ask the coding supervisor. On farm schedules, you should code up as a capital gain (loss) all livestock and other items purchased and then sold for a gain (loss). The best way to do this is to first check the "Summary of Income and Deductions..." schedule found on page one of Form 1-Fe. If there is an entry on line four, "Profit on sale of livestock and other items purchased" (In 1961, this is line two, "Profit on sale of livestock purchased" primarily for sale), then look for the individual items elsewhere so that you can properly code them. The first place to look is just above the "Summary of Income and Deductions,.." schedule in the "4. Sale of Livestock and Other Items Purchased" schedule (this is the schedule and not line four in the "Summary of Income and Deductions" schedule). Here the farmer lists the sales of hogs, cattle, etc. which he bought and made a profit on. These items should be coded. Another place to look for items to be coded is on depreciation schedules. Sometimes the farmer will sell items on which he also claims depreciation that year and will include these in his depreciation schedule. So, look for items on the depreciation schedule for which there is a sales price and/or gain or loss recorded; these items should be coded. One last instruction: if you find anything unusual about the tax return or any of the other items with which you are working, or if you have any question at all, ASK THE CODING SUPERVISOR.hahttp://www.ssc.wisc.edu/wais/WAIS678009.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678009.txt Jan Smithe 1967*$Revisions for Survey File Processing July 27, 1967s WAIS paper678-010 Survey Data and Filehahttp://www.ssc.wisc.edu/wais/WAIS678010.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678010.txtjSome of the data in this document contains confidential information. The confidential information has been removed and preserved in the secure data archive (OLDR). Contact CDHA Data Archivist at cdhadata@ssc.wisc.edu for more information. Jan Smith WAIS 678-010 July 27, 1967 Revisions for Survey File Processing This paper may - or may not - report specific information about some of the processing; adjust time estimates, processing sequences, and completion data,; and suggest future action. I. Completed Processing in WAIS 667-041, Mark Lieberman presented a sequential processing of the file; some of these steps have now been completed, and are so indicated in the revised flow chart herein. Mark's I and my 10 has been completed. Some merged as well as split records were located and have been corrected. Mark estimated four hours fat this step; I would say that this step took us about seven hours. Step 30 (July, 1967) is an expansion of Mark's IV (March, 1967). The Inventory resulting from IV showed us exactly how much paper we actually had in our source file, and that eight interview schedules have been lost. The 30 inventory finished in July, 1967 shows as how much of our source file had actually been answered and keypunched. There are 1300 interview records on the survey file; 194 lacking assets booklets in both the file and inventory; there are six assets booklets started by the R, but terminated before completion-and so stated by or for- R an the booklet (Gene Moyer Indicated that one of these was not to be punched, but was anyway.). Seven more interviews were terminated by or far R before completion, and these do not have assets booklets. Eight R's (see 667-041) have records an tape, but no source documents (one of these also lacks assets booklet card Images). Therefore,there should be 1084 completed interviews with both schedules and assets booklets in the inventory and on file. (1) R's on tape file and source file 1300 (2) I and AB an tape file, not source file (7) I on tape file, not source file (1) (3) AB on tape file, not source file (1) (4) I terminated (7) (5) AB terminated (6) (6) AB not on tape file, not source file (194) (2) Schedule Lost, but present on survey file. (3) Moyer said, "Don't punch" but is still on file. (4) Interview not completed (5) Assets Booklet started, but not finished, and so state by/for R. (1) will be processed as outlined in II of this paper. (2) will be processed for consistency only, but will be left on the tape file. Since the source docent were present at one time (coversheet codes checked), and they represent valid samples. (3) was considered by Gene Moyer to be an invalid response and should be dropped from the file. (4), (S) and (6) are mare difficult to handle. The answers that are present are probably accurate; however, the file is most useful, only If all questions were asked and answered. Whether or not these records should stay en the file should be resolved before the processing is really completed. Of the 194 missing assets booklets, some R's refused to answer them, and some just bother to return the booklet if it had been up with them. Such differences may affect a decision regarding the validity of the interview. The source file inventory is now updated - perhaps not perfectly, but at least fairly reliable. 50 (Mark's II) is now completed. The Indicators and dependencies as explained in WAIS 667-009 are incorrect. Corrected specifications are available, and hopefully will be programmed into a Presence and Absence program in the near future. This program will be used as a final check in 70 60(Lieberman's VI) bas been a trouble-spot. Specifications for all cards (except 12, 13, and 14) are available. However, the program for these codes is rather unreliable. Nevertheless, they will be real, to give preliminary edit results. II. Processing Sequence Revisions * = completed step and time estimate. + = a changes in tip estimate. Input Current Survey Tape I C-cards, etc. Interview Schedules Correct ID's on schedules to correspond with Tape ID's Output Updated Interview Schedules Survey Tape II Logic Survey Tape II Updated Coversheet Run C-cards Check Tape and Coversheet UPDATEAL Cards Survey Tape II Updated Interview Schedules Check Tape and Interview Schedules Check Weight cards and Survey Tape Survey Tape II Inventory Survey Tape II Survey Tape IV 50 Updated Weight cards UPDATEAL Survey Tape III Weight Cards Alter and/or UPDATEAL . Agree? Agree? Survey Tape V Survey V CARD EDIT Specifications Survey Tape Survey Tape CARD EDIT Specifications LONIELLO CARD MIT PRESENCE AND ABSENCE II COBOL CARD EDIT UPDATEAL UPDATEAL UPDATEAL UPDATEAL PRESENCE AND ABSENCE Logic Output SURVEY Tape V Survey Tape VI Agree Survey Tape I Agree? Survey Tape III Logic Output Survey Tape VIII Run Pre-Edit Program OK? Survey Tape Survey Extract Create new Survey Extract WISTAB WISTAB Check tables and frequency counts UPDATEAL Analysis and editing Survey Tape IX Survey Extract counts on extracted blew Actual time Time estimated Consistent 10 is completed. Estimate was four hours; actual time was seven hours. 20(Lieberman's III) is designed to eliminate inconsistencies between the coversheet cards, the source file, end the tape file. 30 is completed. Estimated time, sixteen hours; actual time, forty-one hours. 40(Lieberman's V) will show final weights, or weight discrepancies, depending on the decision concerning the validity of the Moyer Weighting system. 20, 30, and 40 are not of absolute importance is processing the survey, but they may increase the validity of the survey file. 50 is complated. Estimated time, thirty-two hours; actual time, twenty-seven hours. Lieberman's II has, however been split into 30 and 70 basically due to the rigidity of the specifications in the existing Loniello Presence and Absence program. 60 is in a state similar to that of 50 and 70. Specifications are ready for the Loniello Card Edit program; however, the program at this point will accept card input but not tape input. A possible solution for this step is to edit the actual data cards themselves (greatest difficulty would be in finding all 28,000 cards and then transfer the edited data cards to tape to form a new survey file.). 70 is a continuation of 50. A new presence and absence program will be run (as soon as it is written), the survey tape VI updated, and the program run again for verification. New time estimates includes programmer's time for nor presence and absence. 80 is a continuation of 60. The cobol card edit program will be used here to find all arms; the survey tape VIII will be updated and run again for verification. New time estimates are for programming and specifying. 90, 100 and 110 follow Lieberman's VII, VIII, and IX. III. Suggestions 1. A decision should be reached concerning the 13 unfinished interviews, and the 194 interviews without assets booklets. 2. A new presence and absence program should be written. 3. Cards 12, 13, and 14 must be located, coded, and put on the survey file. 4. Decisions must be reached regarding the necessity of processing the coversheet and weight cards. The survey file has had many unforeseen and unpredicted problems. The processing is directly dependent on the availability and reliability of the presence and absence and card edit programs. If at least one of these is available by August 10, and the other program is available by August 13, the survey file processing could be completed by October 1. If these programs are not available by that time, however, such a deadline could not really be met. if the survey processing continues into the fall semester, the undergrads will rise again as Lieberman and Smith join forces and complete the aging survey."Ashok Bhargava 1967LFBeneficiary Analysis: Extract and Merge Tapes for Beneficiary AnalysisAugust 4, 1967 WAIS paper678-011B;Benefit Analysis Formats Social Security Earnings Data- 805c!b![Ashok Bhargava WAIS 678-011 4 August 1967 Beneficiary Analysis Extract and Merge Tapes for Beneficiary Analysis 1. Introduction 1.1 This is a format for BNAN (1) for the Beneficiary Analysis mentioned in 667-045. The tape is made from: (a) Benefit Year Record (BNYR) - format in 667-001 (b) E1-805 - format in 678-003 (c) 400 Character MF - format in 645-056 1.2 Before the BNAN (1) can be created the following steps (outlined on Page 6 in 667-045) will have to be completed. (a) BNYR (2) - Creation (BYR with multiple entitlement) (b) 91(805) - Creation (RID(1), 805(1), SAD - merge) (c) Update of the MF. These will be merged to form the BNAN (1). 1.3 The ID #'s in this tape have to fulfill the same criteria as for the E1(805), and hence will be the same as those on the E1(805). i.e. Age must have values corresponding to birth dates between 1860 < xxxx < 1904; or benefit data must be available. 2. Data Assembly Format Acronym V1 WAIS ID # ID V2 Sex Sex V3 Year of Birth BYR V4 Year of Death DYR V5 Race Race V6 805 Earrnings, 1937 to date E 37 V7 805 Earnings, 1951 to date E 51 V8 # Covered quarters, 1947 to date NQ 47 V9 # Covered quarters, 1951 to date NQ 51 V10 Year of last available record LLYR V11 Year of lot Tax Record FTYR V12 Year of last Tax Record LTYR V13 Year of first Benefit Record FBYR V14 Year of termination of SSA Benefits TYR V15 Beneficiary History BH Income History Matrix 1(1) Vector A Years Data Available 1947 V16 1948 1949 1950 1951 19.52 1953 1954 1955 Calculated 1956 1957 1958 1959 1960 1961 1962 1963 1964 V33 8 (9) 8 (17) 8 (25) 8 (33) 8 (41) 8 (49) 2 (51) 1(52) 2 (54) 5(59) 5(64) Vector B Vector C Vector D Vector E Vector F Vector G Vector H Vector I Vector J Vector K Vector L AGI Wages & Self Property Medical- Total Occupa- Marital # of 805 SSA Salaries Employ- Income Dental Deduc- tion Status Depend Earnings Benefits ment Expenses tions ents V34 V48 V62 V76 V90 V104 V118 V132 V146 V160 V172 V47 V61 V75 V89 V103 V117 V131 V145 V159 V171 V189 Field B153 in Master File* Fields B36 + B45 + B54 in Master File Fields B99 + B117 in Master File Fields B63 + B72 + B81 + B90 in Master File Field B198 in Master File Fields B261 + B279 + B297 in Master File Part of field B11 in Master File Part of field B11 in Master File Part of field B11 in Master File From E1(805) From BNYR positions 29-34 V190 (not)= Record Mark * Master File - refers to 400 character MV in WAIS 645-056 3. Codes to be used for special fields (calculated). 3.1 Year of death (V4): If death data are available, use actual year. If no death data available, use 9999. 3.2 Year of last available record (V10): It is the value of t, such that: If t < (V10), At # 0 for a11 t, and t > (V10), At = 0 for all t. 3.3 Year of 1st tax record (V11): It is the value of t, such that If t < (V11), Bt = Ct = Dt = 9t = 0 for all t. t = V11, not (Bt = Ct = Dt = Et = 0). 3.4 Year of last tax record (V12): It is the value of t, such that If t ;P (V12), for all t, Bt = Ct = Dt = 9t = 0. t = (V12), not (Bt =Ct = Dt = Et = 0). 3.4.1 E.g. No data TYR = LTYR = 0000 A11 sample years FTYR = 1947, LTYR = 1959. 3.5 Year of first benefit record (V13): It is value of t such that: If t < (V13), Gt = 0 for any t t = (V13) , Gt (not)= 0. 3.6 Year of termination of SSA benefits (V14): It is the value of t such that: If t> (V14) , Gt = 0 for all t t = (V14), Gt # 0 3.7 Data Availability Codes (V16-V33) 0 1 2 3 4 5 6 7 0 None 1 805 (# 0) 2 SSA benefits # 0 3 SSA benefits and 805 # 0 4 Tax Records only 5 (1) + (4) 6 (2) + (4) 7 (3) + (4) Then vector (V16-V33) is encoded according to the records read into this merge (a). If MF records, add 4 to code (b). If BNYR record, add 2 to code (c). If wage earnings in the 805 record for this year are 0, add l to code. --- and registering the sum in an appropriate position 1947 becomes V16 1948 becomes V17 1964 becomes V33 4.1 Data from all files must be scrubbed to delete alphanumerics. If alpha occurs anywhere in record insert numerics as follows: MI {90000000 NA { A11 other - 90000000 and print record. 4.2 If source field is blank recode 000 --- 0 for vectors B-E, and I-L. This will result in 00 --- 0 in any year as well as any column for which no data are available. This is for amount fields. 4.3 For Vectors H, I, J (Occupation, Marital Status, # of dependents) If record shows blank, recode high order 9's. Format of BNAN(1) Input Position in Year Item Description Variable Field Positions File Input File Number Number Size Fixed-Fields (for all years) E1(805) 2-9 WAIS ID# V1 8 1-8 E1 (805) 30 - Sex V? 1 9 E1(805) 25-28 - Year of birth V3 4 10-13 E1 (805) 223-224 - Year of death V4 2 14-15 E1 (805) 29 - Race V5 1 16 E1 (805) 74-82 805 Earnings, 1937 to date V6 9 17-25 E1 (805) 88-96 805 Earnings, 1951 to date V7 9 26-34 E1 (805) 83-84 # Covered quarters, 1947 to date V8 2 35-36 E1 (805) 97-98 # Covered quarters, 1951 to date V9 2 37-38 MF/E1(805) 10-11/21-22 Year of last available record V10 2 39-40 MF 10-11 Year of 1st Tax Record V11 2 41-42 MF 10-11 Year of last Tax Record V12 2 43-44 BNYR 27-28 Year of 1st Benefit Record V13 2 45-46 BNYR 73/27-28 Year of termination of SSA Benefits 1/ V14 2 47-48 BNYR 62-72 Beneficiary history - SSA V15 11 49-59 Beneficiary ID Codes Variable fields (by years), BNAN Calculate 1947 Data Availability V16 1 60 BNAN Calculate 1948 Data Availability V17-V33 17 61-77 (each year) Input File MF Mg MF 6 Position in Year Item Description Variable Field Positions Input File Number Size AGI ($ amounts) 2/ V34 145-153 1947 8 78-85 145-1 1948- AGI V35-- 8 86-189 C-2 V47 e.y. e.y. 3/ V48 8 190-197 28-34+ _5/ 1960 Wages and Salaries 2/ 1947 37-43+ ($ amounts) 46-52 28-34+ 5/ 37-43+ 1948- Wages and Salaries V49- 8 198-301 4652 1960 V61 e.y. e:y. 1947 Self-Employment -/ V62 8 302-309 91-99+ !/ 109-117 ($ amounts) 91-99+ _5i 1948 Self-Employment V63- 8 -310-413 109-117 1960 V75 e.y. e.y. 1947 Property Income 7/ V76 55-63+ S/ 414-421 64-72+ ($. amounts) 73-81+ 82-90 55-63+ 1/ 1948 Property Income V77- 8 422-525 64-72+ 73-81+ 82-90 e.y. 1960 V89 e.y. 2/ 190-196 1.947 Medical-dental expenses V90 8 526-533 190-196 1948 Medical-dental expenses V91- 8 534-637 1.960 V103 e.y. 253-259+ 5/ 1947 Total deductions V104 8 638-645 271-277+ ($ amounts) 289-295 253-259+ 5/ 1948 Total deductions V105- 8 646-749 271-277+ 289-295 e.y. 1960 V117 e.y. 18-19 1947 Occupation V118 2 750-751 1819 1948 Occupation V119 2 e9. 752-777 e.y. 1960 V131 23 1947 Marital Status V132 778 23 1948 Marital Status V133 1 779-791 e.y. 1960 V145 e.y. Input Position in Year Item Description Variable Field Positions File Input File Number Size MF 26-27 1947 # of dependents V146 2 792-793 M 26-27 1948 # of dependents V147- 2 794-819 e.y. 1960 V159 e.y. E1(805) 101-104 1951 4/ V160 5 820-824 805 Earnings E1(805) 106-109 1952 805 Earnings V161 5 825-829 E1(805) 111-114 1953 805 Earnings V162 5 830-834 E1(805) 120-123 1954 805 Earnings V163 5 835-839 E1(805) 129-132 1955 805 Earnings V164 5 840-844 E1(805) 139-142 1956 805 Earnings V165 5 845-849 E1(805) 149-152 1957 805 Earnings V166 5 850-854 81 (805) 159-162 1958 805 Earnings V167 5 855-859 E1(805) 169-172 1959 805 Earnings V168 5 860-864 E1(805) 179-182 1960 805 Earnings V169 5 865-869 El (805) 189-192 1961 805 Earnings V170 5 870-874 E1(805) 199-202 1962 805 Earnings V171 5 875-879 BNYR 29-32 1947 SSA benefits 4/ V172 5 880-884 BNYR 29-32 1948- SSA benefits V173- 5 885-969 O.Y. 1960 V189 Record Mark V190 1 970 1/If x in field 73 in Input file (BNYR), take year of record from field 27-28(BNYR) and put into output file. 2/Positon 1: Signs Indicator Position 2-8: Unsigned amount 3/Each year 4/Position 1: Sign Indicator Position 2-5: Unsigned mount 5/These fields have to be added Martin David WAIS 678-011 Addition 1 Additions to BNAN TapeFormat 190 OG max I Occupation held longest 191 OG max II Occupation held 2nd longest 192 Proportion of time held OG max I 193 Proportion of time held OG max II 194 Proportion of time held OG max I + OG max II 195 Indicator Labor Force status 196 indicator Occupation Change 197 Indicator LF Change 198 Marital Status Change 199 Record mark These variables can be created using subroutines in MDAVID, written in COBOL by M. v. Schneidemesserhahttp://www.ssc.wisc.edu/wais/WAIS678011.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678011.txt+D"Richard Bauman Wynn Bussmann 1967>8Supplementary Documentation of Firm Identification FilesAugust 15, 1967\ WAIS paper678-013n Property Filed*(*!Richard A. Bauman Wynn V. Bussmann WAIS 678-013 August 15, 1967 Supplementary Documentation of Firm Identification Files: Sources and Descriptions of Files 21-28* I. Identification WAIS coders identified all assets listed by taxpayers (on tax returns) or respondents (to the Survey and/or Assets Diary File) by a six-digit identification number. The six-digit identification number is a combined code. The first two digits identify the type of asset and were found in the Code Book - Capital Gains Study. An up to-date list of the codes used is found in Table 1. Table 1 Explanation of asset type codes Asset type code Type of asset 01, 51 Commercial back savings accounts 02, 52 Savings and loan association shares 03, 53 Mutual life insurance policies 04, 54 Credit union shares 05 Postal savings deposits 06 Notes and mortgages - obligations of individuals 07 Notes and mortgages - obligations of business firms 29 Notes and mortgages - obligations of upon persons or firm 08, 58 Bonds issue by states or local communities 09 Bonds issued by firm in LFF** 24, 74 Bonds issued by firm not in LFF 10 Stock issued by firms in LFF 11, 67 Stock issued by firm not in LFF 27, 77 Stock issued by barks not in LFF *File code numbers refer to those proposed in WAIS 667-020 This paper is intendad to supplement the Code-Book: Capital Gains Study WAIS 645-038, and WAIS 667-020. **Abbreviations of files are as follows: LFF (Lorie-Fisher File) is Merrill or File 21 and CE is Compustat or File 22. The reader is referred to 667-020 p. 5 for a brief description of all the files. Table (Continued) Asset type code Type of asset 12,62 Dividends from cooperatives 13,63 Debt instruments other then bonds 20 Farm residences 14 Non-farm residences 18 Other types of assets including personal property and federal tax rebate 15 Other non-farm real estate (including easements) 28 Options on non-farm real estate 22 Other form real estate (including easements) 30 options on farm real estate 16 Sales of proprietorships 17 Sales of partnerships 19 U.S. bonds 21 Stock options, warrants, script, etc. issued by firms in LFF 26, 76 Stock options, warrants, script, etc. issued by firm not on LFF 25, 75 Retirement funds,, profit sharing funds, etc. 23, 73 Capital gains distributions from mutual funds or other investment funds 99 Asset not ascertain The last four digits of the identification number identify the individual files or institution issuing the asset. Table 2 contains a list of those ascot types for which a further breakdown to individual asset issuers was either not desired or impossible. Table 2 Asset type codes and identification numbers where there was no further breakdown to individual asset issuers Asset type code Identification number As 05 0008 05 0000 07 0001 29 0011 14 0002 18 0006 15 0003 28 0028 22 0010 30 0030 18 0004 17 0005 19 0007 20 0009 99 9999 For those asset types for which a further breakdown to individual asset issuers was desired, Table 3 lists the files in which the asset identification numbers were found. Table 3 Location of asset identification numbers Asset type code Name and number of file listing asset identification numbers 01,51 Wis Bank (24) or Residual (27, 28) 02,52 Wis Loan (25) or Residual (27, 28) 03,53 Residual (27, 28) 04,54 Residual (27, 28) 08,58 Residual (27, 28) or Wis SEC (26) 09 LFF (21) 24,74 Residual (27, 28) or Wis SEC (26) 10 LFF (21) 11,61 Residual (27, 28) or Wis SEC (26) 27,77 Wis Bank (24) or Residual (27, 28) or Wis SEC (26) 12,62 Residual (27, 28)*** 13,63 Wis SEC (26) or Residual (27, 28) 21 LFF (21) 26,76 Residual (27, 28) or Wis SEC (26) 25,75 LFF (21) or Residual (27, 28) 23,73 LFF (21) or Residual (27, 28) ***Co-ops are listed separately at the present time. However, when the asset identification numbers were issued, there was no separate co-op list, so that the asset identification numbers are intermingled with the rest of the residuals. CF (Compustat-File 22) consists essentially of a subsumple of firms in LFF, so that the LFF identification numbers were used is coding and the CF identification numbers were ignored. There is a modified version of the condensed LFF tapes which links the CF and LFF identification numbers; the tape is in the Commerce Building at the data processing center. In the diagram we have attempted to reconstruct the steps that were undertaken in collecting the firm data files end in using them in the coding of the Property File data. Diagram Coding of asset identification numbers Step 1: Formation of LFF and WisSEC Files LFF Cards 1 5 Check to Delete Dublicates Non-dublicate LFF WisSEC 5 Wis. Department of Securities List of Registered Securities Complete WisSec File LFF-WisSec Matches 5 Assignment of Asset Identification Number 1 Diagram (Continued) Coding of asset identification numbers Step 2: Formation of WisBank and Wisloan Wisconsing Savings & Loan Annual Report 1962 Assignment of Asset Identification Numbers Wisloan File on Cards 5 Print-out of WisLoan 3 WisBank ID Cards from Tom Velk 2 5 Print-out of WisBank 2 Diagram (Continued) Coding of asset identification numbers Step 3: Formation of a sample residual list Tax Forms 1 Print-out of Non-duplicate LFF-WisSEC 6 2 Print-out of WisBank 3 Print-out of WisLoan 5 Duplicate 1 5 Duplicate 2 5 Duplicate 3 Residual List from Sample of Tax Forms Sample Residual List (8000's) ("Alpha List") 3 WisBank and WisLoan found Necessary Deletion of Duplicates from Sample Residual List Sample Residual List (8000's) with Duplicates Deleted File 27 Sample Residual List on Cards 5 Print-out of Sample Residual List; File 27 4 Diagram (Continued) Coding of asset identification numbers Step 4: Production coding and formation of a complete residual list 1 Print-out of Non-duplicate LFF-WisSEC 2 Print-out of WisBank 3 Print-out of WisLoan 4 Print-out of Sample Residual List Tax Forms Production Coding Coded Tax Forms (keypunching) New Residual List of Firms 4 Property File Data Cards New Residual List on cards Complete Residual List on Cards Duplicates with other LIsts 5 5 Sample Residual List on Cards 5 Print-out of Complete Residual List II. Notes to diagram 1 The LFF cards were obtained from the University of Chicago; each one contians the name of a firm on the left, a four-digit identification number on the right, and either two or four groups of numbers (or "R's") in between. The "R's" apparently signify that at end of 1960, which is the most recent date for which the tapes contain data, the firm's securities were still being traded on the New York Stock Exchange (N.Y.S.E.). The first block of numbers seems to be the date that the security was first listed on the N.Y.S.E. The second block seems to be the date that the security was last traded on the N.Y.S.E. When there are four blocks of numbers, then the last two blocks apparently have something to do with either dates of mergers or name changes. To be sure, however, we should ask either Fisher or Louie. The identification numbers on the LFF tapes are five-digit numbers. Counting the leading minuses in the card identification numbers as zeros, then the card identification numbers seem to be the first four digits of the tape identification numbers. When the last digit in the tape identification number is zero, then the four-digit card identification number is followed by blanks. However, when the last digit in the tape identification number is nine, then the four-digit card identification number is followed by either a "A," a "+B" or a "+F." This much is known from examining the data for the first ten firms on the LFF tapes. Unfortunately, there is an exception to the above rates. The tape identification number for Louisville Gas & Electric Co. is 00059. According to the pattern established for the other nine of the first ten companies on the tapes, the card identification number for Louisville Gas & Electric Co. should be either -005+A, -005+B, or O05+F. Louisville Gas & Electric Co. has two card identification numbers: -059+A and -059+F. We assume that this is an error and the correct card identification numbers for Louisville Gas & Electric Co. are -005+A and -005+F. It may be that the different letters at the end of the card identification numbers signify different kinds of securities such as common and preferred stock, but this is only conjecture at present. The format for the bank identification cards was provided to us by Tom Velk currently at McGill University in Montreal, Canada, and is as follows: Table 4 Format of bank identification cards Cols. Description 1-50 Bank name 51 Response to Hooker-Stubles question **** 52 Zone **** 53-54 State (48 - Wisconsin) 55-56 Class **** 57-58 FDIC district 59-60 FR district 61-63 Metropolitan area 64-66 County 67-68 City 69-70 Population class 71 Type **** Branch banks 72 No. of office 73 Holding company 74 Weekly reporting 75-76 Blank 77-80 Bank identification number In addition, WAIS has two decks of bank data cards which are being stored in the coders' room. The formats of these cards will be described in a future WAIS paper. **** The reader is referred to the dissertations of Hooker and Stubles for the explanation of these and other codes. 3 Some members of the supervisory apparently went through some of the tax forms before the production coding was started and wrote down the names of firms not on the merged CFF-WisSEC list. These firms were later (after alphabetising and elimination of duplicates) assigned asset identification numbers here starting with 8000 and running to 9000. This is called File 27 in WAIS 667-020. When the production coding was later started, the additional residual firms found were then assigned identification numbers starting with 9000 and running into the "Add 50's." Thus, File 27 is the original residual list with which the coders started working. 4 Periodic checks were made to insure that multiple asset identification numbers were not assigned to a firm. There is, of course, no guarantee that all Property File data cards which have data for the same firm have a unique asset identification number until a check for duplicates has been run. Until that time, all evidence of possible duplicate asset identification numbers should not be discarded. 5 All of the firm identification cards are stored in the wooden cabinet in room 7413. 6 Although a coders worked with a merged LFF-WisSEC list in assigning asset identification numbers, they assigned different asset types to firms in the two lists (10 for firms in LFF and 11 or 61 for firms in WisSEC). The residual cards that are not "Add 50's" have an N punched in Col 75.hahttp://www.ssc.wisc.edu/wais/WAIS678013.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678013.txt|William Duddleston 1967d]Procedure for Integrating Simultaneous Benefit Claims Cases Into the Benefit Year Record FileAugust 17, 1967 WAIS paper678-014p Benefit FileF?William Duddleston WAIS 678-014 August 17, 1967 Procedure for Integrating Simultaneous Benefit Claims Cases Into the Benefit Year Record File Over the last thirty years, women have been joining the labor force in ever Increasing numbers. As a consequence of this increased participation, more women become eligible for social security payments on their own account (OAIB). However, many of these women, who are eligible as a primary beneficiary, are also entitled to social security benefits on another account. Dual entitlements crop up in various instances, but the majority of the cases involves a woman receiving benefits on her own account and her husband's (deceased or living) account. There are 125 simultaneous benefit accounts in our benefit file. These individual accounts have a single WAIS ID but two legitimate social security numbers. The existing program (George Loniello's SS Payments) which is used to create a benefit year record (BYR) for each identified claims case is not equipped to handle mere than one social security account number for each beneficiary. Therefore a system bed to be devised to handle simultaneous benefit claims cases. The payment histories of these 125 claims cases were quite messy and not amenable to programing. Dick Bauman decided that it would be advisable to code the benefit year records for these accounts annually. Appendix III in WAIS 667-001 contains the basic format for the benefit year record tape. This format had to be expanded to incorporate all the exceptions that were discovered while coding the simultaneous benefit claims cases. This format along with explanatory codes appears on the following pages. Once the entire benefit year record tape (BYR) is created, the 125 claims cases will have to be deleted from this tape. Mike von Schneidemesser's UPDATEAL program will be utilized to delete these records. The benefit year records which were coded and were computed by hand are presently on cards. These cards will be read onto tape; this tape will then be merged with the master BYR tape. Hopefully we will have a complete and integrated benefit year record. Appendix III(a) - Format For Benefit Year Record Of Identified Claims Accounts Receiving Simultaneous Benefits (1) Format, positions No. Of Positions Data 1-8 8 WAIS IN for beneficiary 9-17 9 OAIB SSA # for beneficiary (See Note (1)) 18-26 9 SSA # - earliest date of payment (See Note (2)) ?7-28 2 Year of record 29-34 6 Total Amt. of benefits paid that year 35-46 12 Monthly Payment Code (See Code (3)) 47-50 4 Blank 51 1 No Record Indication (See Code (3)) 52-60 9 SSA # - simultaneous benefit account (See Note (1)) 61 1 Number of benefit accounts for this beneficiary - entire history 62-72 11 Social Security beneficiary ID codes (3) 73 1 X if this is last year of SS benefit history 74 1 Record creation method (See Code (3)) 75 1 Indicator of simultaneous benefit paid during year (See Code (3)) (2) Explanatory Notes 1- Positions 9-17 contain the OAIB account number (primary beneficiary account - PIC "A") 2- Positions 18-26. The social security number that appears in these positions is the one with the earliest payment date. When both social security account numbers have the same date of initial payment, the social security number of the primary beneficiary (OAIB) is given priority. The second social security number is relegated to positions 52-60. 3- Positions 52-60 are also reserved for the social security number that has the later date of initial payment. (3) Codes (a) Monthly Payment Record Code Single Acct. Code Simultaneous Benefit Code blank Benefits not paid during month because blank not yet entitled or terminated in previous month I Benefits Paid During Month S-(l) See field 75 2 Benefits Not Paid-Retroactive R 3 Benefits Not Paid-Worked W 4 Benefits Not Paid-Adjusted A 5 Benefits Not Paid-Terminated T 6 Benefits Not Paid-Entitled To Another Type O of Benefit 7 Benefits Paid During Month Include L Lump Sum Death Payment 8 Benefits Not Paid-Disallowed Claim D 9 Benefits Net Paid-Denied N 0 Benefits Paid During Month Include E Payments for Excess Deductions in Previous Month (b) Codes For No Record Code Explanation N - No information -- from "No Record" code. E - No payment record during year(s). (c) Social Security ID Codes for Entire History Location Explanation A (Col. 62) Primary Beneficiary 0 No C (Col. 63) Child Beneficiary 1 -Yes 0 - No B1 (Cot. 64) Dependent Male 1 - Yes 0 - No Dl (Col. 65) Widower 1 - Yes 0 - No B2 (Col. 66) Wife and Mother 1 - Yes 0 - No B (Col. 67) Wife. 1 - Yes 0 - No 1 - Yes E (Col. 68) Widow 0 - No 1 - Yes 0 - No E1 (Col. 69) Mother (Divorced wife) 1 - Yes 0 - No D (Col. 70) Widow 1 - Yes 0 - No Code Location Explanation F (Col. 71) Parent 1 - Yes 0 - No G (Col. 72) Lump Sum Paid 1 - Yes 0 - No (d) Codes for Record Creation Method Code Explanation 0 Some monthly payment information for this year. C This year record extrapolated from previous year's record and no monthly payment information is given this year. (c) Codes for Indicator of whether or not simultaneous benefits paid during year Code Explanation blank Only one social security number for the beneficiary 0 No simultaneous benefits were paid during the entire year. Only one account was eligible for benefit payments (SS # in Cols. 18-26) Single account code in effect for monthly payment record (Cols. 35-46) 1 Simultaneous benefits were paid on both accounts each month of the year from January through December (Cols. 35-46). Since a #1 indicates that simultaneous benefits were paid every month, the symbol S (the Simultaneous Benefit Code for indicating that simultaneous benefits were paid in a particular month) has been dispensed with in this case and replaced with the conventional 1 (Single Acct. Code - Monthly Payment Records (Cols. 35-46) which indicates that benefits were paid that month. However, when some peculiarity occurs during one particular month, such as a lump sum death payment, the Simultaneous Benefit Monthly Payment Code will be utilised. 2 Simultaneous benefit claims were paid during part of the year. In the monthly payment field positions, the months in which simultaneous payments were made are indicated by the letter S or other appropriate alphabetic simultaneous benefit payment codes. 3 Simultaneous benefit claims were paid in some of the months, but one of the social security accounts was terminated sometime during the year. In the month in which one of the accounts was terminated, a T will appear.hahttp://www.ssc.wisc.edu/wais/WAIS678014.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678014.txt Mike VonSchneidemesser 1967NHElimination of Multiple ID Numbers Valid for the 1946-60 Tax Return FileAugust 18, 1967- WAIS paper678-015engFixed Format Identification File (FFID) Maintenance System - Files, Data, Etc. Master File- Tax Records xrMichael von Schneidemesser WAIS 678-015 August 18, 1967 Elimination of Multiple ID NumbersValid for the 1946-60 Tax Return File After Robert Esterly had cleaned up the WAIS files (Master, property and ID files) in regard to invalid multiple ID numbers (WAIS 667-023), the valid multiple ID numbers had to be eliminated to achieve the goals stated in WAIS 667-034, point 1, and WAIS 667-037, phase 1a. PROCEDURES A. Creating a file of the relevant individuals Multiple ID numbers were extracted with the program MULTIPID This program uses the ID file sorted an SS number. The program punches a card for each of the ID records which have a common SS number. (See appendix for the format.) To this card deck 4 cards for two individuals without SS number were added. These ware found by a hand check method using a listing of all people without SS number sorted an name. The card deck so generated was separated into two groups: Group 1 has a dependent number in the first ID field (col. 9 = 1 or greater). This deck contained a number of inconsistent also which were sorted out and cleaned up. These were cards with a dependent ID number in both the first and second ID field, and ID number combinations which gave one person two different positions in one household, like a father who is also a son. Group II has a husband or wife or independent head of household number in the first ID field (col. 9 - 0). Cards with an independent ID number in both the first and second ID field were sorted out. Those which were both male numbers and those which had one female and one male number were removed and the cause of this inconsistency removed from the files. The remaining cases were females according to both numbers. They were checked for validing and whatever remained are valid multiple ID numbers which will remain even under the new definition since these are women who were members in two or more households during the period covered by the WAIS files. The cleaned up Group I and Group II cards as well as the still remaining valid multiple ID numbers are kept in the Multiple ID file box. Group I and Group II cover the same population and the cards should look alike, except for the ID numbers being interchanged and the year being different. B. Reconstituting the valid multiple ID mode in the WAIS files Use the Group I deck and match on the second ID number Then replace the ID number with the first ID number from the Multiple ID card for all year records up to and including the year given on the card. C. Changing the IDS on the basic files Master file: C-cards were reproduced from the Group I deck and submitted to the MAUPDATE program. Property file: C-cards of the same format as for the Master file were generated, however at this time no attempt was made to do anything to the file. ID file: J-cards were made from Group 1, and submitted to UPDATEFID. Age data file: A manual check with a listing of Group I was wade for any affected Age data cards. Benefit file: A manual check with the listing was made and 2 cases submitted to the UPDATEAL-Benefit version were submitted. Form 805: Not applicable. History file: Nothing was attempted. D. Refiling the source documents The Group I deck was listed. This listing was used to pull all tax returns and demographic and property file code sheets, amend it with the new ID number and place it in the folder identified by the second ID number. The original folder was marked with red pen: "...moved to...." The FFID code sheet was left in the original folder as a reference with the new ID number written over the old one. E. Other considerations The Master file contains a code for address change. This code was checked for all individuals which were integrated so that the code for previous return not present = 99 was appropriately corrected and correction cards were run on the Master file for this. COUNTS Extracted by Multipid were 1088 Multiple ID cards and 4 Multiple ID cards without SS#'s were added to this. This comes out to: 346 Multiple ID cases of which were 458 valid multiple ID cases which were changed on the files 24 cases of valid multiple ID's which could not be changed: 2 cases where a sample woman married (another) man in the sample and 22 cases where the woman appears in the tax sample and also under a different name in the Benefit file (name group 70) 64 cases of invalid multiple ID's which were not detected previously or results of Benefit miscodings. APPENDIX Format of Multiple ID cards extracted by Program "MULTIPID" from the ID file sorted on SS#. Cols. 1 2-9 10-18 19-62 63-64 65-72 73-80 Description 'M' to indicate multiple ID card ID number, taken from ID record, same position SS number, taken from ID record, same position Name, taken from ID record, same position Year, taken from ID record, position 122-123 Blank ID number, taken from the other record with the same SS number (if blank, then cols. 2-64 of this card are from a third or fourth multiple ID record)hahttp://www.ssc.wisc.edu/wais/WAIS678015.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678015.txt  Richard Bauman 1967^WAdditions and Revisions to WAIS 678-002 (Table Specifications for Series P Tabulations)MAugust 22, 1967b WAIS paper678-016p*#Cross Tabulations Extract 01 Tablesw Richard A. Bauman WAIS 678-016 August 22, 1967 Additions and Revisions to WAIS 678-002 (Table Specifications for Series P Tabulations) Legend = Add = Additional Specification Rev = Revised Specification I. Table Specifications Rev Tables P3, P4, P16, P17, P18, P19, P20 Replace Year 1 with Year 4 Add Table P23 Page Variables: Sex 01 Birth Year Groups 1 Row Variable: Year 1 Column Variable: Occupation 2 Cell Entries: Simple F.C. Row % Add Table 24 Page Variables: Sex 01 Birth Year Groups 1 Row Variable: Year 4 Column Variable: Propinc 2 Cell Entries: Simple F.C. Row % Mean A.V. -AGI Add Table P25 Page Variables: Sex 01 Birth Year Group 1 Row Variable: Year 4 Column Variable: W+S 3 Cell Entries: Simple F.C. Row % Mean A.V. -AGI Add Table 26 Page Variables: Sex 01 Birth Year Group 1 Row Variable: Year 4 Column Variable: Selfinc 3 Cell Entries: Simple F.C. Row % Mean A.V. -AGI Add Table P27 Row Variable: Year 1 Column Variable: Propinc 3 Cell Entries: Simple F.C. Row % Add Table P28 Row Variable: Year 1 Column Variable: W+S 4 Call Entries: Simple F.C. Row % Note: A comparable table for Selfinc (26 Boundaries) is Table 2 in "Master Frequency Counts of AGI Plus Other Tables" Add Table P29 Page Variables: Sex 0l Year 5 Row Variable: CounRes 1 Column Variable: CresPri 1 Cell Entries: Simple F.C. Row % Col % Mean A.V.-AGI Add Table P30 Page Variable: Sex 01 Row Variable: CounRes 2 *Column Variable: Year 5 Cell Entries: Simple F.C., Row %, Col. % Mean A.V.-AGI * suppress last column Table 31 Page Variable: Sex 01 Marital Status 1 Birth Year Group 1 *Row Variable: Year 4 Column Variable: Return - Reason 2 Cell Entries: Simple F.C. Row % Mean A.V. -AGI 1 *Suppress last row Add Add Add IIIA. Transformation of Year Variable (To exclude N.A. Amount Fields from Mean A.V. Tables) Assign Year 4 (v25) as follows: If one or more of AGI (v13) W+S (v15) or Selfinc (v18) are N.A. (coded 9999999), Let Year 4 (v25) = 99; otherwise let v25 = v4 Add IIIB. Creation of Property Income Variable The Property Income Sum Variable (Propinc 1 - v26) is defined as the v16 + (Dividends) v17 + (Cap Gains) v19 + (Interest) v20 + (Rent) V21 + (Trust Income) = v26 (Propinc 1) Propinc 1 should be computed whenever all fields (v16, v17, v19, v20, v21) contain valid mounts (i.e. (not)= coded 9999999).. If one or more of the augends are N.A. (coded 9999999), Propinc. I should be coded 9999999. IV. Interval Boundaries Add Year 4 - v23 (16 Boundaries) 46,...(1)..., 60, 99 Add Year 5 - v23, (6 Boundaries) 48, 49, 58, 59, 60, 99 Add CounRes 2 (76 Boundaries) 0,2,...(!).,.,72,97, 98,99 Add W+S 3 (11 Boundaries) -1,0,999,1999,2999, 3999,4999,6999, 9999,9999998,9999999 Add W+S 4 (29 Boundaries) -1,0,499,999,1499, 1999,2499,2999,3499, 3999,4499,4999,5999, 6999,7999,8999,9999, 10999,12499,14999, 17499,19999,24999, 37499,49999,74999,99999,9999998,9999999 Add Selfinc 3 (11 Boundaries) -1,0,999,1999,2999, 3999,4999,6999,9999, 9999998,9999999 Add Propinc 2 (12 Boundaries) -1,0,999,1999,2999, 3999,4999,6999,9999, 9999998,9999999 Add Propinc 3 (31 Boundaries) -999,-1,1,0,199,499,999, 1499,1999,2499,2999, 3499,3999,4499,4999, 3999,6999,7999,8999, 9999,10999,12499,14999, 17499,19999,24999, 37,999,49999,74999 "999,9999998,9999999 Add Retres 2 (5 Boundaries) 1,5,6,8,9 Rev V Priority First Pass: Tables P3, P16, P17, P18, P19, P20 Second Pass: Tables P1, P5-P15, P27-P31 Third Pass: Tables P24-P26 Tables P2, P4, P21, P22, and p23 will be postponed.hahttp://www.ssc.wisc.edu/wais/WAIS678016.pdf http://www.ssc.wisc.edu/wais/textFiles/WAIS678016.txt Z Wynn Bussmann 1967`ZProcessing 1947-1959 Tax Returns: Property File Corrections re. Stock Portfolio EvaluationSeptember 11, 1967 WAIS paper678-018u Property FilesXXWynn V. Bussmann WAIS 678-018 11 September 1967 Processing 1947-1959 Tax Returns: Property File Corrections re. Stock Portfolio Evaluation I. Introduction During the process of editing the 1947-1959 Property File in the late spring of 1967, more and more "errors"* were discovered in the file. The job of correcting these errors finally became rather complicated so that the editing process was halted temporarily while we determined the best course of action to take. (At about the same time, it was discovered that the card edit program was not only very inefficient but also unreliable, so that the card edits were stopped on both the Property File and the Survey.) This paper is an attempt to describe the existing errors in the Property File, discuss their importance to stock portfolio evaluation, and put forth various proposals for correcting the errors. II. The Errors, Their Importance, and Their Corrections Realizing that no precise distinction can be made among the effects of the various errors, for purposes of discussion, however, we may ------------------------- *By "errors" I mean errors on desired alterations in existing data and omissions of desired data. For convenience, I shall drop the quotation marks and use errors in this broad sense throughout this paper. view the errors as falling into two categories: (1) those that directly affect the potential size and scope of analyses, and (2) those that directly affect the reliability of the data. A. Errors Affecting Analyses 1. Numbers of shares missing on the capital gains cards (card 4) When the 1947-1959 Property File data were coded no provision was made for coding the numbers of shares of stocks sold on card 4. Admittedly, many taxpayers -- especially those with smaller portfolios -- neglect to record on their tax returns the numbers of shares of the stocks which they sold during the previous year. But many other taxpayers do provide such information. As long as a stock is listed in the Lorie-Fisher file (in which case, we have the stock's prices and dividend rates through time),