======================= Cor867 Layout Overview: ======================= cor867: Document providing user with an overview of the ICD-9 coding system and how that system fits with the WLS data. Discusses the ICD-9 Code Set, Coding in Practice and Public/Private Data cor867a: Document providing user with a more detailed explanation of the procedural coding for the different variable categories that are briefly explained later in this cor. \public\cors\cor867a.doc cor867b: Spreadsheet providing the user with information on what the numeric ICD-9 codes mean in practice and an explanation of why we used a specific code to represent a specific ailment. \public\cors\cor867b)decision_tree cor867c: Document used to train new ICD-9 coders. \public\cors\cor867c.doc cor867d: Decision tree that is a part of training new ICD-9 coders how to use the many ICD-9 resources. \public\cors\Appendices\R-Death and Disability Coding\cor867d.xls cor867e: Documentation on our procedures for and outcomes of testing the reliability between ICD-9 coders. cor867f: Spreadsheet of the complete ICD-9 code set. The interactive website we used during coding was http://www.icd9coding1.com/flashcode/home.jsp , however, if this site is no longer available, this document has the complete list of ICD-9 codes. The actual code is in the ‘code’ column (column F). The ‘class’ column in column A is important because it denotes what type of code it is. A zero translates to a regular 3 digit code, a one translates to an E code, a two into a V code, and a three into a Volume 3 Heading code; all of which are explained in ‘Explaining the ICD-9 Code Set.’ cor867g: Spreadsheet of raw and constructed variables coded using ICD-9, and descriptions of them. \public\cors\Appendices\R-Death and Disability Coding\cor867g.xls ============================== Explaining the ICD-9 Code Set: ============================== The International Classification of Diseases, Revision 9 (ICD-9) is a classification system developed by the World Health Organization to maintain statistics; it is also used by the insurance sector for medical billing purposes. The codes in this classification system are lumped into 17 broad categories of disease and injury with an associated range of three digit numerical codes for particular types of diseases. 1. Infectious and Parasitic Diseases (001-139) 2. Neoplasms (140-239) 3. Endocrine, Nutritional, and Metabolic Diseases and Immunity Disorders (240-279) 4. Diseases of the Blood and Blood-forming Organs (280-289) 5. Mental Disorders (290-319) 6. Diseases of the Nervous System and Sense Organs (320-389) 7. Diseases of the Circulatory System (390-459) 8. Diseases of the Respiratory System (460-519) 9. Diseases of the Digestive System (520-579) 10.Diseases of the Genitourinary System (580-629) 11.Complications of Pregnancy, Childbirth, and Puerperium (630-676) 12.Diseases of the Skin and Subcutaneous Tissue (680-709) 13.Diseases of the Musculoskeletal System and Connective Tissue (710-739) 14.Congenital Anomalies (740-759) 15.Certain conditions Originating in the Perinatal Period (760-779) 16.Symptoms, Signs and Ill-Defined Conditions (780-799) 17.Injury and Poisoning (800-999) 18.E-Codes (E800-E999) 19.V-Codes (V01-V85) 20.ICD Volume 3 Headings (00-99) Within these seventeen categories of types of diseases, specific diseases are further delineated at two levels. For example, liver cancer is can be generally coded using a 3-digit code 155. This 155 code corresponds to the general category "Malignant neoplasm of liver and intrehepatic bile ducts." Employing a decimal system to further specify, liver cancer can be exactly coded as 155.0 or "Liver, primary," which describes the exact site of the malignant neoplasm. In the same general, three-digit category, cancer of the interahepatic bile ducts is exactly coded as 155.1. And that is the ICD-9 coding system. For our purposes, we did not use the decimal points in our code set, yet we referenced it in order to find out what specific disorders the three digit code included. Many times, because responses to the open ended questions were so ambiguous, when we picked a three digit code we were mainly focusing on the fact that most of the three digit codes included a decimal specification that was "unspecified." In addition to the 17 categories are the E codes, V codes, and Volume 3 Heading Codes. When we first started using ICD-9 as our code set for cause of death variables, it only included the three digit codes that were in the 17 categories. However, as we encountered variables that dealt with causes of disability instead of the original use of the code set for cause of death, we encountered more and more responses that did not fit into any of the three digit codes, and we added the E codes. After this addition, we yet again ran into responses that could easily be coded by V codes and Volume 3 Heading codes. We chose to add those as well to maintain as much specificity as we could while using the ICD-9 as our code set. It is for this reason that for some variables in our study, the cause of death variables, we use only the three digit codes and E codes, and for other variables, disability variables, we use those as well as the V and Volume 3 Heading codes. This may seem to be the opposite of maintaining uniformity, but when the nature of the variables and additional code sets are explained, it will be clear that these differences don't affect the data in a negative way. The E codes are loosely "Accident codes." They are set up the same way as the "standard" three digit codes with the main difference being that there are 25 categories that have entirely to do with accidents that are assumably causing an injury of sorts, or death. For example some of the E codes include categories like "Misadventures to Patients During Surgical and Medical Care", "Accidental Falls", "Motor Vehicle Traffic Accidents", and "Injuries Resulting from Operations of War." The E codes are used in all coding, as are the "standard" three digit codes. A small note about E codes; we use a 5 in place of the "E" so in our code set and data all of the E codes are actually 4 digit codes, but all start with a 5 to denote that they are E codes and not the "standard" three digit codes. The V codes are "factors that influence health status and contact with health services." Many of the codes in here are as vague as the title of the V codes is, and are mainly used in variables that have to do with disabilities. They have 9 main categories which include, for example, "Persons with a Condition Influencing their Health Status," "Persons with Genetic susceptibility to Disease," and "Persons with Potential Health Hazards Related to Personal and Family History." Again, the format with the decimals giving more specificity is the same. In the same way that we gave the "E" in the E codes a numerical value, we gave the "V" a two digit numeric value since the V codes only have two digits, not 3. We substituted the V with a "60", and example would be the code "V49", would turn into "6049." The last sub-code set to the ICD-9 code set is the Volume 3 Heading Codes. These are mainly operations, surgeries, and procedures. Again the two value decimal system adding specificity is the same as it was for all the other codes, however, like the V codes; these codes only have two digits to the code. In our code set these codes are designated by adding a "70" in front of the actual two digit code that is in the ICD-9 system. For example, a "35" would be a "7035." ================================================= Coding in Practice: (For more detail and examples, see cor867a and b) ================================================= Although cor867a and cor867b go into more specific rules of coding with ICD-9 the main procedure of cause of death and disability coding should be explained. Many times the respondent gave more than one cause of death, or reason for disability. Early on it was decided that we would code all the ailments listed by the respondent, in the order they were mentioned. This decision was, for the most part, kept the same. To that rule we added that if a response was something to the effect of, "Alzheimer, old age, and a heart attack, but the Alzheimer isn't what killed him," we were to not code the ailment if the respondent specifically said that it did not contribute to either death or disability. In a similar fashion, if the response was something like, "stroke and then he had a heart attack, but the heart attack is what killed him," we decided to code the ailment specified as the main contributor to the death first, followed by the other ailment mentioned. In the last example, we would first code the heart attack and then the stroke. The "Cause of Death Variable Rule" is, in it's entirety: "Code every ailment mentioned in the order the respondent lists them, unless the respondent specifically says it was one ailment over the other, then code the specified one first, and the other one/s after it, and if the respondent says a specified ailment specifically did not lead to the death, then do not code it at all." The "Disability Variable Rule" is a bit different, and more complicated. It focuses on coding the symptom first, and then the cause, if it is clear from the response that both a cause and symptom are present and able to be coded. Determining causality was the main concern with these variables. For now, the main point is that in most cases, there were multiple responses given, and in the private data, up to 6 ailments were coded for just one case. For analysis and variable creation reasons, many people chose just to use just the first ailment coded, keep in mind, following the above coding rules, the first code is not always the ultimate cause of death, or underlying disability. There are plans to create a variable that has our best estimate of what the ultimate cause of death was, but as of now, the public data is only using the first code. This is especially important because of our decision to collapse this potentially identifying information for the public release of the data. ======================== Public vs. Private Data: ======================== As just mentioned, the data was collapsed for the public release into the previously listed 20 categories. So, in the public data release, there will be 18 cause of death codes, and 20 disability codes. This note about collapsing is even more important when the rules of coding are taken into account-the codes were collapsed based on the first code listed. Because of the nature of our coding rules, this may not truly represent the ultimate cause of death or disability.