Jane Allyn Piliavin- Sociology at UW Madison, bascom graphic


DUE DATES: See deadlines

Your task for this exercise is to develop a set of questions (called an index) to measure a complex concept, and then to test a simple and obvious bivariate hypothesis involving that concept. You will administer the questions to a minimum of 40 people, check the inter-item correlations to be sure it is appropriate to sum them in a composite index (i.e., check the reliability of the index), compare the index against an open-ended question addressing the same concept as the index, as an indication of measurement validity, and test your bivariate hypothesis with both the index and the open-ended question. This exercise is mainly about measuring things using a survey instrument. It is not mainly about hypothesis testing.


You are strongly encouraged to do this exercise in teams of 2 to 4 people (no more than4). The only disadvantage of a team is scheduling; this will be outweighed by the advantages unless you have a very complex schedule. Developing good questions is a task often done better by several heads than one; teams can collect the data more efficiently and can divide the labor in coding and preparing tables. Team members will normally write a group report plus an individual evaluation, and this is what I strongly prefer. If time schedules or personality conflicts make preparing a group report too difficult, some or all of the report may be written separately by each group member (or subsets of the group).

Questionnaire Development

(Teams must work together on this part.)

NOTE: As stated on the syllabus, a rough draft of your questions is due as homework and will be returned with suggestions on a schedule indicated in the syllabus. Submit a xerox of the questions; keep your copy -- preferably a copy for each member. Make sure all students' names are on the draft so all can get credit for the homework. Please submit all the ideas if you disagree. In general, if you agree on what concept to measure but disagree about which are the best questions, it is OK to include all the possible questions in your questionnaire and let the reliability analysis we run on it tell you which ones are the best. I will be available for extended office hours to give groups individual assistance on this.

ANOTHER NOTE: The instructions below assume that your multiple-item index is dependent and everything else is independent. It will actually work out fine if your multiple-item index and open-ended question serve as the independent variable and you have a simple attitude or behavior for your dependent variable; you would still include a few additional independent background factors as possible control variables. This would require minor changes in the organization of the write-up. See me if you think you are in this situation. I will give you a revised set of instructions.

  1. Central concept. Pick a relatively general attitude, belief, or behavioral pattern, one which could obviously have a wide variety of possible measures. This concept must be at least ordinal: that is, there must be one continuum on which people are higher or lower. Your task is to develop a variety of closed-ended questions, which can be summed to form an index to measure the concept better than the individual items would have separately. About 90% of your effort in writing questions should go into these multiple measures of the central concept. You will also write one very general open-ended question to capture the person's opinion -- in his or her own words -- ON THE SAME CONSTRUCT AS THE CLOSED-ENDED QUESTIONS; we will use this measure as a check on the validity of the index.
    1. Closed-ended questions. You should write 10-15 closed-ended questions, each measuring a different aspect of the same variable. At least one should be a rather general question, to capture the main idea of what you are interested in. The others should ask about different dimensions or themes relevant to your concept. Unless this is impossible, some of these items should be worded positively so that agreeing means a person is on the high end of your concept, while others should be worded negatively so that disagreeing means a person is on the high end. (This is a standard procedure in surveys, done to avoid the problem of "yea-saying.")
      The answer categories should provide a ranking to express more or less of the attitude you are measuring, and should be presented to respondents in that rank order. (See example.) You should generally have four or six response categories for each question. It is usually best to avoid a "neutral" middle response category (but note that the example uses one).
      TO CHECK YOURSELF: You are going to add up the responses to these items, so it has to make sense to do this. All the different questions measuring the central variable should give you a score from high to low on the same conceptual variable/construct. Positively worded items will yield a high score for agreeing, while negatively worded items will yield a high score for disagreeing. If you have difficulty deciding whether a particular item should be treated as a positive or a negative, it probably has a problem. If you have difficulty with most of the items, either they or your concept (or your understanding of it) have more serious problems. SEE ME RIGHT AWAY IF THIS HAPPENS TO YOU.
    2. Open-ended question. Include in your questionnaire one general and straightforward open-ended question that asks people to use their own words to tell you what they think about the same thing that is the subject of the multi-item dependent variable. You will code their responses and use the results to evaluate the measurement validity of the multi-item index.
  2. Independent variables. Select one independent variable that you really believe has to be related to your central variable; that is, the relationship is REALLY OBVIOUS. (For example: if your concept is attitudes to gun control, ask about membership in the NRA.) This is the variable you will use for your hypothesis. It may be a basic background variable, or may itself be an attitude or behavior you can measure with one item. You should also measure 2-4 other independent variables, usually background/demographic characteristics (e.g., gender, age) that you suspect will make a difference in your dependent variable. These are to give you the chance to explore other possible relationships. All independent variables should be easy to measure with one question each; they can be nominal, ordinal or interval measures.
  3. Format: Following the principles described in the text and in class, refine your questions to make them as good as you can, both in their content and in the physical structure of your questionnaire or interview schedule. All questions should meet formal criteria such as:
    1. Unbiased; if biased items are used, they should be balanced.
    2. Clear, unambiguous, single-barrelled, grammatical.
    3. Closed-ended categories are exhaustive, mutually exclusive, reasonable in range and precision, and fit with the stem of the question.
    4. Legible, logical physical presentation.
  4. Order: Items should be ordered so that respondents answer the open-ended question BEFORE they see the closed-ended dependent variable questions.

    On the two pages shown in the link is an example questionnaire developed by four students in an earlier semester's class. The project focused on developing an index to measure attitudes towards capital punishment. Their obvious hypothesis is that attitudes towards capital punishment will be related to political liberalism - conservatism. They also asked a set of other questions about which they were merely curious: gender, age, religion, and whether respondent had lived outside the country.

    * Example of a Questionnaire with Coding Information *

Data Collection

(Teams also do this part together.)

  1. Make copies of your team's questionnaire. Each team must collect data from a minimum of 40 people, and each person must do a minimum of 10.
  2. Do convenience sampling, but purposively try to get as much variation as possible on the variables you are studying. That is, try to get people you expect to have different opinions on your central concept/dependent variable and to differ on your independent variables. DO NOT USE YOUR FRIENDS AND FAMILY. Also, do not get groups of people to fill them out together; any chance for people to talk to each other while answering is likely to produce error and bias.
  3. As you collect the data, record any information that might be relevant to understanding people's answers, or that might alert you to problems with the questions.
    1. You may ask the questions orally (i.e., interview people) or let people write their own answers; just say which you did. Don't do some one way and others another way.
    2. Ask people to tell you or to write in the margins if there is anything they find offensive, difficult to answer, unclear, etc.
    3. Watch and listen to the respondents for signs of difficulty or confusion, irritation at questions, hesitation in answering an item, giving the form back to you blank, changing answers, laughing, explaining the answer, etc. Take notes about these observations on the form itself, or on a separate paper you later staple to the questionnaires.
  4. Put a unique three-digit number or code on each questionnaire, right on the actual paper used for data collection. USE THESE SAME NUMBERS ON YOUR CODE SHEET, BELOW. Note that each questionnaires must have a unique number. One way to do this is to use three-digit numbers, where the first digit indicates the team member. In this system, code number 104 is partner #1's 4th questionnaire, and 310 is partner #3's 10th. It is useful to use some numbering system that makes it easy for you to know who collected the data. Don't use letters, please.

Data Organization and Coding

  1. Follow these instructions for preparing your code sheet for computer analysis exactly (see example data sheet below).

    *** Example of a Data Sheet ***

    Teams must make their coding decisions together. Teams may roster their data together or individually, but all the team's data must be submitted together as one package, and partners' data must use a compatible form to be analyzed together on the computer. THE WHOLE TEAM MUST USE THE SAME FORMAT FOR THE SUMMARY SHEET. That is, the variables must be listed in the same order across the page. The sheet must be legible: it must be written in dark pencil or pen, the letters and numbers must be large enough to read, and it must be neat and clear enough for someone to translate it without your being there.
  2. Assign a computer "name" to each variable. This is 1-8 characters (letters or numbers; first must be a letter) which will remind you of the content of the item. Call the open-ended item "OPEN." You can call sex "SEX." One system that might be helpful is to use a 6-letter name, and end with the number of the item on your questionnaire.
  3. Plan a layout that leaves room for everything. You can use poster size paper if that helps. Or work lengthwise on 8.5 by 11 paper and tape pages together. Remember that you will be reading it and entering the data into the computer and I may need to refer to it as well. Each subject's data should be visible from beginning to end without turning pages.
  4. Make the first column the ID number. Then make a column for each independent variable question. Next, make one column for each closed-ended dependent variable question in the order they appear on the questionnaire, putting the computer name at the top of each column. After the last dependent variable item make a last column labeled OPEN.
  5. Take a blank questionnaire and make a "code book." That is (see example) write next to each answer the code number that will be assigned for that answer. Here you must decide what the "high" scores on your closed-ended questions will mean. In the example, high scores were pro-capital punishment. Notice that some items are reverse coded. Be extremely careful in designating which items are scored from high to low, and which from low to high, if you have positive and negative items (which you should). Indicate on the code book which items are added into your index and which items are independent variables. Make sure to include what the categories of your open-ended codes mean.
  6. Now go through all of the questionnaires, writing the numbers to be assigned for all independent and closed-ended dependent measures on each questionnaire. Then transfer the numbers to the code sheets. IF THE RESPONDENT HAS NOT ANSWERED A QUESTION DO NOT GIVE HIM/HER A ZERO. We will discuss in class what to do with such missing data.
  7. Add up all the numbers for the closed-ended item comprising the index/ dependent variable, and put this sum in the INDEX column. DO NOT INCLUDE THE OPEN-ENDED QUESTION. Double-check your addition. This sum is an important check for accuracy of data entry into the computer. Leave this sum blank for any respondent with missing data until you have decided what method to use for estimating those values.
  8. Open-ended question: The entire team must agree on 2-5 ordinal categories into which all the open-ended answers may be sorted according to the dependent variable. You develop these categories AFTER you see the range of responses; create categories that differentiate among the range of answers you received. (Please ask for advice if your answers seem strange.) OPERATIONALIZE these categories by explaining in some detail how you will classify the various answers.
    In the example shown above, the students categorized responses to the question "How do you feel about capital punishment and why do you feel that way?" as strongly oppose (1), moderately oppose (2), ambiguous (3), moderately favor (4), and strongly favor (5). The attributes, therefore, ranged from 1 to 5. (PLEASE NOTE: THIS IS ONLY AN EXAMPLE, AND THIS MAY NOT BE SUITABLE FOR YOUR DATA.)

    In your write-up, state what your categories are and what rules you use for classifying each questionnaire; discuss in detail any cases that proved difficult to classify.
    Finally, enter the numbers for your classification of the open-ended question in the OPEN column on the data sheet, and write the answer (or a summary of the answer) in the place provided.

  9. Gather all team members' data together for data entry and later submission to me as a set. At this meeting, discuss how you did your sampling and any problems any of you had during administration of the questionnaire, such as respondents' not understanding questions. Each team member will need to know all of these problems if write-up is not going to be as a group. This is also when you will decide how to handle missing data and enter it on the summary sheets.

Check each other's work! You must make sure everyone is doing things the same way. This is extremely important. If one team member deviates from the rest, everyone's data will be garbage, no matter who is "right." If in doubt, at least make sure that the whole team goes the same way. I will help with coding questions if they are brought up before the due date; you will lose points for any errors in the data when they are submitted for my computer analysis. (NOTE: If one team member fails to have his or her data ready by an agreed deadline, please tell me. I will help you determine how to handle it.)

Enter the data into a computer data file as I will show you in class. Do a listing and frequencies (see instructions that follow under "computer analysis".)

We will run a standard set of tables. If you want to request something unusual, be sure an explanation is attached when you submit your disk, listing, and frequencies.

**Example of a Syntax File**

SUBMIT YOUR CODE SHEETS, CODE BOOK, AND HARD COPY OF OUTPUT ON (OR BEFORE) the date indicated on the schedule. Make sure to indicate the name of your data and syntax files. EACH PARTNER SHOULD KEEP A XEROX COPY OF THE GROUP'S DATA SHEETS. Do not try to save money by not doing this.

Computer Analysis of Data

(done for team together)

You will do the data entry and some initial analyses of your data: a listing of your data as it is stored in the computer, and frequencies on all of the variables. Please check the listing against what you wrote on the code sheet to be sure there are no errors. Please find the errors and correct them before you hand your file name, code book, and frequencies in to me. It is also worth checking the computer list back against your original questionnaires. Statistical data must be carefully handled at each step, or you can produce garbage numbers which yield garbage interpretations.

I will do the remainder of the analyses, including a test of the reliability of the items in the index, a test of the relationship between your index and your open-ended measure of the same concept, and tests of your "obvious" hypothesis and any other independent-dependent variable relationships you have set up. These tests, which can be seen as construct validity tests, will be done for both the index and the open-ended question, as two separate measures of your concept. If any of your independent variables are continuous measures, the tests of the hypotheses involving these measures will be done with correlation coefficients. Otherwise, mean differences will be tested using analyses of variance. For testing the hypotheses using the open-ended question as the measure of your concept, chi-square tests on contingency tables will be used. This may involve re-categorizing some independent variables in order to avoid very small sample sizes in some cells of the tables. This will be explained in class or in lab.

Next Section


Questions? Comments? Please contact jpiliavi@ssc.wisc.edu

Social World Textbook Cover



Sociology 236

Sociology 357

Sociology 647

Sociology 965

Sociology Homepage