Paper #2 - Analyzing Data from the General Social Survey

For this assignment, you will select a hypothesis or set of inter-related hypotheses relating to inequality in American society and test that hypothesis with data from the General Social Survey (GSS). The GSS is a national survey of American adults that has been conducted biannually since 1972, and includes questions on socioeconomic status, family background, and a wide variety of social and political attitudes. These data are widely used by social scientists to test theories about inequality, social attitudes, and so forth. The University of California-Berkeley has made available GSS data in an easily accessible fashion over the web, and includes on-line procedures for data analysis. This assignment gives you the opportunity to perform your own test of one of the models that we have been discussing in class.

Procedures

  1. Begin by browsing the data that are available in the GSS. Although the surveys have collected a wide variety of information, they are not exhaustive. So your first task is to familiarize yourself with the questions and information that the GSS has collected. To do this, you will need to go to the University of California-Berkeley GSS web site. Click on Browse codebook, then START. The codebook can be browsed in several different ways.  I recommend that you click on Codebook by Year of Interview and then Sequential Variable List. This will display variables by topical groups.  You can click on any variable name to get a listing of the exact question asked in the survey.  If you are browsing by year of interview, you can also scroll down to see what years that particular question was included.  Some questions are asked other years, others only occasionally.  To get you started, I have included a list of personal background variables below, along with some attitude questions.
  2. Once you have scanned the GSS codebook, select a hypothesis (or set of inter-related hypotheses) that you can test using GSS data. Identify the variables that you will need to use to test this hypothesis. Remember that the GSS identifies variables with (often cryptic) mnemonic names of eight characters or less (e.g., EQWLTH, RINCOM). You must use these names, spelled exactly as they appear in the codebook, in order to obtain the statistical calculations that you need for this paper.
  3. The University of California at Berkeley site enables you to run crosstabulations and comparisons of means. To do this, click on the appropriate link and follow the instructions.
  • For crosstabulations, click on Frequencies or crosstabulation, then START. In the crosstabulation window, enter your dependent variable (e.g., the characteristic that you are attempting to explain) as the "row" variable, and the causal (or independent) variable(s) as "column" variable(s). You can add a third "control" variable by entering that variable name in the "control variable" box. For Percentaging click on Column and Statistics YES. You should also check Question Text YES in order to get the exact wording of the questions that you are analyzing. If you wish to confine your analysis to a specific year of data (or restrict your analysis to a specific subgroup of respondents), enter the appropriate variable name and code in the Filters box. For example, to restrict your analysis to the sample interviewed in 2002, type year(2002) in the Filters box.
  • To perform a comparison of means, click on Comparison of means, then START. In this window, enter your dependent variable as "dependent" and your causal variable as "row." You can also enter control variables or filters, as for crosstabulations.
  • When you have filled in all the necessary information, click on Run the table. The results of your analysis will automatically pop up in a new window.

4. Print a copy of each of your analytical tables. You will need to append these to your lab report.

5. Once you have obtained your data, examine your tables. What patterns, if any, do they reveal? (E.g., are more poor people more socially and politically alienated, as the culture of poverty argument suggests? Does the gap between male and female earnings persist across all educational levels?) Do your findings support or refute your hypothesis? Are there unexpected patterns? Can you think of alternative explanations for these findings?

6. Write up your lab report. Your report should include the following sections:

  • Statement of Hypothesis: Identify your hypothesis or hypotheses and explain why you expect the hypothesis to be true (or not true). Identify how your hypothesis relates to one of the arguments that we have discussed.
  • Methodology: Identify the variables that you used to test your hypothesis. Comment on the question used by the GSS to obtain this information. Is it a "fair" (e.g., valid and reliable) indicator of the characteristic or attitude that your hypothesis is concerned with? Are there any biases in question wording that we must consider in interpreting the results? Are there any problems with the sample? (E.g., there are very small numbers of respondents in some groups. Does this affect your analysis?)
  • Presentation of Results: Summarize your findings verbally.
  • Analysis and Discussion: Interpret your findings. Explain what they tell us about your hypothesis and about inequality in America.
  • Statistical Appendix: Attach your statistical table(s) to your report.

Your paper will be due on Friday, March 1, 4:30 p.m. in my office (Leighton 227).

Here are some variables from the 2002 GSS that might be of interest, to get you started. 

 

age AGE OF RESPONDENT

SEX Respondents sex

WRKSTAT Labor force status

INCOME98 Total family income

rincom98 RESPONDENTS INCOME

class SUBJECTIVE CLASS IDENTIFICATION

incom16 RS FAMILY INCOME WHEN 16 YRS OLD

MARITAL Marital status

AGEWED Age when first married

DIVORCE Ever been divorced or separated

SPWRKSTA Spouse labor force status

DEGREE Rs highest degree attained

PADEG Fathers highest degree

MADEG Mothers highest degree

SPDEG Spouses highest degree

BORN Was R born in this country

XNORCSIZ Expanded N.O.R.C. size code (e.g., metropolitan, urban, rural, etc.)

POLVIEWS Think of self as liberal or conservative

RELIG Rs religious preference

RELITEN Strength of affiliation

fund HOW FUNDAMENTALIST IS R CURRENTLY

fund16 HOW FUNDAMENTALIST WAS R AT AGE 16

equal8 SOCIAL STANDING DUE TO ABILITY

usclass1 TRADITIONAL CLASS DIVISIONS STILL REMAIN

usclass2 ACHIEVEMENT DEPENDS ON FAMILY BACKGROUND

usclass3 ACHIEVEMENT DEPENDS ON EDUC AND ABILITY

usclass4 ONES OWN EFFORTS DONT COUNT

CONFINAN Confid in banks, financial institutions

CONBUS Confidence in major companies

CONCLERG Confidence in organized religion

CONEDUC Confidence in education

CONFED Confid. in exec branch of fed govt

DRINK Ever drink alcoholic beverages?

DRUNK Ever drink too much?

GOVAID Ever receive welfare, unemp insur, etc.

GETAID Ever received welfare?

UNION Does R or spouse belong to union

FEWORK Should women work

premarsx SEX BEFORE MARRIAGE

teensex SEX BEFORE MARRIAGE -- TEENS 14-16

welfare1 WELFARE MAKES PEOPLE WORK LESS

welfare2 HELPS PEOPLE OVERCOME DIFFICULT TIMES

 

 

For the 2002 GSS, racial/ethnic identity is included in a variable called RACECEN1.  For all previous years, it is called RACE.