Paper #2 - Analyzing Data from the General Social Survey
For this assignment, you will select a hypothesis or set of inter-related
hypotheses relating to inequality in American society and test that hypothesis
with data from the General Social Survey (GSS). The GSS is a national survey of
American adults that has been conducted biannually since 1972, and includes
questions on socioeconomic status, family background, and a wide variety of
social and political attitudes. These data are widely used by social scientists
to test theories about inequality, social attitudes, and so forth. The
University of California-Berkeley has made available GSS data in an easily
accessible fashion over the web, and includes on-line procedures for data
analysis. This assignment gives you the opportunity to perform your own test of
one of the models that we have been discussing in class.
Procedures
- Begin by browsing the data
that are available in the GSS. Although the surveys have collected a wide
variety of information, they are not exhaustive. So your first task is to
familiarize yourself with the questions and information that the GSS has
collected. To do this, you will need to go to the University of California-Berkeley GSS web site.
Click on Browse codebook, then START. The codebook can be
browsed in several different ways.
I recommend that you click on Codebook
by Year of Interview and then Sequential
Variable List. This will display variables by topical groups. You can click on any variable name to
get a listing of the exact question asked in the survey. If you are browsing by year of interview,
you can also scroll down to see what years that particular question was
included. Some questions are asked
other years, others only occasionally.
To get you started, I have included a list of personal background variables
below, along with some attitude questions.
- Once you have scanned the GSS
codebook, select a hypothesis (or set of inter-related hypotheses) that
you can test using GSS data. Identify the variables that you will need to
use to test this hypothesis. Remember that the GSS identifies variables
with (often cryptic) mnemonic names of eight characters or less (e.g.,
EQWLTH, RINCOM). You must use these names, spelled exactly as they appear
in the codebook, in order to obtain the statistical calculations that you
need for this paper.
- The University
of California at Berkeley
site enables you to run crosstabulations and comparisons of means. To do
this, click on the appropriate link and follow the instructions.
- For crosstabulations, click
on Frequencies or crosstabulation, then START. In the
crosstabulation window, enter your dependent variable (e.g., the
characteristic that you are attempting to explain) as the "row"
variable, and the causal (or independent) variable(s) as
"column" variable(s). You can add a third "control"
variable by entering that variable name in the "control
variable" box. For Percentaging click on Column and Statistics
YES. You should also check Question Text YES in order to get
the exact wording of the questions that you are analyzing. If you wish to
confine your analysis to a specific year of data (or restrict your
analysis to a specific subgroup of respondents), enter the appropriate
variable name and code in the Filters box. For example, to restrict
your analysis to the sample interviewed in 2002, type year(2002) in
the Filters box.
- To perform a comparison of
means, click on Comparison of means, then START. In this
window, enter your dependent variable as "dependent" and your
causal variable as "row." You can also enter control variables
or filters, as for crosstabulations.
- When you have filled in all
the necessary information, click on Run the table. The results of
your analysis will automatically pop up in a new window.
4. Print a copy of each of your analytical tables. You will need to append
these to your lab report.
5. Once you have obtained your data, examine your tables. What patterns, if
any, do they reveal? (E.g., are more poor people more socially and politically
alienated, as the culture of poverty argument suggests? Does the gap between
male and female earnings persist across all educational levels?) Do your
findings support or refute your hypothesis? Are there unexpected patterns? Can
you think of alternative explanations for these findings?
6. Write up your lab report. Your report should include the following
sections:
- Statement of Hypothesis:
Identify your hypothesis or hypotheses and explain why you expect the
hypothesis to be true (or not true). Identify how your hypothesis relates
to one of the arguments that we have discussed.
- Methodology: Identify
the variables that you used to test your hypothesis. Comment on the
question used by the GSS to obtain this information. Is it a
"fair" (e.g., valid and reliable) indicator of the
characteristic or attitude that your hypothesis is concerned with? Are
there any biases in question wording that we must consider in interpreting
the results? Are there any problems with the sample? (E.g., there are very
small numbers of respondents in some groups. Does this affect your
analysis?)
- Presentation of Results:
Summarize your findings verbally.
- Analysis and Discussion:
Interpret your findings. Explain what they tell us about your hypothesis
and about inequality in America.
- Statistical Appendix:
Attach your statistical table(s) to your report.
Your paper will be due on Friday, March 1, 4:30
p.m. in my office (Leighton 227).
Here are some variables from the 2002 GSS that might be of
interest, to get you started.
age AGE
OF RESPONDENT
SEX
Respondents sex
WRKSTAT
Labor force status
INCOME98
Total family income
rincom98
RESPONDENTS INCOME
class
SUBJECTIVE CLASS IDENTIFICATION
incom16
RS FAMILY INCOME WHEN 16 YRS OLD
MARITAL
Marital status
AGEWED
Age when first married
DIVORCE
Ever been divorced or separated
SPWRKSTA
Spouse labor force status
DEGREE
Rs highest degree attained
PADEG
Fathers highest degree
MADEG
Mothers highest degree
SPDEG
Spouses highest degree
BORN
Was R born in this country
XNORCSIZ
Expanded N.O.R.C. size code (e.g., metropolitan, urban, rural, etc.)
POLVIEWS
Think of self as liberal or conservative
RELIG
Rs religious preference
RELITEN
Strength of affiliation
fund HOW
FUNDAMENTALIST IS R CURRENTLY
fund16
HOW FUNDAMENTALIST WAS R AT AGE 16
equal8
SOCIAL STANDING DUE TO ABILITY
usclass1
TRADITIONAL CLASS DIVISIONS STILL REMAIN
usclass2
ACHIEVEMENT DEPENDS ON FAMILY BACKGROUND
usclass3
ACHIEVEMENT DEPENDS ON EDUC AND ABILITY
usclass4
ONES OWN EFFORTS DONT COUNT
CONFINAN
Confid in banks, financial institutions
CONBUS
Confidence in major companies
CONCLERG
Confidence in organized religion
CONEDUC
Confidence in education
CONFED
Confid. in exec branch of fed govt
DRINK
Ever drink alcoholic beverages?
DRUNK
Ever drink too much?
GOVAID
Ever receive welfare, unemp insur, etc.
GETAID
Ever received welfare?
UNION
Does R or spouse belong to union
FEWORK
Should women work
premarsx
SEX BEFORE MARRIAGE
teensex
SEX BEFORE MARRIAGE -- TEENS 14-16
welfare1
WELFARE MAKES PEOPLE WORK LESS
welfare2
HELPS PEOPLE OVERCOME DIFFICULT TIMES
For the 2002 GSS, racial/ethnic identity is included in a
variable called RACECEN1. For all previous years, it is called RACE.