Assignment 3, Sociology 3080, Fall 2022
Computing Phase
1. You are asked to
a. set up your variables, using the code provided under the heading Variables below;
b. obtain appropriate graphs and descriiptive statistics (for central tendency, dispersion and shape) for the non- dichotomous variables used (for dichotomies a table or barplot will be sufficient);
c. obtain 3 two-way crosstabs, with an odds ratio and a value of Q for each;
the two-way crosstabs should be between the variables goodscore and over50; the first should include the entire sample, the second should include only Ms and the third only Fs;
d. obtain a doubledecker, using goodscore as dependent, female and over50 as predictors; and
e. run regression equations.
initially, you should try to predict score from age, female, native_English, minority, teaching_prof and looks. If
a predictor appears to be unnecessary, you should drop it in a second equation.
As usual, you should hand in the syntax you have used to do this. (35 points)
Writing Phase
Your written report should include:
2. a brief discussion of what your crosstabs, the statistic(s) associated with them, and your doubledecker tell us. (16 points)
3. an explanation of why some variables were dropped from regression equations in which they were originally present, if this has been done, or why none were dropped. (Just saying that something was or wasn’t ‘statistically significant’ is only
the beginning of a full answer.) (6 points)
4. a table showing the coefficients from your final equation, in presentation form. This is illustrated in the text (pp. 240-41). (8 points)
5. a set of brief explanations of the following elements of your regression output: SEE (or, as R refers to it, the ‘residual standard error’), R-squared, and b. Just explain what measure each of these refers to, and how we interpret its value for
these data. (10 points)
6. a discussion of the message of your regression results.
Specifically, what does the value of each of the individual bs in your final equation tell us?
Among the dichotomous predictors, which seems to have the greatest impact? By how much?
How much impact would a 10 year change in age have? A five point rise in the value of looks?
(16 points)
Variables
You will need the file ‘evals.rdata’. It contains results from student ratings of instructors at a university in Texas. Two of the variables you will work with can be used under their original names. The others can be created with the following code:
female <- as.numeric(gender == "female")
looks <- bty_avg
native_English <- as.numeric(language == "english") teaching_prof <- as.numeric(rank == "teaching") minority <- as.numeric(ethnicity == "minority") over50 <- as.numeric(age >= 50)
goodscore <- as.numeric(score >= 4.3)
You can either key in this code, or copy and paste it from the file ‘variable_setup.r’. The latter procedure is recommended.
You should execute this code immediately after loading the workspace ‘evals2.rdata’. Then you will have all the variables you will need.
score – the mean rating of an instructor by a class, on
a scale from 1 to 5, where higher scores are better
goodscore – coded: 0 – up to 4.2 ; 1 – 4.3+ age – age of the instructor, in years
over50 – coded: 0 – age up to 49 ; 1 – 50+ female – coded: 0 – M ; 1 -F
native_English – coded: 0 – not a native speaker of English; 1 – a native speaker of English
minority – coded: 0 – not a member of a visible minority; 1 – a member of a visible minority
teaching_prof – coded: 0 – regular faculty;
1 – faculty with only teaching
responsibilities
looks – the average of 6 ratings of the instructor’s appearance, on a scale from 0 to 10