Regression, job category, and pay discrimination (part 2)

In the first post in this series,
we looked at the regression of
pay on group membership.
link:


https://douglasadowning.wordpress.com/2017/11/14/how-to-use-regression-to-study-pay-discrimination-1/

However, there is more that needs to be done.
The pay also varies by job cateogory, so investigate
that effect by performing a regression with
pay as the dependent variable, and indicator
variables for the job categories. For this
example, there are 4 job categories, and you
need 3 indicator variables (one less than the
number of job categories). Job category 1
is identified by the indicator values 0 0 0;
category 2 by 0 0 1; category 3 by 0 1 0;
and category 4 by 1 0 0. The coefficient of
each indicator variable will be the difference
between the average pay of that category
and the pay of job category 1.

Here is the regression:

pay = 5 ind2 + 10 ind3 + 20 ind4 + 10

r-squared= 0.93184

The coefficient of 5 for variable “ind2” indicates
the pay of job cateogry 2 is (on average) 5 dollars
more than the pay of job category 1; the pay of
category 3 is 10 dollars more than category 1;
and the pay of category 4 is 20 dollars more
than category 1. The r-squared value is close to,
but less than, 1, indicating that the job
cateogries do not explain all of the variation in pay.
If the r-squared value for this regression is less than one,
then the company has some explaining to do about why
the pay scale is not determined by the categories that
supposedly matter.

Note: ideally we will perform this calculation for
all employees of the organization. In that case, we
don’t do any statistical inference testing, such
as using a t-statistic, because we don’t have data
for a sample (we have all of the data for the entire
population we are interested in). We are using the
regression as a descriptive set of statistics,
rather than as inferential statistics. This is
good — it you can, if is best to do this analysis
when you know the data for the entire population.

Next, investigate whether the groups are
distributed evenly across job categories, by
performing the regression using the group indicator
variable as the dependent variable, and the job category
indicator variables as the independent variables. In this case, we
have:


group = (0 * ind2) + (0 * ind3) + (0 * ind4) + 0.5

r-saured = 0.00000

All of the coefficients are zero, and the r-squared
value is zero, indicating that there is no connection
between group membership and job category. From the
frequency table in the previous post it is
clear that the group members are distributed
evenly across all of the job categories (however,
in more complicated cases it would
not be as obvious, so that is why the regression analysis
is helpful). This means that this employer does not
seem to discriminate when it assigns workers to
job categories (although there possibly could be
discrimination if the jobs are distributed
evenly but one group has higher qualifications).
This result does mean that the pay difference between
the groups seems to be caused by discrimination, since
it can’t be explained by differences in job category.

Finally, perform the regression with pay as the
dependent variable and both the group indicator
and the job category indicators as the independent
variables:


pay = 5 ind2 + 10 ind3 + 20 ind4  - 4 group + 12

r-squared= 1.00000

The r-squared value indicates that all of the variation
in pay is accounted for by group and job category.
The coefficients are the same as they were in the
earlier regressions which did not include all of
the variables. Also, the r-squared value is the
sum of the first two regression r-squareds:

1.0000= 0.06816+0.93184

These two results follow from the fact that group
membership and job category are uncorrelated, so
there is no multicollinearity issue when they are
both included as independent variables in the
regression.

In summary: this employer seems to be discriminating
against group 2 members by paying them less than group 1
members doing the same job.

See the spreadsheet at this link:

http://myhome.spu.edu/ddowning/fos/discrgr.xlsx


……………..
–Douglas Downing
You are welcome to write your comments on the facebook page at

https://www.facebook.com/DouglasADowningSPU/?ref=profile

This blog is part of the

Seattle Pacific University Political Economy blog group
(click here for index).


Click here for the index of topics for the blog

Twitter:
https://twitter.com/douglasdowning

New items are posted about twice per week.

Leave a comment