Step 1: Choose your topic

• Choose a topic where there are theories, research and studies already available so that these would serve as your reference and starting point.

– In the research tradition, these existing theories/research/studies from EXPERTS will provide “grounding” or foundation for your paper.

– It’s not why YOU say, it’s what EXPERTS say/found.

Step 2: Have you researched/reviewed existing studies on your topic and used these studies to pick your X variables (at least 3 required)?

• Your topic (your Y variable) must be expressed in numbers and that these numbers must be different.

• General format fits the question of “why is there variation in the Y variable?” or “why is it that the Y variable is high/large for some and low/small for others?“

– For example, if Y=GPA of college students in each U.S. state, then the question becomes “Why do students in some U.S. states have higher GPAs than students in other states?”

• To answer the above question, you must have a

minimum of 3 explanations/reasons (these are

your X variables).

– Although some X variables are obvious (for example,

X1=number of hours students study per week, with the expectation that students who study more get higher GPAs), be sure to use theories/existing studies to choose the X variables. These theories/studies will also tell you if you have more than 3 X variables.

• Make sure that your Y and X variables can all be

expressed in numerical values.

– This is the only way that Excel can run your regression model.

• Make sure you know where (which reliable websites) you will get the numerical values of your Y and X variables.

Research Question: Why do some students have higher GPAs than others?

• Y = GPA on a 4.0 scale

• X1=numbers of study hours per week, which is an actual number

• X2=course load, measured by the number of course credits currently taking or an average course load per semester

• X3=number of hours a student works per week, which is an actual number

Note that I chose X1, X2, X3 because of studies/theories that say each of them affect a student’s GPA, not because I think they should be my X variables.

Step 3: Have you constructed your initial regression equation and specified your Y and X variables?

General Equation: Y = a + b1 X1 + b2 X2 + b3 X3

Specific Equation: GPA = a + b1 Study Hours + b2 Course Load + b3 Work Hours

Hypotheses (based on theories/studies/research):

• The more hours students study (higher value of X1), the higher their GPA (higher value of Y)expect b1>0

• The more credits students take (higher value of X2), the more pressure on their grades, the lower their GPA (lower value of Y)expect b2<0
• The more hours students work (higher value of X3), the less time/energy they have to study, the lower their GPA (lower value of Y)expect b3<0
Step 4: Have you decided how you will measure (in numbers) your Y and X variables?
Y= GPA (out of 4.0)
X1= study time in hours per week (average)
X2= number of course credits per semester (average) X3= work hours per week (average)
Step 5. Have you decided where you will get the data for your Y and X variables?