Sampling, Normal Curve,
and Hypotheses
in Quantitative Research
Types of Sampling
Simple Random Sample
Stratified Random Sample
Cluster sampling
Systematic
Convenience
Simple Random Sample
Every subset of a specified size n from the population has an equal chance of being selected.
Stratified Random Sample
The population is divided into two or more groups called strata, according to some criterion,
such as geographic location, grade level, age, or income, and subsamples are randomly selected
from each strata.
Cluster Sample
The population is divided into subgroups (clusters) like families. A simple random sample is
taken of the subgroups and then all members of the cluster selected are surveyed.
Systematic Sample
Every kth member ( for example: every 10th person) is selected from a list of all population
members.
Convenience Sample
Selection of whichever individuals are easiest to reach.
It is done at the “convenience” of the researcher.
Now you decide:
• including 5 people from every sports
team on a collegiate campus
• including every teacher from 4
elementary schools chosen from a
group of 11 elementary schools in a
school district with 45 elementary
schools total
• including 25 employees whose names were
drawn from a hat 250 school employees
• including all people who attend parent-teacher
conferences
• including every 20th student from a list of
2000 students in a particular high school
Errors in Sampling
Non-Observation Errors
◦ Sampling error: naturally occurs
◦ Coverage error: people sampled do not match the
population of interest
◦ Underrepresentation
◦ Non-response: won’t or can’t participate
As the researcher, you will never eliminate ALL
elements of BIAS…but it is your job to minimize
the impact of bias on your research project by
carefully planning out your research design.
The Normal Curve
The Normal Distribution:
The Normal curve is a mathematical abstraction
which conveniently describes ("models") many
frequency distributions of scores in real-life.
length of time before someone
looks away in a staring contest:
length of pickled gherkins:
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and
country public schools.' Journal of the Anthropological Institute, 5, 174-180:
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and
country public schools.' Journal of the Anthropological Institute, 5, 174-180:
Height of 14 year-old children
16
country
town
14
10
8
6
4
2
0
51
-5
2
53
-5
4
55
-5
6
57
-5
8
59
-6
0
61
-6
2
63
-6
4
65
-6
6
67
-6
8
69
-7
0
frequency (%)
12
height (inches)
Frequency of different wand lengths
An example of a normal distribution - the length of
Sooty's magic wand...
Length of wand
Properties of the Normal Distribution:
1. It is bell-shaped and asymptotic at the extremes.
2. It's symmetrical around the mean.
3. The mean, median and mode all have same value.
4. It can be specified completely, once mean and SD
are known.
5. The area under the curve is directly proportional
to the relative frequency of observations.
e.g. here, 50% of scores fall below the mean, as
does 50% of the area under the curve.
e.g. here, 85% of scores fall below score X,
corresponding to 85% of the area under the curve.
Relationship between the normal curve and the
standard deviation:
frequency
All normal curves share this property: the SD cuts off a
constant proportion of the distribution of scores:-
68%
95%
99.7%
-3
-2
-1
mean
+1
+2
+3
Number of standard deviations either side of mean
About 68% of scores fall in the range of the mean plus and minus 1 SD;
95% in the range of the mean +/- 2 SDs;
99.7% in the range of the mean +/- 3 SDs.
e.g. IQ is normally distributed (mean = 100, SD = 15).
68% of people have IQs between 85 and 115 (100 +/- 15).
95% have IQs between 70 and 130 (100 +/- (2*15).
99.7% have IQs between 55 and 145 (100 +/- (3*15).
68%
85 (mean - 1 SD)
115 (mean + 1 SD)
We can tell a lot about a population just from knowing
the mean, SD, and that scores are normally distributed.
If we encounter someone with a particular score, we can
assess how they stand in relation to the rest of their
group.
e.g. someone with an IQ of 145 is quite unusual (3 SDs
above the mean).
IQs of 3 SDs or above occur in only 0.15% of the
population [ (100-99.7) / 2 ].
Population all possible values
Sample a portion of the population
Statistical inference generalizing from a
sample to a population with calculated
degree of certainty
Two forms of statistical inference
◦ Hypothesis testing
◦ Estimation
Parameter a characteristic of population, e.g.,
population mean µ
Statistic calculated from data in the sample, e.g.,
sample mean ( )
P-hat a sample proportion, symbolized by
Distinctions Between Parameters and Statistics
(Vocabulary Review)
Parameters
Statistics
Source
Population
Sample
Notation
Greek (e.g., μ) Roman (e.g., xbar)
Vary
No
Yes
Calculated
No
Yes
Hypothesis Testing Steps
A.Null and alternative hypotheses
B.Significance level
C.Test statistic
D.P-value and interpretation
General Example:
A criminal trial is an example of hypothesis testing
without the statistics.
In a trial a jury must decide between two hypotheses. The
null hypothesis is
H0: The defendant is innocent
The alternative hypothesis or research hypothesis is
H1: The defendant is guilty
The jury does not know which hypothesis is true. They
must make a decision on the basis of evidence presented.
In the language of statistics convicting the defendant is
called rejecting the null hypothesis in favor of the
alternative hypothesis. That is, the jury is saying that
there is enough evidence to conclude that the defendant
is guilty (i.e., there is enough evidence to support the
alternative hypothesis). We say, “We reject the null.”
If the jury acquits it is stating that there is not enough
evidence to support the alternative hypothesis. Notice that
the jury is not saying that the defendant is innocent, only
that there is not enough evidence to support the
alternative hypothesis. That is why we never say that we
accept the null hypothesis…we say, “We fail to reject the
null.” (Although non-stats people often do this wrong!)
Specific Example:
Crazy guy...but good at explanations!
Another Specific Example:
A department store manager determines that a new
billing system will be cost-effective only if the mean
monthly account is more than $170.
What null and alternative hypotheses
can we write for this situation?
The system will be cost effective if the mean account
balance for all customers is greater than $170.
We express this belief as a our research hypothesis, that is:
H1: μ > 170 (this is what we want to determine)
Thus, our null hypothesis becomes:
H0: μ < 170 (we assume is true until proven otherwise)
Interpretation
P-value answer the question: What is the
probability of the observed test statistic …
when H0 is true?
Thus, smaller and smaller P-values provide
stronger and stronger evidence against H0
Small P-value strong evidence for HA
Interpreting the p-value…
The smaller the p-value, the more statistical evidence exists to support the
alternative hypothesis.
•If the p-value is less than 1%, there is overwhelming evidence that supports
the alternative hypothesis.
•If the p-value is between 1% and 5%, there is a strong evidence that supports
the alternative hypothesis.
•If the p-value is between 5% and 10% there is a weak evidence that supports
the alternative hypothesis.
•If the p-value exceeds 10%, there is no evidence that supports the alternative
hypothesis.
We observe a p-value of .0069, hence there is overwhelming evidence to
support H1: > 170.
11.38
Interpreting the p-value…
Overwhelming Evidence
(Highly Significant)
Strong Evidence
(Significant)
Weak Evidence
(Not Significant)
No Evidence
(Not Significant)
0
.01
.05
.10
p=.0068
11.39
Conclusions of a Test of
Hypothesis…
If we reject the null hypothesis, we conclude that there is enough
evidence to infer that the alternative hypothesis is true.
If we fail to reject the null hypothesis, we conclude that there is
not enough statistical evidence to infer that the alternative
hypothesis is true. This does not mean that we have proven that
the null hypothesis is true!
11.40
Prior to testing, you would decide on a level of
significance…
Your computed “p-value” will indicate whether you
should reject the null or fail to reject the null.
Let’s examine some p-values and make decisions:
If p = .45, we would __________________.
If p = .20, we would __________________.
If p = .09, we would __________________.
If p = .01, we would __________________.
If p = .009, we would __________________.
In summary…
*Sampling critically important to your study.
*Null and alternative hypotheses are the
foundation of research investigations.
*Interpreting the p-value provides evidence as to
whether you have “statistically significant”
evidence to support your claim or not.
Correlation &
Regression
Correlation
Finding the relationship between two
quantitative variables without being
able to infer causal relationships
Correlation is a statistical technique
used to determine the degree to which
two variables are related
Scattergram (or scatterplot)
• Rectangular coordinate
• Two quantitative variables
• One variable is called independent or
criterion (X) and the second is called
dependent or predictive (Y)
• Points are not joined
• No frequency table
Y
*
*
*
X
Example
Wt. 67 69 85 83 74 81 97 92 114 85
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
mmHg)
Wt. 67 69 85 83 74 81 97 92 114 85
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
SBP(mmHg)
(mmHg)
220
200
180
160
140
120
100
wt (kg)
80
60
70
80
90
100
110
120
Scatter diagram of weight and systolic blood
pressure
Scatter plots
The pattern of data is indicative of the type of
relationship between your two variables:
➢ positive relationship
➢ negative relationship
➢ no relationship
Positive relationship
18
16
14
Height in CM
12
10
8
6
4
2
0
0
10
20
30
40
50
Age in Weeks
60
70
80
90
Negative relationship
Reliability
Age of Car
No relation
An Example
A familiar statement from parents to children:
If you want to get ahead, stay in school.
Underlying this nagging parental advice is the following claimed
empirical relationship:
+
LEVEL OF EDUCATION =====> LEVEL OF SUCCESS IN LIFE
Suppose we collect data through by means of a survey asking
respondents (say a representative sample of the population aged 3555) to report the number of years of formal EDUCATION they
completed and also their current INCOME (as an indicator of
SUCCESS). We then analyze the association between the two
interval variables in this reformulated hypothesis.
+
LEVEL OF EDUCATION =========> LEVEL OF INCOME
(# of years reported)
($000 per year)
Since these are both continuous variables, we analyze their association
by means of a scattergram or scatterplot.
Data collected from two different societies:
Years of Education versus Yearly Income
An Example (cont.)
Note that the two scattergrams are drawn with the
same horizontal and vertical scales to facilitate
comparison between the two charts.
Both scattergrams show a clear positive association
between the two variables, i.e., the plotted points in
both form an upward-sloping pattern running from
Low – Low to High – High.
At the same time there are obvious differences
between the two scattergrams (and thus between the
relationships between INCOME and EDUCATION in
societies A and B).
Questions For Discussion
In which society, A or B, is the hypothesis most powerfully confirmed?
In which society, A or B, is there a greater incentive for people to stay in school?
Which society, A or B, does the U.S. more closely resemble?
How might we characterize the difference between societies A and B?
An Example (cont.)
We can visually compare and contrast the nature of
the associations between the two variables in the two
scattergrams by drawing a number of vertical strips
in each scattergram.
Points that lie within each vertical strip represent
respondents who have (just about) the same value
on the independent (horizontal) variable of
EDUCATION.
Within each strip, we can estimate (by “eyeball
methods”) the average magnitude of the dependent
(vertical) variable INCOME and put a mark at the
appropriate level.
Average Income for Selected Levels of Education
We can connect these marks to form a line of averages that is
apparently (close to being) a straight line.
An Example (cont.)
Now we can assess two distinct characteristics of the
relationships between EDUCATION and INCOME in
scattergrams A and B.
How much the does the average level of INCOME change
among people with different levels of education?
How much dispersion of INCOME there is among people with
the same level of EDUCATION?
An Example (cont.)
In both scattergams, the line of averages is upward-sloping, indicating a
clear apparent positive effect on EDUCATION on INCOME.
But in the scattergram for society A, the upward slope of the line of
averages is fairly shallow.
The line of averages indicates that average INCOME increases
by only about $1000 for each additional year of EDUCATION.
On the other hand, in the scattergram for society B, the upward
slope of the line of averages is much steeper.
The graph in Figure 1B indicates that average INCOME
increases by about $4000 for each additional year of
EDUCATION.
In this sense, EDUCATION is on average more “rewarding” in
society B than A.
An Example (cont.)
There is another difference between the two scattergrams.
In scattergram A, there is almost no dispersion within each vertical strip (and
almost no dispersion around the line of averages as a whole).
In scattergram B, there is a lot of dispersion within each vertical strip (and
around the line of averages as a whole).
We can put this point in simpler language.
In society A, while additional years of EDUCATION produce rewards in
terms of INCOME that are modest (as we saw before), these modest rewards are
essentially certain.
In society B, while additional years of EDUCATION produce on average much
more substantial rewards in terms of INCOME (as we saw before), these large
expected rewards are highly uncertain and are indeed realized only on average.
For example, in scattergram B (but not A), we can find many pairs of cases such
that one case has (much) higher EDUCATION but the other case has (much)
higher INCOME.
An Example (cont.)
This means that in society B, while EDUCATION has a big impact on
EDUCATION, there are evidently other (independent) variables (maybe
family wealth, ambition, career choice, athletic or other talent, just
plain luck, etc.) that also have major effects on LEVEL OF INCOME.
In contrast, in society A it appears that LEVEL OF EDUCATION
(almost) wholly determines LEVEL OF INCOME and that essentially
nothing else matters.
Another difference between the two societies is that, while both
societies have similar distributions of EDUCATION, their INCOME
distributions are quite different.
A is quite egalitarian with respect to INCOME, which ranges
only from about $40,000 to about $60,000, while B is
considerably less egalitarian with respect to INCOME, which
ranges from under to $10,000 to at least $100,000 — and
possibly higher.)
In summary, in society A the INCOME rewards of EDUCATION are
modest but essentially certain, while in society B the INCOME rewards
of EDUCATION are substantial on average but quite uncertain in
individual cases.
Correlation Coefficient: r
➢
It is also called Pearson's correlation
or product moment correlation
coefficient.
➢
It measures the nature and strength
between two variables of
the quantitative type.
The sign of r denotes the nature of
association
while the value of r denotes the
strength of association.
➢
If the sign is +, this means the
relation is direct (an increase in one
variable is associated with an
increase in the other variable and a
decrease in one variable is associated
with a decrease in the other
variable).
➢
While if the sign is -, this means an
inverse or indirect relationship
(which means an increase in one
variable is associated with a decrease
in the other).
➢
➢
The value of r ranges between ( -1) and ( +1)
The value of r denotes the strength of the
association as illustrated
by the following diagram.
strong
-1
intermediate
-0.75
-0.25
weak
weak
0
indirect
perfect
correlation
intermediate
0.25
strong
0.75
1
Direct
no relation
perfect
correlation
If r = Zero this means no association or
correlation between the two variables.
If 0 < r < 0.25 = weak correlation.
If 0.25 ≤ r < 0.75 = intermediate correlation.
If 0.75 ≤ r < 1 = strong correlation.
If r = l = perfect correlation.
How to compute the simple correlation
coefficient (r)
r
x y
xy
n
2
2
(
x)
(
y)
2
2
x
. y
n
n
This slide used for explanation only…not a required “understanding” slide.
Example:
A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight.
serial
No
Age
(years)
Weight
(Kg)
1
7
12
2
6
8
3
8
12
4
5
10
5
6
11
6
9
13
This slide used for explanation only…not a required “understanding” slide.
These 2 variables are of the quantitative type, one
variable (Age) is called the independent and
denoted as (X) variable and the other (weight)
is called the dependent and denoted as (Y)
variables to find the relation between age and
weight compute the simple correlation coefficient
using the following formula:
r
x y
xy
2
(
x)
x2
n
n
2
(
y)
. y 2
n
This slide used for explanation only…not a required “understanding” slide.
Serial
n.
Age
(years)
(x)
Weight
(Kg)
(y)
xy
X2
Y2
1
7
12
84
49
144
2
6
8
48
36
64
3
8
12
96
64
144
4
5
10
50
25
100
5
6
11
66
36
121
6
9
13
117
81
169
Total
∑x=
41
∑y=
66
∑xy=
461
∑x2=
291
∑y2=
742
This slide used for explanation only…not a required “understanding” slide.
r
41 66
461
6
(41) 2
(66) 2
291
.742
6
6
r = 0.759
strong direct correlation
This slide used for explanation only…not a required “understanding” slide.
EXAMPLE: Relationship between Anxiety and
Test Scores
X2
Y2
Anxiety
(X)
Test
score (Y)
10
8
2
1
5
6
∑X = 32
2
100
4
20
3
64
9
24
9
4
81
18
7
1
49
7
6
25
36
30
5
36
25
30
∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
XY
This slide used for explanation only…not a required “understanding” slide.
Calculating Correlation Coefficient
r
(6)(129) (32)(32)
6(230) 32 6(204) 32
2
2
774 1024
.94
(356)( 200)
r = - 0.94
Indirect strong correlation
This slide used for explanation only…not a required “understanding” slide.
exercise
Multiple Correlation Tables
Repeated correlations with multiple variables
t-tests, & ANOVAs
and their application to the statistical
analysis of neuroimaging
Adapted from
Carles Falcon &
Suz Prejawa
Populations and samples
Population
z-tests
Sample
(of a population)
t-tests
NOTE: a sample can be 2 sets of scores, eg fMRI data from 2 conditions
Comparison between Samples
Are these groups
different?
Comparison between Conditions
(fMRI)
Reading aloud (script)
Reading aloud
vs
vs
“Reading” finger spelling (sign)
Picture naming
t-tests
comp
infer
12
95% CI
10
8
6
Left hemisphere
right hemisphere
Exp. 1lesion site Exp. 2
• Compare the mean between 2 samples/ conditions
• if 2 samples are taken from the same population,
then they should have fairly similar means
if 2 means are statistically different, then the
samples are likely to be drawn from 2 different
populations, ie they really are different
t-test in word-forming area
comp
infer
• Exp. 1: activation patterns
are similar, not significantly
different they are similar
tasks and recruit the wordforming area in a similar way
12
• Exp. 2: activation patterns
are very (and significantly)
different reading aloud
recruits the word-forming
area significantly more than
naming
95% CI
10
8
6
Left hemisphere
Exp. 1
lesion site
right hemisphere
Exp.
2
Formula
Difference between the means divided by the pooled standard
error of the mean
x1 x 2
t
s x1 x2
Example slide only…you DON’T have to know this formula!!
Formula cont.
x1 x 2
t
s x1 x2
2
Cond. 1
Cond. 2
s x1 x2
2
s1
s2
n1
n2
Example slide only…you DON’T have to know this formula!!
Types of t-tests
One sample vs. Independent
hypothesized
Samples
mean
Paired Samples
(also called dependent means test)
#
1 sample
compared to a
predicted value
2 experimental
conditions and
different
participants were
assigned to each
condition
2 experimental
conditions and the
same participants
took part in both
conditions of the
experiments
Research Question Example
• Let’s pretend you came up with the
following theory…
Having a baby increases brain volume
(associated with possible structural
changes)
Some Problems with a Population-Based Study
•
•
•
•
Cost
Not able to include everyone
Too time consuming
Ethical right to privacy
Realistically, researchers can only do
sample based studies.
Paired Sample T-Tests: Pre and Post
Hypothesize
• HO = There is no difference in brain size
before or after giving birth
• HA = The brain is significantly smaller or
significantly larger after giving birth
(difference detected)
Absolute Brain Volumes cm3
Sum
Mean
SD
Before Delivery
1437.4
1089.2
1201.7
1371.8
1207.9
1150.7
1221.9
1208.7
9889.3
1236.1625
113.8544928
6 Weeks After Delivery
1494.5
1109.7
1245.4
1383.6
1237.7
1180.1
1268.8
1248.3
10168.1
1271.0125
119.0413426
T=(1271-1236)/(119-113)
Difference
57.1
20.5
43.7
11.8
29.8
29.4
46.9
39.6
278.8
34.85
5.18685
Results: p=.003
T
DF
6.718914454
7
Women have a significantly larger brain after giving birth.
http://www.danielsoper.com/statcalc/calc08.aspx
The concentration of cholesterol (a type of fat) in the blood is associated with the risk of
developing heart disease, such that higher concentrations of cholesterol indicate a higher
level of risk, and lower concentrations indicate a lower level of risk. If you lower the
concentration of cholesterol in the blood, your risk of developing heart disease can be
reduced. Being overweight and/or physically inactive increases the concentration of
cholesterol in your blood. Both exercise and weight loss can reduce cholesterol
concentration. However, it is not known whether exercise or weight loss is best for
lowering cholesterol concentration. Therefore, a researcher decided to investigate
whether an exercise or weight loss intervention is more effective in lowering cholesterol
levels. To this end, the researcher recruited a random sample of inactive males that were
classified as overweight. This sample was then randomly split into two groups: Group 1
underwent a calorie-controlled diet and Group 2 undertook the exercise-training
program. In order to determine which treatment program was more effective, the mean
cholesterol concentrations were compared between the two groups at the end of the
treatment programs.
This table provides useful descriptive statistics for the two groups
that you compared, including the mean and standard deviation.
This table provides the actual results from the
independent t-test.
This study found that overweight, physically
inactive male participants had statistically
significantly lower cholesterol
concentrations (5.80 ± 0.38 mmol/L) at the
end of an exercise-training programme
compared to after a calorie-controlled diet
(6.15 ± 0.52 mmol/L), t(38) = 2.428, p =
0.020.
Note the mean for each of the two groups in the
“Group Statistics” section. This output shows that
the average weight for European cars is 2431
pounds, versus 2221 pounds for Japanese cars.
To see the results of the t-test for the difference in the two means, find the p-value for
the test. The p-value is labeled as “Sig.” in the SPSS output (“Sig.” stands for
significance level). To find the correct “Sig.”, look in the section of the “Independent
Samples Test” output labeled “t-test for Equality of Means” and you will find a column
labeled “Sig. (2-tailed).” This is the correct column, not the column labeled “Sig.” in the
section of the “Levene’s Test for Equality of Variances” section. Finally, read the “Sig.”
value in the second row, the row labeled “Equal variances not assumed”. We will use
the second row since we almost never have any reason to think a priori that the amount
of variation within each group will be the same (the p-value in the two rows is usually
almost the same anyway). In the above example the p-value is .002, implying that the
difference in means is statistically significant at the .05. and .01 levels.
Comparison of more than 2
samples or complicated designs
Each comparison
brings its own
p-value…too much!
Could be p = .15,
yielding no results!
ANOVA in word-forming area
comp
infer
12
95% CI
10
• Is activation in word-forming
area different for a) naming
and reading and b) influenced
by age and if so (a + b) how
so?
8
6
Left hemisphere
right hemisphere
Naminglesion site Reading
TASK
Naming
AGE
Young
Old
Reading
Aloud
• H1 & H0: reading difference
• H2 & H0: age difference
• H3 & H0: reading/age difference
•
reading causes significantly
stronger activation in the wordforming area but only in the older
group so the word-forming area is
more strongly activated during
reading but this seems to be
affected by age
ANOVA
ANalysis Of VAriance (ANOVA)
– Still compares the differences in means between groups but it
uses the variance of data to “decide” if means are different
Many different types of ANOVA…just the basics in this
class, however.
I’m throwing this in only if you
want to explore detailed types of
ANOVAs…
2-way ANOVA for independent
groups
Type
Participants
Condition
I
Condition II
Task I
Participan
t group A
Participant
group B
Task
II
Participan
t group C
Participant
group D
Betweensubject design
repeated measures ANOVA
Condition
I
Condition II
Task I
Participan
t group A
Participant
group A
Task
II
Participan
t group A
Participant
group A
mixed ANOVA
Condition
I
Condition II
Task I
Participa
nt group
A
Participant
group B
Task
II
Participa
nt group
A
Participant
group B
Within-subject
design
NOTE: You may have more than 2 levels in each condition/ task
both
A manager wants to raise the productivity at his company
by increasing the speed at which his employees can use a
particular spreadsheet program. As he does not have the
skills in-house, he employs an external agency which
provides training in this spreadsheet program. They offer 3
courses: a beginner, intermediate and advanced course.
He is unsure which course is needed for the type of work
they do at his company, so he sends 10 employees on the
beginner course, 10 on the intermediate and 10 on the
advanced course. When they all return from the training,
he gives them a problem to solve using the spreadsheet
program, and times how long it takes them to complete the
problem. He then compares the three courses (beginner,
intermediate, advanced) to see if there are any differences
in the average time it took to complete the problem.
Descriptive Statistics…
ANOVA Results…
What can be concluded from ANOVA
• There is a significant difference
somewhere between groups
• NOT where the difference lies
• Finding exactly where the difference lies
requires further statistical analysis =
post hoc analysis
Tukey Post-Hoc tests…
• There was a statistically significant
difference between groups as determined
by one-way ANOVA (F(2,27) = 4.467, p =
.021). A Tukey post-hoc test revealed that
the time to complete the problem was
statistically significantly lower after taking
the intermediate (23.6 ± 3.3 min, p = .046)
and advanced (23.4 ± 3.2 min, p = .034)
course compared to the beginners course
(27.2 ± 3.0 min). There were no
statistically significant differences between
the intermediate and advanced groups
(p = .989).
Conclusions
• T-Tests for samples
• ANOVAS compare 2 groups in more
complicated scenarios or more than 2
groups
Tables used in this presentation are
courtesy of: https://statistics.laerd.com/spss-tutorials/
Additional tests…
• More statistical tests available for more
complicated situations:
• ANCOVAs (same idea as ANOVA but in pre/post
test situations, ANCOVA can be used if your
original groups are statistically different).
• Factor Analysis (investigation into separate factors
which may explain correlations of several
variables)
• MANOVAs (same principal as ANOVA but multiple
dependent variables).
Please answer these questions
Final Thoughts:
1.
The goal of this course was to give you an overview of
educational research methods and statistical tests.
Please discuss the degree to which you feel that this
goal was accomplished.
2.
Explain which activity of this class was most effective
at increasing your knowledge of educational research
and why it was so effective.
3.
Through which activity or activities do you feel you
gained immediately-useful information to help you
improve your understanding of educational research?
4.
Which of the statistical tests presented in this course
do you feel you understand the most and why? The
least and why?
5.
Did you find that any instructional activities in the
course were not beneficial to your understanding of
educational research?
Sampling, Normal Curve,
and Hypotheses
in Quantitative Research
Types of Sampling
Simple Random Sample
Stratified Random Sample
Cluster sampling
Systematic
Convenience
Simple Random Sample
Every subset of a specified size n from the population has an equal chance of being selected.
Stratified Random Sample
The population is divided into two or more groups called strata, according to some criterion,
such as geographic location, grade level, age, or income, and subsamples are randomly selected
from each strata.
Cluster Sample
The population is divided into subgroups (clusters) like families. A simple random sample is
taken of the subgroups and then all members of the cluster selected are surveyed.
Systematic Sample
Every kth member ( for example: every 10th person) is selected from a list of all population
members.
Convenience Sample
Selection of whichever individuals are easiest to reach.
It is done at the “convenience” of the researcher.
Now you decide:
• including 5 people from every sports
team on a collegiate campus
• including every teacher from 4
elementary schools chosen from a
group of 11 elementary schools in a
school district with 45 elementary
schools total
• including 25 employees whose names were
drawn from a hat 250 school employees
• including all people who attend parent-teacher
conferences
• including every 20th student from a list of
2000 students in a particular high school
Errors in Sampling
Non-Observation Errors
◦ Sampling error: naturally occurs
◦ Coverage error: people sampled do not match the
population of interest
◦ Underrepresentation
◦ Non-response: won’t or can’t participate
As the researcher, you will never eliminate ALL
elements of BIAS…but it is your job to minimize
the impact of bias on your research project by
carefully planning out your research design.
The Normal Curve
The Normal Distribution:
The Normal curve is a mathematical abstraction
which conveniently describes ("models") many
frequency distributions of scores in real-life.
length of time before someone
looks away in a staring contest:
length of pickled gherkins:
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and
country public schools.' Journal of the Anthropological Institute, 5, 174-180:
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and
country public schools.' Journal of the Anthropological Institute, 5, 174-180:
Height of 14 year-old children
16
country
town
14
10
8
6
4
2
0
51
-5
2
53
-5
4
55
-5
6
57
-5
8
59
-6
0
61
-6
2
63
-6
4
65
-6
6
67
-6
8
69
-7
0
frequency (%)
12
height (inches)
Frequency of different wand lengths
An example of a normal distribution - the length of
Sooty's magic wand...
Length of wand
Properties of the Normal Distribution:
1. It is bell-shaped and asymptotic at the extremes.
2. It's symmetrical around the mean.
3. The mean, median and mode all have same value.
4. It can be specified completely, once mean and SD
are known.
5. The area under the curve is directly proportional
to the relative frequency of observations.
e.g. here, 50% of scores fall below the mean, as
does 50% of the area under the curve.
e.g. here, 85% of scores fall below score X,
corresponding to 85% of the area under the curve.
Relationship between the normal curve and the
standard deviation:
frequency
All normal curves share this property: the SD cuts off a
constant proportion of the distribution of scores:-
68%
95%
99.7%
-3
-2
-1
mean
+1
+2
+3
Number of standard deviations either side of mean
About 68% of scores fall in the range of the mean plus and minus 1 SD;
95% in the range of the mean +/- 2 SDs;
99.7% in the range of the mean +/- 3 SDs.
e.g. IQ is normally distributed (mean = 100, SD = 15).
68% of people have IQs between 85 and 115 (100 +/- 15).
95% have IQs between 70 and 130 (100 +/- (2*15).
99.7% have IQs between 55 and 145 (100 +/- (3*15).
68%
85 (mean - 1 SD)
115 (mean + 1 SD)
We can tell a lot about a population just from knowing
the mean, SD, and that scores are normally distributed.
If we encounter someone with a particular score, we can
assess how they stand in relation to the rest of their
group.
e.g. someone with an IQ of 145 is quite unusual (3 SDs
above the mean).
IQs of 3 SDs or above occur in only 0.15% of the
population [ (100-99.7) / 2 ].
Population all possible values
Sample a portion of the population
Statistical inference generalizing from a
sample to a population with calculated
degree of certainty
Two forms of statistical inference
◦ Hypothesis testing
◦ Estimation
Parameter a characteristic of population, e.g.,
population mean µ
Statistic calculated from data in the sample, e.g.,
sample mean ( )
P-hat a sample proportion, symbolized by
Distinctions Between Parameters and Statistics
(Vocabulary Review)
Parameters
Statistics
Source
Population
Sample
Notation
Greek (e.g., μ) Roman (e.g., xbar)
Vary
No
Yes
Calculated
No
Yes
Hypothesis Testing Steps
A.Null and alternative hypotheses
B.Significance level
C.Test statistic
D.P-value and interpretation
General Example:
A criminal trial is an example of hypothesis testing
without the statistics.
In a trial a jury must decide between two hypotheses. The
null hypothesis is
H0: The defendant is innocent
The alternative hypothesis or research hypothesis is
H1: The defendant is guilty
The jury does not know which hypothesis is true. They
must make a decision on the basis of evidence presented.
In the language of statistics convicting the defendant is
called rejecting the null hypothesis in favor of the
alternative hypothesis. That is, the jury is saying that
there is enough evidence to conclude that the defendant
is guilty (i.e., there is enough evidence to support the
alternative hypothesis). We say, “We reject the null.”
If the jury acquits it is stating that there is not enough
evidence to support the alternative hypothesis. Notice that
the jury is not saying that the defendant is innocent, only
that there is not enough evidence to support the
alternative hypothesis. That is why we never say that we
accept the null hypothesis…we say, “We fail to reject the
null.” (Although non-stats people often do this wrong!)
Specific Example:
Crazy guy...but good at explanations!
Another Specific Example:
A department store manager determines that a new
billing system will be cost-effective only if the mean
monthly account is more than $170.
What null and alternative hypotheses
can we write for this situation?
The system will be cost effective if the mean account
balance for all customers is greater than $170.
We express this belief as a our research hypothesis, that is:
H1: μ > 170 (this is what we want to determine)
Thus, our null hypothesis becomes:
H0: μ < 170 (we assume is true until proven otherwise)
Interpretation
P-value answer the question: What is the
probability of the observed test statistic …
when H0 is true?
Thus, smaller and smaller P-values provide
stronger and stronger evidence against H0
Small P-value strong evidence for HA
Interpreting the p-value…
The smaller the p-value, the more statistical evidence exists to support the
alternative hypothesis.
•If the p-value is less than 1%, there is overwhelming evidence that supports
the alternative hypothesis.
•If the p-value is between 1% and 5%, there is a strong evidence that supports
the alternative hypothesis.
•If the p-value is between 5% and 10% there is a weak evidence that supports
the alternative hypothesis.
•If the p-value exceeds 10%, there is no evidence that supports the alternative
hypothesis.
We observe a p-value of .0069, hence there is overwhelming evidence to
support H1: > 170.
11.38
Interpreting the p-value…
Overwhelming Evidence
(Highly Significant)
Strong Evidence
(Significant)
Weak Evidence
(Not Significant)
No Evidence
(Not Significant)
0
.01
.05
.10
p=.0068
11.39
Conclusions of a Test of
Hypothesis…
If we reject the null hypothesis, we conclude that there is enough
evidence to infer that the alternative hypothesis is true.
If we fail to reject the null hypothesis, we conclude that there is
not enough statistical evidence to infer that the alternative
hypothesis is true. This does not mean that we have proven that
the null hypothesis is true!
11.40
Prior to testing, you would decide on a level of
significance…
Your computed “p-value” will indicate whether you
should reject the null or fail to reject the null.
Let’s examine some p-values and make decisions:
If p = .45, we would __________________.
If p = .20, we would __________________.
If p = .09, we would __________________.
If p = .01, we would __________________.
If p = .009, we would __________________.
In summary…
*Sampling critically important to your study.
*Null and alternative hypotheses are the
foundation of research investigations.
*Interpreting the p-value provides evidence as to
whether you have “statistically significant”
evidence to support your claim or not.

Purchase answer to see full
attachment